Page MenuHomec4science

No OneTemporary

File Metadata

Created
Sun, Jun 22, 07:34
This file is larger than 256 KB, so syntax highlighting was skipped.
diff --git a/MANIFEST.in b/MANIFEST.in
index 16425816f..7e29b03cc 100644
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -1,4 +1,4 @@
include requirements*.txt
recursive-include docs *.xml *.xsl *.tpl *.html *.js *.css *.cfg *.bft *.rst
recursive-include invenio *.xml *.xsl *.tpl *.html *.js *.css *.cfg *.bft *.rst
-recursive-include invenio_atlantis *.xml *.xsl *.tpl *.html *.js *.css *.cfg *.bft *.rst
+recursive-include invenio_demosite *.xml *.xsl *.tpl *.html *.js *.css *.cfg *.bft *.rst
diff --git a/docs/developers/templates.rst b/docs/developers/templates.rst
index 17ee76787..4418404a4 100644
--- a/docs/developers/templates.rst
+++ b/docs/developers/templates.rst
@@ -1,122 +1,122 @@
.. _developers-templates:
Templates
=========
`Jinja2`_ is a modern and designer friendly templating language for Python.
We will summarize adoption of Jinja2 web framework and describe best
practices for writing easily reusable and extendable templates.
.. note:: Please follow these simple rules when creating new templates, to
allow different Invenio installations (CDS, Inspire, ZENODO, ...) to
easily customize templates and keep in sync with updates to templates.
Using the rules will greatly reduce the amount of copy/pasting the
installations will have to do.
* ``<module>/views.py``::
# ... some where in the blueprint code:
- render_template(['<module>_<name>.html'], ctx)
+ render_template(['<module>/<name>.html'], ctx)
* ``<module>/templates/<module>/<name>_base.html``::
{#
# The base template usually extends from the main Invenio template, or perhaps
# from a module specific main template. It contains nearly all the HTML code.
#}
{% extends "page.html" %}
{#
# Macros
# * Make template blocks more clear so you don't clutter up a lot of HTML with
# rendering logic - e.g. you wouldn't put your Python code in one big
# function, instead you split it up into small functions with clearly
# defined responsibilities.
# * Macros can be parameterized.
#}
{%- macro action_bar(show_delete=True) %}
{# Macros can be overwritten in child template, but only calls within the
# child template will call the new macro. Hence, if you just want to
- # overwrite the action_bar macro in <module>_<name>.html, you must also
+ # overwrite the action_bar macro in <module>/<name>.html, you must also
# copy/paste the form_header and form_footer blocks where it's used,
# otherwise the old macro will be used. To avoid this problem, please
# instead just include a template inside the macro. This allow anther
# Invenio installation to overwrite just this part
#}
- {% include "<module>_<name>_action_bar.html"%}
+ {% include "<module>/<name>_action_bar.html"%}
{ endmarco %}
{%- macro render_field(thisfield, with_label=True) %}
- {% include "<module>_<name>_render_field.html"%}
+ {% include "<module>/<name>_render_field.html"%}
{%- endmarco %}
{#
# Blocks
# * Think of template-blocks, as the API other Invenio installations will
# use to customize the Invenio layout. An Invenio installation can override
# blocks defined in your templates so that they keep their own changes
# to a minimum, and don't copy/paste large parts of the template code.
# * Use blocks liberally - to allow customizations of your template.
# * Add the template block name to the {% endblock <name> %} to increase
# readability of template code.
#}
{% block body %}
<div>
{%- block form_header scoped %}{{action_bar()}}{% endblock form_header%}
{%- block form_title scoped %}<h1>{{ form._title }}</h1>{% endblock form_title %}
{%- block form_body scoped %}
<fieldset>
{%- for field in fields %}
{#
# Use the "scoped" parameter, to make variables available inside
# the block. E.g. without the loop variable will not be available
# inside the block.
#}
{%- block field_body scoped %}
{{ render_field(field) }}
{% if loop.last %}<hr />{% endif %}
{%- endblock field_body %}
{%- endfor %}
</fieldset>
{% endblock form_body %}
{% block form_footer scoped %}{{action_bar(show_delete=False)}}{% endblock form_footer %}
</div>
{% endblock body %}
* ``<module>/templates/<module>/<name>.html``::
{#
# The template actually being rendered by the blueprint. It only extends the
# base template. Doing it this way, allow an Invenio installation to overwrite
# just the blocks they need, instead of having to implement the entire
# template.
#}
- {% extends "<module>_<name>_base.html" %}
+ {% extends "<module>/<name>_base.html" %}
* ``<mypackage>/templates/<module>/<name>.html``::
{#
# Here's an example of an Invenio installation which just overwrites the
# necessary template block.
#}
- {% extends "<module>_<name>_base.html" %}
+ {% extends "<module>/<name>_base.html" %}
{%- block field_body %}
{%- if field.name == 'awesomefield' %}
{{ render_field(field, class="awesomeness") }}
{% else %}
{{ render_field(field) }}
{%- endif %}
{% if loop.last %}<hr />{% endif %}
{%- endblock field_body %}
.. _Flask: http://flask.pocoo.org/
.. _Jinja2: http://jinja.pocoo.org/2/
diff --git a/invenio/base/decorators.py b/invenio/base/decorators.py
index 961e2a204..d5afce12e 100644
--- a/invenio/base/decorators.py
+++ b/invenio/base/decorators.py
@@ -1,204 +1,204 @@
# -*- coding: utf-8 -*-
## This file is part of Invenio.
## Copyright (C) 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
invenio.base.decorators
-----------------------
Implements custom decorators.
Requires::
* `invenio.ext.sqlalchemy:db`
* `invenio.ext.template.context_processor:\
register_template_context_processor`
"""
from flask import request, jsonify, current_app, stream_with_context, \
Response, render_template, get_flashed_messages, flash, g
from functools import wraps
from sqlalchemy.sql import operators
from invenio.ext.sqlalchemy import db
from invenio.ext.template.context_processor import \
register_template_context_processor
def templated(template=None, stream=False, mimetype='text/html'):
"""
The idea of this decorator is that you return a dictionary with the
values passed to the template from the view function and the template
is automatically rendered or a JSON object is generated if it is reply
to an AJAX request.
see::
http://flask.pocoo.org/docs/patterns/viewdecorators/
"""
def stream_template(template_name, **context):
"""
The jinja2 template supports rendering templates piece by piece,
however the request context is not kept during whole time of
template rendering process.
see::
http://flask.pocoo.org/docs/patterns/streaming/
"""
current_app.update_template_context(context)
t = current_app.jinja_env.get_template(template_name)
rv = t.stream(context)
return stream_with_context(rv)
if stream:
render = lambda template, **ctx: Response(
stream_template(template, **ctx), mimetype=mimetype)
else:
render = render_template
def decorator(f):
@wraps(f)
def decorated_function(*args, **kwargs):
template_name = template
if template_name is None:
template_name = request.endpoint \
.replace('.', '/') + '.html'
ctx = f(*args, **kwargs)
if ctx is None:
ctx = {}
elif not isinstance(ctx, dict):
return ctx
if request.is_xhr:
#FIXME think about more possible types
for k, v in ctx.iteritems():
if isinstance(v, list):
try:
ctx[k] = [dict(zip(x.keys(),
[dict(i) for i in x])
) for x in v]
except:
ctx[k] = v
else:
try:
ctx[k] = dict(v)
except:
ctx[k] = v
ctx['_messages'] = get_flashed_messages(with_categories=True)
return jsonify(**ctx)
return render(template_name, **ctx)
return decorated_function
return decorator
def sorted_by(model=None, cols=None):
"""
This decorator fills `sort` argument with `ORDER BY` expression used by
sqlalchemy for defined URL arguments `sort_by` and `order`.
"""
def decorator(f):
@wraps(f)
def decorated_function(*args, **kwargs):
sort_by = request.args.get('sort_by', None)
order_fn = {'asc': db.asc,
'desc': db.desc}.get(request.args.get('order', 'asc'),
db.asc)
sort = False
if model is not None and sort_by is not None and (
cols is None or sort_by in cols):
try:
sort_keys = sort_by.split('.')
if hasattr(model, sort_keys[0]):
sort = order_fn(reduce(lambda x, y: getattr(
x.property.table.columns, y), sort_keys[1:],
getattr(model, sort_keys[0])))
except:
flash(g._("Invalid sorting key '%s'.") % sort_by)
kwargs['sort'] = sort
return f(*args, **kwargs)
return decorated_function
return decorator
def filtered_by(model=None, columns=None, form=None, filter_empty=False):
"""
This decorator adds `filter` argument with 'WHERE' exression.
The `filter_form` is also injected to template context.
"""
def decorator(f):
@wraps(f)
def decorated_function(*args, **kwargs):
if not model or not columns:
return f(*args, **kwargs)
where = []
for column, op in columns.iteritems():
try:
values = request.values.getlist(column)
if not values:
continue
column_keys = column.split('.')
if hasattr(model, column_keys[0]):
cond = reduce(lambda x, y:
getattr(x.property.table.columns, y),
column_keys[1:],
getattr(model, column_keys[0]))
current_app.logger.debug("Filtering by: %s = %s" %
(cond, values))
# Multi-values support
if len(values) > 0:
# Ignore empty values when using start with,
# contains or similar.
# FIXME: add per field configuration
values = [value for value in values
if len(value) > 0 or filter_empty]
if op == operators.eq:
where.append(db.in_(values))
else:
or_list = []
for value in values:
or_list.append(op(cond, value))
where.append(db.or_(*or_list))
else:
where.append(op(cond, value))
except:
flash(g._("Invalid filtering key '%s'.") % column)
if form is not None:
filter_form = form(request.values)
@register_template_context_processor
def inject_filter_form():
return dict(filter_form=filter_form)
# Generate ClauseElement for filtered columns.
kwargs['filter'] = db.and_(*where)
return f(*args, **kwargs)
return decorated_function
return decorator
def wash_arguments(config):
def _decorated(f):
@wraps(f)
def decorator(*args, **kwargs):
- from invenio.base.washers import wash_urlargd
+ from invenio.utils.washers import wash_urlargd
argd = wash_urlargd(request.values, config)
argd.update(kwargs)
return f(*args, **argd)
return decorator
return _decorated
diff --git a/invenio/base/factory.py b/invenio/base/factory.py
index 543fb8c50..c8d3fbd9a 100644
--- a/invenio/base/factory.py
+++ b/invenio/base/factory.py
@@ -1,217 +1,219 @@
# -*- coding: utf-8 -*-
## This file is part of Invenio.
## Copyright (C) 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
invenio.base.factory
--------------------
Implements application factory.
"""
+import warnings
+
#from invenio.ext.logging import register_exception
from .helpers import with_app_context, unicodifier
from .utils import collect_blueprints, register_extensions, \
register_configurations
from .wrappers import Flask
__all__ = ['create_app', 'with_app_context']
def create_app(**kwargs_config):
"""
Prepare WSGI Invenio application based on Flask.
Invenio consists of a new Flask application with legacy support for
the old WSGI legacy application and the old Python legacy
scripts (URLs to *.py files).
An incoming request is processed in the following manner:
* The Flask application first routes request via its URL routing
system (see LegacyAppMiddleware.__call__()).
* One route in the Flask system, will match Python legacy
scripts (see static_handler_with_legacy_publisher()).
* If the Flask application aborts the request with a 404 error, the request
is passed on to the WSGI legacy application (see page_not_found()). E.g.
either the Flask application did not find a route, or a view aborted the
request with a 404 error.
"""
## The Flask application instance
_app = Flask('.'.join(__name__.split('.')[0:2]),
## Static files are usually handled directly by the webserver (e.g. Apache)
## However in case WSGI is required to handle static files too (such
## as when running simple server), then this flag can be
## turned on (it is done automatically by wsgi_handler_test).
## We assume anything under '/' which is static to be server directly
## by the webserver from CFG_WEBDIR. In order to generate independent
## url for static files use func:`url_for('static', filename='test')`.
static_url_path='',
template_folder='templates',
instance_relative_config=True,
)
# Handle both url with and without trailing slashe by Flask.
# @blueprint.route('/test')
# @blueprint.route('/test/') -> not necessary when strict_slashes == False
_app.url_map.strict_slashes = False
# Load invenio.conf
_app.config.from_object('invenio.base.config')
try:
#print _app.instance_path
import os
os.makedirs(_app.instance_path)
except:
pass
# Load invenio.cfg
_app.config.from_pyfile('invenio.cfg', silent=True)
## Update application config from parameters.
_app.config.update(kwargs_config)
## Database was here.
## First check that you have all rights to logs
- #from invenio.bibtask import check_running_process_user
+ #from invenio.legacy.bibsched.bibtask import check_running_process_user
#check_running_process_user()
#from invenio.base.i18n import language_list_long
def language_list_long():
return []
# Jinja2 hacks were here.
# See note on Jinja2 string decoding using ASCII codec instead of UTF8 in
# function documentation
# SECRET_KEY is needed by Flask Debug Toolbar
SECRET_KEY = _app.config.get('SECRET_KEY') or \
_app.config.get('CFG_SITE_SECRET_KEY', 'change_me')
if not SECRET_KEY or SECRET_KEY == 'change_me':
fill_secret_key = """
Set variable SECRET_KEY with random string in invenio.cfg.
You can use following commands:
$ %s
""" % ('inveniomanage config create secret-key', )
- print fill_secret_key
+ warnings.warn(fill_secret_key, UserWarning)
#try:
# raise Exception(fill_secret_key)
#except Exception:
# #register_exception(alert_admin=True,
# # subject="Missing CFG_SITE_SECRET_KEY")
# raise Exception(fill_secret_key)
_app.config["SECRET_KEY"] = SECRET_KEY
# Register extendsions listed in invenio.cfg
register_extensions(_app)
# Extend application config with packages configuration.
register_configurations(_app)
# Debug toolbar was here
# Set email backend for Flask-Email plugin
# Mailutils were here
# SSLify was here
# Legacy was here
# Jinja2 Memcache Bytecode Cache was here.
# Jinja2 custom loader was here.
# SessionInterface was here.
## Set custom request class was here.
## ... and map certain common parameters
_app.config['CFG_LANGUAGE_LIST_LONG'] = [(lang, longname.decode('utf-8'))
for (lang, longname) in language_list_long()]
## Invenio is all using str objects. Let's change them to unicode
_app.config.update(unicodifier(dict(_app.config)))
from invenio.base import before_request_functions
before_request_functions.setup_app(_app)
# Cache was here
# Logging was here.
# Login manager was here.
# Main menu was here.
# Jinja2 extensions loading was here.
# Custom template filters were here.
# Gravatar bridge was here.
# Set the user language was here.
# Custom templete filters loading was here.
def _invenio_blueprint_plugin_builder(plugin):
"""
Handy function to bridge pluginutils with (Invenio) blueprints.
"""
from flask import Blueprint
if 'blueprints' in dir(plugin):
candidates = getattr(plugin, 'blueprints')
elif 'blueprint' in dir(plugin):
candidates = [getattr(plugin, 'blueprint')]
else:
candidates = []
for candidate in candidates:
if isinstance(candidate, Blueprint):
if candidate.name in _app.config.get('CFG_FLASK_DISABLED_BLUEPRINTS', []):
_app.logger.info('%s is excluded by CFG_FLASK_DISABLED_BLUEPRINTS' % candidate.name)
return
return candidate
_app.logger.error('%s is not a valid blueprint plugin' % plugin.__name__)
## Let's load all the blueprints that are composing this Invenio instance
_BLUEPRINTS = [m for m in map(_invenio_blueprint_plugin_builder,
collect_blueprints(app=_app))
if m is not None]
## Let's attach all the blueprints
for plugin in _BLUEPRINTS:
_app.register_blueprint(plugin,
url_prefix=_app.config.get(
'BLUEPRINTS_URL_PREFIXES',
{}).get(plugin.name))
# Flask-Admin was here.
@_app.route('/testing')
def testing():
from flask import render_template
return render_template('404.html')
return _app
diff --git a/invenio/base/scripts/cache.py b/invenio/base/scripts/cache.py
index 993177a4c..94a50fbf0 100644
--- a/invenio/base/scripts/cache.py
+++ b/invenio/base/scripts/cache.py
@@ -1,102 +1,102 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
from flask import current_app
from invenio.ext.script import Manager, change_command_name
manager = Manager(usage="Perform cache operations")
def reset_rec_cache(output_format, get_record, split_by=1000):
"""It either stores or does not store the output_format.
If CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE is changed, this function
will adapt the database to either store or not store the output_format."""
import sys
try:
import cPickle as pickle
except:
import pickle
from itertools import islice
from invenio.intbitset import intbitset
- from invenio.bibsched import server_pid, pidfile
+ from invenio.legacy.bibsched.scripts.bibsched import server_pid, pidfile
from invenio.ext.sqlalchemy import db
from invenio.modules.record_editor.models import Bibrec
from invenio.modules.formatter.models import Bibfmt
pid = server_pid(ping_the_process=False)
if pid:
print >> sys.stderr, "ERROR: bibsched seems to run with pid %d, according to %s." % (pid, pidfile)
print >> sys.stderr, " Please stop bibsched before running this procedure."
sys.exit(1)
if current_app.config.get('CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE'):
print ">>> Searching records which need %s cache resetting; this may take a while..." % (output_format, )
all_recids = intbitset(db.session.query(Bibrec.id).all())
#TODO: prevent doing all records?
recids = all_recids
print ">>> Generating %s cache..." % (output_format, )
tot = len(recids)
count = 0
it = iter(recids)
while True:
rec_group = tuple(islice(it, split_by))
if not len(rec_group):
break
Bibfmt.query.filter(db.and_(
Bibfmt.id_bibrec.in_(rec_group),
Bibfmt.format == output_format)).delete(synchronize_session=False)
db.session.commit()
#TODO: Update the cache or wait for the first access
map(get_record, rec_group)
count += len(rec_group)
print " ... done records %s/%s" % (count, tot)
if len(rec_group) < split_by or count >= tot:
break
print ">>> %s cache generated successfully." % (output_format, )
else:
print ">>> Cleaning %s cache..." % (output_format, )
Bibfmt.query.filter(Bibfmt.format == output_format).delete(synchronize_session=False)
db.session.commit()
@manager.command
@change_command_name
def reset_recjson(split_by=1000):
"""Reset record json structure cache."""
from invenio.legacy.bibfield.bibfield_manager import reset
reset(split_by)
@manager.command
@change_command_name
def reset_recstruct(split_by=1000):
"""Reset record structure cache."""
from invenio.legacy.bibrecord.bibrecord_manager import reset
reset(split_by)
def main():
from invenio.base.factory import create_app
app = create_app()
manager.app = app
manager.run()
if __name__ == '__main__':
main()
diff --git a/invenio/base/scripts/database.py b/invenio/base/scripts/database.py
index f97ad691f..689658210 100644
--- a/invenio/base/scripts/database.py
+++ b/invenio/base/scripts/database.py
@@ -1,376 +1,376 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import os
import sys
import shutil
import datetime
from pipes import quote
from flask import current_app
from invenio.ext.script import Manager, change_command_name, print_progress
manager = Manager(usage="Perform database operations")
# Shortcuts for manager options to keep code DRY.
option_yes_i_know = manager.option('--yes-i-know', action='store_true',
dest='yes_i_know', help='use with care!')
option_default_data = manager.option('--no-data', action='store_false',
dest='default_data',
help='do not populate tables with default data')
@manager.option('-u', '--user', dest='user', default="root")
@manager.option('-p', '--password', dest='password', default="")
@option_yes_i_know
def init(user='root', password='', yes_i_know=False):
"""Initializes database and user."""
from invenio.ext.sqlalchemy import db
from invenio.utils.text import wrap_text_in_a_box, wait_for_user
## Step 0: confirm deletion
wait_for_user(wrap_text_in_a_box("""WARNING: You are going to destroy your database tables! Run first `inveniomanage database drop`."""))
## Step 1: drop database and recreate it
if db.engine.name == 'mysql':
#FIXME improve escaping
args = dict((k, str(v).replace('$', '\$'))
for (k, v) in current_app.config.iteritems()
if k.startswith('CFG_DATABASE'))
args = dict(zip(args, map(quote, args.values())))
prefix = ('{cmd} -u {user} --password={password} '
'-h {CFG_DATABASE_HOST} -P {CFG_DATABASE_PORT} ')
cmd_prefix = prefix.format(cmd='mysql', user=user, password=password,
**args)
cmd_admin_prefix = prefix.format(cmd='mysqladmin', user=user,
password=password,
**args)
cmds = [
cmd_prefix + '-e "DROP DATABASE IF EXISTS {CFG_DATABASE_NAME}"',
(cmd_prefix + '-e "CREATE DATABASE IF NOT EXISTS '
'{CFG_DATABASE_NAME} DEFAULT CHARACTER SET utf8 '
'COLLATE utf8_general_ci"'),
# Create user and grant access to database.
(cmd_prefix + '-e "GRANT ALL PRIVILEGES ON '
'{CFG_DATABASE_USER}.* TO {CFG_DATABASE_NAME}@localhost '
'IDENTIFIED BY {CFG_DATABASE_PASS}"'),
cmd_admin_prefix + 'flush-privileges'
]
for cmd in cmds:
cmd = cmd.format(**args)
print cmd
if os.system(cmd):
print "ERROR: failed execution of", cmd
sys.exit(1)
print '>>> Database has been installed.'
@option_yes_i_know
def drop(yes_i_know=False):
"""Drops database tables"""
print ">>> Going to drop tables and related data on filesystem ..."
from sqlalchemy import event
from invenio.utils.date import get_time_estimator
from invenio.utils.text import wrap_text_in_a_box, wait_for_user
- from invenio.webstat import destroy_customevents
+ from invenio.legacy.webstat.api import destroy_customevents
from invenio.legacy.inveniocfg import test_db_connection
from invenio.base.utils import autodiscover_models
from invenio.ext.sqlalchemy import db
- from invenio.bibdocfile import _make_base_dir
+ from invenio.legacy.bibdocfile.api import _make_base_dir
## Step 0: confirm deletion
wait_for_user(wrap_text_in_a_box("""WARNING: You are going to destroy your database tables and related data on filesystem!"""))
## Step 1: test database connection
test_db_connection()
list(autodiscover_models())
## Step 2: disable foreign key checks
if db.engine.name == 'mysql':
db.engine.execute('SET FOREIGN_KEY_CHECKS=0;')
## Step 3: destroy associated data
try:
msg = destroy_customevents()
if msg:
print msg
except:
print "ERROR: Could not destroy customevents."
## FIXME: move to bibedit_model
def bibdoc_before_drop(target, connection_dummy, **kw_dummy):
print
print ">>> Going to remove records data..."
for (docid,) in db.session.query(target.c.id).all():
directory = _make_base_dir(docid)
if os.path.isdir(directory):
print ' >>> Removing files for docid =', docid
shutil.rmtree(directory)
db.session.commit()
print ">>> Data has been removed."
from invenio.modules.record_editor.models import Bibdoc
event.listen(Bibdoc.__table__, "before_drop", bibdoc_before_drop)
tables = list(reversed(db.metadata.sorted_tables))
N = len(tables)
prefix = '>>> Dropping %d tables ...' % N
e = get_time_estimator(N)
dropped = 0
for i, table in enumerate(tables):
try:
print_progress(1.0 * i / N, prefix=prefix,
suffix=str(datetime.timedelta(seconds=e()[0])))
table.drop(bind=db.engine)
dropped += 1
except:
print '\r', '>>> problem with dropping table', table
print
if dropped == N:
print ">>> Tables dropped successfully."
else:
print "ERROR: not all tables were properly dropped."
print ">>> Dropped", dropped, 'out of', N
@option_default_data
def create(default_data=True):
"""Creates database tables from sqlalchemy models"""
print ">>> Going to create tables..."
from sqlalchemy import event
from invenio.utils.date import get_time_estimator
from invenio.legacy.inveniocfg import test_db_connection
from invenio.base.utils import autodiscover_models
from invenio.ext.sqlalchemy import db
try:
test_db_connection()
except:
from invenio.ext.logging import get_tracestack
print get_tracestack()
list(autodiscover_models())
def cfv_after_create(target, connection, **kw):
print
print ">>> Modifing table structure..."
from invenio.legacy.dbquery import run_sql
run_sql('ALTER TABLE collection_field_fieldvalue DROP PRIMARY KEY')
run_sql('ALTER TABLE collection_field_fieldvalue ADD INDEX id_collection(id_collection)')
run_sql('ALTER TABLE collection_field_fieldvalue CHANGE id_fieldvalue id_fieldvalue mediumint(9) unsigned')
#print run_sql('SHOW CREATE TABLE collection_field_fieldvalue')
from invenio.modules.search.models import CollectionFieldFieldvalue
event.listen(CollectionFieldFieldvalue.__table__, "after_create", cfv_after_create)
tables = db.metadata.sorted_tables
N = len(tables)
prefix = '>>> Creating %d tables ...' % N
e = get_time_estimator(N)
created = 0
for i, table in enumerate(tables):
try:
print_progress(1.0 * i / N, prefix=prefix,
suffix=str(datetime.timedelta(seconds=e()[0])))
table.create(bind=db.engine)
created += 1
except:
print '\r', '>>> problem with creating table', table
print
if created == N:
print ">>> Tables created successfully."
else:
print "ERROR: not all tables were properly created."
print ">>> Created", created, 'out of', N
populate(default_data)
@option_yes_i_know
@option_default_data
def recreate(yes_i_know=False, default_data=True):
"""Recreates database tables (same as issuing 'drop' and then 'create')"""
drop()
create(default_data)
@manager.command
def uri():
"""Prints SQLAlchemy database uri."""
from flask import current_app
print current_app.config['SQLALCHEMY_DATABASE_URI']
def load_fixtures(packages=['invenio.modules.*'], truncate_tables_first=False):
from invenio.base.utils import autodiscover_models, \
import_module_from_packages
from invenio.ext.sqlalchemy import db
from fixture import SQLAlchemyFixture
fixture_modules = list(import_module_from_packages('fixtures',
packages=packages))
model_modules = list(autodiscover_models())
fixtures = dict((f, getattr(ff, f)) for ff in fixture_modules
for f in dir(ff) if f[-4:] == 'Data')
fixture_names = fixtures.keys()
models = dict((m+'Data', getattr(mm, m)) for mm in model_modules
for m in dir(mm) if m+'Data' in fixture_names)
dbfixture = SQLAlchemyFixture(env=models, engine=db.metadata.bind,
session=db.session)
data = dbfixture.data(*[f for (n, f) in fixtures.iteritems() if n in models])
if len(models) != len(fixtures):
print ">>> ERROR: There are", len(models), "tables and", len(fixtures), "fixtures."
print ">>>", set(fixture_names) ^ set(models.keys())
else:
print ">>> There are", len(models), "tables to be loaded."
if truncate_tables_first:
print ">>> Going to truncate following tables:",
print map(lambda t: t.__tablename__, models.values())
db.session.execute("TRUNCATE %s" % ('collectionname', ))
db.session.execute("TRUNCATE %s" % ('collection_externalcollection', ))
for m in models.values():
db.session.execute("TRUNCATE %s" % (m.__tablename__, ))
db.session.commit()
data.setup()
db.session.commit()
@option_default_data
@manager.option('--truncate', action='store_true',
dest='truncate_tables_first', help='use with care!')
def populate(default_data=True, truncate_tables_first=False):
"""Populate database with default data"""
from invenio.config import CFG_PREFIX
from invenio.base.scripts.config import get_conf
if not default_data:
print '>>> No data filled...'
return
print ">>> Going to fill tables..."
load_fixtures(truncate_tables_first=truncate_tables_first)
conf = get_conf()
from invenio.legacy.inveniocfg import cli_cmd_reset_sitename, \
cli_cmd_reset_siteadminemail, cli_cmd_reset_fieldnames
cli_cmd_reset_sitename(conf)
cli_cmd_reset_siteadminemail(conf)
cli_cmd_reset_fieldnames(conf)
for cmd in ["%s/bin/webaccessadmin -u admin -c -a" % CFG_PREFIX]:
if os.system(cmd):
print "ERROR: failed execution of", cmd
sys.exit(1)
from invenio.modules.upgrader.engine import InvenioUpgrader
iu = InvenioUpgrader()
map(iu.register_success, iu.get_upgrades())
print ">>> Tables filled successfully."
def version():
""" Get running version of database driver."""
from invenio.ext.sqlalchemy import db
try:
return db.engine.dialect.dbapi.__version__
except:
import MySQLdb
return MySQLdb.__version__
@manager.option('-v', '--verbose', action='store_true', dest='verbose',
help='Display more details (driver version).')
@change_command_name
def driver_info(verbose=False):
""" Get name of running database driver."""
from invenio.ext.sqlalchemy import db
try:
return db.engine.dialect.dbapi.__name__ + (('==' + version())
if verbose else '')
except:
import MySQLdb
return MySQLdb.__name__ + (('==' + version()) if verbose else '')
@manager.option('-l', '--line-format', dest='line_format', default="%s: %s")
@manager.option('-s', '--separator', dest='separator', default="\n")
@change_command_name
def mysql_info(separator=None, line_format=None):
"""
Detect and print MySQL details useful for debugging problems on various OS.
"""
from invenio.ext.sqlalchemy import db
if db.engine.name != 'mysql':
raise Exception('Database engine is not mysql.')
from invenio.legacy.dbquery import run_sql
out = []
for key, val in run_sql("SHOW VARIABLES LIKE 'version%'") + \
run_sql("SHOW VARIABLES LIKE 'charact%'") + \
run_sql("SHOW VARIABLES LIKE 'collat%'"):
if False:
print " - %s: %s" % (key, val)
elif key in ['version',
'character_set_client',
'character_set_connection',
'character_set_database',
'character_set_results',
'character_set_server',
'character_set_system',
'collation_connection',
'collation_database',
'collation_server']:
out.append((key, val))
if separator is not None:
if line_format is None:
line_format = "%s: %s"
return separator.join(map(lambda i: line_format % i, out))
return dict(out)
def main():
from invenio.base.factory import create_app
app = create_app()
manager.app = app
manager.run()
if __name__ == '__main__':
main()
diff --git a/invenio/base/scripts/demosite.py b/invenio/base/scripts/demosite.py
index b381ccad9..dbca7d06b 100644
--- a/invenio/base/scripts/demosite.py
+++ b/invenio/base/scripts/demosite.py
@@ -1,127 +1,127 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import os
import sys
from invenio.ext.script import Manager
manager = Manager(usage="Perform demosite operations")
# Shortcuts for manager options to keep code DRY.
option_yes_i_know = manager.option('--yes-i-know', action='store_true',
dest='yes_i_know', help='use with care!')
option_default_data = manager.option('--no-data', action='store_false',
dest='default_data',
help='do not populate tables with default data')
@option_default_data
-def populate(packages=['invenio_atlantis'], default_data=True):
+def populate(packages=['invenio_demosite'], default_data=True):
"""Load demo records. Useful for testing purposes."""
if not default_data:
print '>>> Default data has been skiped (--no-data).'
return
from werkzeug.utils import import_string
map(import_string, packages)
from invenio.config import CFG_PREFIX
from invenio.ext.sqlalchemy import db
print ">>> Going to load demo records..."
db.session.execute("TRUNCATE schTASK")
db.session.commit()
for cmd in ["%s/bin/bibupload -u admin -i %s/var/tmp/demobibdata.xml" % (CFG_PREFIX, CFG_PREFIX),
"%s/bin/bibupload 1" % CFG_PREFIX,
"%s/bin/bibdocfile --textify --with-ocr --recid 97" % CFG_PREFIX,
"%s/bin/bibdocfile --textify --all" % CFG_PREFIX,
"%s/bin/bibindex -u admin" % CFG_PREFIX,
"%s/bin/bibindex 2" % CFG_PREFIX,
"%s/bin/bibreformat -u admin -o HB" % CFG_PREFIX,
"%s/bin/bibreformat 3" % CFG_PREFIX,
"%s/bin/webcoll -u admin" % CFG_PREFIX,
"%s/bin/webcoll 4" % CFG_PREFIX,
"%s/bin/bibrank -u admin" % CFG_PREFIX,
"%s/bin/bibrank 5" % CFG_PREFIX,
"%s/bin/bibsort -u admin -R" % CFG_PREFIX,
"%s/bin/bibsort 6" % CFG_PREFIX,
"%s/bin/oairepositoryupdater -u admin" % CFG_PREFIX,
"%s/bin/oairepositoryupdater 7" % CFG_PREFIX,
"%s/bin/bibupload 8" % CFG_PREFIX, ]:
if os.system(cmd):
print "ERROR: failed execution of", cmd
sys.exit(1)
print ">>> Demo records loaded successfully."
@manager.command
-def create(packages=['invenio_atlantis']):
+def create(packages=['invenio_demosite']):
"""Populate database with demo site data."""
from invenio.ext.sqlalchemy import db
from invenio.config import CFG_PREFIX
from invenio.modules.accounts.models import User
from invenio.base.scripts.config import get_conf
print ">>> Going to create demo site..."
db.session.execute("TRUNCATE schTASK")
try:
db.session.execute("TRUNCATE session")
except:
pass
User.query.filter(User.email == '').delete()
db.session.commit()
from werkzeug.utils import import_string
map(import_string, packages)
from invenio.base.scripts.database import load_fixtures
load_fixtures(packages=packages, truncate_tables_first=True)
db.session.execute("UPDATE idxINDEX SET stemming_language='en' WHERE name IN ('global','abstract','keyword','title','fulltext');")
db.session.commit()
conf = get_conf()
from invenio.legacy.inveniocfg import cli_cmd_reset_sitename, \
cli_cmd_reset_siteadminemail, cli_cmd_reset_fieldnames
cli_cmd_reset_sitename(conf)
cli_cmd_reset_siteadminemail(conf)
cli_cmd_reset_fieldnames(conf) # needed for I18N demo ranking method names
for cmd in ["%s/bin/webaccessadmin -u admin -c -r -D" % CFG_PREFIX,
"%s/bin/webcoll -u admin" % CFG_PREFIX,
"%s/bin/webcoll 1" % CFG_PREFIX,
"%s/bin/bibsort -u admin --load-config" % CFG_PREFIX,
"%s/bin/bibsort 2" % CFG_PREFIX, ]:
if os.system(cmd):
print "ERROR: failed execution of", cmd
sys.exit(1)
print ">>> Demo site created successfully."
def main():
from invenio.base.factory import create_app
app = create_app()
manager.app = app
manager.run()
if __name__ == '__main__':
main()
diff --git a/invenio/celery/__init__.py b/invenio/celery/__init__.py
index 92d54221d..9fe171b01 100644
--- a/invenio/celery/__init__.py
+++ b/invenio/celery/__init__.py
@@ -1,151 +1,153 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Invenio Celery Loader
The loader's purposes is to load modules defined in invenio.*_tasks modules
"""
# Important for 'from celery import Celery' to find the right celery module.
from __future__ import absolute_import
from celery import Celery, signals
from celery.datastructures import DictAttribute
from celery.loaders.base import BaseLoader
from invenio.base.utils import autodiscover_celery_tasks
+from invenio.base.factory import with_app_context
class InvenioLoader(BaseLoader):
"""
The Invenio Celery loader - modeled after the Django Celery loader.
"""
def __init__(self, *args, **kwargs):
super(InvenioLoader, self).__init__(*args, **kwargs)
self._install_signal_handlers()
self.flask_app = None
self.db = None
def _install_signal_handlers(self):
# Need to close any open database connection after
# any embedded celerybeat process forks.
signals.beat_embedded_init.connect(self.close_database)
# Handlers for settings Flask request context
signals.task_prerun.connect(self.on_task_prerun)
signals.task_postrun.connect(self.on_task_postrun)
def _init_flask(self):
"""
Initialize Flask application.
The Flask application should only be created in the workers, thus
this method should not be called from the __init__ method.
"""
if not self.flask_app:
from flask import current_app
if current_app:
self.flask_app = current_app
else:
from invenio.base.factory import create_app
self.flask_app = create_app()
from invenio.ext.sqlalchemy import db
self.db = db
def on_task_init(self, task_id, task):
"""Called before every task."""
try:
is_eager = task.request.is_eager
except AttributeError:
is_eager = False
if not is_eager:
self.close_database()
def on_task_prerun(self, task=None, **dummy_kwargs):
"""
Called before a task is run - pushes a new Flask request context
for the task.
"""
task.request.flask_ctx = self.flask_app.test_request_context()
task.request.flask_ctx.push()
def on_task_postrun(self, task=None, **dummy_kwargs):
"""
Called after a task is run - pops the pushed Flask request context
for the task.
"""
task.request.flask_ctx.pop()
def on_process_cleanup(self):
"""Does everything necessary for Invenio to work in a long-living,
multiprocessing environment. Called after on_task_postrun.
"""
self.close_database()
def on_worker_init(self):
"""Called when the worker starts.
Automatically discovers any ``*_tasks.py`` files in the Invenio module.
"""
self.close_database()
def on_worker_process_init(self):
self.close_database()
+ @with_app_context()
def read_configuration(self):
""" Read configuration defined in invenio.celery.config """
usercfg = self._import_config_module('invenio.celery.config')
self.configured = True
from werkzeug.datastructures import CombinedMultiDict
from flask import current_app
return DictAttribute(CombinedMultiDict([current_app.config, usercfg]))
def close_database(self, **dummy_kwargs):
if self.db:
self.db.session.remove()
def import_default_modules(self):
""" Called before on_worker_init """
# First setup Flask application
self._init_flask()
# Next import all task modules with a request context (otherwise
# the SQLAlchemy models cannot be imported).
with self.flask_app.test_request_context():
super(InvenioLoader, self).import_default_modules()
self.autodiscover()
def autodiscover(self):
"""
Discover task modules named 'invenio.modules.*.tasks'
"""
from invenio.celery import tasks
self.task_modules.update(tasks.__name__)
self.task_modules.update(
mod.__name__ for mod in autodiscover_celery_tasks() or ())
#
# Create main celery application
#
celery = Celery(
'invenio',
loader=InvenioLoader,
)
diff --git a/invenio/core/record/config_engine.py b/invenio/core/record/config_engine.py
index 654fe08ea..a9901f36e 100644
--- a/invenio/core/record/config_engine.py
+++ b/invenio/core/record/config_engine.py
@@ -1,712 +1,710 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
invenio.core.record.config_engine
---------------------------------
Record fields and models configuration loader.
This module uses `pyparsing <http://pyparsing.wikispaces.com/>` to read from the
different configuration files the field and model definitions.
"""
import os
import re
from invenio.base.globals import cfg
from invenio.base.utils import autodiscover_non_python_files
from pyparsing import ParseException, FollowedBy, Suppress, OneOrMore, Literal, \
LineEnd, ZeroOrMore, Optional, Forward, Word, QuotedString, alphas, nums, \
alphanums, originalTextFor, oneOf, nestedExpr, quotedString, removeQuotes, \
lineEnd, empty, col, restOfLine, delimitedList, indentedBlock
class FieldParserException(Exception):
"""Exception raised when some error happens parsing field definitions"""
pass
class ModelParserException(Exception):
"""Exception raised when some error happens parsing model definitions"""
pass
def _create_record_field_parser():
"""
Creates a parser that can handle field definitions.
BFN like grammar::
line ::= python_doc_string | python_comment | include | field_def
include ::= "include(" PATH ")"
field_def ::= [persitent_identifier] [inherit_from] [override] json_id ["[0]" | "[n]"] "," aliases":" INDENT field_def_body UNDENT
field_def_body ::= (creator | derived | calculated) [checker] [producer] [documentation]
aliases ::= json_id ["[0]" | "[n]"] ["," aliases]
json_id ::= (alphas + '_') (alphanums + '_')
creator ::= "creator:" INDENT creator_body+ UNDENT
creator_body ::= [decorators] source_format "," source_tags "," python_allowed_expr
source_format ::= MASTER_FORMATS
source_tag ::= QUOTED_STRING #This quoted string can contain a space separated list
derived ::= "derived" INDENT derived_calculated_body UNDENT
calculated ::= "calculated:" INDENT derived_calculated_body UNDENT
derived_calculated_body ::= [decorators] "," python_allowed_exp
decorators ::= (legacy | do_not_cache | parse_first | depends_on | only_if | only_if_master_value)*
legacy ::= "@legacy(" correspondences+ ")"
correspondences ::= "(" source_tag [ "," tag_name ] "," json_id ")"
parse_first ::= "@parse_first(" json_id+ ")"
depends_on ::= "@depends_on(" json_id+ ")"
only_if ::= "@only_if(" python_condition+ ")"
only_if_master_value ::= "@only_if_master_value(" python_condition+ ")"
persistent_identifier ::= @persistent_identifier( level )
inherit_from ::= "@inherit_from(" json_id+ ")"
override ::= "@override"
extend ::= "@extend"
do_not_cache ::= "@do_not_cache"
checker ::= "checker:" INDENT checker_function+ UNDENT
documentation ::= INDENT doc_string UNDENT
doc_string ::= QUOTED_STRING #reStructuredText string
producer ::= "producer:" INDENT producer_body UNDENT
producer_body ::= producer_code "," python_dictionary
producer_code ::= ident
python_allowed_exp ::= ident | list_def | dict_def | list_access | dict_access | function_call | one_line_expr
"""
indent_stack = [1]
def check_sub_indent(str, location, tokens):
cur_col = col(location, str)
if cur_col > indent_stack[-1]:
indent_stack.append(cur_col)
else:
raise ParseException(str, location, "not a subentry")
def check_unindent(str, location, tokens):
if location >= len(str):
return
cur_col = col(location, str)
if not(cur_col < indent_stack[-1] and cur_col <= indent_stack[-2]):
raise ParseException(str, location, "not an unindent")
def do_unindent():
indent_stack.pop()
INDENT = lineEnd.suppress() + empty + empty.copy().setParseAction(check_sub_indent)
UNDENT = FollowedBy(empty).setParseAction(check_unindent)
UNDENT.setParseAction(do_unindent)
json_id = (Word(alphas + "_", alphanums + "_") + Optional(oneOf("[0] [n]")))\
.setResultsName("json_id", listAllMatches=True)\
.setParseAction(lambda tokens: "".join(tokens))
aliases = delimitedList((Word(alphanums + "_") + Optional(oneOf("[0] [n]")))
.setParseAction(lambda tokens: "".join(tokens)))\
.setResultsName("aliases")
python_allowed_expr = Forward()
ident = Word(alphas + "_", alphanums + "_")
dict_def = originalTextFor(nestedExpr('{', '}'))
list_def = originalTextFor(nestedExpr('[', ']'))
dict_access = list_access = originalTextFor(ident + nestedExpr('[', ']'))
function_call = originalTextFor(ZeroOrMore(ident + ".") + ident + nestedExpr('(', ')'))
python_allowed_expr << (ident ^ dict_def ^ list_def ^ dict_access ^ list_access ^ function_call ^ restOfLine)\
.setResultsName("value", listAllMatches=True)
persistent_identifier = (Suppress("@persistent_identifier") + nestedExpr("(", ")"))\
.setResultsName("persistent_identifier")
legacy = (Suppress("@legacy") + originalTextFor(nestedExpr("(", ")")))\
.setResultsName("legacy", listAllMatches=True)
only_if = (Suppress("@only_if") + originalTextFor(nestedExpr("(", ")")))\
.setResultsName("only_if")
only_if_master_value = (Suppress("@only_if_value") + originalTextFor(nestedExpr("(", ")")))\
.setResultsName("only_if_master_value")
depends_on = (Suppress("@depends_on") + originalTextFor(nestedExpr("(", ")")))\
.setResultsName("depends_on")
parse_first = (Suppress("@parse_first") + originalTextFor(nestedExpr("(", ")")))\
.setResultsName("parse_first")
do_not_cache = (Suppress("@") + "do_not_cache")\
.setResultsName("do_not_cache")
field_decorator = parse_first ^ depends_on ^ only_if ^ only_if_master_value ^ do_not_cache ^ legacy
#Independent decorators
inherit_from = (Suppress("@inherit_from") + originalTextFor(nestedExpr("(", ")")))\
.setResultsName("inherit_from")
override = (Suppress("@") + "override")\
.setResultsName("override")
extend = (Suppress("@") + "extend")\
.setResultsName("extend")
master_format = (Suppress("@master_format") + originalTextFor(nestedExpr("(", ")")))\
.setResultsName("master_format")
derived_calculated_body = ZeroOrMore(field_decorator) + python_allowed_expr
derived = "derived" + Suppress(":") + INDENT + derived_calculated_body + UNDENT
calculated = "calculated" + Suppress(":") + INDENT + derived_calculated_body + UNDENT
source_tag = quotedString\
.setParseAction(removeQuotes)\
.setResultsName("source_tag", listAllMatches=True)
source_format = oneOf(cfg['CFG_RECORD_MASTER_FORMATS'])\
.setResultsName("source_format", listAllMatches=True)
creator_body = (ZeroOrMore(field_decorator) + source_format + Suppress(",") + source_tag + Suppress(",") + python_allowed_expr)\
.setResultsName("creator_def", listAllMatches=True)
creator = "creator" + Suppress(":") + INDENT + OneOrMore(creator_body) + UNDENT
checker_function = (Optional(master_format) + ZeroOrMore(ident + ".") + ident + originalTextFor(nestedExpr('(', ')')))\
.setResultsName("checker_function", listAllMatches=True)
checker = "checker" + Suppress(":") + INDENT + OneOrMore(checker_function) + UNDENT
doc_string = QuotedString(quoteChar='"""', multiline=True) | quotedString.setParseAction(removeQuotes)
documentation = "documentation" + Suppress(":") + \
INDENT + Optional(doc_string).setResultsName("documentation") + UNDENT
producer_code = Word(alphas + "_", alphanums + "_")\
.setResultsName("producer_code", listAllMatches=True)
producer_body = (producer_code + Suppress(",") + python_allowed_expr)\
.setResultsName("producer_def", listAllMatches=True)
producer = "producer" + Suppress(":") + INDENT + OneOrMore(producer_body) + UNDENT
field_def = (creator | derived | calculated)\
.setResultsName("type_field", listAllMatches=True)
body = Optional(field_def) + Optional(checker) + Optional(producer) + Optional(documentation)
comment = Literal("#") + restOfLine + LineEnd()
include = (Suppress("include") + quotedString)\
.setResultsName("includes", listAllMatches=True)
rule = (Optional(persistent_identifier) + Optional(inherit_from) + Optional(override) + json_id + Optional(Suppress(",") + aliases) + Suppress(":") + INDENT + body + UNDENT)\
.setResultsName("rules", listAllMatches=True)
return OneOrMore(rule | include | comment.suppress())
def _create_record_model_parser():
"""
Creates a parser that can handle model definitions.
BFN like grammar::
record_model ::= python_doc_string | python_comment | fields [documentation] [checker]
fields ::= "fields:" INDENT [inherit_from] [list_of_fields]
inherit_from ::= "@inherit_from(" json_id+ ")"
list_of_fields ::= json_id [ "=" json_id ]
documentation ::= INDENT QUOTED_STRING UNDENT
checker ::= "checker:" INDENT checker_function+ UNDENT
Note: Unlike the field configuration files where you can specify more than
- one field inside each file for the record models only one definition is
+ one field inside each file for the record models only one definition is
allowed by file.
"""
indentStack = [1]
field_def = (Word(alphas + "_", alphanums + "_") + \
Optional(Suppress("=") + \
Word(alphas + "_", alphanums + "_")))\
.setResultsName("field_definition")
inherit_from = (Suppress("@inherit_from") + \
originalTextFor(nestedExpr("(", ")")))\
.setResultsName("inherit_from")
fields = (Suppress("fields:") + \
indentedBlock(inherit_from | field_def, indentStack))\
.setResultsName("fields")
ident = Word(alphas + "_", alphanums + "_")
checker_function = (ZeroOrMore(ident + ".") +\
ident + \
originalTextFor(nestedExpr('(', ')')))\
.setResultsName("checker_function", listAllMatches=True)
checker = (Suppress("checker:") + \
indentedBlock(OneOrMore(checker_function), indentStack))\
.setResultsName("checker")
doc_string = QuotedString(quoteChar='"""', multiline=True) | \
quotedString.setParseAction(removeQuotes)
documentation = (Suppress("documentation:") + \
indentedBlock(doc_string, indentStack))\
.setResultsName("documentation")
comment = Literal("#") + restOfLine + LineEnd()
return OneOrMore(comment | fields + Optional(documentation) + Optional(checker))
class FieldParser(object):
"""Field definitions parser"""
- def __init__(self):
+ def __init__(self, name):
#Autodiscover .cfg files
- self.files = autodiscover_non_python_files('.*\.cfg',
- 'recordext.fields')
+ self.files = autodiscover_non_python_files('.*\.cfg', name)
self.field_definitions = {}
self.legacy_field_matchings = {}
self.__inherit_rules = []
self.__unresolved_inheritence = []
self.__override_rules = []
self.__extend_rules = []
def create(self):
"""
Fills up field_definitions and legacy_field_matchings dictionary with
the rules defined inside the configuration files.
It also resolve the includes present inside the configuration files and
recursively the ones in the other files.
"""
parser = _create_record_field_parser()
already_included = [os.path.basename(f) for f in self.files]
for field_file in self.files:
field_descs = parser.parseFile(field_file, parseAll=True)
for include in field_descs.includes:
if include[0] in already_included:
continue
if not os.path.exists(include[0]):
raise FieldParserException("Can't find file: %s" % (include[0], ))
self.files.append(include[0])
for rule in field_descs.rules:
if rule.override:
self.__override_rules.append(rule)
elif rule.extend:
self.__extend_rules.append(rule)
elif rule.type_field[0] == 'creator':
self._create_creator_rule(rule)
elif rule.type_field[0] == "derived" or rule.type_field[0] == "calculated":
self._create_derived_calculated_rule(rule)
elif rule.inherit_from:
self.__inherit_rules.append(rule)
else:
assert False, 'Type creator, derived or calculated, or inherit field or overwrite field expected'
self.__resolve_inherit_rules()
self.__resolve_override_rules()
self.__resolve_extend_rules()
return (self.field_definitions, self.legacy_field_matchings)
def _create_creator_rule(self, rule, override=False, extend=False):
"""
Creates the config_rule entries for the creator rules.
The result looks like this::
{'json_id':{'rules': { 'inherit_from' : (inherit_from_list),
'source_format' : [translation_rules],
'parse_first' : (parse_first_json_ids),
'depends_on' : (depends_on_json_id),
'only_if' : (only_if_boolean_expressions),
'only_if_master_value': (only_if_master_value_boolean_expressions),
},
'checker': [(function_name, arguments), ...]
'documentation' : {'doc_string': '...',
'subfields' : .....},
'type' : 'real'
'aliases' : [list_of_aliases_ids]
},
....
}
"""
json_id = rule.json_id[0]
#Chech duplicate names
if json_id in self.field_definitions and not override and not extend:
raise FieldParserException("Name error: '%s' field name already defined"
% (rule.json_id[0],))
if not json_id in self.field_definitions and (override or extend):
raise FieldParserException("Name error: '%s' field name not defined"
% (rule.json_id[0],))
#Workaround to keep clean doctype files
#Just creates a dict entry with the main json field name and points it to
#the full one i.e.: 'authors' : ['authors[0]', 'authors[n]']
if '[0]' in json_id or '[n]' in json_id:
main_json_id = re.sub('(\[n\]|\[0\])', '', json_id)
if not main_json_id in self.field_definitions:
self.field_definitions[main_json_id] = []
self.field_definitions[main_json_id].append(json_id)
aliases = []
if rule.aliases:
aliases = rule.aliases.asList()
persistent_id = None
if rule.persistent_identifier:
persistent_id = int(rule.persistent_identifier[0][0])
inherit_from = None
if rule.inherit_from:
self.__unresolved_inheritence.append(json_id)
inherit_from = eval(rule.inherit_from[0])
rules = {}
for creator in rule.creator_def:
source_format = creator.source_format[0]
if source_format not in rules:
#Allow several tags point to the same json id
rules[source_format] = []
(depends_on, only_if, only_if_master_value, parse_first) = self.__create_decorators_content(creator)
self._create_legacy_rules(creator.legacy, json_id, source_format)
rules[source_format].append({'source_tag' : creator.source_tag[0].split(),
'value' : creator.value[0],
'depends_on' : depends_on,
'only_if' : only_if,
'only_if_master_value' : only_if_master_value,
'parse_first' : parse_first})
if not override and not extend:
self.field_definitions[json_id] = {'inherit_from' : inherit_from,
'rules' : rules,
'checker' : [],
'documentation' : '',
'producer' : {},
'type' : 'real',
'aliases' : aliases,
'persistent_identifier': persistent_id,
'overwrite' : False}
elif override:
self.field_definitions[json_id]['overwrite'] = True
self.field_definitions[json_id]['rules'].update(rules)
self.field_definitions[json_id]['aliases'] = \
aliases or self.field_definitions[json_id]['aliases']
self.field_definitions[json_id]['persistent_identifier'] = \
persistent_id or self.field_definitions[json_id]['persistent_identifier']
self.field_definitions[json_id]['inherit_from'] = \
inherit_from or self.field_definitions[json_id]['inherit_from']
elif extend:
pass
self._create_checkers(rule)
self._create_documentation(rule)
self._create_producer(rule)
def _create_derived_calculated_rule(self, rule, override=False):
"""
Creates the field_definitions entries for the virtual fields
The result is similar to the one of real fields but in this case there is
only one rule.
"""
json_id = rule.json_id[0]
#Chech duplicate names
if json_id in self.field_definitions and not override:
raise FieldParserException("Name error: '%s' field name already defined"
% (rule.json_id[0],))
if not json_id in self.field_definitions and override:
raise FieldParserException("Name error: '%s' field name not defined"
% (rule.json_id[0],))
aliases = []
if rule.aliases:
aliases = rule.aliases.asList()
if re.search('^_[a-zA-Z0-9]', json_id):
aliases.append(json_id[1:])
do_not_cache = False
if rule.do_not_cache:
do_not_cache = True
persistent_id = None
if rule.persistent_identifier:
persistent_id = int(rule.persistent_identifier[0][0])
(depends_on, only_if, only_if_master_value, parse_first) = self.__create_decorators_content(rule)
self._create_legacy_rules(rule.legacy, json_id)
self.field_definitions[json_id] = {'rules' : {},
'checker' : [],
'documentation': '',
'producer' : {},
'aliases' : aliases,
'type' : rule.type_field[0],
'persistent_identifier' : persistent_id,
'overwrite' : False}
self.field_definitions[json_id]['rules'] = {'value' : rule.value[0],
'depends_on' : depends_on,
'only_if' : only_if,
'only_if_master_value': only_if_master_value,
'parse_first' : parse_first,
'do_not_cache' : do_not_cache}
self._create_checkers(rule)
self._create_documentation(rule)
self._create_producer(rule)
def _create_legacy_rules(self, legacy_rules, json_id, source_format=None):
"""
Creates the legacy rules dictionary::
{'100' : ['authors[0]'],
'100__' : ['authors[0]'],
'100__%': ['authors[0]'],
'100__a': ['auhtors[0].full_name'],
.......
}
"""
if not legacy_rules:
return
for legacy_rule in legacy_rules:
legacy_rule = eval(legacy_rule[0])
if source_format is None:
inner_source_format = legacy_rule[0]
legacy_rule = legacy_rule[1]
else:
inner_source_format = source_format
if not inner_source_format in self.legacy_field_matchings:
self.legacy_field_matchings[inner_source_format] = {}
for field_legacy_rule in legacy_rule:
#Allow string and tuple in the config file
legacy_fields = isinstance(field_legacy_rule[0], basestring) and (field_legacy_rule[0], ) or field_legacy_rule[0]
json_field = json_id
if field_legacy_rule[-1]:
json_field = '.'.join((json_field, field_legacy_rule[-1]))
for legacy_field in legacy_fields:
if not legacy_field in self.legacy_field_matchings[inner_source_format]:
self.legacy_field_matchings[inner_source_format][legacy_field] = []
self.legacy_field_matchings[inner_source_format][legacy_field].append(json_field)
def _create_checkers(self, rule):
"""
Creates the list of checker functions and arguments for the given rule
"""
json_id = rule.json_id[0]
assert json_id in self.field_definitions
if rule.checker_function:
if self.field_definitions[json_id]['overwrite']:
self.field_definitions[json_id]['checker'] = []
for checker in rule.checker_function:
if checker.master_format:
master_format = eval(rule.master_format[0])
checker_function_name = checker[1]
arguments = checker[2][1:-1]
else:
master_format = ('all',)
checker_function_name = checker[0]
arguments = checker[1][1:-1]
#json_id : (master_format, checker_name, parameters)
self.field_definitions[json_id]['checker'].append((master_format,
checker_function_name,
arguments))
def _create_documentation(self, rule):
"""
Creates the documentation dictionary for the given rule
"""
json_id = rule.json_id[0]
assert json_id in self.field_definitions
if rule.documentation:
self.field_definitions[json_id]['documentation'] = rule.documentation
def _create_producer(self, rule):
"""
Creates the dictionary of possible producer formats for the given rule
"""
json_id = rule.json_id[0]
assert json_id in self.field_definitions
if rule.producer_def:
if self.field_definitions[json_id]['overwrite']:
self.field_definitions[json_id]['producer'] = {}
for producer in rule.producer_def:
producer_code = producer.producer_code[0]
rule = producer.value[0]
if not producer_code in self.field_definitions[json_id]['producer']:
self.field_definitions[json_id]['producer'][producer_code] = []
self.field_definitions[json_id]['producer'][producer_code].append(eval(rule))
def __create_decorators_content(self, rule):
"""
Extracts from the rule all the possible decorators.
"""
depends_on = only_if = only_if_master_value = parse_first = None
if rule.depends_on:
depends_on = rule.depends_on[0]
if rule.only_if:
only_if = rule.only_if[0]
if rule.only_if_master_value:
only_if_master_value = rule.only_if_master_value[0]
if rule.parse_first:
parse_first = rule.parse_first[0]
return (depends_on, only_if, only_if_master_value, parse_first)
def __resolve_inherit_rules(self):
"""
Iterates over all the 'inherit' fields after all the normal field
creation to avoid problem when creating this rules.
"""
def resolve_inheritance(json_id):
rule = self.field_definitions[json_id]
inherit_from_list = self.field_definitions[json_id]['inherit_from']
for inherit_json_id in inherit_from_list:
#Check if everithing is fine
if inherit_json_id == json_id:
raise FieldParserException("Inheritance from itself")
if inherit_json_id not in self.field_definitions:
raise FieldParserException("Unable to solve %s inheritance" % (inherit_json_id,))
if inherit_json_id in self._unresolved_inheritence:
self._resolve_inheritance(inherit_json_id)
self._unresolved_inheritence.remove(inherit_json_id)
inherit_rule = self.field_definitions[inherit_json_id]
for format in inherit_rule['rules']:
if not format in rule['rules']:
rule['rules'][format] = []
rule['rules'][format].extend(inherit_rule['rules'][format])
rule['checker'].extend(inherit_rule['checker'])
for rule in self.__inherit_rules:
if rule.type_field[0] == 'creator':
self._create_creator_rule(rule)
elif rule.type_field[0] == "derived" or rule.type_field[0] == "calculated":
self._create_derived_calculated_rule(rule)
#Resolve inheritance
for i in xrange(len(self.__unresolved_inheritence) - 1, -1, -1):
resolve_inheritance(self.__unresolved_inheritence[i])
del self.__unresolved_inheritence[i]
def __resolve_override_rules(self):
"""
Iterates over all the 'override' field to override the already created
fields.
"""
for rule in self.__override_rules:
if rule.type_field[0] == 'creator':
self._create_creator_rule(rule, override=True)
elif rule.type_field[0] == "derived" or rule.type_field[0] == "calculated":
self._create_derived_calculated_rule(rule, override=True)
def __resolve_extend_rules(self):
"""
Iterates over all the 'extend' field to extend the rule definition of this
field.
"""
for rule in self.__extend_rules:
if rule.type_field[0] == 'creator':
self._create_creator_rule(rule, extend=True)
elif rule.type_field[0] == "derived" or rule.type_field[0] == "calculated":
self._create_derived_calculated_rule(rule, extend=True)
class ModelParser(object):
"""Record model parser"""
- def __init__(self):
+ def __init__(self, name):
#Autodiscover .cfg files
- self.files = autodiscover_non_python_files('.*\.cfg',
- 'recordext.models')
+ self.files = autodiscover_non_python_files('.*\.cfg', name)
self.model_definitions = {}
def create(self):
"""
Fills up model_definitions dictionary with what is written inside the
*.cfg present in the base directory / models
It also resolve inheritance at creation time and name matching for the
field names present inside the model file
The result looks like this::
{'model': {'fields': {json_id1: 'name_for_fieldfield1',
json_id2: 'name_for_field2',
....
'name_for_fieldN': fieldN },
'inherit_from: [(inherit_from_list), ...]
'documentation': 'doc_string',
'checker': [(functiona_name, arguments), ...]
},
...
:raises: ModelParserException in case of missing model definition
(helpful if we use inheritance) or in case of unknown field name.
"""
from .definitions import field_definitions
parser = _create_record_model_parser()
for model_file in self.files:
model_name = os.path.basename(model_file).split('.')[0]
if model_name in self.model_definitions:
raise ModelParserException("Already defined record model: %s" % (model_name,))
self.model_definitions[model_name] = {'fields': {},
'super': [],
'documentation': '',
'checker': []
}
model_definition = parser.parseFile(model_file, parseAll=True)
if model_definition.documentation:
self.model_definitions[model_name]['documentation'] = model_definition.documentation[0][0][0]
if model_definition.checker:
for checker in model_definition.checker[0][0].checker_function:
self.model_definitions[model_name]['checker'].append((checker[0], checker[1][1:-1]))
if not model_definition.fields:
raise ModelParserException("Field definition needed")
for field_def in model_definition.fields[0]:
if field_def.inherit_from:
self.model_definitions[model_name]['super'].extend(eval(field_def[0]))
else:
if len(field_def) == 1:
json_id = field_def[0]
else:
json_id = field_def[1]
if not json_id in field_definitions:
raise ModelParserException("Unknown field name: %s" % (json_id,))
self.model_definitions[model_name]['fields'][json_id] = field_def[0]
for model, model_definition in self.model_definitions.iteritems():
model_definition['fields'] = self.__resolve_inheritance(model)
return self.model_definitions
def __resolve_inheritance(self, model):
"""
Resolves the inheritance
:param model: name of the super model
:type model: string
:return: List of new fields to be added to the son model
:raises: ModelParserException if the super model does not exist.
"""
try:
model_definition = self.model_definitions[model]
except KeyError:
raise ModelParserException("Missing model definition for %s" % (model,))
fields = {}
for super_model in model_definition['super']:
fields.update(resolve_inheritance(super_model))
fields.update(model_definition['fields'])
return fields
diff --git a/invenio/core/record/definitions.py b/invenio/core/record/definitions.py
index 7799c2581..aba3db5e8 100644
--- a/invenio/core/record/definitions.py
+++ b/invenio/core/record/definitions.py
@@ -1,63 +1,64 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.from invenio.ext.cache import cache
#FIXME: add signal management and relation between them
from invenio.utils.datastructures import LazyDict
from invenio.ext.cache import cache
__all__ = ['field_definitions', 'legacy_field_matchings', 'model_definitions']
def _rebuild_cache():
print ">>> Recreating the cache for fields!"
from invenio.core.record.config_engine import FieldParser
- field_definitions, legacy_field_matchings = FieldParser().create()
+ field_definitions, legacy_field_matchings = FieldParser(
+ 'recordext.fields').create()
cache.set('RECORD_FIELD_DEFINITIONS', field_definitions)
cache.set('LEGACY_FIELD_MATCHINGS', legacy_field_matchings)
return field_definitions, legacy_field_matchings
def _field_definitions():
field_definitions = cache.get('RECORD_FIELD_DEFINITIONS')
if field_definitions is None:
field_definitions, _ = _rebuild_cache()
return field_definitions
def _legacy_field_matchings():
legacy_field_matchings = cache.get('LEGACY_FIELD_MATCHINGS')
if field_definitions is None:
_, legacy_field_matchings = _rebuild_cache()
return legacy_field_matchings
def _model_definitions():
model_definitions = cache.get('RECORD_MODEL_DEFINITIONS')
if model_definitions is None:
print ">>> Recreating the cache for models"
from invenio.core.record.config_engine import ModelParser
- model_definitions = ModelParser().create()
+ model_definitions = ModelParser('recordext.models').create()
cache.set('RECORD_MODEL_DEFINITIONS', model_definitions)
return model_definitions
field_definitions = LazyDict(_field_definitions)
legacy_field_matchings = LazyDict(_legacy_field_matchings)
model_definitions = LazyDict(_model_definitions)
diff --git a/invenio/ext/email/__init__.py b/invenio/ext/email/__init__.py
index c1479be40..8dc8f74b0 100644
--- a/invenio/ext/email/__init__.py
+++ b/invenio/ext/email/__init__.py
@@ -1,430 +1,430 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Invenio mail sending utilities. send_email() is the main API function
people should be using; just check out its docstring.
"""
__revision__ = "$Id$"
from time import sleep
import re
import os
import sys
from email.MIMEMultipart import MIMEMultipart
from email.MIMEBase import MIMEBase
from email import Encoders
from email.MIMEImage import MIMEImage
from email.Utils import formatdate
from cStringIO import StringIO
from flask import g
from formatter import DumbWriter, AbstractFormatter
from flask.ext.email.message import EmailMultiAlternatives, EmailMessage
from invenio.base.globals import cfg
default_ln = lambda ln: cfg.get('CFG_SITE_LANG') if ln is None else ln
from .errors import EmailError
from invenio.ext.template import render_template_to_string
from invenio.base.helpers import unicodifier
def setup_app(app):
"""
Prepare application config from Invenio configuration.
@see: https://flask-email.readthedocs.org/en/latest/#configuration
"""
cfg = app.config
app.config.setdefault('EMAIL_BACKEND', cfg.get(
'CFG_EMAIL_BACKEND', 'flask.ext.email.backends.smtp.Mail'))
app.config.setdefault('DEFAULT_FROM_EMAIL', cfg['CFG_SITE_SUPPORT_EMAIL'])
app.config.setdefault('SERVER_EMAIL', cfg['CFG_SITE_ADMIN_EMAIL'])
app.config.setdefault('ADMINS', (cfg['CFG_SITE_ADMIN_EMAIL'], ))
app.config.setdefault('MANAGERS', (cfg['CFG_SITE_SUPPORT_EMAIL'], ))
CFG_MISCUTIL_SMTP_HOST = cfg.get('CFG_MISCUTIL_SMTP_HOST')
CFG_MISCUTIL_SMTP_PORT = cfg.get('CFG_MISCUTIL_SMTP_PORT')
CFG_MISCUTIL_SMTP_USER = cfg.get('CFG_MISCUTIL_SMTP_USER', '')
CFG_MISCUTIL_SMTP_PASS = cfg.get('CFG_MISCUTIL_SMTP_PASS', '')
CFG_MISCUTIL_SMTP_TLS = cfg.get('CFG_MISCUTIL_SMTP_TLS', False)
app.config.setdefault('EMAIL_HOST', CFG_MISCUTIL_SMTP_HOST)
app.config.setdefault('EMAIL_PORT', CFG_MISCUTIL_SMTP_PORT)
app.config.setdefault('EMAIL_HOST_USER', CFG_MISCUTIL_SMTP_USER)
app.config.setdefault('EMAIL_HOST_PASSWORD', CFG_MISCUTIL_SMTP_PASS)
app.config.setdefault('EMAIL_USE_TLS', CFG_MISCUTIL_SMTP_TLS)
# app.config['EMAIL_USE_SSL']: defaults to False
app.config.setdefault('EMAIL_FILE_PATH', cfg['CFG_LOGDIR'])
return app
def scheduled_send_email(fromaddr,
toaddr,
subject="",
content="",
header=None,
footer=None,
copy_to_admin=0,
attempt_times=1,
attempt_sleeptime=10,
user=None,
other_bibtasklet_arguments=None,
replytoaddr=""):
"""
Like send_email, but send an email via the bibsched
infrastructure.
@param fromaddr: sender
@type fromaddr: string
@param toaddr: list of receivers
@type toaddr: string (comma separated) or list of strings
@param subject: the subject
@param content: the body of the message
@param header: optional header, otherwise default is used
@param footer: optional footer, otherwise default is used
@param copy_to_admin: set to 1 in order to send email the admins
@param attempt_times: try at least n times before giving up sending
@param attempt_sleeptime: number of seconds to sleep between two attempts
@param user: the user name to user when scheduling the bibtasklet. If
None, the sender will be used
@param other_bibtasklet_arguments: other arguments to append to the list
of arguments to the call of task_low_level_submission
@param replytoaddr: [string or list-of-strings] to be used for the
reply-to header of the email (if string, then
receivers are separated by ',')
@return: the scheduled bibtasklet
"""
- from invenio.bibtask import task_low_level_submission
+ from invenio.legacy.bibsched.bibtask import task_low_level_submission
if not isinstance(toaddr, (unicode, str)):
toaddr = ','.join(toaddr)
if not isinstance(replytoaddr, (unicode, str)):
replytoaddr = ','.join(replytoaddr)
toaddr = remove_temporary_emails(toaddr)
if user is None:
user = fromaddr
if other_bibtasklet_arguments is None:
other_bibtasklet_arguments = []
else:
other_bibtasklet_arguments = list(other_bibtasklet_arguments)
if not header is None:
other_bibtasklet_arguments.extend(("-a", "header=%s" % header))
if not footer is None:
other_bibtasklet_arguments.extend(("-a", "footer=%s" % footer))
return task_low_level_submission(
"bibtasklet", user, "-T", "bst_send_email",
"-a", "fromaddr=%s" % fromaddr,
"-a", "toaddr=%s" % toaddr,
"-a", "replytoaddr=%s" % replytoaddr,
"-a", "subject=%s" % subject,
"-a", "content=%s" % content,
"-a", "copy_to_admin=%s" % copy_to_admin,
"-a", "attempt_times=%s" % attempt_times,
"-a", "attempt_sleeptime=%s" % attempt_sleeptime,
*other_bibtasklet_arguments)
def send_email(fromaddr,
toaddr,
subject="",
content="",
html_content='',
html_images=None,
header=None,
footer=None,
html_header=None,
html_footer=None,
copy_to_admin=0,
attempt_times=1,
attempt_sleeptime=10,
debug_level=0,
ln=None,
charset=None,
replytoaddr="",
attachments=None
):
"""Send a forged email to TOADDR from FROMADDR with message created from subjet, content and possibly
header and footer.
@param fromaddr: [string] sender
@param toaddr: [string or list-of-strings] list of receivers (if string, then
receivers are separated by ',')
@param subject: [string] subject of the email
@param content: [string] content of the email
@param html_content: [string] html version of the email
@param html_images: [dict] dictionary of image id, image path
@param header: [string] header to add, None for the Default
@param footer: [string] footer to add, None for the Default
@param html_header: [string] header to add to the html part, None for the Default
@param html_footer: [string] footer to add to the html part, None for the Default
@param copy_to_admin: [int] if 1 add CFG_SITE_ADMIN_EMAIL in receivers
@param attempt_times: [int] number of tries
@param attempt_sleeptime: [int] seconds in between tries
@param debug_level: [int] debug level
@param ln: [string] invenio language
@param charset: [string] the content charset. By default is None which means
to try to encode the email as ascii, then latin1 then utf-8.
@param replytoaddr: [string or list-of-strings] to be used for the
reply-to header of the email (if string, then
receivers are separated by ',')
@param attachments: list of paths of files to be attached. Alternatively,
every element of the list could be a tuple: (filename, mimetype)
If sending fails, try to send it ATTEMPT_TIMES, and wait for
ATTEMPT_SLEEPTIME seconds in between tries.
e.g.:
send_email('foo.bar@cern.ch', 'bar.foo@cern.ch', 'Let\'s try!'', 'check 1234', '<strong>check</strong> <em>1234</em><img src="cid:image1">', {'image1': '/tmp/quantum.jpg'})
@return: [bool]: True if email was sent okay, False if it was not.
"""
from invenio.ext.logging import register_exception
ln = default_ln(ln)
if html_images is None:
html_images = {}
if type(toaddr) is not list:
toaddr = toaddr.strip().split(',')
toaddr = remove_temporary_emails(toaddr)
usebcc = len(toaddr.split(',')) > 1 # More than one address, let's use Bcc in place of To
if copy_to_admin:
if cfg['CFG_SITE_ADMIN_EMAIL'] not in toaddr:
toaddr.append(cfg['CFG_SITE_ADMIN_EMAIL'])
body = forge_email(fromaddr, toaddr, subject, content, html_content,
html_images, usebcc, header, footer, html_header,
html_footer, ln, charset, replytoaddr, attachments)
if attempt_times < 1 or not toaddr:
try:
raise EmailError(g._('The system is not attempting to send an email from %s, to %s, with body %s.') % (fromaddr, toaddr, body))
except EmailError:
register_exception()
return False
sent = False
while not sent and attempt_times > 0:
try:
sent = body.send()
except Exception:
register_exception()
if debug_level > 1:
try:
raise EmailError(g._('Error in sending message. Waiting %s seconds. Exception is %s, while sending email from %s to %s with body %s.') % (attempt_sleeptime, sys.exc_info()[0], fromaddr, toaddr, body))
except EmailError:
register_exception()
if not sent:
attempt_times -= 1
if attempt_times > 0: # sleep only if we shall retry again
sleep(attempt_sleeptime)
if not sent:
try:
raise EmailError(g._('Error in sending email from %s to %s with body %s.') % (fromaddr, toaddr, body))
except EmailError:
register_exception()
return sent
def attach_embed_image(email, image_id, image_path):
"""
Attach an image to the email.
"""
with open(image_path, 'rb') as image_data:
img = MIMEImage(image_data.read())
img.add_header('Content-ID', '<%s>' % image_id)
img.add_header('Content-Disposition', 'attachment', filename=os.path.split(image_path)[1])
email.attach(img)
def forge_email(fromaddr, toaddr, subject, content, html_content='',
html_images=None, usebcc=False, header=None, footer=None,
html_header=None, html_footer=None, ln=None,
charset=None, replytoaddr="", attachments=None):
"""Prepare email. Add header and footer if needed.
@param fromaddr: [string] sender
@param toaddr: [string or list-of-strings] list of receivers (if string, then
receivers are separated by ',')
@param usebcc: [bool] True for using Bcc in place of To
@param subject: [string] subject of the email
@param content: [string] content of the email
@param html_content: [string] html version of the email
@param html_images: [dict] dictionary of image id, image path
@param header: [string] None for the default header
@param footer: [string] None for the default footer
@param ln: language
@charset: [string] the content charset. By default is None which means
to try to encode the email as ascii, then latin1 then utf-8.
@param replytoaddr: [string or list-of-strings] to be used for the
reply-to header of the email (if string, then
receivers are separated by ',')
@param attachments: list of paths of files to be attached. Alternatively,
every element of the list could be a tuple: (filename, mimetype)
@return: forged email as a string"""
ln = default_ln(ln)
if html_images is None:
html_images = {}
content = render_template_to_string('mail_text.tpl',
content=unicodifier(content),
header=unicodifier(header),
footer=unicodifier(footer)
).encode('utf8')
if type(toaddr) is list:
toaddr = ','.join(toaddr)
if type(replytoaddr) is list:
replytoaddr = ','.join(replytoaddr)
toaddr = remove_temporary_emails(toaddr)
headers = {}
kwargs = {'to': [], 'cc': [], 'bcc': []}
if replytoaddr:
headers['Reply-To'] = replytoaddr
if usebcc:
headers['Bcc'] = toaddr
kwargs['bcc'] = toaddr.split(',')
kwargs['to'] = ['Undisclosed.Recipients:']
else:
kwargs['to'] = toaddr.split(',')
headers['From'] = fromaddr
headers['Date'] = formatdate(localtime=True)
headers['User-Agent'] = 'Invenio %s at %s' % (cfg['CFG_VERSION'],
cfg['CFG_SITE_URL'])
if html_content:
html_content = render_template_to_string(
'mail_html.tpl',
content=unicodifier(html_content),
header=unicodifier(html_header),
footer=unicodifier(html_footer)
).encode('utf8')
msg_root = EmailMultiAlternatives(subject=subject, body=content,
from_email=fromaddr,
headers=headers, **kwargs)
msg_root.attach_alternative(html_content, "text/html")
#if not html_images:
# # No image? Attach the HTML to the root
# msg_root.attach(msg_text)
#else:
if html_images:
# Image(s)? Attach the HTML and image(s) as children of a
# "related" block
msg_related = MIMEMultipart('related')
#msg_related.attach(msg_text)
for image_id, image_path in html_images.iteritems():
attach_embed_image(msg_related, image_id, image_path)
msg_root.attach(msg_related)
else:
msg_root = EmailMessage(subject=subject, body=content,
from_email=fromaddr, headers=headers, **kwargs)
if attachments:
- from invenio.bibdocfile import _mimes, guess_format_from_url
+ from invenio.legacy.bibdocfile.api import _mimes, guess_format_from_url
#old_msg_root = msg_root
#msg_root = MIMEMultipart()
#msg_root.attach(old_msg_root)
for attachment in attachments:
try:
mime = None
if type(attachment) in (list, tuple):
attachment, mime = attachment
if mime is None:
## Automatic guessing of mimetype
mime = _mimes.guess_type(attachment)[0]
if mime is None:
ext = guess_format_from_url(attachment)
mime = _mimes.guess_type("foo" + ext)[0]
if not mime:
mime = 'application/octet-stream'
part = MIMEBase(*mime.split('/', 1))
part.set_payload(open(attachment, 'rb').read())
Encoders.encode_base64(part)
part.add_header('Content-Disposition', 'attachment; filename="%s"' % os.path.basename(attachment))
msg_root.attach(part)
except:
from invenio.ext.logging import register_exception
register_exception(alert_admin=True, prefix="Can't attach %s" % attachment)
return msg_root
RE_NEWLINES = re.compile(r'<br\s*/?>|</p>', re.I)
RE_SPACES = re.compile(r'\s+')
RE_HTML_TAGS = re.compile(r'<.+?>')
def email_strip_html(html_content):
"""Strip html tags from html_content, trying to respect formatting."""
html_content = RE_SPACES.sub(' ', html_content)
html_content = RE_NEWLINES.sub('\n', html_content)
html_content = RE_HTML_TAGS.sub('', html_content)
html_content = html_content.split('\n')
out = StringIO()
out_format = AbstractFormatter(DumbWriter(out))
for row in html_content:
out_format.add_flowing_data(row)
out_format.end_paragraph(1)
return out.getvalue()
def remove_temporary_emails(emails):
"""
Removes the temporary emails (which are constructed randomly when user logs in
with an external authentication provider which doesn't supply an email
address) from an email list.
@param emails: email list (if string, then receivers are separated by ',')
@type emails: str|[str]
@rtype: str
"""
from invenio.modules.access.local_config import CFG_TEMP_EMAIL_ADDRESS
if not isinstance(emails, (str, unicode)):
emails = ','.join(emails)
# Remove all of the spaces
emails = emails.replace(' ', '')
# Remove all of the emails formatted like CFG_TEMP_EMAIL_ADDRESS
emails = re.sub((CFG_TEMP_EMAIL_ADDRESS % '\w+') + '(,|$)', '', emails,
re.IGNORECASE)
# Remove all consecutive commas
emails = re.sub(',+', ',', emails)
if emails[0] == ',':
# Remove the comma at the beginning of the string
emails = emails[1:]
if emails[-1] == ',':
# Remove the comma at the end of the string
emails = emails[:-1]
return emails
diff --git a/invenio/ext/legacy/layout.py b/invenio/ext/legacy/layout.py
index 0abd4eb73..2d8ba0d0a 100644
--- a/invenio/ext/legacy/layout.py
+++ b/invenio/ext/legacy/layout.py
@@ -1,369 +1,369 @@
# -*- coding: utf-8 -*-
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Global organisation of the application's URLs.
This module binds together Invenio's modules and maps them to
their corresponding URLs (ie, /search to the websearch modules,...)
"""
from invenio.ext.legacy.handler import create_handler
from invenio.ext.logging import register_exception
from invenio.ext.legacy.handler import WebInterfaceDirectory
from invenio.utils import apache
from invenio.config import CFG_DEVEL_SITE, CFG_OPENAIRE_SITE
class WebInterfaceDumbPages(WebInterfaceDirectory):
"""This class implements a dumb interface to use as a fallback in case of
errors importing particular module pages."""
_exports = ['']
def __call__(self, req, form):
try:
from invenio.legacy.webpage import page
except ImportError:
page = lambda * args: args[1]
req.status = apache.HTTP_INTERNAL_SERVER_ERROR
msg = "<p>This functionality is experiencing a temporary failure.</p>"
msg += "<p>The administrator has been informed about the problem.</p>"
try:
from invenio.config import CFG_SITE_ADMIN_EMAIL
msg += """<p>You can contact <code>%s</code>
in case of questions.</p>""" % \
CFG_SITE_ADMIN_EMAIL
except ImportError:
pass
msg += """<p>We hope to restore the service soon.</p>
<p>Sorry for the inconvenience.</p>"""
try:
return page('Service failure', msg)
except:
return msg
def _lookup(self, component, path):
return WebInterfaceDumbPages(), path
index = __call__
try:
from invenio.legacy.websearch.webinterface import WebInterfaceSearchInterfacePages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceSearchInterfacePages = WebInterfaceDumbPages
try:
from invenio.legacy.websearch.webinterface import WebInterfaceRSSFeedServicePages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceRSSFeedServicePages = WebInterfaceDumbPages
try:
from invenio.legacy.websearch.webinterface import WebInterfaceUnAPIPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceUnAPIPages = WebInterfaceDumbPages
try:
from invenio.legacy.bibdocfile.webinterface import bibdocfile_legacy_getfile
except:
register_exception(alert_admin=True, subject='EMERGENCY')
bibdocfile_legacy_getfile = WebInterfaceDumbPages
try:
from invenio.legacy.websubmit.webinterface import WebInterfaceSubmitPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceSubmitPages = WebInterfaceDumbPages
try:
from invenio.legacy.websession.websesion_webinterface import WebInterfaceYourAccountPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceYourAccountPages = WebInterfaceDumbPages
try:
from invenio.legacy.websession.websesion_webinterface import WebInterfaceYourTicketsPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceYourTicketsPages = WebInterfaceDumbPages
try:
from invenio.legacy.websession.websesion_webinterface import WebInterfaceYourGroupsPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceYourGroupsPages = WebInterfaceDumbPages
try:
from invenio.legacy.webalert.webinterface import WebInterfaceYourAlertsPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceYourAlertsPages = WebInterfaceDumbPages
try:
from invenio.legacy.webbasket.webinterface import WebInterfaceYourBasketsPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceYourBasketsPages = WebInterfaceDumbPages
try:
from invenio.legacy.webcomment.webinterface import WebInterfaceCommentsPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceCommentsPages = WebInterfaceDumbPages
try:
from invenio.legacy.weblinkback.webinterface import WebInterfaceRecentLinkbacksPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceRecentLinkbacksPages = WebInterfaceDumbPages
try:
from invenio.legacy.webmessage.webinterface import WebInterfaceYourMessagesPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceYourMessagesPages = WebInterfaceDumbPages
try:
from invenio.errorlib_webinterface import WebInterfaceErrorPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceErrorPages = WebInterfaceDumbPages
try:
from invenio.oai_repository_webinterface import WebInterfaceOAIProviderPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceOAIProviderPages = WebInterfaceDumbPages
try:
from invenio.legacy.webstat.webinterface import WebInterfaceStatsPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceStatsPages = WebInterfaceDumbPages
try:
from invenio.legacy.bibcirculation.webinterface import WebInterfaceYourLoansPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceYourLoansPages = WebInterfaceDumbPages
try:
from invenio.legacy.bibcirculation.webinterface import WebInterfaceILLPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceILLPages = WebInterfaceDumbPages
try:
from invenio.legacy.webjournal.webinterface import WebInterfaceJournalPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceJournalPages = WebInterfaceDumbPages
try:
from invenio.webdoc_webinterface import WebInterfaceDocumentationPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceDocumentationPages = WebInterfaceDumbPages
try:
from invenio.bibexport_method_fieldexporter_webinterface import \
WebInterfaceFieldExporterPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceFieldExporterPages = WebInterfaceDumbPages
try:
from invenio.legacy.bibknowledge.webinterface import WebInterfaceBibKnowledgePages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceBibKnowledgePages = WebInterfaceDumbPages
try:
from invenio.batchuploader_webinterface import \
WebInterfaceBatchUploaderPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceBatchUploaderPages = WebInterfaceDumbPages
try:
from invenio.legacy.bibsword.webinterface import \
WebInterfaceSword
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceSword = WebInterfaceDumbPages
try:
from invenio.ping_webinterface import \
WebInterfacePingPages
except:
register_exception(alert_admin=True, subject='EMERGENCE')
WebInterfacePingPages = WebInterfaceDumbPages
try:
from invenio.legacy.bibauthorid.webinterface import WebInterfaceBibAuthorIDPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceBibAuthorIDPages = WebInterfaceDumbPages
try:
- from invenio.bibcirculationadmin_webinterface import \
+ from invenio.legacy.bibcirculation.admin_webinterface import \
WebInterfaceBibCirculationAdminPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceBibCirculationAdminPages = WebInterfaceDumbPages
try:
from invenio.legacy.bibsched.webinterface import \
WebInterfaceBibSchedPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceBibSchedPages = WebInterfaceDumbPages
try:
from invenio.legacy.webauthorprofile.webinterface import WebAuthorPages
WebInterfaceWebAuthorPages = WebAuthorPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceWebAuthorPages = WebInterfaceDumbPages
try:
from invenio.legacy.docextract.webinterface import WebInterfaceDocExtract
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceDocExtract = WebInterfaceDumbPages
try:
from invenio.legacy.webcomment.webinterface import WebInterfaceYourCommentsPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceYourAlertsPages = WebInterfaceDumbPages
try:
from invenio.goto_webinterface import WebInterfaceGotoPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceGotoPages = WebInterfaceDumbPages
if CFG_OPENAIRE_SITE:
try:
from invenio.openaire_deposit_webinterface import \
WebInterfaceOpenAIREDepositPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceOpenAIREDepositPages = WebInterfaceDumbPages
openaire_exports = ['deposit']
else:
openaire_exports = []
if CFG_DEVEL_SITE:
try:
from invenio.httptest_webinterface import WebInterfaceHTTPTestPages
except:
register_exception(alert_admin=True, subject='EMERGENCY')
WebInterfaceHTTPTestPages = WebInterfaceDumbPages
test_exports = ['httptest']
else:
test_exports = []
class WebInterfaceAdminPages(WebInterfaceDirectory):
"""This class implements /admin2 admin pages."""
_exports = ['index', 'bibcirculation', 'bibsched']
def index(self, req, form):
return "FIXME: return /help/admin content"
bibcirculation = WebInterfaceBibCirculationAdminPages()
bibsched = WebInterfaceBibSchedPages()
class WebInterfaceInvenio(WebInterfaceSearchInterfacePages):
""" The global URL layout is composed of the search API plus all
the other modules."""
_exports = WebInterfaceSearchInterfacePages._exports + \
[
'youraccount',
'youralerts',
'yourbaskets',
'yourmessages',
'yourloans',
'yourcomments',
'ill',
'yourgroups',
'yourtickets',
'comments',
'error',
'oai2d', ('oai2d.py', 'oai2d'),
('getfile.py', 'getfile'),
'submit',
'rss',
'stats',
'journal',
'help',
'unapi',
'exporter',
'kb',
'batchuploader',
'bibsword',
'ping',
'person',
'admin2',
'linkbacks',
'author',
'textmining',
'goto',
] + test_exports + openaire_exports
def __init__(self):
self.getfile = bibdocfile_legacy_getfile
if CFG_DEVEL_SITE:
self.httptest = WebInterfaceHTTPTestPages()
if CFG_OPENAIRE_SITE:
self.deposit = WebInterfaceOpenAIREDepositPages()
submit = WebInterfaceSubmitPages()
youraccount = WebInterfaceYourAccountPages()
youralerts = WebInterfaceYourAlertsPages()
yourbaskets = WebInterfaceYourBasketsPages()
yourmessages = WebInterfaceYourMessagesPages()
yourloans = WebInterfaceYourLoansPages()
ill = WebInterfaceILLPages()
yourgroups = WebInterfaceYourGroupsPages()
yourtickets = WebInterfaceYourTicketsPages()
comments = WebInterfaceCommentsPages()
error = WebInterfaceErrorPages()
oai2d = WebInterfaceOAIProviderPages()
rss = WebInterfaceRSSFeedServicePages()
stats = WebInterfaceStatsPages()
journal = WebInterfaceJournalPages()
help = WebInterfaceDocumentationPages()
unapi = WebInterfaceUnAPIPages()
exporter = WebInterfaceFieldExporterPages()
kb = WebInterfaceBibKnowledgePages()
admin2 = WebInterfaceAdminPages()
batchuploader = WebInterfaceBatchUploaderPages()
bibsword = WebInterfaceSword()
ping = WebInterfacePingPages()
person = WebInterfaceBibAuthorIDPages()
linkbacks = WebInterfaceRecentLinkbacksPages()
#redirects author to the new webauthor
author = WebInterfaceWebAuthorPages()
#author = WebInterfaceAuthorPages()
textmining = WebInterfaceDocExtract()
yourcomments = WebInterfaceYourCommentsPages()
goto = WebInterfaceGotoPages()
# This creates the 'handler' function, which will be invoked directly
# by mod_python.
invenio_handler = create_handler(WebInterfaceInvenio())
diff --git a/invenio/legacy/batchuploader/cli.py b/invenio/legacy/batchuploader/cli.py
index 59483a2fe..1ad374849 100644
--- a/invenio/legacy/batchuploader/cli.py
+++ b/invenio/legacy/batchuploader/cli.py
@@ -1,146 +1,146 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
""" Invenio bibsched task for uploading multiple documents
or metadata files. This task can run in two different modes:
metadata or documents.
The parent directory from where the folders metadata and
documents are expected to be found has to be specified
in the invenio config file.
"""
import os.path
__revision__ = "$Id$"
import sys
import os
import time
import tempfile
import shutil
from invenio.config import CFG_TMPSHAREDDIR, \
CFG_BATCHUPLOADER_DAEMON_DIR, \
CFG_BATCHUPLOADER_FILENAME_MATCHING_POLICY, \
CFG_PREFIX
-from invenio.bibtask import task_init, task_set_option, \
+from invenio.legacy.bibsched.bibtask import task_init, task_set_option, \
task_get_option, task_update_progress, task_low_level_submission, \
write_message, task_sleep_now_if_required
from invenio.batchuploader_engine import document_upload
def task_submit_elaborate_specific_parameter(key, value, opts, args):
""" Given the string key, checks its meaning and returns True if
has elaborated the key.
Possible keys:
"""
if key in ('-d', '--documents'):
task_set_option('documents', "documents")
return True
elif key in ('-m', '--metadata'):
task_set_option('metadata', "metadata")
return True
return False
def task_run_core():
""" Walks through all directories where metadata files are located
and uploads them.
Files are then moved to the corresponding DONE folders.
"""
daemon_dir = CFG_BATCHUPLOADER_DAEMON_DIR[0] == '/' and CFG_BATCHUPLOADER_DAEMON_DIR \
or CFG_PREFIX + '/' + CFG_BATCHUPLOADER_DAEMON_DIR
# Check if directory /batchupload exists
if not task_get_option('documents'):
# Metadata upload
parent_dir = daemon_dir + "/metadata/"
progress = 0
try:
os.makedirs(parent_dir)
except OSError:
pass
for folder in ["insert/", "append/", "correct/", "replace/"]:
files_dir = parent_dir + folder
files_done_dir = files_dir + "DONE/"
try:
files = os.listdir(files_dir)
except OSError, e:
os.mkdir(files_dir)
files = []
write_message(e, sys.stderr)
# Create directory DONE/ if doesn't exist
try:
os.mkdir(files_done_dir)
except OSError:
# Directory exists
pass
for metafile in files:
if os.path.isfile(os.path.join(files_dir, metafile)):
# Create temporary file to be uploaded
(fd, filename) = tempfile.mkstemp(prefix=metafile + "_" + time.strftime("%Y%m%d%H%M%S", time.localtime()) + "_", dir=CFG_TMPSHAREDDIR)
shutil.copy(os.path.join(files_dir, metafile), filename)
# Send bibsched task
mode = "-" + folder[0]
jobid = str(task_low_level_submission('bibupload', 'batchupload', mode, filename))
# Move file to done folder
filename = metafile + "_" + time.strftime("%Y%m%d%H%M%S", time.localtime()) + "_" + jobid
os.rename(os.path.join(files_dir, metafile), os.path.join(files_done_dir, filename))
task_sleep_now_if_required(can_stop_too=True)
progress += 1
task_update_progress("Done %d out of 4." % progress)
else:
# Documents upload
parent_dir = daemon_dir + "/documents/"
try:
os.makedirs(parent_dir)
except OSError:
pass
matching_order = CFG_BATCHUPLOADER_FILENAME_MATCHING_POLICY
for folder in ["append/", "revise/"]:
try:
os.mkdir(parent_dir + folder)
except:
pass
for matching in matching_order:
errors = document_upload(folder=parent_dir + folder, matching=matching, mode=folder[:-1])[0]
if not errors:
break # All documents succedeed with that matching
for error in errors:
write_message("File: %s - %s with matching %s" % (error[0], error[1], matching), sys.stderr)
task_sleep_now_if_required(can_stop_too=True)
return 1
def main():
""" Main that constructs all the bibtask. """
task_init(authorization_action='runbatchuploader',
authorization_msg="Batch Uploader",
description="""Description:
The batch uploader has two different run modes.
If --metadata is specified (by default) then all files in folders insert,
append, correct and replace are uploaded using the corresponding mode.
If mode --documents is selected all documents present in folders named
append and revise are uploaded using the corresponding mode.
Parent directory for batch uploader must be specified in the
invenio configuration file.\n""",
help_specific_usage=""" -m, --metadata\t Batch Uploader will look for metadata files in the corresponding folders
-d, --documents\t Batch Uploader will look for documents in the corresponding folders
""",
version=__revision__,
specific_params=("md:", ["metadata", "documents"]),
task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
task_run_fnc=task_run_core)
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/batchuploader/engine.py b/invenio/legacy/batchuploader/engine.py
index bfe62145e..902e9a92a 100644
--- a/invenio/legacy/batchuploader/engine.py
+++ b/invenio/legacy/batchuploader/engine.py
@@ -1,663 +1,663 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Batch Uploader core functions. Uploading metadata and documents.
"""
import os
import pwd
import grp
import sys
import time
import tempfile
import cgi
import re
from invenio.legacy.dbquery import run_sql, Error
from invenio.modules.access.engine import acc_authorize_action
from invenio.legacy.webuser import collect_user_info, page_not_authorized
from invenio.config import CFG_BINDIR, CFG_TMPSHAREDDIR, CFG_LOGDIR, \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG, \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG, \
CFG_OAI_ID_FIELD, CFG_BATCHUPLOADER_DAEMON_DIR, \
CFG_BATCHUPLOADER_WEB_ROBOT_RIGHTS, \
CFG_BATCHUPLOADER_WEB_ROBOT_AGENTS, \
CFG_PREFIX, CFG_SITE_LANG
from invenio.utils.text import encode_for_xml
-from invenio.bibtask import task_low_level_submission
+from invenio.legacy.bibsched.bibtask import task_low_level_submission
from invenio.base.i18n import gettext_set_language
from invenio.legacy.bibrecord.scripts.textmarc2xmlmarc import transform_file
from invenio.utils.shell import run_shell_command
-from invenio.bibupload import xml_marc_to_records, bibupload
+from invenio.legacy.bibupload.engine import xml_marc_to_records, bibupload
-import invenio.bibupload as bibupload_module
+import invenio.legacy.bibupload as bibupload_module
from invenio.legacy.bibrecord import create_records, \
record_strip_empty_volatile_subfields, \
record_strip_empty_fields
try:
from cStringIO import StringIO
except ImportError:
from StringIO import StringIO
PERMITTED_MODES = ['-i', '-r', '-c', '-a', '-ir',
'--insert', '--replace', '--correct', '--append']
_CFG_BATCHUPLOADER_WEB_ROBOT_AGENTS_RE = re.compile(CFG_BATCHUPLOADER_WEB_ROBOT_AGENTS)
def cli_allocate_record(req):
req.content_type = "text/plain"
req.send_http_header()
# check IP and useragent:
if not _check_client_ip(req):
msg = "[ERROR] Sorry, client IP %s cannot use the service." % _get_client_ip(req)
_log(msg)
return _write(req, msg)
if not _check_client_useragent(req):
msg = '[ERROR] Sorry, the "%s" useragent cannot use the service.' % _get_useragent(req)
_log(msg)
return _write(req, msg)
recid = run_sql("insert into bibrec (creation_date,modification_date) values(NOW(),NOW())")
return recid
def cli_upload(req, file_content=None, mode=None, callback_url=None, nonce=None, special_treatment=None):
""" Robot interface for uploading MARC files
"""
req.content_type = "text/plain"
req.send_http_header()
# check IP and useragent:
if not _check_client_ip(req):
msg = "[ERROR] Sorry, client IP %s cannot use the service." % _get_client_ip(req)
_log(msg)
return _write(req, msg)
if not _check_client_useragent(req):
msg = "[ERROR] Sorry, the %s useragent cannot use the service." % _get_useragent(req)
_log(msg)
return _write(req, msg)
arg_mode = mode
if not arg_mode:
msg = "[ERROR] Please specify upload mode to use."
_log(msg)
return _write(req, msg)
if not arg_mode in PERMITTED_MODES:
msg = "[ERROR] Invalid upload mode."
_log(msg)
return _write(req, msg)
arg_file = file_content
if hasattr(arg_file, 'read'):
## We've been passed a readable file, e.g. req
arg_file = arg_file.read()
if not arg_file:
msg = "[ERROR] Please provide a body to your request."
_log(msg)
return _write(req, msg)
else:
if not arg_file:
msg = "[ERROR] Please specify file body to input."
_log(msg)
return _write(req, msg)
if hasattr(arg_file, "filename"):
arg_file = arg_file.value
else:
msg = "[ERROR] 'file' parameter must be a (single) file"
_log(msg)
return _write(req, msg)
# write temporary file:
(fd, filename) = tempfile.mkstemp(prefix="batchupload_" + \
time.strftime("%Y%m%d%H%M%S", time.localtime()) + "_",
dir=CFG_TMPSHAREDDIR)
filedesc = os.fdopen(fd, 'w')
filedesc.write(arg_file)
filedesc.close()
# check if this client can run this file:
client_ip = _get_client_ip(req)
permitted_dbcollids = CFG_BATCHUPLOADER_WEB_ROBOT_RIGHTS[client_ip]
if permitted_dbcollids != ['*']: # wildcard
allow = _check_client_can_submit_file(client_ip, filename, req, 0)
if not allow:
msg = "[ERROR] Cannot submit such a file from this IP. (Wrong collection.)"
_log(msg)
return _write(req, msg)
# check validity of marcxml
xmlmarclint_path = CFG_BINDIR + '/xmlmarclint'
xmlmarclint_output, dummy1, dummy2 = run_shell_command('%s %s' % (xmlmarclint_path, filename))
if xmlmarclint_output != 0:
msg = "[ERROR] MARCXML is not valid."
_log(msg)
return _write(req, msg)
args = ['bibupload', "batchupload", arg_mode, filename]
# run upload command
if callback_url:
args += ["--callback-url", callback_url]
if nonce:
args += ["--nonce", nonce]
if special_treatment:
args += ["--special-treatment", special_treatment]
task_low_level_submission(*args)
msg = "[INFO] %s" % ' '.join(args)
_log(msg)
return _write(req, msg)
def metadata_upload(req, metafile=None, filetype=None, mode=None, exec_date=None,
exec_time=None, metafilename=None, ln=CFG_SITE_LANG,
priority="1", email_logs_to=None):
"""
Metadata web upload service. Get upload parameters and exec bibupload for the given file.
Finally, write upload history.
@return: tuple (error code, message)
error code: code that indicates if an error ocurred
message: message describing the error
"""
# start output:
req.content_type = "text/html"
req.send_http_header()
error_codes = {'not_authorized': 1}
# write temporary file:
if filetype != 'marcxml':
metafile = _transform_input_to_marcxml(file_input=metafile)
user_info = collect_user_info(req)
(fd, filename) = tempfile.mkstemp(prefix="batchupload_" + \
user_info['nickname'] + "_" + time.strftime("%Y%m%d%H%M%S",
time.localtime()) + "_", dir=CFG_TMPSHAREDDIR)
filedesc = os.fdopen(fd, 'w')
filedesc.write(metafile)
filedesc.close()
# check if this client can run this file:
if req is not None:
allow = _check_client_can_submit_file(req=req, metafile=metafile, webupload=1, ln=ln)
if allow[0] != 0:
return (error_codes['not_authorized'], allow[1])
# run upload command:
task_arguments = ('bibupload', user_info['nickname'], mode, "--name=" + metafilename, "--priority=" + priority)
if exec_date:
date = exec_date
if exec_time:
date += ' ' + exec_time
task_arguments += ("-t", date)
if email_logs_to:
task_arguments += ('--email-logs-to', email_logs_to)
task_arguments += (filename, )
jobid = task_low_level_submission(*task_arguments)
# write batch upload history
run_sql("""INSERT INTO hstBATCHUPLOAD (user, submitdate,
filename, execdate, id_schTASK, batch_mode)
VALUES (%s, NOW(), %s, %s, %s, "metadata")""",
(user_info['nickname'], metafilename,
exec_date != "" and (exec_date + ' ' + exec_time)
or time.strftime("%Y-%m-%d %H:%M:%S"), str(jobid), ))
return (0, "Task %s queued" % str(jobid))
def document_upload(req=None, folder="", matching="", mode="", exec_date="", exec_time="", ln=CFG_SITE_LANG, priority="1", email_logs_to=None):
""" Take files from the given directory and upload them with the appropiate mode.
@parameters:
+ folder: Folder where the files to upload are stored
+ matching: How to match file names with record fields (report number, barcode,...)
+ mode: Upload mode (append, revise, replace)
@return: tuple (file, error code)
file: file name causing the error to notify the user
error code:
1 - More than one possible recID, ambiguous behaviour
2 - No records match that file name
3 - File already exists
"""
import sys
- from invenio.bibdocfile import BibRecDocs, file_strip_ext
+ from invenio.legacy.bibdocfile.api import BibRecDocs, file_strip_ext
from invenio.utils.hash import md5
import shutil
from invenio.legacy.search_engine import perform_request_search, \
search_pattern, \
guess_collection_of_a_record
_ = gettext_set_language(ln)
errors = []
info = [0, []] # Number of files read, name of the files
try:
files = os.listdir(folder)
except OSError, error:
errors.append(("", error))
return errors, info
err_desc = {1: _("More than one possible recID, ambiguous behaviour"), 2: _("No records match that file name"),
3: _("File already exists"), 4: _("A file with the same name and format already exists"),
5: _("No rights to upload to collection '%s'")}
# Create directory DONE/ if doesn't exist
folder = (folder[-1] == "/") and folder or (folder + "/")
files_done_dir = folder + "DONE/"
try:
os.mkdir(files_done_dir)
except OSError:
# Directory exists or no write permission
pass
for docfile in files:
if os.path.isfile(os.path.join(folder, docfile)):
info[0] += 1
identifier = file_strip_ext(docfile)
extension = docfile[len(identifier):]
rec_id = None
if identifier:
rec_id = search_pattern(p=identifier, f=matching, m='e')
if not rec_id:
errors.append((docfile, err_desc[2]))
continue
elif len(rec_id) > 1:
errors.append((docfile, err_desc[1]))
continue
else:
rec_id = str(list(rec_id)[0])
rec_info = BibRecDocs(rec_id)
if rec_info.bibdocs:
for bibdoc in rec_info.bibdocs:
attached_files = bibdoc.list_all_files()
file_md5 = md5(open(os.path.join(folder, docfile), "rb").read()).hexdigest()
num_errors = len(errors)
for attached_file in attached_files:
if attached_file.checksum == file_md5:
errors.append((docfile, err_desc[3]))
break
elif attached_file.get_full_name() == docfile:
errors.append((docfile, err_desc[4]))
break
if len(errors) > num_errors:
continue
# Check if user has rights to upload file
if req is not None:
file_collection = guess_collection_of_a_record(int(rec_id))
auth_code, auth_message = acc_authorize_action(req, 'runbatchuploader', collection=file_collection)
if auth_code != 0:
error_msg = err_desc[5] % file_collection
errors.append((docfile, error_msg))
continue
# Move document to be uploaded to temporary folder
(fd, tmp_file) = tempfile.mkstemp(prefix=identifier + "_" + time.strftime("%Y%m%d%H%M%S", time.localtime()) + "_", suffix=extension, dir=CFG_TMPSHAREDDIR)
shutil.copy(os.path.join(folder, docfile), tmp_file)
# Create MARC temporary file with FFT tag and call bibupload
(fd, filename) = tempfile.mkstemp(prefix=identifier + '_', dir=CFG_TMPSHAREDDIR)
filedesc = os.fdopen(fd, 'w')
marc_content = """ <record>
<controlfield tag="001">%(rec_id)s</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="n">%(name)s</subfield>
<subfield code="a">%(path)s</subfield>
</datafield>
</record> """ % {'rec_id': rec_id,
'name': encode_for_xml(identifier),
'path': encode_for_xml(tmp_file),
}
filedesc.write(marc_content)
filedesc.close()
info[1].append(docfile)
user = ""
if req is not None:
user_info = collect_user_info(req)
user = user_info['nickname']
if not user:
user = "batchupload"
# Execute bibupload with the appropiate mode
task_arguments = ('bibupload', user, "--" + mode, "--name=" + docfile, "--priority=" + priority)
if exec_date:
date = '--runtime=' + "\'" + exec_date + ' ' + exec_time + "\'"
task_arguments += (date, )
if email_logs_to:
task_arguments += ("--email-logs-to", email_logs_to)
task_arguments += (filename, )
jobid = task_low_level_submission(*task_arguments)
# write batch upload history
run_sql("""INSERT INTO hstBATCHUPLOAD (user, submitdate,
filename, execdate, id_schTASK, batch_mode)
VALUES (%s, NOW(), %s, %s, %s, "document")""",
(user_info['nickname'], docfile,
exec_date != "" and (exec_date + ' ' + exec_time)
or time.strftime("%Y-%m-%d %H:%M:%S"), str(jobid)))
# Move file to DONE folder
done_filename = docfile + "_" + time.strftime("%Y%m%d%H%M%S", time.localtime()) + "_" + str(jobid)
try:
os.rename(os.path.join(folder, docfile), os.path.join(files_done_dir, done_filename))
except OSError:
errors.append('MoveError')
return errors, info
def get_user_metadata_uploads(req):
"""Retrieve all metadata upload history information for a given user"""
user_info = collect_user_info(req)
upload_list = run_sql("""SELECT DATE_FORMAT(h.submitdate, '%%Y-%%m-%%d %%H:%%i:%%S'), \
h.filename, DATE_FORMAT(h.execdate, '%%Y-%%m-%%d %%H:%%i:%%S'), \
s.status \
FROM hstBATCHUPLOAD h INNER JOIN schTASK s \
ON h.id_schTASK = s.id \
WHERE h.user=%s and h.batch_mode="metadata"
ORDER BY h.submitdate DESC""", (user_info['nickname'],))
return upload_list
def get_user_document_uploads(req):
"""Retrieve all document upload history information for a given user"""
user_info = collect_user_info(req)
upload_list = run_sql("""SELECT DATE_FORMAT(h.submitdate, '%%Y-%%m-%%d %%H:%%i:%%S'), \
h.filename, DATE_FORMAT(h.execdate, '%%Y-%%m-%%d %%H:%%i:%%S'), \
s.status \
FROM hstBATCHUPLOAD h INNER JOIN schTASK s \
ON h.id_schTASK = s.id \
WHERE h.user=%s and h.batch_mode="document"
ORDER BY h.submitdate DESC""", (user_info['nickname'],))
return upload_list
def get_daemon_doc_files():
""" Return all files found in batchuploader document folders """
files = {}
for folder in ['/revise', '/append']:
try:
daemon_dir = CFG_BATCHUPLOADER_DAEMON_DIR[0] == '/' and CFG_BATCHUPLOADER_DAEMON_DIR \
or CFG_PREFIX + '/' + CFG_BATCHUPLOADER_DAEMON_DIR
directory = daemon_dir + '/documents' + folder
files[directory] = [(filename, []) for filename in os.listdir(directory) if os.path.isfile(os.path.join(directory, filename))]
for file_instance, info in files[directory]:
stat_info = os.lstat(os.path.join(directory, file_instance))
info.append("%s" % pwd.getpwuid(stat_info.st_uid)[0]) # Owner
info.append("%s" % grp.getgrgid(stat_info.st_gid)[0]) # Group
info.append("%d" % stat_info.st_size) # Size
time_stat = stat_info.st_mtime
time_fmt = "%Y-%m-%d %R"
info.append(time.strftime(time_fmt, time.gmtime(time_stat))) # Last modified
except OSError:
pass
return files
def get_daemon_meta_files():
""" Return all files found in batchuploader metadata folders """
files = {}
for folder in ['/correct', '/replace', '/insert', '/append']:
try:
daemon_dir = CFG_BATCHUPLOADER_DAEMON_DIR[0] == '/' and CFG_BATCHUPLOADER_DAEMON_DIR \
or CFG_PREFIX + '/' + CFG_BATCHUPLOADER_DAEMON_DIR
directory = daemon_dir + '/metadata' + folder
files[directory] = [(filename, []) for filename in os.listdir(directory) if os.path.isfile(os.path.join(directory, filename))]
for file_instance, info in files[directory]:
stat_info = os.lstat(os.path.join(directory, file_instance))
info.append("%s" % pwd.getpwuid(stat_info.st_uid)[0]) # Owner
info.append("%s" % grp.getgrgid(stat_info.st_gid)[0]) # Group
info.append("%d" % stat_info.st_size) # Size
time_stat = stat_info.st_mtime
time_fmt = "%Y-%m-%d %R"
info.append(time.strftime(time_fmt, time.gmtime(time_stat))) # Last modified
except OSError:
pass
return files
def user_authorization(req, ln):
""" Check user authorization to visit page """
auth_code, auth_message = acc_authorize_action(req, 'runbatchuploader')
if auth_code != 0:
referer = '/batchuploader/'
return page_not_authorized(req=req, referer=referer,
text=auth_message, navmenuid="batchuploader")
else:
return None
def perform_basic_upload_checks(xml_record):
""" Performs tests that would provoke the bibupload task to fail with
an exit status 1, to prevent batchupload from crashing while alarming
the user wabout the issue
"""
- from invenio.bibupload import writing_rights_p
+ from invenio.legacy.bibupload.engine import writing_rights_p
errors = []
if not writing_rights_p():
errors.append("Error: BibUpload does not have rights to write fulltext files.")
recs = create_records(xml_record, 1, 1)
if recs == []:
errors.append("Error: Cannot parse MARCXML file.")
elif recs[0][0] is None:
errors.append("Error: MARCXML file has wrong format: %s" % recs)
return errors
def perform_upload_check(xml_record, mode):
""" Performs a upload simulation with the given record and mode
@return: string describing errors
@rtype: string
"""
error_cache = []
def my_writer(msg, stream=sys.stdout, verbose=1):
if verbose == 1:
if 'DONE' not in msg:
error_cache.append(msg.strip())
orig_writer = bibupload_module.write_message
bibupload_module.write_message = my_writer
error_cache.extend(perform_basic_upload_checks(xml_record))
if error_cache:
# There has been some critical error
return '\n'.join(error_cache)
recs = xml_marc_to_records(xml_record)
try:
upload_mode = mode[2:]
# Adapt input data for bibupload function
if upload_mode == "r insert-or-replace":
upload_mode = "replace_or_insert"
for record in recs:
if record:
record_strip_empty_volatile_subfields(record)
record_strip_empty_fields(record)
bibupload(record, opt_mode=upload_mode, pretend=True)
finally:
bibupload_module.write_message = orig_writer
return '\n'.join(error_cache)
def _get_useragent(req):
"""Return client user agent from req object."""
user_info = collect_user_info(req)
return user_info['agent']
def _get_client_ip(req):
"""Return client IP address from req object."""
return str(req.remote_ip)
def _check_client_ip(req):
"""
Is this client permitted to use the service?
"""
client_ip = _get_client_ip(req)
if client_ip in CFG_BATCHUPLOADER_WEB_ROBOT_RIGHTS.keys():
return True
return False
def _check_client_useragent(req):
"""
Is this user agent permitted to use the service?
"""
client_useragent = _get_useragent(req)
if _CFG_BATCHUPLOADER_WEB_ROBOT_AGENTS_RE.match(client_useragent):
return True
return False
def _check_client_can_submit_file(client_ip="", metafile="", req=None, webupload=0, ln=CFG_SITE_LANG):
"""
Is this client able to upload such a FILENAME?
check 980 $a values and collection tags in the file to see if they are among the
permitted ones as specified by CFG_BATCHUPLOADER_WEB_ROBOT_RIGHTS and ACC_AUTHORIZE_ACTION.
Useful to make sure that the client does not override other records by
mistake.
"""
_ = gettext_set_language(ln)
recs = create_records(metafile, 0, 0)
user_info = collect_user_info(req)
filename_tag980_values = _detect_980_values_from_marcxml_file(recs)
for filename_tag980_value in filename_tag980_values:
if not filename_tag980_value:
if not webupload:
return False
else:
return(1, "Invalid collection in tag 980")
if not webupload:
if not filename_tag980_value in CFG_BATCHUPLOADER_WEB_ROBOT_RIGHTS[client_ip]:
return False
else:
auth_code, auth_message = acc_authorize_action(req, 'runbatchuploader', collection=filename_tag980_value)
if auth_code != 0:
error_msg = _("The user '%(x_user)s' is not authorized to modify collection '%(x_coll)s'") % \
{'x_user': user_info['nickname'], 'x_coll': filename_tag980_value}
return (auth_code, error_msg)
filename_rec_id_collections = _detect_collections_from_marcxml_file(recs)
for filename_rec_id_collection in filename_rec_id_collections:
if not webupload:
if not filename_rec_id_collection in CFG_BATCHUPLOADER_WEB_ROBOT_RIGHTS[client_ip]:
return False
else:
auth_code, auth_message = acc_authorize_action(req, 'runbatchuploader', collection=filename_rec_id_collection)
if auth_code != 0:
error_msg = _("The user '%(x_user)s' is not authorized to modify collection '%(x_coll)s'") % \
{'x_user': user_info['nickname'], 'x_coll': filename_rec_id_collection}
return (auth_code, error_msg)
if not webupload:
return True
else:
return (0, " ")
def _detect_980_values_from_marcxml_file(recs):
"""
Read MARCXML file and return list of 980 $a values found in that file.
Useful for checking rights.
"""
from invenio.legacy.bibrecord import record_get_field_values
collection_tag = run_sql("SELECT value FROM tag, field_tag, field \
WHERE tag.id=field_tag.id_tag AND \
field_tag.id_field=field.id AND \
field.code='collection'")
collection_tag = collection_tag[0][0]
dbcollids = {}
for rec, dummy1, dummy2 in recs:
if rec:
for tag980 in record_get_field_values(rec,
tag=collection_tag[:3],
ind1=collection_tag[3],
ind2=collection_tag[4],
code=collection_tag[5]):
dbcollids[tag980] = 1
return dbcollids.keys()
def _detect_collections_from_marcxml_file(recs):
"""
Extract all possible recIDs from MARCXML file and guess collections
for these recIDs.
"""
from invenio.legacy.bibrecord import record_get_field_values
from invenio.legacy.search_engine import guess_collection_of_a_record
- from invenio.bibupload import find_record_from_sysno, \
+ from invenio.legacy.bibupload.engine import find_record_from_sysno, \
find_records_from_extoaiid, \
find_record_from_oaiid
dbcollids = {}
sysno_tag = CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG
oaiid_tag = CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG
oai_tag = CFG_OAI_ID_FIELD
for rec, dummy1, dummy2 in recs:
if rec:
for tag001 in record_get_field_values(rec, '001'):
collection = guess_collection_of_a_record(int(tag001))
dbcollids[collection] = 1
for tag_sysno in record_get_field_values(rec, tag=sysno_tag[:3],
ind1=sysno_tag[3],
ind2=sysno_tag[4],
code=sysno_tag[5]):
record = find_record_from_sysno(tag_sysno)
if record:
collection = guess_collection_of_a_record(int(record))
dbcollids[collection] = 1
for tag_oaiid in record_get_field_values(rec, tag=oaiid_tag[:3],
ind1=oaiid_tag[3],
ind2=oaiid_tag[4],
code=oaiid_tag[5]):
try:
records = find_records_from_extoaiid(tag_oaiid)
except Error:
records = []
if records:
record = records.pop()
collection = guess_collection_of_a_record(int(record))
dbcollids[collection] = 1
for tag_oai in record_get_field_values(rec, tag=oai_tag[0:3],
ind1=oai_tag[3],
ind2=oai_tag[4],
code=oai_tag[5]):
record = find_record_from_oaiid(tag_oai)
if record:
collection = guess_collection_of_a_record(int(record))
dbcollids[collection] = 1
return dbcollids.keys()
def _transform_input_to_marcxml(file_input=""):
"""
Takes text-marc as input and transforms it
to MARCXML.
"""
# Create temporary file to read from
tmp_fd, filename = tempfile.mkstemp(dir=CFG_TMPSHAREDDIR)
os.write(tmp_fd, file_input)
os.close(tmp_fd)
try:
# Redirect output, transform, restore old references
old_stdout = sys.stdout
new_stdout = StringIO()
sys.stdout = new_stdout
transform_file(filename)
finally:
sys.stdout = old_stdout
return new_stdout.getvalue()
def _log(msg, logfile="webupload.log"):
"""
Log MSG into LOGFILE with timestamp.
"""
filedesc = open(CFG_LOGDIR + "/" + logfile, "a")
filedesc.write(time.strftime("%Y-%m-%d %H:%M:%S") + " --> " + msg + "\n")
filedesc.close()
return
def _write(req, msg):
"""
Write MSG to the output stream for the end user.
"""
req.write(msg + "\n")
return
diff --git a/invenio/legacy/bibauthorid/backinterface.py b/invenio/legacy/bibauthorid/backinterface.py
index 8ac343561..9ca358d61 100644
--- a/invenio/legacy/bibauthorid/backinterface.py
+++ b/invenio/legacy/bibauthorid/backinterface.py
@@ -1,122 +1,122 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
'''
bibauthorid_frontinterface
This file aims to filter and modify the interface given by
bibauthorid_bdinterface in order to make it usable by the
backend so to keep it as clean as possible.
'''
from itertools import groupby
from operator import itemgetter
#Well this is bad, BUT otherwise there must 100+ lines
#of the form from dbinterface import ... # emitting
-from invenio.bibauthorid_dbinterface import * #pylint: disable-msg=W0614
+from invenio.legacy.bibauthorid.dbinterface import * #pylint: disable-msg=W0614
import invenio.bibauthorid_dbinterface as dbinter
def group_personid(papers_table="aidPERSONID_PAPERS", data_table="aidPERSONID_DATA"):
'''
Extracts, groups and returns the whole personid.
'''
papers = dbinter.get_full_personid_papers(papers_table)
data = dbinter.get_full_personid_data(data_table)
group = lambda x: groupby(sorted(x, key=itemgetter(0)), key=itemgetter(0))
to_dict = lambda x: dict((pid, map(itemgetter(slice(1, None)), data)) for pid, data in x)
return (to_dict(group(papers)), to_dict(group(data)))
def compare_personid_tables(personIDold_papers, personIDold_data,
personIDnew_papers, personIDnew_data, fp):
"""
Compares how personIDnew is different to personIDold.
The two arguments must be generated with group_personid.
fp must be a valid file object.
"""
header_new = "+++ "
# header_old = " "
header_removed = "--- "
def write_new_personid(pid):
fp.write(" Personid %d\n" % pid)
def write_end_personid():
fp.write("\n")
def write_paper(row, header):
fp.write("%s[PAPER] %s, signature %s %d %d, flag: %d, lcul: %d\n" % (header, row[3], row[0], row[1], row[2], row[4], row[5]))
def write_data(row, header):
tag = "[%s]" % row[0].upper()
fp.write("%s%s %s, opt: (%s %s %s)\n" % (header, tag, row[1], row[2], row[3], row[4]))
all_pids = (frozenset(personIDold_data.keys())
| frozenset(personIDnew_data.keys())
| frozenset(personIDold_papers.keys())
| frozenset(personIDnew_papers.keys()))
for pid in all_pids:
data_old = frozenset(personIDold_data.get(pid, frozenset()))
data_new = frozenset(personIDnew_data.get(pid, frozenset()))
# old_data = data_new & data_old
new_data = data_new - data_old
del_data = data_old - data_new
papers_old = frozenset(personIDold_papers.get(pid, frozenset()))
papers_new = frozenset(personIDnew_papers.get(pid, frozenset()))
# old_papers = papers_new & papers_old
new_papers = papers_new - papers_old
del_papers = papers_old - papers_new
if new_data or del_data or new_papers or del_papers:
write_new_personid(pid)
for arr, header in zip([new_data, del_data],
[header_new, header_removed]):
for row in arr:
write_data(row, header)
for arr, header in zip([new_papers, del_papers],
[header_new, header_removed]):
for row in arr:
write_paper(row, header)
write_end_personid()
def compare_personid_tables_easy(suffix='_copy', filename='/tmp/pid_comparison'):
f = open(filename, 'w')
oldPap, oldDat = group_personid('aidPERSONIDPAPERS' + suffix, 'aidPERSONIDDATA' + suffix)
pap, dat = group_personid('aidPERSONIDPAPERS', 'aidPERSONIDDATA')
compare_personid_tables(oldPap, oldDat, pap, dat, f)
f.close()
def filter_bibrecs_outside(all_papers):
all_bibrecs = get_all_bibrecs()
to_remove = list(frozenset(all_bibrecs) - frozenset(all_papers))
chunk = 1000
separated = [to_remove[i: i + chunk] for i in range(0, len(to_remove), chunk)]
for sep in separated:
remove_all_bibrecs(sep)
diff --git a/invenio/legacy/bibauthorid/cli.py b/invenio/legacy/bibauthorid/cli.py
index ef043d0e6..e969901ef 100644
--- a/invenio/legacy/bibauthorid/cli.py
+++ b/invenio/legacy/bibauthorid/cli.py
@@ -1,40 +1,40 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
bibauthorid_cli
This module provides a command-line interface for BibAuthorID.
"""
-from invenio.bibauthorid_general_utils import bibauthor_print
+from invenio.legacy.bibauthorid.general_utils import bibauthor_print
def main():
"""Main function """
try:
import invenio.bibauthorid_daemon as daemon
except ImportError:
bibauthor_print("Hmm...No Daemon process running.")
return
daemon.bibauthorid_daemon()
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/bibauthorid/cluster_set.py b/invenio/legacy/bibauthorid/cluster_set.py
index 66ca0a116..6c0e0dbf7 100644
--- a/invenio/legacy/bibauthorid/cluster_set.py
+++ b/invenio/legacy/bibauthorid/cluster_set.py
@@ -1,307 +1,307 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
from itertools import chain, groupby, izip, cycle
from operator import itemgetter
import cPickle
from invenio.bibauthorid_matrix_optimization import maximized_mapping
from invenio.bibauthorid_backinterface import save_cluster
from invenio.bibauthorid_backinterface import get_all_papers_of_pids
from invenio.bibauthorid_backinterface import get_bib10x, get_bib70x
from invenio.bibauthorid_backinterface import get_all_modified_names_from_personid
from invenio.bibauthorid_backinterface import get_signatures_from_bibrefs
-from invenio.bibauthorid_name_utils import generate_last_name_cluster_str
-from invenio.bibauthorid_general_utils import bibauthor_print
+from invenio.legacy.bibauthorid.name_utils import generate_last_name_cluster_str
+from invenio.legacy.bibauthorid.general_utils import bibauthor_print
#python2.4 compatibility
-from invenio.bibauthorid_general_utils import bai_all as all
+from invenio.legacy.bibauthorid.general_utils import bai_all as all
class Blob(object):
def __init__(self, personid_records):
'''
@param personid_records:
A list of tuples: (personid, bibrefrec, flag).
Notice that all bibrefrecs should be the same
since the Blob represents only one bibrefrec.
'''
self.bib = personid_records[0][1]
assert all(p[1] == self.bib for p in personid_records), \
"All cluster sets should share the bibrefrec"
self.claimed = set()
self.assigned = set()
self.rejected = set()
for pid, _, flag in personid_records:
if flag > 1:
self.claimed.add(pid)
elif flag >= -1:
self.assigned.add(pid)
else:
self.rejected.add(pid)
def create_blobs_by_pids(pids):
'''
Returs a list of blobs by a given set of personids.
Blob is an object which describes all information
for a bibrefrec in the personid table.
@type pids: iterable of integers
'''
all_bibs = get_all_papers_of_pids(pids)
all_bibs = ((x[0], (int(x[1]), x[2], x[3]), x[4]) for x in all_bibs)
bibs_dict = groupby(sorted(all_bibs, key=itemgetter(1)), key=itemgetter(1))
blobs = [Blob(list(bibs)) for _, bibs in bibs_dict]
return blobs
def group_blobs(blobs):
'''
Separates the blobs into two groups
of objects - those with claims and
those without.
'''
# created from blobs, which are claimed
# [(bibrefrec, personid)]
union = []
# created from blobs, which are not claimed
# [(bibrefrec, personid/None, [personid])]
independent = []
for blob in blobs:
assert len(blob.claimed) + len(blob.assigned) == 1, \
"Each blob must have exactly one associated signature"
if len(blob.claimed) > 0:
union.append((blob.bib, list(blob.claimed)[0]))
else:
independent.append((blob.bib, list(blob.assigned)[0], list(blob.rejected)))
return (union, independent)
class ClusterSet(object):
class Cluster(object):
def __init__(self, bibs, hate=None):
# hate is a symetrical relation
self.bibs = set(bibs)
if hate:
self.hate = set(hate)
else:
self.hate = set([])
self.personid = None
def hates(self, other):
return other in self.hate
def quarrel(self, cl2):
self.hate.add(cl2)
cl2.hate.add(self)
def _debug_test_hate_relation(self):
for cl2 in self.hate:
if not self.hates(cl2) or not cl2.hates(self):
return False
return True
def __init__(self):
self.clusters = []
self.update_bibs()
self.num_all_bibs = None
self.last_name = None
def update_bibs(self):
self.num_all_bibs = sum(len(cl.bibs) for cl in self.clusters)
def all_bibs(self):
return chain.from_iterable(cl.bibs for cl in self.clusters)
def create_skeleton(self, personids, last_name):
blobs = create_blobs_by_pids(personids)
self.last_name = last_name
union, independent = group_blobs(blobs)
union_clusters = {}
for uni in union:
union_clusters[uni[1]] = union_clusters.get(uni[1], []) + [uni[0]]
cluster_dict = dict((personid, self.Cluster(bibs)) for personid, bibs in union_clusters.items())
self.clusters = cluster_dict.values()
for i, cl in enumerate(self.clusters):
cl.hate = set(chain(self.clusters[:i], self.clusters[i + 1:]))
for ind in independent:
bad_clusters = [cluster_dict[i] for i in ind[2] if i in cluster_dict]
cl = self.Cluster([ind[0]], bad_clusters)
for bcl in bad_clusters:
bcl.hate.add(cl)
self.clusters.append(cl)
self.update_bibs()
return self
# Creates a cluster set, ignoring the claims and the
# rejected papers.
def create_pure(self, personids, last_name):
blobs = create_blobs_by_pids(personids)
self.last_name = last_name
self.clusters = [self.Cluster((blob.bib,)) for blob in blobs]
self.update_bibs()
return self
# no longer used
def create_body(self, blobs):
union, independent = group_blobs(blobs)
arranged_clusters = {}
for cls in chain(union, independent):
arranged_clusters[cls[1]] = arranged_clusters.get(cls[1], []) + [cls[0]]
for pid, bibs in arranged_clusters.items():
cl = self.Cluster(bibs)
cl.personid = pid
self.clusters.append(cl)
self.update_bibs()
return self
def create_from_mark(self, bibrefs, last_name):
bibrecrefs = get_signatures_from_bibrefs(bibrefs)
self.clusters = [ClusterSet.Cluster([bib]) for bib in bibrecrefs]
self.last_name = last_name
self.update_bibs()
return self
# a *very* slow fucntion checking when the hate relation is no longer symetrical
def _debug_test_hate_relation(self):
for cl1 in self.clusters:
if not cl1._debug_test_hate_relation():
return False
return True
# similar to the function above
def _debug_duplicated_recs(self, mapping=None):
for cl in self.clusters:
if mapping:
setty = set(mapping[x][2] for x in cl.bibs)
else:
setty = set(x[2] for x in cl.bibs)
if len(cl.bibs) != len(setty):
return False
return True
# No longer used but it might be handy.
@staticmethod
def match_cluster_sets(cs1, cs2):
"""
This functions tries to generate the best matching
between cs1 and cs2 acoarding to the shared bibrefrecs.
It returns a dictionary with keys, clsuters in cs1,
and values, clusters in cs2.
@param and type of cs1 and cs2: cluster_set
@return: dictionary with the matching clusters.
@return type: { cluster : cluster }
"""
matr = [[len(cl1.bibs & cl2.bibs) for cl2 in cs2.clusters] for cl1 in cs1.clusters]
mapping = maximized_mapping(matr)
return dict((cs1.clusters[mappy[0]], cs2.clusters[mappy[1]]) for mappy in mapping)
def store(self):
'''
Stores the cluster set in a special table.
This is used to store the results of
tortoise/wedge in a table and later merge them
with personid.
'''
named_clusters = (("%s.%d" % (self.last_name, idx), cl) for idx, cl in enumerate(self.clusters))
map(save_cluster, named_clusters)
def delayed_create_from_mark(bibrefs, last_name):
def ret():
return ClusterSet().create_from_mark(bibrefs, last_name)
return ret
def delayed_cluster_sets_from_marktables(limit_to_surnames=False):
# { name -> [(table, bibref)] }
bibauthor_print('Delayed_cluster_set_from_marktables limited to %s' % str(limit_to_surnames))
name_buket = {}
if limit_to_surnames:
limit_to_surnames = set([generate_last_name_cluster_str(s) for s in limit_to_surnames])
for tab, ref, name in chain(izip(cycle((100,)), *izip(*get_bib10x())),
izip(cycle((700,)), *izip(*get_bib70x()))):
name = generate_last_name_cluster_str(name)
if limit_to_surnames and not name in limit_to_surnames:
continue
name_buket[name] = name_buket.get(name, []) + [(tab, ref)]
bibauthor_print('Delayed_cluster_set_from_marktables going to get %s signatures....' % str(len(name_buket)))
all_refs = ((name, refs, len(list(get_signatures_from_bibrefs(refs))))
for name, refs in name_buket.items())
all_refs = sorted(all_refs, key=itemgetter(2))
return ([delayed_create_from_mark(set(refs), name) for name, refs, _ in all_refs],
map(itemgetter(0), all_refs),
map(itemgetter(2), all_refs))
def create_lastname_list_from_personid(last_modification):
'''
This function generates a dictionary from a last name
to list of personids which have this lastname.
'''
# ((personid, [full Name1], Nbibs) ... )
all_names = get_all_modified_names_from_personid(last_modification)
# ((personid, last_name, Nbibs) ... )
all_names = ((row[0], generate_last_name_cluster_str(iter(row[1]).next()), row[2])
for row in all_names)
# { (last_name, [(personid)... ], Nbibs) ... }
all_names = groupby(sorted(all_names, key=itemgetter(1)), key=itemgetter(1))
all_names = ((key, list(data)) for key, data in all_names)
all_names = ((key, map(itemgetter(0), data), sum(x[2] for x in data)) for key, data in all_names)
return all_names
def delayed_create(create_f, pids, lname):
def ret():
return create_f(ClusterSet(), pids, lname)
return ret
def delayed_cluster_sets_from_personid(pure, last_modification=None):
names = create_lastname_list_from_personid(last_modification)
names = sorted(names, key=itemgetter(2))
if pure:
create = ClusterSet.create_pure
else:
create = ClusterSet.create_skeleton
return ([delayed_create(create, name[1], name[0]) for name in names],
map(itemgetter(0), names),
map(itemgetter(2), names))
diff --git a/invenio/legacy/bibauthorid/comparison.py b/invenio/legacy/bibauthorid/comparison.py
index 1befa2b20..3d930477b 100644
--- a/invenio/legacy/bibauthorid/comparison.py
+++ b/invenio/legacy/bibauthorid/comparison.py
@@ -1,440 +1,440 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import re
-from invenio import bibauthorid_config as bconfig
+from invenio.legacy.bibauthorid import config as bconfig
from itertools import starmap
from operator import mul, itemgetter
-from invenio.bibauthorid_name_utils import compare_names
-from invenio.bibauthorid_dbinterface import get_name_by_bibrecref
-from invenio.bibauthorid_dbinterface import get_grouped_records
-from invenio.bibauthorid_dbinterface import get_all_authors
-from invenio.bibauthorid_dbinterface import get_collaboration
-from invenio.bibauthorid_dbinterface import resolve_affiliation
+from invenio.legacy.bibauthorid.name_utils import compare_names
+from invenio.legacy.bibauthorid.dbinterface import get_name_by_bibrecref
+from invenio.legacy.bibauthorid.dbinterface import get_grouped_records
+from invenio.legacy.bibauthorid.dbinterface import get_all_authors
+from invenio.legacy.bibauthorid.dbinterface import get_collaboration
+from invenio.legacy.bibauthorid.dbinterface import resolve_affiliation
from invenio.bibauthorid_backinterface import get_key_words
#from invenio.legacy.bibrank.citation_searcher import get_citation_dict
#metadat_comparison_print commented everywhere to increase performances,
#import and calls left here to make future debug easier.
-from invenio.bibauthorid_general_utils import metadata_comparison_print
+from invenio.legacy.bibauthorid.general_utils import metadata_comparison_print
# This module is not thread safe!
# Be sure to use processes instead of
# threads if you need parallel
# computation!
# FIXME: hack for Python-2.4; switch to itemgetter() once Python-2.6 is default
# use_refrec = itemgetter(slice(None))
# use_ref = itemgetter(0, 1)
# use_rec = itemgetter(2)
try:
_ = itemgetter(2, 5, 3)(range(10))
use_refrec = lambda x : x
use_ref = itemgetter(0, 1)
use_rec = itemgetter(2)
except:
#python 2.4 compatibility, a bit slower than itemgetter
use_refrec = lambda x: x
use_ref = lambda x: x[0:2]
use_rec = lambda x: x[2]
# At first glance this may look silly.
# However, if we load the dictionaries
# uncoditionally there will be only
# one instance of them in the memory after
# fork
cit_dict = None#get_citation_dict("citationdict")
recit_dict = None#get_citation_dict("reversedict")
caches = []
def create_new_cache():
ret = {}
caches.append(ret)
return ret
def clear_all_caches():
for c in caches:
c.clear()
_replacer = re.compile("[^a-zA-Z]")
def canonical_str(string):
return _replacer.sub('', string).lower()
def jaccard(set1, set2):
'''
This is no longer jaccard distance.
'''
metadata_comparison_print("Jaccard: Found %d items in the first set and %d in nthe second set" % (len(set1), len(set2)))
if not set1 or not set2:
return '?'
match = len(set1 & set2)
ret = match / float(len(set1) + len(set2) - match)
metadata_comparison_print("Jaccard: %d common items; returning %f" % (match, ret))
return ret
def cached_sym(reducing):
'''
Memoizes a pure function with two symmetrical arguments.
'''
def deco(func):
cache = create_new_cache()
red = reducing
def ret(a, b):
ra, rb = red(a), red(b)
if ra > rb:
ra, rb = rb, ra
try:
return cache[(ra, rb)]
except KeyError:
val = func(a, b)
cache[(ra, rb)] = val
return val
return ret
return deco
def cached_arg(reducing):
'''
Memoizes a pure function.
'''
def deco(func):
cache = create_new_cache()
red = reducing
def ret(a):
ra = red(a)
try:
return cache[ra]
except KeyError:
val = func(a)
cache[ra] = val
return val
return ret
return deco
def check_comparison(fn):
allowed = ['+','-']
def checked(a,b):
val = fn(a,b)
if isinstance(val, tuple):
assert (val[0] >= 0 and val[0] <= 1), 'COMPARISON: Returned value not in range %s' % str(val)
assert (val[1] >= 0 and val[1] <= 1), 'COMPARISON: Returned compatibility not in range %s' % str(val)
else:
assert val in allowed, 'COMPARISON: returned not tuple value not in range %s' % str(val)
return val
return checked
# The main function of this module
@check_comparison
def compare_bibrefrecs(bibref1, bibref2):
'''
This function compares two bibrefrecs (100:123,456) using all metadata
and returns:
* a pair with two numbers in [0, 1] - the probability that the two belong
together and the ratio of the metadata functions used to the number of
all metadata functions.
* '+' - the metadata showed us that the two belong together for sure.
* '-' - the metadata showed us that the two do not belong together for sure.
Example:
'(0.7, 0.4)' - 2 out of 5 functions managed to compare the bibrefrecs and
using their computations the average value of 0.7 is returned.
'-' - the two bibrefres are in the same paper, so they dont belong together
for sure.
'(1, 0)' There was insufficient metadata to compare the bibrefrecs. (The
first values in ignored).
'''
metadata_comparison_print("")
metadata_comparison_print("Started comparing %s vs %s"% (str(bibref1),str(bibref2)))
# try first the metrics, which might return + or -
papers = _compare_papers(bibref1, bibref2)
if papers != '?':
return papers
# if bconfig.CFG_INSPIRE_SITE:
# insp_ids = _compare_inspireid(bibref1, bibref2)
# if insp_ids != '?':
# return insp_ids
results = []
for func, weight, fname in cbrr_func_weight:
r = func(bibref1,bibref2)
assert r == '?' or (r <= 1 and r>=0), 'COMPARISON %s returned %s for %s' % (fname, str(r),str(len(results)))
results.append((r, weight))
total_weights = sum(res[1] for res in results)
metadata_comparison_print("Final comparison vector: %s." % str(results))
results = filter(lambda x: x[0] != '?', results)
if not results:
metadata_comparison_print("Final result: Skipped all tests, returning 0,0")
return 0, 0
cert = sum(starmap(mul, results))
prob = sum(res[1] for res in results)
vals = cert / prob, prob / total_weights
assert vals[0] >= 0 and vals[0] <= 1, 'COMPARISON: RETURNING VAL out of range'
assert vals[1] >= 0 and vals[1] <= 1, 'COMPARISON: RETURNING PROB out of range'
metadata_comparison_print("Final result: %s" % str(vals))
return vals
@cached_arg(use_refrec)
def _find_affiliation(bib):
aff = get_grouped_records(bib, str(bib[0]) + '__u').values()[0]
return set(canonical_str(a) for a in aff)
def _compare_affiliations(bib1, bib2):
metadata_comparison_print("Comparing affiliations.")
aff1 = _find_affiliation(bib1)
aff2 = _find_affiliation(bib2)
ret = jaccard(aff1, aff2)
metadata_comparison_print("Affiliations: %s %s %s", (str(aff1), str(aff2), str(ret)))
return ret
@cached_arg(use_refrec)
def _find_unified_affiliation(bib):
aff = get_grouped_records(bib, str(bib[0]) + '__u').values()[0]
return set(x for x in list(canonical_str(resolve_affiliation(a)) for a in aff) if not x == "None")
def _compare_unified_affiliations(bib1, bib2):
metadata_comparison_print("Comparing unified affiliations.")
aff1 = _find_affiliation(bib1)
aff2 = _find_affiliation(bib2)
ret = jaccard(aff1, aff2)
metadata_comparison_print("Affiliations: %s %s %s", (str(aff1), str(aff2), str(ret)))
return ret
@cached_arg(use_refrec)
def _find_inspireid(bib):
ids = get_grouped_records(bib, str(bib[0]) + '__i').values()[0]
return set(ids)
def _compare_inspireid(bib1, bib2):
metadata_comparison_print("Comparing inspire ids.")
iids1 = _find_inspireid(bib1)
iids2 = _find_inspireid(bib2)
metadata_comparison_print("Found %d, %d different inspire ids for the two sets." % (len(iids1), len(iids2)))
if (len(iids1) != 1 or
len(iids2) != 1):
return '?'
elif iids1 == iids2:
metadata_comparison_print("The ids are the same.")
return 1
else:
metadata_comparison_print("The ids are different.")
return 0
@cached_arg(use_refrec)
def _find_email(bib):
ids = get_grouped_records(bib, str(bib[0]) + '__m').values()[0]
return set(ids)
def _compare_email(bib1, bib2):
metadata_comparison_print("Comparing email addresses.")
iids1 = _find_email(bib1)
iids2 = _find_email(bib2)
metadata_comparison_print("Found %d, %d different email addresses for the two sets." % (len(iids1), len(iids2)))
if (len(iids1) != 1 or
len(iids2) != 1):
return '?'
elif iids1 == iids2:
metadata_comparison_print("The addresses are the same.")
return 1.0
else:
metadata_comparison_print("The addresses are there, but different.")
return 0.3
def _compare_papers(bib1, bib2):
metadata_comparison_print("Checking if the two bib refs are in the same paper...")
if bib1[2] == bib2[2]:
metadata_comparison_print(" ... Yes they are! Are you crazy, man?")
return '-'
return '?'
get_name_by_bibrecref = cached_arg(use_ref)(get_name_by_bibrecref)
@cached_sym(use_ref)
def _compare_names(bib1, bib2):
metadata_comparison_print("Comparing names.")
name1 = get_name_by_bibrecref(bib1)
name2 = get_name_by_bibrecref(bib2)
metadata_comparison_print(" Found %s and %s" % (name1,name2))
if name1 and name2:
cmpv = compare_names(name1, name2, False)
metadata_comparison_print(" cmp(%s,%s) = %s" % (name1, name2, str(cmpv)))
return cmpv
return '?'
@cached_arg(use_rec)
def _find_key_words(bib):
words = get_key_words(bib[2])
return set(canonical_str(word) for word in words)
@cached_sym(use_rec)
def _compare_key_words(bib1, bib2):
metadata_comparison_print("Comparing key words.")
words1 = _find_key_words(bib1)
words2 = _find_key_words(bib2)
cmpv = jaccard(words1, words2)
metadata_comparison_print(" key words got (%s vs %s) for %s"% (words1, words2, cmpv))
return cmpv
@cached_arg(use_rec)
def _find_collaboration(bib):
colls = get_collaboration(bib[2])
return set(canonical_str(coll) for coll in colls)
@cached_sym(use_rec)
def _compare_collaboration(bib1, bib2):
metadata_comparison_print("Comparing collaboration.")
colls1 = _find_collaboration(bib1)
colls2 = _find_collaboration(bib2)
metadata_comparison_print("Found %d, %d different collaborations for the two sets." % (len(colls1), len(colls2)))
if (len(colls1) != 1 or
len(colls2) != 1):
return '?'
elif colls1 == colls2:
return 1.
else:
return 0.
@cached_arg(use_rec)
def _find_coauthors(bib):
return set(canonical_str(a) for a in get_all_authors(bib[2]))
@cached_sym(use_rec)
def _compare_coauthors(bib1, bib2):
metadata_comparison_print("Comparing authors.")
aths1 = _find_coauthors(bib1)
aths2 = _find_coauthors(bib2)
cmpv = jaccard(aths1, aths2)
metadata_comparison_print(" coauthors lists as %s"% (cmpv))
return cmpv
@cached_arg(use_rec)
def _find_citations(bib):
return set(cit_dict.get(bib[2], ()))
@cached_sym(use_rec)
def _compare_citations(bib1, bib2):
metadata_comparison_print("Comparing citations.")
cites1 = _find_citations(bib1)
cites2 = _find_citations(bib2)
cmpv = jaccard(cites1, cites2)
metadata_comparison_print(" citations as %s" % cmpv)
return cmpv
@cached_arg(use_rec)
def _find_citations_by(bib):
return set(recit_dict.get(bib[2], ()))
@cached_sym(use_rec)
def _compare_citations_by(bib1, bib2):
metadata_comparison_print("Comparing citations by.")
cites1 = _find_citations_by(bib1)
cites2 = _find_citations_by(bib2)
cmpv = jaccard(cites1, cites2)
metadata_comparison_print(" citations by as %s" % cmpv)
return cmpv
# compare_bibrefrecs
# Unfortunately doing this assignment at every call of compare_bibrefrec is too expensive.
# Doing it here is much less elegant but much faster. Let's hope for better times to put it back
# where it belongs.
# unfortunately, we have to do all comparisons
if bconfig.CFG_INSPIRE_SITE:
cbrr_func_weight = (
(_compare_inspireid, 2., 'inspID'),
(_compare_affiliations, .3, 'aff'),
(_compare_names, .8, 'names'),
#(_compare_citations, .1, 'cit'),
#(_compare_citations_by, .1, 'citby'),
(_compare_key_words, .1, 'kw'),
(_compare_collaboration, .3, 'collab'),
(_compare_coauthors, .1,'coauth')
)
elif bconfig.CFG_ADS_SITE:
cbrr_func_weight = (
(_compare_email, 3.,'email'),
(_compare_unified_affiliations, 2., 'aff'),
(_compare_names, 5.,'names'),
# register(_compare_citations, .5)
# register(_compare_citations_by, .5)
(_compare_key_words, 2.,'kw')
)
else:
cbrr_func_weight = ((_compare_names, 5.,'names'),)
diff --git a/invenio/legacy/bibauthorid/daemon.py b/invenio/legacy/bibauthorid/daemon.py
index 192035557..b3935d97c 100644
--- a/invenio/legacy/bibauthorid/daemon.py
+++ b/invenio/legacy/bibauthorid/daemon.py
@@ -1,312 +1,312 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Bibauthorid Daemon
This module IS NOT standalone safe - it should never be run this way.
"""
import sys
-from invenio import bibauthorid_config as bconfig
+from invenio.legacy.bibauthorid import config as bconfig
from invenio import bibtask
from invenio.bibauthorid_backinterface import get_recently_modified_record_ids
from invenio.bibauthorid_backinterface import get_user_log
from invenio.bibauthorid_backinterface import insert_user_log
from invenio.bibauthorid_backinterface import get_sql_time
from invenio.bibauthorid_backinterface import get_personids_from_bibrec
from invenio.bibauthorid_backinterface import get_claimed_papers_from_papers
from invenio.bibauthorid_backinterface import get_all_valid_bibrecs
#python 2.4 compatibility
-from invenio.bibauthorid_general_utils import bai_any as any
+from invenio.legacy.bibauthorid.general_utils import bai_any as any
def bibauthorid_daemon():
"""Constructs the Bibauthorid bibtask."""
bibtask.task_init(authorization_action='runbibclassify',
authorization_msg="Bibauthorid Task Submission",
description="""
Purpose:
Disambiguate Authors and find their identities.
Examples:
- Process all records that hold an author with last name 'Ellis':
$ bibauthorid -u admin --update-personid --all-records
- Disambiguate all records on a fresh installation
$ bibauthorid -u admin --disambiguate --from-scratch
""",
help_specific_usage="""
bibauthorid [COMMAND] [OPTIONS]
COMMAND
You can choose only one from the following:
--update-personid Updates personid adding not yet assigned papers
to the system, in a fast, best effort basis.
Cleans the table from stale records.
--disambiguate Disambiguates all signatures in the database
using the tortoise/wedge algorithm. This usually
takes a LOT of time so the results are stored in
a special table. Use --merge to use the results.
--merge Updates the personid tables with the results from
the --disambiguate algorithm.
OPTIONS
Options for update personid
(default) Will update only the modified records since last
run.
-i, --record-ids Force the procedure to work only on the specified
records. This option is exclusive with --all-records.
--all-records Force the procedure to work on all records. This
option is exclusive with --record-ids.
Options for disambiguate
(default) Performs full disambiguation of all records in the
current personid tables with respect to the user
decisions.
--from-scratch Ignores the current information in the personid
tables and disambiguates everything from scratch.
There are no options for the merger.
""",
version="Invenio Bibauthorid v%s" % bconfig.VERSION,
specific_params=("i:",
[
"record-ids=",
"disambiguate",
"merge",
"all-records",
"update-personid",
"from-scratch"
]),
task_submit_elaborate_specific_parameter_fnc=_task_submit_elaborate_specific_parameter,
task_submit_check_options_fnc=_task_submit_check_options,
task_run_fnc=_task_run_core)
def _task_submit_elaborate_specific_parameter(key, value, opts, args):
"""
Given the string key it checks it's meaning, eventually using the
value. Usually, it fills some key in the options dict.
It must return True if it has elaborated the key, False, if it doesn't
know that key.
"""
if key in ("--update-personid",):
bibtask.task_set_option("update_personid", True)
elif key in ("--record-ids", '-i'):
if value.count("="):
value = value[1:]
value = value.split(",")
bibtask.task_set_option("record_ids", value)
elif key in ("--all-records",):
bibtask.task_set_option("all_records", True)
elif key in ("--disambiguate",):
bibtask.task_set_option("disambiguate", True)
elif key in ("--merge",):
bibtask.task_set_option("merge", True)
elif key in ("--from-scratch",):
bibtask.task_set_option("from_scratch", True)
else:
return False
return True
def _task_run_core():
"""
Runs the requested task in the bibsched environment.
"""
if bibtask.task_get_option('update_personid'):
record_ids = bibtask.task_get_option('record_ids')
if record_ids:
record_ids = map(int, record_ids)
all_records = bibtask.task_get_option('all_records')
bibtask.task_update_progress('Updating personid...')
run_rabbit(record_ids, all_records)
bibtask.task_update_progress('PersonID update finished!')
if bibtask.task_get_option("disambiguate"):
bibtask.task_update_progress('Performing full disambiguation...')
run_tortoise(bool(bibtask.task_get_option("from_scratch")))
bibtask.task_update_progress('Full disambiguation finished!')
if bibtask.task_get_option("merge"):
bibtask.task_update_progress('Merging results...')
run_merge()
bibtask.task_update_progress('Merging finished!')
return 1
def _task_submit_check_options():
"""
Required by bibtask. Checks the options.
"""
update_personid = bibtask.task_get_option("update_personid")
disambiguate = bibtask.task_get_option("disambiguate")
merge = bibtask.task_get_option("merge")
record_ids = bibtask.task_get_option("record_ids")
all_records = bibtask.task_get_option("all_records")
from_scratch = bibtask.task_get_option("from_scratch")
commands = bool(update_personid) + bool(disambiguate) + bool(merge)
if commands == 0:
bibtask.write_message("ERROR: At least one command should be specified!"
, stream=sys.stdout, verbose=0)
return False
if commands > 1:
bibtask.write_message("ERROR: The options --update-personid, --disambiguate "
"and --merge are mutually exclusive."
, stream=sys.stdout, verbose=0)
return False
assert commands == 1
if update_personid:
if any((from_scratch,)):
bibtask.write_message("ERROR: The only options which can be specified "
"with --update-personid are --record-ids and "
"--all-records"
, stream=sys.stdout, verbose=0)
return False
options = bool(record_ids) + bool(all_records)
if options > 1:
bibtask.write_message("ERROR: conflicting options: --record-ids and "
"--all-records are mutually exclusive."
, stream=sys.stdout, verbose=0)
return False
if record_ids:
for iden in record_ids:
if not iden.isdigit():
bibtask.write_message("ERROR: Record_ids expects numbers. "
"Provided: %s." % iden)
return False
if disambiguate:
if any((record_ids, all_records)):
bibtask.write_message("ERROR: The only option which can be specified "
"with --disambiguate is from-scratch"
, stream=sys.stdout, verbose=0)
return False
if merge:
if any((record_ids, all_records, from_scratch)):
bibtask.write_message("ERROR: There are no options which can be "
"specified along with --merge"
, stream=sys.stdout, verbose=0)
return False
return True
def _get_personids_to_update_extids(papers=None):
'''
It returns the set of personids of which we should recalculate
their external ids.
@param papers: papers
@type papers: set or None
@return: personids
@rtype: set
'''
last_log = get_user_log(userinfo='daemon', action='PID_UPDATE', only_most_recent=True)
if last_log:
daemon_last_time_run = last_log[0][2]
modified_bibrecs = get_recently_modified_record_ids(daemon_last_time_run)
else:
modified_bibrecs = get_all_valid_bibrecs()
if papers:
modified_bibrecs &= set(papers)
if not modified_bibrecs:
return None
if bconfig.LIMIT_EXTERNAL_IDS_COLLECTION_TO_CLAIMED_PAPERS:
modified_bibrecs = [rec[0] for rec in get_claimed_papers_from_papers(modified_bibrecs)]
personids_to_update_extids = set()
for bibrec in modified_bibrecs:
personids_to_update_extids |= set(get_personids_from_bibrec(bibrec))
return personids_to_update_extids
def rabbit_with_log(papers, check_invalid_papers, log_comment, partial=False):
from invenio.bibauthorid_rabbit import rabbit
personids_to_update_extids = _get_personids_to_update_extids(papers)
starting_time = get_sql_time()
rabbit(papers, check_invalid_papers, personids_to_update_extids)
if partial:
action = 'PID_UPDATE_PARTIAL'
else:
action = 'PID_UPDATE'
insert_user_log('daemon', '-1', action, 'bibsched', 'status', comment=log_comment, timestamp=starting_time)
def run_rabbit(paperslist, all_records=False):
if not paperslist and all_records:
rabbit_with_log(None, True, 'bibauthorid_daemon, update_personid on all papers')
elif not paperslist:
last_log = get_user_log(userinfo='daemon', action='PID_UPDATE', only_most_recent=True)
if len(last_log) >= 1:
#select only the most recent papers
recently_modified = get_recently_modified_record_ids(date=last_log[0][2])
if not recently_modified:
bibtask.write_message("update_personID_table_from_paper: "
"All person entities up to date.",
stream=sys.stdout, verbose=0)
else:
bibtask.write_message("update_personID_table_from_paper: Running on: " +
str(recently_modified), stream=sys.stdout, verbose=0)
rabbit_with_log(recently_modified, True, 'bibauthorid_daemon, run_personid_fast_assign_papers on '
+ str([paperslist, all_records, recently_modified]))
else:
rabbit_with_log(None, True, 'bibauthorid_daemon, update_personid on all papers')
else:
rabbit_with_log(paperslist, True, 'bibauthorid_daemon, personid_fast_assign_papers on ' + str(paperslist), partial=True)
def run_tortoise(from_scratch):
from invenio.bibauthorid_tortoise import tortoise, tortoise_from_scratch
if from_scratch:
tortoise_from_scratch()
else:
start_time = get_sql_time()
tortoise_db_name = 'tortoise'
last_run = get_user_log(userinfo=tortoise_db_name, only_most_recent=True)
if last_run:
modified = get_recently_modified_record_ids(last_run[0][2])
else:
modified = []
tortoise(modified)
insert_user_log(tortoise_db_name, '-1', '', '', '', timestamp=start_time)
def run_merge():
from invenio.bibauthorid_merge import merge_dynamic
merge_dynamic()
diff --git a/invenio/legacy/bibauthorid/dbinterface.py b/invenio/legacy/bibauthorid/dbinterface.py
index fdcee53e7..e26a15cc4 100644
--- a/invenio/legacy/bibauthorid/dbinterface.py
+++ b/invenio/legacy/bibauthorid/dbinterface.py
@@ -1,3044 +1,3044 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
'''
bibauthorid_bdinterface
This is the only file in bibauthorid which should
use the data base. It should have an interface for
all other files in the module.
'''
-import invenio.bibauthorid_config as bconfig
+import invenio.legacy.bibauthorid.config as bconfig
import numpy
import cPickle
from cPickle import UnpicklingError
from invenio.utils.html import X
import os
import gc
#python2.4 compatibility
-from invenio.bibauthorid_general_utils import bai_all as all
+from invenio.legacy.bibauthorid.general_utils import bai_all as all
from itertools import groupby, count, ifilter, chain, imap
from operator import itemgetter
from invenio.legacy.search_engine import perform_request_search
from invenio.modules.access.engine import acc_authorize_action
from invenio.config import CFG_SITE_URL
-from invenio.bibauthorid_name_utils import split_name_parts
-from invenio.bibauthorid_name_utils import create_canonical_name
-from invenio.bibauthorid_name_utils import create_normalized_name
-from invenio.bibauthorid_general_utils import bibauthor_print
-from invenio.bibauthorid_general_utils import update_status \
+from invenio.legacy.bibauthorid.name_utils import split_name_parts
+from invenio.legacy.bibauthorid.name_utils import create_canonical_name
+from invenio.legacy.bibauthorid.name_utils import create_normalized_name
+from invenio.legacy.bibauthorid.general_utils import bibauthor_print
+from invenio.legacy.bibauthorid.general_utils import update_status \
, update_status_final
from invenio.legacy.dbquery import run_sql
try:
from collections import defaultdict
except:
from invenio.utils.container import defaultdict
MARC_100_700_CACHE = None
COLLECT_INSPIRE_ID = bconfig.COLLECT_EXTERNAL_ID_INSPIREID
def get_sql_time():
'''
Returns the time according to the database. The type is datetime.datetime.
'''
return run_sql("select now()")[0][0]
def set_personid_row(person_id, tag, value, opt1=None, opt2=None, opt3=None):
'''
Inserts data and additional info into aidPERSONIDDATA
@param person_id:
@type person_id: int
@param tag:
@type tag: string
@param value:
@type value: string
@param opt1:
@type opt1: int
@param opt2:
@type opt2: int
@param opt3:
@type opt3: string
'''
run_sql("INSERT INTO aidPERSONIDDATA "
"(`personid`, `tag`, `data`, `opt1`, `opt2`, `opt3`) "
"VALUES (%s, %s, %s, %s, %s, %s)",
(person_id, tag, value, opt1, opt2, opt3))
def get_personid_row(person_id, tag):
'''
Returns all the records associated to a person and a tag.
@param person_id: id of the person to read the attribute from
@type person_id: int
@param tag: the tag to read.
@type tag: string
@return: the data associated with a virtual author
@rtype: tuple of tuples
'''
return run_sql("SELECT data, opt1, opt2, opt3 "
"data FROM aidPERSONIDDATA "
"WHERE personid = %s AND tag = %s",
(person_id, tag))
def del_personid_row(tag, person_id=None, value=None):
'''
Delete the value associated to the given tag for a certain person.
Can delete all tags regardless of person_id or value, or restrict the deletion using either of
both of them.
@param person_id: ID of the person
@type person_id: int
@param tag: tag to be updated
@type tag: string
@param value: value to be written for the tag
@type value: string
'''
if person_id:
if value:
run_sql("delete from aidPERSONIDDATA where personid=%s and tag=%s and data=%s", (person_id, tag, value,))
else:
run_sql("delete from aidPERSONIDDATA where personid=%s and tag=%s", (person_id, tag,))
else:
if value:
run_sql("delete from aidPERSONIDDATA where tag=%s and data=%s", (tag, value,))
else:
run_sql("delete from aidPERSONIDDATA where tag=%s", (tag,))
def get_all_papers_of_pids(personid_list):
'''
Get all papers of authors in a given list and sorts the results
by bibrefrec.
@param personid_list: list with the authors.
@type personid_list: iteratable of integers.
'''
if personid_list:
plist = list_2_SQL_str(personid_list)
paps = run_sql("select personid, bibref_table, bibref_value, bibrec, flag "
"from aidPERSONIDPAPERS "
"where personid in %s "
% plist)
inner = set(row[1:4] for row in paps if row[4] > -2)
return (x for x in paps if x[1:4] in inner)
return ()
def del_person_not_manually_claimed_papers(pid):
'''
Deletes papers from a person which have not been manually claimed.
@param pid:
@type pid: int
'''
run_sql("delete from aidPERSONIDPAPERS "
"where and (flag <> '-2' and flag <> '2') and personid=%s", (pid,))
def get_personid_from_uid(uid):
'''
Returns the personID associated with the provided ui.
If the personID is already associated with the person the secon parameter is True, false otherwise.
@param uid: userID
@type uid: ((int,),)
'''
pid = run_sql("select personid from aidPERSONIDDATA where tag=%s and data=%s", ('uid', str(uid[0][0])))
if len(pid) == 1:
return (pid[0], True)
else:
return ([-1], False)
def get_uid_from_personid(pid):
'''
Get the invenio user id associated to a pid if exists.
@param pid: person_id
@type pid: int
'''
uid = run_sql("select data from aidPERSONIDDATA where tag='uid' and personid = %s", (pid,))
if uid:
return uid[0][0]
else:
return None
def get_new_personid():
'''
Get a free personid number
'''
pids = (run_sql("select max(personid) from aidPERSONIDDATA")[0][0],
run_sql("select max(personid) from aidPERSONIDPAPERS")[0][0])
pids = tuple(int(p) for p in pids if p != None)
if len(pids) == 2:
return max(*pids) + 1
elif len(pids) == 1:
return pids[0] + 1
else:
return 0
def get_existing_personids(with_papers_only=False):
'''
Get a set of existing person_ids.
@param with_papers_only: if True, returns only ids holding papers discarding ids holding only information in aidPERSONIDDATA
@type with_papers_only: Bool
'''
if not with_papers_only:
try:
pids_data = set(map(int, zip(*run_sql("select distinct personid from aidPERSONIDDATA"))[0]))
except IndexError:
pids_data = set()
else:
pids_data = set()
try:
pids_pap = set(map(int, zip(*run_sql("select distinct personid from aidPERSONIDPAPERS"))[0]))
except IndexError:
pids_pap = set()
return pids_data | pids_pap
def get_existing_result_clusters():
'''
Get existing relult clusters, for private use of Tortoise and merger
'''
return run_sql("select distinct personid from aidRESULTS")
def create_new_person(uid= -1, uid_is_owner=False):
'''
Create a new person. Set the uid as owner if requested.
@param uid: User id to associate to the newly created person
@type uid: int
@param uid_is_owner: If true, the person will hold the uid as owned, otherwise the id is only remembered as the creator
@type uid_is_owner: bool
'''
pid = get_new_personid()
if uid_is_owner:
set_personid_row(pid, 'uid', str(uid))
else:
set_personid_row(pid, 'user-created', str(uid))
return pid
def create_new_person_from_uid(uid):
'''
Commodity stub for create_new_person(...)
@param uid: user id
@type uid: int
'''
return create_new_person(uid, uid_is_owner=True)
def new_person_from_signature(sig, name=None):
'''
Creates a new person from a signature.
@param sig: signature tuple ([100|700],bibref,bibrec)
@type sig: tuple
@param name:
@type name: string
'''
pid = get_new_personid()
add_signature(sig, name, pid)
return pid
def add_signature(sig, name, pid):
'''
Inserts a signature in personid.
@param sig: signature tuple
@type sig: tuple
@param name: name string
@type name: string
@param pid: personid to which assign the signature
@type pid: int
'''
if not name:
name = get_name_by_bibrecref(sig)
name = create_normalized_name(split_name_parts(name))
run_sql("INSERT INTO aidPERSONIDPAPERS "
"(personid, bibref_table, bibref_value, bibrec, name) "
"VALUES (%s, %s, %s, %s, %s)"
, (pid, str(sig[0]), sig[1], sig[2], name))
def move_signature(sig, pid, force_claimed=False, unclaim=False):
'''
Moves a signature to a different person id
@param sig: signature tuple
@type sig: tuple
@param pid: personid
@type pid: int
'''
upd = "update aidPERSONIDPAPERS set personid=%s" % pid
if unclaim:
upd += ',flag=0 '
sel = " where bibref_table like '%s' and bibref_value=%s and bibrec=%s " % sig
sql = upd + sel
if not force_claimed:
sql += ' and flag <> 2 and flag <> -2'
run_sql(sql)
def find_conflicts(sig, pid):
'''
Helper for merger algorithm, find signature given personid
@param sig: signature tuple
@type sig: tuple
@param pid: personid id
@type pid: integer
'''
return run_sql("select bibref_table, bibref_value, bibrec, flag "
"from aidPERSONIDPAPERS where "
"personid = %s and "
"bibrec = %s and "
"flag <> -2"
, (pid, sig[2]))
def update_request_ticket(person_id, tag_data_tuple, ticket_id=None):
'''
Creates / updates a request ticket for a personID
@param: personid int
@param: tag_data_tuples 'image' of the ticket: (('paper', '700:316,10'), ('owner', 'admin'), ('external_id', 'ticket_18'))
@return: ticketid
'''
#tags: rt_owner (the owner of the ticket, associating the rt_number to the transaction)
# rt_external_id
# rt_paper_cornfirm, rt_paper_reject, rt_paper_forget, rt_name, rt_email, rt_whatever
#flag: rt_number
if not ticket_id:
last_id = run_sql("select max(opt1) from aidPERSONIDDATA where personid=%s and tag like %s", (str(person_id), 'rt_%'))[0][0]
if last_id:
ticket_id = last_id + 1
else:
ticket_id = 1
else:
delete_request_ticket(person_id, ticket_id)
for d in tag_data_tuple:
run_sql("insert into aidPERSONIDDATA (personid, tag, data, opt1) "
"values (%s,%s,%s,%s)",
(str(person_id), 'rt_' + str(d[0]), str(d[1]), str(ticket_id)))
return ticket_id
def delete_request_ticket(person_id, ticket_id=None):
'''
Removes a ticket from a person_id.
If ticket_id is not provider removes all the tickets pending on a person.
'''
if ticket_id:
run_sql("delete from aidPERSONIDDATA where personid=%s and tag like %s and opt1 =%s", (str(person_id), 'rt_%', str(ticket_id)))
else:
run_sql("delete from aidPERSONIDDATA where personid=%s and tag like %s", (str(person_id), 'rt_%'))
def get_all_personids_by_name(regexpr):
'''
Search personids matching SQL expression in the name field
@param regexpr: string SQL regexp
@type regexpr: string
'''
return run_sql("select personid, name "
"from aidPERSONIDPAPERS "
"where name like %s "
"and flag > -2",
(regexpr,))
def get_personids_by_canonical_name(target):
'''
Find personids by canonical name
@param target:
@type target:
'''
pid = run_sql("select personid from aidPERSONIDDATA where "
"tag='canonical_name' and data like %s", (target,))
if pid:
return run_sql("select personid, name from aidPERSONIDPAPERS "
"where personid=%s and flag > -2", (pid[0][0],))
else:
return []
def get_bibref_modification_status(bibref):
'''
Determines if a record attached to a person has been touched by a human
by checking the flag.
@param pid: The Person ID of the person to check the assignment from
@type pid: int
@param bibref: The paper identifier to be checked (e.g. "100:12,144")
@type bibref: string
returns [bool:human_modified, int:lcul]
'''
if not bibref:
raise ValueError("A bibref is expected!")
head, rec = bibref.split(',')
table, ref = head.split(':')
flags = run_sql("SELECT flag, lcul FROM aidPERSONIDPAPERS WHERE "
"bibref_table = %s and bibref_value = %s and bibrec = %s"
, (table, ref, rec))
if flags:
return flags[0]
else:
return (False, 0)
def get_canonical_id_from_personid(pid):
'''
Finds the person id canonical name (e.g. Ellis_J_R_1)
@param pid
@type int
@return: sql result of the request
@rtype: tuple of tuple
'''
return run_sql("SELECT data FROM aidPERSONIDDATA WHERE "
"tag = %s AND personid = %s", ('canonical_name', str(pid)))
def get_papers_status(paper):
'''
Gets the personID and flag assiciated to papers
@param papers: list of papers
@type papers: '100:7531,9024'
@return: (('data','personID','flag',),)
@rtype: tuple of tuples
'''
head, bibrec = paper.split(',')
_table, bibref = head.split(':')
rets = run_sql("select PersonID, flag "
"from aidPERSONIDPAPERS "
"where bibref_table = %s "
"and bibref_value = %s "
"and bibrec = %s"
% (head, bibrec, bibref))
return [[paper] + list(x) for x in rets]
def get_persons_from_recids(recids, return_alt_names=False,
return_all_person_papers=False):
'''
Helper for search engine indexing. Gives back a dictionary with important info about a person, for example:
get_persons_from_recids([1], True, True) returns
({1: [16591L]},
{16591L: {'alternatative_names': ['Wong, Yung Chow'],
'canonical_id': 'Y.C.Wong.1',
'person_records': [275304, 1, 51394, 128250, 311629]}})
@param recids:
@type recids:
@param return_alt_names:
@type return_alt_names:
@param return_all_person_papers:
@type return_all_person_papers:
'''
rec_2_pid = dict()
pid_2_data = dict()
all_pids = set()
def get_canonical_name(pid):
return run_sql("SELECT data "
"FROM aidPERSONIDDATA "
"WHERE tag = %s "
"AND personid = %s",
('canonical_name', pid))
for rec in recids:
pids = run_sql("SELECT personid "
"FROM aidPERSONIDPAPERS "
"WHERE bibrec = %s "
" and flag > -2 ",
(rec,))
# for some reason python's set is faster than a mysql distinct
pids = set(p[0] for p in pids)
all_pids |= pids
rec_2_pid[rec] = list(pids)
for pid in all_pids:
pid_data = {}
canonical = get_canonical_name(pid)
#We can supposed that this person didn't have a chance to get a canonical name yet
#because it was not fully processed by it's creator. Anyway it's safe to try to create one
#before failing miserably
if not canonical:
update_personID_canonical_names([pid])
canonical = get_canonical_name(pid)
#assert len(canonical) == 1
#This condition cannot hold in case claims or update daemons are run in parallel
#with this, as it can happen that a person with papers exists for wich a canonical name
#has not been computed yet. Hence, it will be indexed next time, so it learns.
#Each person should have at most one canonical name, so:
assert len(canonical) <= 1, "A person cannot have more than one canonical name"
if len(canonical) == 1:
pid_data = {'canonical_id' : canonical[0][0]}
if return_alt_names:
names = run_sql("SELECT name "
"FROM aidPERSONIDPAPERS "
"WHERE personid = %s "
" and flag > -2 ",
(pid,))
names = set(n[0] for n in names)
pid_data['alternatative_names'] = list(names)
if return_all_person_papers:
recs = run_sql("SELECT bibrec "
"FROM aidPERSONIDPAPERS "
"WHERE personid = %s "
" and flag > -2 ",
(pid,))
recs = set(r[0] for r in recs)
pid_data['person_records'] = list(recs)
pid_2_data[pid] = pid_data
return (rec_2_pid, pid_2_data)
def get_person_db_names_count(pid, sort_by_count=True):
'''
Returns the set of name strings and count associated to a person id.
The name strings are as found in the database.
@param pid: ID of the person
@type pid: ('2',)
'''
id_2_count = run_sql("select bibref_table, bibref_value "
"from aidPERSONIDPAPERS "
"where personid = %s "
"and flag > -2", (pid,))
ref100 = [refid[1] for refid in id_2_count if refid[0] == '100']
ref700 = [refid[1] for refid in id_2_count if refid[0] == '700']
ref100_count = dict((key, len(list(data))) for key, data in groupby(sorted(ref100)))
ref700_count = dict((key, len(list(data))) for key, data in groupby(sorted(ref700)))
if ref100:
ref100_s = list_2_SQL_str(ref100, str)
id100_2_str = run_sql("select id, value "
"from bib10x "
"where id in %s"
% ref100_s)
else:
id100_2_str = tuple()
if ref700:
ref700_s = list_2_SQL_str(ref700, str)
id700_2_str = run_sql("select id, value "
"from bib70x "
"where id in %s"
% ref700_s)
else:
id700_2_str = tuple()
ret100 = [(name, ref100_count[refid]) for refid, name in id100_2_str]
ret700 = [(name, ref700_count[refid]) for refid, name in id700_2_str]
ret = ret100 + ret700
if sort_by_count:
ret = sorted(ret, key=itemgetter(1), reverse=True)
return ret
def get_person_id_from_canonical_id(canonical_id):
'''
Finds the person id from a canonical name (e.g. Ellis_J_R_1)
@param canonical_id: the canonical ID
@type canonical_id: string
@return: sql result of the request
@rtype: tuple of tuple
'''
return run_sql("SELECT personid FROM aidPERSONIDDATA WHERE "
"tag='canonical_name' AND data = %s", (canonical_id,))
def get_person_names_count(pid):
'''
Returns the set of name strings and count associated to a person id
@param pid: ID of the person
@type pid: ('2',)
@param value: value to be written for the tag
@type value: string
'''
return run_sql("select name, count(name) from aidPERSONIDPAPERS where "
"personid=%s and flag > -2 group by name", (pid,))
def get_person_db_names_set(pid):
'''
Returns the set of db_name strings associated to a person id
@param pid: ID of the person
@type pid: 2
'''
names = get_person_db_names_count(pid)
if names:
return zip(set(zip(*names)[0]))
else:
return []
def get_personids_from_bibrec(bibrec):
'''
Returns all the personids associated to a bibrec.
@param bibrec: record id
@type bibrec: int
'''
pids = run_sql("select distinct personid from aidPERSONIDPAPERS where bibrec=%s and flag > -2", (bibrec,))
if pids:
return zip(*pids)[0]
else:
return []
def get_personids_and_papers_from_bibrecs(bibrecs, limit_by_name=None):
'''
Gives back a list of tuples (personid, set_of_papers_owned_by) limited to the given list of bibrecs.
@param bibrecs:
@type bibrecs:
@param limit_by_name:
@type limit_by_name:
'''
if not bibrecs:
return []
else:
bibrecs = list_2_SQL_str(bibrecs)
if limit_by_name:
try:
surname = split_name_parts(limit_by_name)[0]
except IndexError:
surname = None
else:
surname = None
if not surname:
data = run_sql("select personid,bibrec from aidPERSONIDPAPERS where bibrec in %s" % (bibrecs,))
else:
surname = split_name_parts(limit_by_name)[0]
data = run_sql(("select personid,bibrec from aidPERSONIDPAPERS where bibrec in %s "
"and name like " % bibrecs) + ' %s ', (surname + '%',))
pidlist = [(k, set([s[1] for s in d]))
for k, d in groupby(sorted(data, key=lambda x:x[0]), key=lambda x:x[0])]
pidlist = sorted(pidlist, key=lambda x:len(x[1]), reverse=True)
return pidlist
def get_person_bibrecs(pid):
'''
Returns bibrecs associated with a personid
@param pid: integer personid
@return [bibrec1,...,bibrecN]
'''
papers = run_sql("select bibrec from aidPERSONIDPAPERS where personid=%s and flag > -2", (str(pid),))
if papers:
return list(set(zip(*papers)[0]))
else:
return []
def get_person_papers(pid, flag,
show_author_name=False,
show_title=False,
show_rt_status=False,
show_affiliations=False,
show_date=False,
show_experiment=False):
'''
Get all papers of person with flag greater than flag. Gives back a dictionary like:
get_person_papers(16591,-2,True,True,True,True,True,True) returns
[{'affiliation': ['Hong Kong U.'],
'authorname': 'Wong, Yung Chow',
'data': '100:1,1',
'date': ('1961',),
'experiment': [],
'flag': 0,
'rt_status': False,
'title': ('Isoclinic N planes in Euclidean 2N space, Clifford parallels in elliptic (2N-1) space, and the Hurwitz matrix equations',)},
...]
@param pid:
@type pid:
@param flag:
@type flag:
@param show_author_name:
@type show_author_name:
@param show_title:
@type show_title:
@param show_rt_status:
@type show_rt_status:
@param show_affiliations:
@type show_affiliations:
@param show_date:
@type show_date:
@param show_experiment:
@type show_experiment:
'''
query = "bibref_table, bibref_value, bibrec, flag"
if show_author_name:
query += ", name"
all_papers = run_sql("SELECT " + query + " "
"FROM aidPERSONIDPAPERS "
"WHERE personid = %s "
"AND flag >= %s",
(pid, flag))
def format_paper(paper):
bibrefrec = "%s:%d,%d" % paper[:3]
ret = {'data' : bibrefrec,
'flag' : paper[3]
}
if show_author_name:
ret['authorname'] = paper[4]
if show_title:
ret['title'] = ""
title = get_title_from_rec(paper[2])
if title:
ret['title'] = (title,)
if show_rt_status:
rt_count = run_sql("SELECT count(personid) "
"FROM aidPERSONIDDATA WHERE "
"tag like 'rt_%%' and data = %s"
, (bibrefrec,))
ret['rt_status'] = (rt_count[0][0] > 0)
if show_affiliations:
tag = '%s__u' % paper[0]
ret['affiliation'] = get_grouped_records(paper[:3], tag)[tag]
if show_date:
ret['date'] = []
date_id = run_sql("SELECT id_bibxxx "
"FROM bibrec_bib26x "
"WHERE id_bibrec = %s "
, (paper[2],))
if date_id:
date_id_s = list_2_SQL_str(date_id, lambda x: x[0])
date = run_sql("SELECT value "
"FROM bib26x "
"WHERE id in %s "
"AND tag = %s"
% (date_id_s, "'269__c'"))
if date:
ret['date'] = zip(*date)[0]
if show_experiment:
ret['experiment'] = []
experiment_id = run_sql("SELECT id_bibxxx "
"FROM bibrec_bib69x "
"WHERE id_bibrec = %s "
, (paper[2],))
if experiment_id:
experiment_id_s = list_2_SQL_str(experiment_id, lambda x: x[0])
experiment = run_sql("SELECT value "
"FROM bib69x "
"WHERE id in %s "
"AND tag = %s"
% (experiment_id_s, "'693__e'"))
if experiment:
ret['experiment'] = zip(*experiment)[0]
return ret
return [format_paper(paper) for paper in all_papers]
def get_persons_with_open_tickets_list():
'''
Finds all the persons with open tickets and returns pids and count of tickets
@return: [[pid, ticket_count]]
'''
return run_sql("select personid, count(distinct opt1) from "
"aidPERSONIDDATA where tag like 'rt_%' group by personid")
def get_request_ticket(person_id, ticket_id=None):
'''
Retrieves one or many requests tickets from a person
@param: person_id: person id integer
@param: matching: couple of values to match ('tag', 'value')
@param: ticket_id: ticket id (flag) value
@returns: [[[('tag', 'value')], ticket_id]]
[[[('a', 'va'), ('b', 'vb')], 1L], [[('b', 'daOEIaoe'), ('a', 'caaoOUIe')], 2L]]
'''
if ticket_id:
tstr = " and opt1='%s' " % ticket_id
else:
tstr = " "
tickets = run_sql("select tag,data,opt1 from aidPERSONIDDATA where personid=%s and "
" tag like 'rt_%%' " + tstr , (person_id,))
return [[[(s[0][3:], s[1]) for s in d], k] for k, d in groupby(sorted(tickets, key=lambda k: k[2]), key=lambda k: k[2])]
def insert_user_log(userinfo, personid, action, tag, value, comment='', transactionid=0, timestamp=None, userid=''):
'''
Instert log entries in the user log table.
For example of entres look at the table generation script.
@param userinfo: username or user identifier
@type: string
@param personid: personid involved in the transaction
@type: longint
@param action: action type
@type: string
@param tag: tag
@type: string
@param value: value for the transaction
@type: string
@param comment: optional comment for the transaction
@type: string
@param transactionid: optional id for the transaction
@type: longint
@return: the transactionid
@rtype: longint
'''
if not timestamp:
timestamp = run_sql('select now()')[0][0]
run_sql('insert into aidUSERINPUTLOG '
'(transactionid,timestamp,userinfo,userid,personid,action,tag,value,comment) values '
'(%s,%s,%s,%s,%s,%s,%s,%s,%s)',
(transactionid, timestamp, userinfo, userid, personid,
action, tag, value, comment))
return transactionid
def person_bibref_is_touched_old(pid, bibref):
'''
Determines if a record attached to a person has been touched by a human
by checking the flag.
@param pid: The Person ID of the person to check the assignment from
@type pid: int
@param bibref: The paper identifier to be checked (e.g. "100:12,144")
@type bibref: string
'''
bibref, rec = bibref.split(",")
table, ref = bibref.split(":")
flag = run_sql("SELECT flag "
"FROM aidPERSONIDPAPERS "
"WHERE personid = %s "
"AND bibref_table = %s "
"AND bibref_value = %s "
"AND bibrec = %s"
, (pid, table, ref, rec))
try:
flag = flag[0][0]
except (IndexError):
return False
if not flag:
return False
elif -2 < flag < 2:
return False
else:
return True
def confirm_papers_to_person(pid, papers, user_level=0):
'''
Confirms the relationship between pid and paper, as from user input.
@param pid: id of the person
@type pid: integer
@param papers: list of papers to confirm
@type papers: ((str,),) e.g. (('100:7531,9024',),)
@return: list of tuples: (status, message_key)
@rtype: [(bool, str), ]
'''
pids_to_update = set([pid])
res = []
for p in papers:
bibref, rec = p.split(",")
rec = int(rec)
table, ref = bibref.split(":")
ref = int(ref)
sig = (table, ref, rec)
#Check the status of pid: the paper should be present, either assigned or rejected
gen_papers = run_sql("select bibref_table, bibref_value, bibrec, personid, flag, name "
"from aidPERSONIDPAPERS "
"where bibrec=%s "
"and flag >= -2"
, (rec,))
paps = [el[0:3] for el in gen_papers if el[3] == pid and el[4] > -2]
#run_sql("select bibref_table, bibref_value, bibrec "
# "from aidPERSONIDPAPERS "
# "where personid=%s "
# "and bibrec=%s "
# "and flag > -2"
# , (pid, rec))
other_paps = [el[0:3] for el in gen_papers if el[3] != pid and el[4] > -2]
#other_paps = run_sql("select bibref_table, bibref_value, bibrec "
# "from aidPERSONIDPAPERS "
# "where personid <> %s "
# "and bibrec=%s "
# "and flag > -2"
# , (pid, rec))
rej_paps = [el[0:3] for el in gen_papers if el[3] == pid and el[4] == -2]
#rej_paps = run_sql("select bibref_table, bibref_value, bibrec "
# "from aidPERSONIDPAPERS "
# "where personid=%s "
# "and bibrec=%s "
# "and flag = -2"
# , (pid, rec))
bibref_exists = [el[0:3] for el in gen_papers if el[0] == table and el[1] == ref and el[4] > -2]
#bibref_exists = run_sql("select * "
# "from aidPERSONIDPAPERS "
# "and bibref_table=%s "
# "and bibref_value=%s "
# "and bibrec=%s "
# "and flag > -2"
# , (table, ref, rec))
# All papers that are being claimed should be present in aidPERSONIDPAPERS, thus:
# assert paps or rej_paps or other_paps, 'There should be at least something regarding this bibrec!'
# should always be valid.
# BUT, it usually happens that claims get done out of the browser/session cache which is hours/days old,
# hence it happens that papers are claimed which no longer exists in the system.
# For the sake of mental sanity, instead of crashing from now on we just ignore such cases.
if not (paps or other_paps or rej_paps) or not bibref_exists:
res.append((False, 'confirm_failure'))
continue
res.append((True, 'confirm_success'))
# It should not happen that a paper is assigned more then once to the same person.
# But sometimes it happens in rare unfortunate cases of bad concurrency circumstances,
# so we try to fix it directly instead of crashing here.
# Once a better solution for dealing with concurrency will be found, the following asserts
# shall be reenabled, to allow better control on what happens.
# assert len(paps) < 2, "This paper should not be assigned to this person more then once! %s" % paps
# assert len(other_paps) < 2, "There should not be more then one copy of this paper! %s" % other_paps
# if the bibrec is present with a different bibref, the present one must be moved somwhere
# else before we can claim the incoming one
if paps:
for pap in paps:
#kick out all unwanted signatures
if sig != pap:
new_pid = get_new_personid()
pids_to_update.add(new_pid)
move_signature(pap, new_pid)
# Make sure that the incoming claim is unique and get rid of all rejections, they are useless
# from now on
run_sql("delete from aidPERSONIDPAPERS where bibref_table like %s and "
" bibref_value = %s and bibrec=%s"
, sig)
add_signature(sig, None, pid)
run_sql("update aidPERSONIDPAPERS "
"set personid = %s "
", flag = %s "
", lcul = %s "
"where bibref_table = %s "
"and bibref_value = %s "
"and bibrec = %s"
, (pid, '2', user_level,
table, ref, rec))
update_personID_canonical_names(pids_to_update)
return res
def reject_papers_from_person(pid, papers, user_level=0):
'''
Confirms the negative relationship between pid and paper, as from user input.
@param pid: id of the person
@type pid: integer
@param papers: list of papers to confirm
@type papers: ((str,),) e.g. (('100:7531,9024',),)
@return: list of tuples: (status, message_key)
@rtype: [(bool, str), ]
'''
new_pid = get_new_personid()
pids_to_update = set([pid])
res = []
for p in papers:
brr, rec = p.split(",")
table, ref = brr.split(':')
sig = (table, ref, rec)
# To be rejected, a record should be present!
records = personid_name_from_signature(sig)
# For the sake of mental sanity (see commentis in confirm_papers_to_personid, just ignore in case this paper is no longer existent
# assert(records)
if not records:
res.append((False, 'reject_failure'))
continue
res.append((True, 'reject_success'))
fpid, name = records[0]
# If the record is assigned to a different person already, the rejection is meaningless
# Otherwise, we assign the paper to someone else (not important who it will eventually
# get moved by tortoise) and add the rejection to the current person
if fpid == pid:
move_signature(sig, new_pid, force_claimed=True, unclaim=True)
pids_to_update.add(new_pid)
run_sql("INSERT INTO aidPERSONIDPAPERS "
"(personid, bibref_table, bibref_value, bibrec, name, flag, lcul) "
"VALUES (%s, %s, %s, %s, %s, %s, %s)"
, (pid, table, ref, rec, name, -2, user_level))
update_personID_canonical_names(pids_to_update)
return res
def reset_papers_flag(pid, papers):
'''
Resets the flag associated to the papers to '0'
@param pid: id of the person
@type pid: integer
@param papers: list of papers to confirm
@type papers: ((str,),) e.g. (('100:7531,9024',),)
@return: list of tuples: (status, message_key)
@rtype: [(bool, str), ]
'''
res = []
for p in papers:
bibref, rec = p.split(",")
table, ref = bibref.split(":")
ref = int(ref)
sig = (table, ref, rec)
gen_papers = run_sql("select bibref_table, bibref_value, bibrec, flag "
"from aidPERSONIDPAPERS "
"where bibrec=%s "
"and personid=%s"
, (rec, pid))
paps = [el[0:3] for el in gen_papers]
#run_sql("select bibref_table, bibref_value, bibrec "
# "from aidPERSONIDPAPERS "
# "where personid=%s "
# "and bibrec=%s "
# , (pid, rec))
rej_paps = [el[0:3] for el in gen_papers if el[3] == -2]
#rej_paps = run_sql("select bibref_table, bibref_value, bibrec "
# "from aidPERSONIDPAPERS "
# "where personid=%s "
# "and bibrec=%s "
# "and flag = -2"
# , (pid, rec))
pid_bibref_exists = [el[0:3] for el in gen_papers if el[0] == table and el[1] == ref and el[3] > -2]
#bibref_exists = run_sql("select * "
# "from aidPERSONIDPAPERS "
# "and bibref_table=%s "
# "and bibref_value=%s "
# "and personid=%s "
# "and bibrec=%s "
# "and flag > -2"
# , (table, ref, pid, rec))
# again, see confirm_papers_to_person for the sake of mental sanity
# assert paps or rej_paps
if rej_paps or not pid_bibref_exists:
res.append((False, 'reset_failure'))
continue
res.append((True, 'reset_success'))
assert len(paps) < 2
run_sql("delete from aidPERSONIDPAPERS where bibref_table like %s and "
"bibref_value = %s and bibrec = %s",
(sig))
add_signature(sig, None, pid)
return res
def user_can_modify_data(uid, pid):
'''
Return True if the uid can modify data of this personID, false otherwise.
@param uid: the user id
@type: int
@param pid: the person id
@type: int
@return: can user mofidfy data?
@rtype: boolean
'''
pid_uid = run_sql("select data from aidPERSONIDDATA where tag = %s"
" and personid = %s", ('uid', str(pid)))
if len(pid_uid) >= 1 and str(uid) == str(pid_uid[0][0]):
rights = bconfig.CLAIMPAPER_CHANGE_OWN_DATA
else:
rights = bconfig.CLAIMPAPER_CHANGE_OTHERS_DATA
return acc_authorize_action(uid, rights)[0] == 0
def get_possible_bibrecref(names, bibrec, always_match=False):
'''
Returns a list of bibrefs for which the surname is matching
@param names: list of names strings
@param bibrec: bibrec number
@param always_match: match with all the names (full bibrefs list)
'''
splitted_names = [split_name_parts(n) for n in names]
bibrec_names_100 = run_sql("select o.id, o.value from bib10x o, "
"(select i.id_bibxxx as iid from bibrec_bib10x i "
"where id_bibrec=%s) as dummy "
"where o.tag='100__a' AND o.id = dummy.iid",
(str(bibrec),))
bibrec_names_700 = run_sql("select o.id, o.value from bib70x o, "
"(select i.id_bibxxx as iid from bibrec_bib70x i "
"where id_bibrec=%s) as dummy "
"where o.tag='700__a' AND o.id = dummy.iid",
(str(bibrec),))
# bibrec_names_100 = run_sql("select id,value from bib10x where tag='100__a' and id in "
# "(select id_bibxxx from bibrec_bib10x where id_bibrec=%s)",
# (str(bibrec),))
# bibrec_names_700 = run_sql("select id,value from bib70x where tag='700__a' and id in "
# "(select id_bibxxx from bibrec_bib70x where id_bibrec=%s)",
# (str(bibrec),))
bibreflist = []
for b in bibrec_names_100:
spb = split_name_parts(b[1])
for n in splitted_names:
if (n[0].lower() == spb[0].lower()) or always_match:
if ['100:' + str(b[0]), b[1]] not in bibreflist:
bibreflist.append(['100:' + str(b[0]), b[1]])
for b in bibrec_names_700:
spb = split_name_parts(b[1])
for n in splitted_names:
if (n[0].lower() == spb[0].lower()) or always_match:
if ['700:' + str(b[0]), b[1]] not in bibreflist:
bibreflist.append(['700:' + str(b[0]), b[1]])
return bibreflist
def user_can_modify_paper(uid, paper):
'''
Return True if the uid can modify this paper, false otherwise.
If the paper is assigned more then one time (from algorithms) consider the most privileged
assignment.
@param uid: the user id
@type: int
@param paper: the paper bibref,bibrec pair x00:1234,4321
@type: str
@return: can user mofidfy paper attribution?
@rtype: boolean
'''
bibref, rec = paper.split(",")
table, ref = bibref.split(":")
prow = run_sql("select personid, lcul from aidPERSONIDPAPERS "
"where bibref_table = %s and bibref_value = %s and bibrec = %s "
"order by lcul desc limit 0,1",
(table, ref, rec))
if len(prow) == 0:
return ((acc_authorize_action(uid, bconfig.CLAIMPAPER_CLAIM_OWN_PAPERS)[0] == 0) or
(acc_authorize_action(uid, bconfig.CLAIMPAPER_CLAIM_OTHERS_PAPERS)[0] == 0))
min_req_acc_n = int(prow[0][1])
req_acc = resolve_paper_access_right(bconfig.CLAIMPAPER_CLAIM_OWN_PAPERS)
pid_uid = run_sql("select data from aidPERSONIDDATA where tag = %s and personid = %s", ('uid', str(prow[0][0])))
if len(pid_uid) > 0:
if (str(pid_uid[0][0]) != str(uid)) and min_req_acc_n > 0:
req_acc = resolve_paper_access_right(bconfig.CLAIMPAPER_CLAIM_OTHERS_PAPERS)
if min_req_acc_n < req_acc:
min_req_acc_n = req_acc
min_req_acc = resolve_paper_access_right(min_req_acc_n)
return (acc_authorize_action(uid, min_req_acc)[0] == 0) and (resolve_paper_access_right(min_req_acc) >= min_req_acc_n)
def resolve_paper_access_right(acc):
'''
Given a string or an integer, resolves to the corresponding integer or string
If asked for a wrong/not present parameter falls back to the minimum privilege.
'''
access_dict = {bconfig.CLAIMPAPER_VIEW_PID_UNIVERSE: 0,
bconfig.CLAIMPAPER_CLAIM_OWN_PAPERS: 25,
bconfig.CLAIMPAPER_CLAIM_OTHERS_PAPERS: 50}
if isinstance(acc, str):
try:
return access_dict[acc]
except:
return 0
inverse_dict = dict([[v, k] for k, v in access_dict.items()])
lower_accs = [a for a in inverse_dict.keys() if a <= acc]
try:
return inverse_dict[max(lower_accs)]
except:
return bconfig.CLAIMPAPER_VIEW_PID_UNIVERSE
def get_recently_modified_record_ids(date):
'''
Returns the bibrecs with modification date more recent then date.
@param date: date
'''
touched_papers = frozenset(p[0] for p in run_sql(
"select id from bibrec "
"where modification_date > %s"
, (date,)))
return touched_papers & frozenset(get_all_valid_bibrecs())
def filter_modified_record_ids(bibrecs, date):
'''
Returns the bibrecs with modification date before the date.
@param date: date
'''
return ifilter(
lambda x: run_sql("select count(*) from bibrec "
"where id = %s and "
"modification_date < %s"
, (x[2], date))[0][0]
, bibrecs)
def get_user_log(transactionid='', userinfo='', userid='', personID='', action='', tag='', value='', comment='', only_most_recent=False):
'''
Get user log table entry matching all the given parameters; all of them are optional.
IF no parameters are given retuns the complete log table
@param transactionid: id of the transaction
@param userinfo: user name or identifier
@param personid: id of the person involved
@param action: action
@param tag: tag
@param value: value
@param comment: comment
'''
sql_query = ('select id,transactionid,timestamp,userinfo,personid,action,tag,value,comment ' +
'from aidUSERINPUTLOG where 1 ')
if transactionid:
sql_query += ' and transactionid=\'' + str(transactionid) + '\''
if userinfo:
sql_query += ' and userinfo=\'' + str(userinfo) + '\''
if userid:
sql_query += ' and userid=\'' + str(userid) + '\''
if personID:
sql_query += ' and personid=\'' + str(personID) + '\''
if action:
sql_query += ' and action=\'' + str(action) + '\''
if tag:
sql_query += ' and tag=\'' + str(tag) + '\''
if value:
sql_query += ' and value=\'' + str(value) + '\''
if comment:
sql_query += ' and comment=\'' + str(comment) + '\''
if only_most_recent:
sql_query += ' order by timestamp desc limit 0,1'
return run_sql(sql_query)
def list_2_SQL_str(items, f=lambda x: x):
"""
Concatenates all items in items to a sql string using f.
@param items: a set of items
@param type items: X
@param f: a function which transforms each item from items to string
@param type f: X:->str
@return: "(x1, x2, x3, ... xn)" for xi in items
@return type: string
"""
strs = (str(f(x)) for x in items)
return "(%s)" % ", ".join(strs)
def _get_authors_from_paper_from_db(paper):
'''
selects all author bibrefs by a given papers
'''
fullbibrefs100 = run_sql("select id_bibxxx from bibrec_bib10x where id_bibrec=%s", (paper,))
if len(fullbibrefs100) > 0:
fullbibrefs100str = list_2_SQL_str(fullbibrefs100, lambda x: str(x[0]))
return run_sql("select id from bib10x where tag='100__a' and id in %s" % (fullbibrefs100str,))
return tuple()
def _get_authors_from_paper_from_cache(paper):
'''
selects all author bibrefs by a given papers
'''
try:
ids = MARC_100_700_CACHE['brb100'][paper]['id'].keys()
refs = [i for i in ids if '100__a' in MARC_100_700_CACHE['b100'][i][0]]
except KeyError:
return tuple()
return zip(refs)
def get_authors_from_paper(paper):
if MARC_100_700_CACHE:
if bconfig.DEBUG_CHECKS:
assert _get_authors_from_paper_from_cache(paper) == _get_authors_from_paper_from_cache(paper)
return _get_authors_from_paper_from_cache(paper)
else:
return _get_authors_from_paper_from_db(paper)
def _get_coauthors_from_paper_from_db(paper):
'''
selects all coauthor bibrefs by a given papers
'''
fullbibrefs700 = run_sql("select id_bibxxx from bibrec_bib70x where id_bibrec=%s", (paper,))
if len(fullbibrefs700) > 0:
fullbibrefs700str = list_2_SQL_str(fullbibrefs700, lambda x: str(x[0]))
return run_sql("select id from bib70x where tag='700__a' and id in %s" % (fullbibrefs700str,))
return tuple()
def _get_coauthors_from_paper_from_cache(paper):
'''
selects all author bibrefs by a given papers
'''
try:
ids = MARC_100_700_CACHE['brb700'][paper]['id'].keys()
refs = [i for i in ids if '700__a' in MARC_100_700_CACHE['b700'][i][0]]
except KeyError:
return tuple()
return zip(refs)
def get_coauthors_from_paper(paper):
if MARC_100_700_CACHE:
if bconfig.DEBUG_CHECKS:
assert _get_coauthors_from_paper_from_cache(paper) == _get_coauthors_from_paper_from_db(paper)
return _get_coauthors_from_paper_from_cache(paper)
else:
return _get_coauthors_from_paper_from_db(paper)
def get_bibrefrec_subset(table, papers, refs):
table = "bibrec_bib%sx" % str(table)[:-1]
contents = run_sql("select id_bibrec, id_bibxxx from %s" % table)
papers = set(papers)
refs = set(refs)
# yes, there are duplicates and we must set them
return set(ifilter(lambda x: x[0] in papers and x[1] in refs, contents))
def get_deleted_papers():
return run_sql("select o.id_bibrec from bibrec_bib98x o, "
"(select i.id as iid from bib98x i "
"where value = 'DELETED' "
"and tag like '980__a') as dummy "
"where o.id_bibxxx = dummy.iid")
def add_personID_external_id(personid, external_id_str, value):
run_sql("insert into aidPERSONIDDATA (personid,tag,data) values (%s,%s,%s)",
(personid, 'extid:%s' % external_id_str, value))
def remove_personID_external_id(personid, external_id_str, value=False):
if not value:
run_sql("delete from aidPERSONIDDATA where personid=%s and tag=%s",
(personid, 'extid:%s' % external_id_str))
else:
run_sql("delete from aidPERSONIDDATA where personid=%s and tag=%s and data=%s",
(personid, 'extid:%s' % external_id_str, value))
def get_personiID_external_ids(personid):
ids = run_sql("select tag,data from aidPERSONIDDATA where personid=%s and tag like 'extid:%%'",
(personid,))
extids = {}
for i in ids:
id_str = i[0].split(':')[1]
idd = i[1]
try:
extids[id_str].append(idd)
except KeyError:
extids[id_str] = [idd]
return extids
#bibauthorid_maintenance personid update private methods
def update_personID_canonical_names(persons_list=None, overwrite=False, suggested='', overwrite_not_claimed_only=False):
'''
Updates the personID table creating or updating canonical names for persons
@param: persons_list: persons to consider for the update (('1'),)
@param: overwrite: if to touch already existing canonical names
@param: suggested: string to suggest a canonical name for the person
'''
if not persons_list and overwrite:
persons_list = set([x[0] for x in run_sql('select personid from aidPERSONIDPAPERS')])
elif not persons_list:
persons_list = set([x[0] for x in run_sql('select personid from aidPERSONIDPAPERS')])
existing_cnamed_pids = set(
[x[0] for x in run_sql('select personid from aidPERSONIDDATA where tag=%s',
('canonical_name',))])
persons_list = persons_list - existing_cnamed_pids
for idx, pid in enumerate(persons_list):
update_status(float(idx) / float(len(persons_list)), "Updating canonical_names...")
if overwrite_not_claimed_only:
has_claims = run_sql("select personid from aidPERSONIDPAPERS where personid = %s and flag = 2", (pid,))
if has_claims:
continue
current_canonical = run_sql("select data from aidPERSONIDDATA where "
"personid=%s and tag=%s", (pid, 'canonical_name'))
if overwrite or len(current_canonical) == 0:
run_sql("delete from aidPERSONIDDATA where personid=%s and tag=%s",
(pid, 'canonical_name'))
names = get_person_names_count(pid)
names = sorted(names, key=lambda k: k[1], reverse=True)
if len(names) < 1 and not suggested:
continue
else:
if suggested:
canonical_name = suggested
else:
canonical_name = create_canonical_name(names[0][0])
existing_cnames = run_sql("select data from aidPERSONIDDATA "
"where tag=%s and data like %s",
('canonical_name', str(canonical_name) + '%'))
existing_cnames = set(name[0].lower() for name in existing_cnames)
for i in count(1):
cur_try = canonical_name + '.' + str(i)
if cur_try.lower() not in existing_cnames:
canonical_name = cur_try
break
run_sql("insert into aidPERSONIDDATA (personid, tag, data) values (%s,%s,%s) ",
(pid, 'canonical_name', canonical_name))
update_status_final("Updating canonical_names finished.")
def personid_get_recids_affected_since(last_timestamp):
'''
Returns a list of recids which have been manually changed since timestamp)
@param: last_timestamp: last update, datetime.datetime
'''
vset = set(int(v[0].split(',')[1]) for v in run_sql(
"select distinct value from aidUSERINPUTLOG "
"where timestamp > %s", (last_timestamp,))
if ',' in v[0] and ':' in v[0])
pids = set(int(p[0]) for p in run_sql(
"select distinct personid from aidUSERINPUTLOG "
"where timestamp > %s", (last_timestamp,))
if p[0] > 0)
if pids:
pids_s = list_2_SQL_str(pids)
vset |= set(int(b[0]) for b in run_sql(
"select bibrec from aidPERSONIDPAPERS "
"where personid in %s" % pids_s))
return list(vset) # I'm not sure about this cast. It might work without it.
def get_all_paper_records(pid, claimed_only=False):
if not claimed_only:
return run_sql("SELECT distinct bibrec FROM aidPERSONIDPAPERS WHERE personid = %s", (str(pid),))
else:
return run_sql("SELECT distinct bibrec FROM aidPERSONIDPAPERS WHERE "
"personid = %s and flag=2 or flag=-2", (str(pid),))
def get_all_modified_names_from_personid(since=None):
if since:
all_pids = run_sql("SELECT DISTINCT personid "
"FROM aidPERSONIDPAPERS "
"WHERE flag > -2 "
"AND last_updated > %s"
% since)
else:
all_pids = run_sql("SELECT DISTINCT personid "
"FROM aidPERSONIDPAPERS "
"WHERE flag > -2 ")
return ((name[0][0], set(n[1] for n in name), len(name))
for name in (run_sql(
"SELECT personid, name "
"FROM aidPERSONIDPAPERS "
"WHERE personid = %s "
"AND flag > -2", p)
for p in all_pids))
def destroy_partial_marc_caches():
global MARC_100_700_CACHE
MARC_100_700_CACHE = None
gc.collect()
def populate_partial_marc_caches():
global MARC_100_700_CACHE
if MARC_100_700_CACHE:
return
def br_dictionarize(maptable):
gc.disable()
md = defaultdict(dict)
maxiters = len(set(map(itemgetter(0), maptable)))
for i, v in enumerate(groupby(maptable, itemgetter(0))):
if i % 1000 == 0:
update_status(float(i) / maxiters, 'br_dictionarizing...')
if i % 1000000 == 0:
update_status(float(i) / maxiters, 'br_dictionarizing...GC')
gc.collect()
idx = defaultdict(list)
fn = defaultdict(list)
for _, k, z in v[1]:
idx[k].append(z)
fn[z].append(k)
md[v[0]]['id'] = idx
md[v[0]]['fn'] = fn
update_status_final('br_dictionarizieng done')
gc.enable()
return md
def bib_dictionarize(bibtable):
return dict((i[0], (i[1], i[2])) for i in bibtable)
update_status(.0, 'Populating get_grouped_records_table_cache')
bibrec_bib10x = sorted(run_sql("select id_bibrec,id_bibxxx,field_number from bibrec_bib10x"))
update_status(.125, 'Populating get_grouped_records_table_cache')
brd_b10x = br_dictionarize(bibrec_bib10x)
del bibrec_bib10x
update_status(.25, 'Populating get_grouped_records_table_cache')
bibrec_bib70x = sorted(run_sql("select id_bibrec,id_bibxxx,field_number from bibrec_bib70x"))
update_status(.375, 'Populating get_grouped_records_table_cache')
brd_b70x = br_dictionarize(bibrec_bib70x)
del bibrec_bib70x
update_status(.5, 'Populating get_grouped_records_table_cache')
bib10x = (run_sql("select id,tag,value from bib10x"))
update_status(.625, 'Populating get_grouped_records_table_cache')
bibd_10x = bib_dictionarize(bib10x)
del bib10x
update_status(.75, 'Populating get_grouped_records_table_cache')
bib70x = (run_sql("select id,tag,value from bib70x"))
update_status(.875, 'Populating get_grouped_records_table_cache')
bibd_70x = bib_dictionarize(bib70x)
del bib70x
update_status_final('Finished populating get_grouped_records_table_cache')
MARC_100_700_CACHE = {'brb100':brd_b10x, 'brb700':brd_b70x, 'b100':bibd_10x, 'b700':bibd_70x}
def _get_grouped_records_using_caches(brr, *args):
try:
c = MARC_100_700_CACHE['brb%s' % str(brr[0])][brr[2]]
fn = c['id'][brr[1]]
except KeyError:
return dict((arg, []) for arg in args)
if not fn or len(fn) > 1:
#if len fn > 1 it's BAD: the same signature is at least twice on the same paper.
#Let's default to nothing, to be on the safe side.
return dict((arg, []) for arg in args)
ids = set(chain(*(c['fn'][i] for i in fn)))
tuples = [MARC_100_700_CACHE['b%s' % str(brr[0])][i] for i in ids]
results = {}
for t in tuples:
present = [x for x in args if x in t[0]]
assert len(present) <= 1
if present:
arg = present[0]
try:
results[arg].append(t[1])
except KeyError:
results[arg] = [t[1]]
for arg in args:
if arg not in results.keys():
results[arg] = []
return results
def _get_grouped_records_from_db(bibrefrec, *args):
'''
By a given bibrefrec: mark:ref,rec this function will scan
bibmarkx table and extract all records with tag in argc, which
are grouped togerther with this bibrec.
Returns a dictionary with { tag : [extracted_values] }
if the values is not found.
@type bibrefrec: (mark(int), ref(int), rec(int))
'''
table, ref, rec = bibrefrec
target_table = "bib%sx" % (str(table)[:-1])
mapping_table = "bibrec_%s" % target_table
group_id = run_sql("SELECT field_number "
"FROM %s "
"WHERE id_bibrec = %d "
"AND id_bibxxx = %d" %
(mapping_table, rec, ref))
if len(group_id) == 0:
# unfortunately the mapping is not found, so
# we cannot find anything
return dict((arg, []) for arg in args)
elif len(group_id) == 1:
# All is fine
field_number = group_id[0][0]
else:
# sounds bad, but ignore the error
field_number = min(x[0] for x in group_id)
grouped = run_sql("SELECT id_bibxxx "
"FROM %s "
"WHERE id_bibrec = %d "
"AND field_number = %d" %
(mapping_table, rec, int(field_number)))
assert len(grouped) > 0, "There should be a most one grouped value per tag."
grouped_s = list_2_SQL_str(grouped, lambda x: str(x[0]))
ret = {}
for arg in args:
qry = run_sql("SELECT value "
"FROM %s "
"WHERE tag LIKE '%%%s%%' "
"AND id IN %s" %
(target_table, arg, grouped_s))
ret[arg] = [q[0] for q in qry]
return ret
def get_grouped_records(bibrefrec, *args):
if MARC_100_700_CACHE:
if bconfig.DEBUG_CHECKS:
assert _get_grouped_records_using_caches(bibrefrec, *args) == _get_grouped_records_from_db(bibrefrec, *args)
return _get_grouped_records_using_caches(bibrefrec, *args)
else:
return _get_grouped_records_from_db(bibrefrec, *args)
def get_person_with_extid(idd, match_tag=False):
if match_tag:
mtag = " and tag = '%s'" % 'extid:' + match_tag
else:
mtag = ''
pids = run_sql("select personid from aidPERSONIDDATA where data=%s" % '%s' + mtag, (idd,))
return set(pids)
def get_inspire_id(p):
'''
Gets inspire id for a signature (bibref_table,bibref_value.bibrec)
'''
return get_grouped_records((str(p[0]), p[1], p[2]), str(p[0]) + '__i').values()[0]
def get_claimed_papers_from_papers(papers):
'''
Given a set of papers it returns the subset of claimed papers
@param papers: set of papers
@type papers: frozenset
@return: tuple
'''
papers_s = list_2_SQL_str(papers)
claimed_papers = run_sql("select bibrec from aidPERSONIDPAPERS "
"where bibrec in %s and flag = 1" % papers_s)
return claimed_papers
def collect_personID_external_ids_from_papers(personid, limit_to_claimed_papers=False):
gathered_ids = {}
if limit_to_claimed_papers:
flag = 1
else:
flag = -2
person_papers = run_sql("select bibref_table,bibref_value,bibrec from aidPERSONIDPAPERS where "
"personid=%s and flag > %s", (personid, flag))
if COLLECT_INSPIRE_ID:
inspireids = []
for p in person_papers:
extid = get_inspire_id(p)
if extid:
inspireids.append(extid)
inspireids = set((i[0] for i in inspireids))
gathered_ids['INSPIREID'] = inspireids
# if COLLECT_ORCID:
# orcids = []
# for p in person_papers:
# extid = get_orcid(p)
# if extid:
# orcids.append(extid)
# orcids = set((i[0] for i in orcids))
# gathered_ids['ORCID'] = orcids
# if COLLECT_ARXIV_ID:
# arxivids = []
# for p in person_papers:
# extid = get_arxiv_id(p)
# if extid:
# arxivids.append(extid)
# arxivids = set((i[0] for i in arxivids))
# gathered_ids['ARXIVID'] = arxivids
return gathered_ids
def update_personID_external_ids(persons_list=None, overwrite=False,
limit_to_claimed_papers=False, force_cache_tables=False):
if force_cache_tables:
populate_partial_marc_caches()
if not persons_list:
persons_list = set([x[0] for x in run_sql('select personid from aidPERSONIDPAPERS')])
for idx, pid in enumerate(persons_list):
update_status(float(idx) / float(len(persons_list)), "Updating external ids...")
collected = collect_personID_external_ids_from_papers(pid, limit_to_claimed_papers=limit_to_claimed_papers)
present = get_personiID_external_ids(pid)
if overwrite:
for idd in present.keys():
for k in present[idd]:
remove_personID_external_id(pid, idd, value=k)
present = {}
for idd in collected.keys():
for k in collected[idd]:
if idd not in present or k not in present[idd]:
add_personID_external_id(pid, idd, k)
if force_cache_tables:
destroy_partial_marc_caches()
update_status_final("Updating external ids finished.")
def _get_name_by_bibrecref_from_db(bib):
'''
@param bib: bibrefrec or bibref
@type bib: (mark, bibref, bibrec) OR (mark, bibref)
'''
table = "bib%sx" % str(bib[0])[:-1]
refid = bib[1]
tag = "%s__a" % str(bib[0])
ret = run_sql("select value from %s where id = '%s' and tag = '%s'" % (table, refid, tag))
assert len(ret) == 1, "A bibrefrec must have exactly one name(%s)" % str(bib)
return ret[0][0]
def _get_name_by_bibrecref_from_cache(bib):
'''
@param bib: bibrefrec or bibref
@type bib: (mark, bibref, bibrec) OR (mark, bibref)
'''
table = "b%s" % bib[0]
refid = bib[1]
tag = "%s__a" % str(bib[0])
ret = None
try:
if tag in MARC_100_700_CACHE[table][refid][0]:
ret = MARC_100_700_CACHE[table][refid][1]
except (KeyError, IndexError), e:
#The GC did run and the table is not clean?
#We might want to allow empty response here
raise Exception(str(bib) + str(e))
if bconfig.DEBUG_CHECKS:
assert ret == _get_name_by_bibrecref_from_db(bib)
return ret
def get_name_by_bibrecref(bib):
if MARC_100_700_CACHE:
if bconfig.DEBUG_CHECKS:
assert _get_name_by_bibrecref_from_cache(bib) == _get_name_by_bibrecref_from_db(bib)
return _get_name_by_bibrecref_from_cache(bib)
else:
return _get_name_by_bibrecref_from_db(bib)
def get_collaboration(bibrec):
bibxxx = run_sql("select id_bibxxx from bibrec_bib71x where id_bibrec = %s", (str(bibrec),))
if len(bibxxx) == 0:
return ()
bibxxx = list_2_SQL_str(bibxxx, lambda x: str(x[0]))
ret = run_sql("select value from bib71x where id in %s and tag like '%s'" % (bibxxx, "710__g"))
return [r[0] for r in ret]
def get_key_words(bibrec):
if bconfig.CFG_ADS_SITE:
bibxxx = run_sql("select id_bibxxx from bibrec_bib65x where id_bibrec = %s", (str(bibrec),))
else:
bibxxx = run_sql("select id_bibxxx from bibrec_bib69x where id_bibrec = %s", (str(bibrec),))
if len(bibxxx) == 0:
return ()
bibxxx = list_2_SQL_str(bibxxx, lambda x: str(x[0]))
if bconfig.CFG_ADS_SITE:
ret = run_sql("select value from bib69x where id in %s and tag like '%s'" % (bibxxx, "6531_a"))
else:
ret = run_sql("select value from bib69x where id in %s and tag like '%s'" % (bibxxx, "695__a"))
return [r[0] for r in ret]
def get_all_authors(bibrec):
bibxxx_1 = run_sql("select id_bibxxx from bibrec_bib10x where id_bibrec = %s", (str(bibrec),))
bibxxx_7 = run_sql("select id_bibxxx from bibrec_bib70x where id_bibrec = %s", (str(bibrec),))
if bibxxx_1:
bibxxxs_1 = list_2_SQL_str(bibxxx_1, lambda x: str(x[0]))
authors_1 = run_sql("select value from bib10x where tag = '%s' and id in %s" % ('100__a', bibxxxs_1,))
else:
authors_1 = []
if bibxxx_7:
bibxxxs_7 = list_2_SQL_str(bibxxx_7, lambda x: str(x[0]))
authors_7 = run_sql("select value from bib70x where tag = '%s' and id in %s" % ('700__a', bibxxxs_7,))
else:
authors_7 = []
return [a[0] for a in authors_1] + [a[0] for a in authors_7]
def get_title_from_rec(rec):
"""
Returns the name of the paper like str if found.
Otherwise returns None.
"""
title_id = run_sql("SELECT id_bibxxx "
"FROM bibrec_bib24x "
"WHERE id_bibrec = %s",
(rec,))
if title_id:
title_id_s = list_2_SQL_str(title_id, lambda x: x[0])
title = run_sql("SELECT value "
"FROM bib24x "
"WHERE id in %s "
"AND tag = '245__a'"
% title_id_s)
if title:
return title[0][0]
def get_bib10x():
return run_sql("select id, value from bib10x where tag like %s", ("100__a",))
def get_bib70x():
return run_sql("select id, value from bib70x where tag like %s", ("700__a",))
class Bib_matrix(object):
'''
This small class contains the sparse matrix
and encapsulates it.
'''
# please increment this value every time you
# change the output of the comparison functions
current_comparison_version = 10
__special_items = ((None, -3.), ('+', -2.), ('-', -1.))
special_symbols = dict((x[0], x[1]) for x in __special_items)
special_numbers = dict((x[1], x[0]) for x in __special_items)
def __init__(self, cluster_set=None):
if cluster_set:
self._bibmap = dict((b[1], b[0]) for b in enumerate(cluster_set.all_bibs()))
width = len(self._bibmap)
size = ((width - 1) * width) / 2
self._matrix = Bib_matrix.create_empty_matrix(size)
else:
self._bibmap = dict()
self.creation_time = get_sql_time()
@staticmethod
def create_empty_matrix(lenght):
ret = numpy.ndarray(shape=(lenght, 2), dtype=float, order='C')
ret.fill(Bib_matrix.special_symbols[None])
return ret
def _resolve_entry(self, bibs):
assert len(bibs) == 2
first = self._bibmap[bibs[0]]
second = self._bibmap[bibs[1]]
if first > second:
first, second = second, first
assert first < second
return first + ((second - 1) * second) / 2
def __setitem__(self, bibs, val):
entry = self._resolve_entry(bibs)
self._matrix[entry] = Bib_matrix.special_symbols.get(val, val)
def __getitem__(self, bibs):
entry = self._resolve_entry(bibs)
ret = tuple(self._matrix[entry])
return Bib_matrix.special_numbers.get(ret[0], ret)
def __contains__(self, bib):
return bib in self._bibmap
def get_keys(self):
return self._bibmap.keys()
@staticmethod
def get_file_dir(name):
sub_dir = name[:2]
if not sub_dir:
sub_dir = "empty_last_name"
return "%s%s/" % (bconfig.TORTOISE_FILES_PATH, sub_dir)
@staticmethod
def get_map_path(dir_path, name):
return "%s%s.bibmap" % (dir_path, name)
@staticmethod
def get_matrix_path(dir_path, name):
return "%s%s.npy" % (dir_path, name)
def load(self, name, load_map=True, load_matrix=True):
files_dir = Bib_matrix.get_file_dir(name)
if not os.path.isdir(files_dir):
self._bibmap = dict()
return False
try:
if load_map:
bibmap_v = cPickle.load(open(Bib_matrix.get_map_path(files_dir, name), 'r'))
rec_v, self.creation_time, self._bibmap = bibmap_v
if (rec_v != Bib_matrix.current_comparison_version or
Bib_matrix.current_comparison_version < 0): # you can use negative
# version to recalculate
self._bibmap = dict()
return False
if load_matrix:
self._matrix = numpy.load(Bib_matrix.get_matrix_path(files_dir, name))
except (IOError, UnpicklingError):
if load_map:
self._bibmap = dict()
self.creation_time = get_sql_time()
return False
return True
def store(self, name):
files_dir = Bib_matrix.get_file_dir(name)
if not os.path.isdir(files_dir):
try:
os.mkdir(files_dir)
except OSError, e:
if e.errno == 17 or 'file exists' in str(e.strerror).lower():
pass
else:
raise e
bibmap_v = (Bib_matrix.current_comparison_version, self.creation_time, self._bibmap)
cPickle.dump(bibmap_v, open(Bib_matrix.get_map_path(files_dir, name), 'w'))
numpy.save(open(Bib_matrix.get_matrix_path(files_dir, name), "w"), self._matrix)
def delete_paper_from_personid(rec):
'''
Deletes all information in PERSONID about a given paper
'''
run_sql("delete from aidPERSONIDPAPERS where bibrec = %s", (rec,))
def get_signatures_from_rec(bibrec):
'''
Retrieves all information in PERSONID
about a given bibrec.
'''
return run_sql("select personid, bibref_table, bibref_value, bibrec, name "
"from aidPERSONIDPAPERS where bibrec = %s"
, (bibrec,))
def modify_signature(oldref, oldrec, newref, newname):
'''
Modifies a signature in aidPERSONIDpapers.
'''
return run_sql("UPDATE aidPERSONIDPAPERS "
"SET bibref_table = %s, bibref_value = %s, name = %s "
"WHERE bibref_table = %s AND bibref_value = %s AND bibrec = %s"
, (str(newref[0]), newref[1], newname,
str(oldref[0]), oldref[1], oldrec))
def find_pids_by_name(name):
'''
Finds names and personids by a prefix name.
'''
return set(run_sql("SELECT personid, name "
"FROM aidPERSONIDPAPERS "
"WHERE name like %s"
, (name + ',%',)))
def find_pids_by_exact_name(name):
"""
Finds names and personids by a name.
"""
return set(run_sql("SELECT personid "
"FROM aidPERSONIDPAPERS "
"WHERE name = %s"
, (name,)))
def remove_sigs(signatures):
'''
Removes records from aidPERSONIDPAPERS
'''
for sig in signatures:
run_sql("DELETE FROM aidPERSONIDPAPERS "
"WHERE bibref_table like %s AND bibref_value = %s AND bibrec = %s"
, (str(sig[0]), sig[1], sig[2]))
def remove_personid_papers(pids):
'''
Removes all signatures from aidPERSONIDPAPERS with pid in pids
'''
if pids:
run_sql("delete from aidPERSONIDPAPERS where personid in %s"
% list_2_SQL_str(pids))
def get_full_personid_papers(table_name="`aidPERSONIDPAPERS`"):
'''
Get all columns and rows from aidPERSONIDPAPERS
or any other table with the same structure.
'''
return run_sql("select personid, bibref_table, "
"bibref_value, bibrec, name, flag, "
"lcul from %s" % table_name)
def get_full_results():
'''
Depricated. Should be removed soon.
'''
return run_sql("select personid, bibref_table, bibref_value, bibrec "
"from aidRESULTS")
def get_lastname_results(last_name):
'''
Returns rows from aidRESULTS which share a common last name.
'''
return run_sql("select personid, bibref_table, bibref_value, bibrec "
"from aidRESULTS "
"where personid like '" + last_name + ".%'")
def get_full_personid_data(table_name="`aidPERSONIDDATA`"):
'''
Get all columns and rows from aidPERSONIDDATA
or any other table with the same structure.
'''
return run_sql("select personid, tag, data, "
"opt1, opt2, opt3 from %s" % table_name)
def get_specific_personid_full_data(pid):
'''
Get all columns and rows from aidPERSONIDDATA
'''
return run_sql("select personid, tag, data, "
"opt1, opt2, opt3 from aidPERSONIDDATA where personid=%s "
, (pid,))
def get_canonical_names_by_pid(pid):
'''
Get all data that has as a tag canonical_name from aidPERSONIDDATA
'''
return run_sql("select data "
"from aidPERSONIDDATA where personid=%s and tag=%s"
, (pid, "canonical_name"))
def get_orcids_by_pids(pid):
'''
Get all data that has as a tag extid:ORCID from aidPERSONIDDATA
'''
return run_sql("select data "
"from aidPERSONIDDATA where personid=%s and tag=%s"
, (pid, "extid:ORCID"))
def get_inspire_ids_by_pids(pid):
'''
Get all data that has as a tag extid:INSPIREID from aidPERSONIDDATA
'''
return run_sql("select data "
"from aidPERSONIDDATA where personid=%s and tag=%s"
, (pid, "extid:INSPIREID"))
def get_uids_by_pids(pid):
'''
Get all data that has as a tag uid from aidPERSONIDDATA
'''
return run_sql("select data "
"from aidPERSONIDDATA where personid=%s and tag=%s"
, (pid, "uid"))
def get_name_string_to_pid_dictionary():
'''
Get a dictionary which maps name strigs to person ids
'''
namesdict = {}
all_names = run_sql("select personid,name from aidPERSONIDPAPERS")
for x in all_names:
namesdict.setdefault(x[1], set()).add(x[0])
return namesdict
#could be useful to optimize rabbit, still unused and untested, watch out.
def get_bibrecref_to_pid_dictuonary():
brr2pid = {}
all_brr = run_sql("select personid,bibref_table,bibref_value.bibrec from aidPERSONIDPAPERS")
for x in all_brr:
brr2pid.setdefault(tuple(x[1:]), set()).add(x[0])
return brr2pid
def check_personid_papers(output_file=None):
'''
Checks all invariants of personid.
Writes in stdout if output_file if False.
'''
if output_file:
fp = open(output_file, "w")
printer = lambda x: fp.write(x + '\n')
else:
printer = bibauthor_print
checkers = (
check_wrong_names,
check_duplicated_papers,
check_duplicated_signatures,
check_wrong_rejection,
check_canonical_names,
check_empty_personids
#check_claim_inspireid_contradiction
)
# Avoid writing f(a) or g(a), because one of the calls
# might be optimized.
return all([check(printer) for check in checkers])
def repair_personid(output_file=None):
'''
This should make check_personid_papers() to return true.
'''
if output_file:
fp = open(output_file, "w")
printer = lambda x: fp.write(x + '\n')
else:
printer = bibauthor_print
checkers = (
check_wrong_names,
check_duplicated_papers,
check_duplicated_signatures,
check_wrong_rejection,
check_canonical_names,
check_empty_personids
#check_claim_inspireid_contradiction
)
first_check = [check(printer) for check in checkers]
repair_pass = [check(printer, repair=True) for check in checkers]
last_check = [check(printer) for check in checkers]
if not all(first_check):
assert not(all(repair_pass))
assert all(last_check)
return all(last_check)
def check_duplicated_papers(printer, repair=False):
all_ok = True
bibrecs_to_reassign = []
recs = run_sql("select personid,bibrec from aidPERSONIDPAPERS where flag <> %s", (-2,))
d = {}
for x, y in recs:
d.setdefault(x, []).append(y)
for pid , bibrec in d.iteritems():
if not len(bibrec) == len(set(bibrec)):
all_ok = False
dups = sorted(bibrec)
dups = [x for i, x in enumerate(dups[0:len(dups) - 1]) if x == dups[i + 1]]
printer("Person %d has duplicated papers: %s" % (pid, dups))
if repair:
for dupbibrec in dups:
printer("Repairing duplicated bibrec %s" % str(dupbibrec))
involved_claimed = run_sql("select personid,bibref_table,bibref_value,bibrec,flag "
"from aidPERSONIDPAPERS where personid=%s and bibrec=%s "
"and flag >= 2", (pid, dupbibrec))
if len(involved_claimed) != 1:
bibrecs_to_reassign.append(dupbibrec)
run_sql("delete from aidPERSONIDPAPERS where personid=%s and bibrec=%s", (pid, dupbibrec))
else:
involved_not_claimed = run_sql("select personid,bibref_table,bibref_value,bibrec,flag "
"from aidPERSONIDPAPERS where personid=%s and bibrec=%s "
"and flag < 2", (pid, dupbibrec))
for v in involved_not_claimed:
run_sql("delete from aidPERSONIDPAPERS where personid=%s and bibref_table=%s "
"and bibref_value=%s and bibrec=%s and flag=%s", (v))
if repair and bibrecs_to_reassign:
printer("Reassigning deleted bibrecs %s" % str(bibrecs_to_reassign))
from invenio.bibauthorid_rabbit import rabbit
rabbit(bibrecs_to_reassign)
return all_ok
def check_duplicated_signatures(printer, repair=False):
all_ok = True
brr = run_sql("select bibref_table, bibref_value, bibrec from aidPERSONIDPAPERS where flag > %s", ("-2",))
bibrecs_to_reassign = []
d = {}
for x, y, z in brr:
d.setdefault(z, []).append((x, y))
for bibrec, bibrefs in d.iteritems():
if not len(bibrefs) == len(set(bibrefs)):
all_ok = False
dups = sorted(bibrefs)
dups = [x for i, x in enumerate(dups[0:len(dups) - 1]) if x == dups[i + 1]]
printer("Paper %d has duplicated signatures: %s" % (bibrec, dups))
if repair:
for dup in set(dups):
printer("Repairing duplicate %s" % str(dup))
claimed = run_sql("select personid,bibref_table,bibref_value,bibrec from "
"aidPERSONIDPAPERS where bibref_table=%s and bibref_value=%s "
"and bibrec=%s and flag=2", (dup[0], dup[1], bibrec))
if len(claimed) != 1:
bibrecs_to_reassign.append(bibrec)
run_sql("delete from aidPERSONIDPAPERS where bibref_table=%s and "
"bibref_value = %s and bibrec = %s", (dup[0], dup[1], bibrec))
else:
run_sql("delete from aidPERSONIDPAPERS where bibref_table=%s and bibref_value=%s "
"and bibrec=%s and flag<2", (dup[0], dup[1], bibrec))
if repair and bibrecs_to_reassign:
printer("Reassigning deleted duplicates %s" % str(bibrecs_to_reassign))
from invenio.bibauthorid_rabbit import rabbit
rabbit(bibrecs_to_reassign)
return all_ok
def get_wrong_names():
'''
Returns a generator with all wrong names in aidPERSONIDPAPERS.
Every element is (table, ref, correct_name).
'''
bib100 = dict(((x[0], create_normalized_name(split_name_parts(x[1]))) for x in get_bib10x()))
bib700 = dict(((x[0], create_normalized_name(split_name_parts(x[1]))) for x in get_bib70x()))
pidnames100 = set(run_sql("select bibref_value, name from aidPERSONIDPAPERS "
" where bibref_table='100'"))
pidnames700 = set(run_sql("select bibref_value, name from aidPERSONIDPAPERS "
" where bibref_table='700'"))
wrong100 = set(('100', x[0], bib100.get(x[0], None)) for x in pidnames100 if x[1] != bib100.get(x[0], None))
wrong700 = set(('700', x[0], bib700.get(x[0], None)) for x in pidnames700 if x[1] != bib700.get(x[0], None))
total = len(wrong100) + len(wrong700)
return chain(wrong100, wrong700), total
def check_wrong_names(printer, repair=False):
ret = True
wrong_names, number = get_wrong_names()
if number > 0:
ret = False
printer("%d corrupted names in aidPERSONIDPAPERS." % number)
for wrong_name in wrong_names:
if wrong_name[2]:
printer("Outdated name, '%s'(%s:%d)." % (wrong_name[2], wrong_name[0], wrong_name[1]))
else:
printer("Invalid id(%s:%d)." % (wrong_name[0], wrong_name[1]))
if repair:
printer("Fixing wrong name: %s" % str(wrong_name))
if wrong_name[2]:
run_sql("update aidPERSONIDPAPERS set name=%s where bibref_table=%s and bibref_value=%s",
(wrong_name[2], wrong_name[0], wrong_name[1]))
else:
run_sql("delete from aidPERSONIDPAPERS where bibref_table=%s and bibref_value=%s",
(wrong_name[0], wrong_name[1]))
return ret
def check_canonical_names(printer, repair=False):
ret = True
pid_cn = run_sql("select personid, data from aidPERSONIDDATA where tag = %s", ('canonical_name',))
pid_2_cn = dict((k, len(list(d))) for k, d in groupby(sorted(pid_cn, key=itemgetter(0)), key=itemgetter(0)))
pid_to_repair = []
for pid in get_existing_personids():
canon = pid_2_cn.get(pid, 0)
if canon != 1:
if canon == 0:
papers = run_sql("select count(*) from aidPERSONIDPAPERS where personid = %s", (pid,))[0][0]
if papers != 0:
printer("Personid %d does not have a canonical name, but have %d papers." % (pid, papers))
ret = False
pid_to_repair.append(pid)
else:
printer("Personid %d has %d canonical names.", (pid, canon))
pid_to_repair.append(pid)
ret = False
if repair and not ret:
printer("Repairing canonical names for pids: %s" % str(pid_to_repair))
update_personID_canonical_names(pid_to_repair, overwrite=True)
return ret
def check_empty_personids(printer, repair=False):
ret = True
paper_pids = set(p[0] for p in run_sql("select personid from aidPERSONIDPAPERS"))
data_pids = set(p[0] for p in run_sql("select personid from aidPERSONIDDATA"))
for p in data_pids - paper_pids:
fields = run_sql("select count(*) from aidPERSONIDDATA where personid = %s and tag <> %s", (p, "canonical_name",))[0][0]
if fields == 0:
printer("Personid %d has no papers and nothing else than canonical_name." % p)
if repair:
printer("Deleting empty person %s" % str(p))
run_sql("delete from aidPERSONIDDATA where personid=%s", (p,))
ret = False
return ret
def check_wrong_rejection(printer, repair=False):
all_ok = True
to_reassign = []
to_deal_with = []
all_rejections = set(run_sql("select bibref_table, bibref_value, bibrec "
"from aidPERSONIDPAPERS "
"where flag = %s",
('-2',)))
all_confirmed = set(run_sql("select bibref_table, bibref_value, bibrec "
"from aidPERSONIDPAPERS "
"where flag > %s",
('-2',)))
not_assigned = all_rejections - all_confirmed
if not_assigned:
all_ok = False
for s in not_assigned:
printer('Paper (%s:%s,%s) was rejected but never reassigned' % s)
to_reassign.append(s)
all_rejections = set(run_sql("select personid, bibref_table, bibref_value, bibrec "
"from aidPERSONIDPAPERS "
"where flag = %s",
('-2',)))
all_confirmed = set(run_sql("select personid, bibref_table, bibref_value, bibrec "
"from aidPERSONIDPAPERS "
"where flag > %s",
('-2',)))
both_confirmed_and_rejected = all_rejections & all_confirmed
if both_confirmed_and_rejected:
all_ok = False
for i in both_confirmed_and_rejected:
printer("Conflicting assignment/rejection: %s" % str(i))
to_deal_with.append(i)
if repair and (to_reassign or to_deal_with):
from invenio.bibauthorid_rabbit import rabbit
if to_reassign:
#Rabbit is not designed to reassign signatures which are rejected but not assigned:
#All signatures should stay assigned, if a rejection occours the signature should get
#moved to a new place and the rejection entry added, but never exist as a rejection only.
#Hence, to force rabbit to reassign it, we have to delete the rejection.
printer("Reassigning bibrecs with missing entries: %s" % str(to_reassign))
for sig in to_reassign:
run_sql("delete from aidPERSONIDPAPERS where bibref_table=%s and "
"bibref_value=%s and bibrec = %s and flag=-2", (sig))
rabbit([s[2] for s in to_reassign])
if to_deal_with:
#We got claims and rejections on the same person for the same paper. Let's forget about
#it and reassign it automatically, they'll make up their minds sooner or later.
printer("Deleting and reassigning bibrefrecs with conflicts %s" % str(to_deal_with))
for sig in to_deal_with:
run_sql("delete from aidPERSONIDPAPERS where personid=%s and bibref_table=%s and "
"bibref_value=%s and bibrec = %s", (sig))
rabbit(map(itemgetter(3), to_deal_with))
return all_ok
def check_merger():
'''
This function presumes that copy_personid was
called before the merger.
'''
is_ok = True
old_claims = set(run_sql("select personid, bibref_table, bibref_value, bibrec, flag "
"from aidPERSONIDPAPERS_copy "
"where flag = -2 or flag = 2"))
cur_claims = set(run_sql("select personid, bibref_table, bibref_value, bibrec, flag "
"from aidPERSONIDPAPERS "
"where flag = -2 or flag = 2"))
errors = ((old_claims - cur_claims, "Some claims were lost during the merge."),
(cur_claims - old_claims, "Some new claims appeared after the merge."))
act = { -2 : 'Rejection', 2 : 'Claim' }
for err_set, err_msg in errors:
if err_set:
is_ok = False
bibauthor_print(err_msg)
bibauthor_print("".join(" %s: personid %d %d:%d,%d\n" %
(act[cl[4]], cl[0], int(cl[1]), cl[2], cl[3]) for cl in err_set))
old_assigned = set(run_sql("select bibref_table, bibref_value, bibrec "
"from aidPERSONIDPAPERS_copy"))
#"where flag <> -2 and flag <> 2"))
cur_assigned = set(run_sql("select bibref_table, bibref_value, bibrec "
"from aidPERSONIDPAPERS"))
#"where flag <> -2 and flag <> 2"))
errors = ((old_assigned - cur_assigned, "Some signatures were lost during the merge."),
(cur_assigned - old_assigned, "Some new signatures appeared after the merge."))
for err_sig, err_msg in errors:
if err_sig:
is_ok = False
bibauthor_print(err_msg)
bibauthor_print("".join(" %s:%d,%d\n" % sig for sig in err_sig))
return is_ok
def check_results():
is_ok = True
all_result_rows = run_sql("select personid,bibref_table,bibref_value,bibrec from aidRESULTS")
keyfunc = lambda x: x[1:]
duplicated = (d for d in (list(d) for k, d in groupby(sorted(all_result_rows, key=keyfunc), key=keyfunc)) if len(d) > 1)
for dd in duplicated:
is_ok = False
for d in dd:
print "Duplicated row in aidRESULTS"
print "%s %s %s %s" % d
print
clusters = {}
for rr in all_result_rows:
clusters[rr[0]] = clusters.get(rr[0], []) + [rr[3]]
faulty_clusters = dict((cid, len(recs) - len(set(recs)))
for cid, recs in clusters.items()
if not len(recs) == len(set(recs)))
if faulty_clusters:
is_ok = False
print "Recids NOT unique in clusters!"
print ("A total of %s clusters hold an average of %.2f duplicates" %
(len(faulty_clusters), (sum(faulty_clusters.values()) / float(len(faulty_clusters)))))
for c in faulty_clusters:
print "Name: %-20s Size: %4d Faulty: %2d" % (c, len(clusters[c]), faulty_clusters[c])
return is_ok
def check_claim_inspireid_contradiction():
iids10x = run_sql("select id from bib10x where tag = '100__i'")
iids70x = run_sql("select id from bib70x where tag = '700__i'")
refs10x = set(x[0] for x in run_sql("select id from bib10x where tag = '100__a'"))
refs70x = set(x[0] for x in run_sql("select id from bib70x where tag = '700__a'"))
if iids10x:
iids10x = list_2_SQL_str(iids10x, lambda x: str(x[0]))
iids10x = run_sql("select id_bibxxx, id_bibrec, field_number "
"from bibrec_bib10x "
"where id_bibxxx in %s"
% iids10x)
iids10x = ((row[0], [(ref, rec) for ref, rec in run_sql(
"select id_bibxxx, id_bibrec "
"from bibrec_bib10x "
"where id_bibrec = '%s' "
"and field_number = '%s'"
% row[1:])
if ref in refs10x])
for row in iids10x)
else:
iids10x = ()
if iids70x:
iids70x = list_2_SQL_str(iids70x, lambda x: str(x[0]))
iids70x = run_sql("select id_bibxxx, id_bibrec, field_number "
"from bibrec_bib70x "
"where id_bibxxx in %s"
% iids70x)
iids70x = ((row[0], [(ref, rec) for ref, rec in run_sql(
"select id_bibxxx, id_bibrec "
"from bibrec_bib70x "
"where id_bibrec = '%s' "
"and field_number = '%s'"
% (row[1:]))
if ref in refs70x])
for row in iids70x)
else:
iids70x = ()
# [(iids, [bibs])]
inspired = list(chain(((iid, list(set(('100',) + bib for bib in bibs))) for iid, bibs in iids10x),
((iid, list(set(('700',) + bib for bib in bibs))) for iid, bibs in iids70x)))
assert all(len(x[1]) == 1 for x in inspired)
inspired = ((k, map(itemgetter(0), map(itemgetter(1), d)))
for k, d in groupby(sorted(inspired, key=itemgetter(0)), key=itemgetter(0)))
# [(inspireid, [bibs])]
inspired = [([(run_sql("select personid "
"from aidPERSONIDPAPERS "
"where bibref_table = %s "
"and bibref_value = %s "
"and bibrec = %s "
"and flag = '2'"
, bib), bib)
for bib in cluster[1]], cluster[0])
for cluster in inspired]
# [([([pid], bibs)], inspireid)]
for cluster, iid in inspired:
pids = set(chain.from_iterable(imap(itemgetter(0), cluster)))
if len(pids) > 1:
print "InspireID: %s links the following papers:" % iid
print map(itemgetter(1), cluster)
print "More than one personid claimed them:"
print list(pids)
print
continue
if len(pids) == 0:
# not even one paper with this inspireid has been
# claimed, screw it
continue
pid = list(pids)[0][0]
# The last step is to check all non-claimed papers for being
# claimed by the person on some different signature.
problem = (run_sql("select bibref_table, bibref_value, bibrec "
"from aidPERSONIDPAPERS "
"where bibrec = %s "
"and personid = %s "
"and flag = %s"
, (bib[2], pid, 2))
for bib in (bib for lpid, bib in cluster if not lpid))
problem = list(chain.from_iterable(problem))
if problem:
print "A personid has claimed a paper from an inspireid cluster and a contradictory paper."
print "Personid %d" % pid
print "Inspireid cluster %s" % str(map(itemgetter(1), cluster))
print "Contradicting claims: %s" % str(problem)
print
def get_all_bibrecs():
'''
Get all record ids present in aidPERSONIDPAPERS
'''
return set([x[0] for x in run_sql("select bibrec from aidPERSONIDPAPERS")])
def get_bibrefrec_to_pid_flag_mapping():
'''
create a map between signatures and personid/flag
'''
whole_table = run_sql("select bibref_table,bibref_value,bibrec,personid,flag from aidPERSONIDPAPERS")
gc.disable()
ret = {}
for x in whole_table:
sig = (x[0], x[1], x[2])
pid_flag = (x[3], x[4])
ret[sig] = ret.get(sig , []) + [pid_flag]
gc.collect()
gc.enable()
return ret
def remove_all_bibrecs(bibrecs):
'''
Remove give record ids from aidPERSONIDPAPERS table
@param bibrecs:
@type bibrecs:
'''
bibrecs_s = list_2_SQL_str(bibrecs)
run_sql("delete from aidPERSONIDPAPERS where bibrec in %s" % bibrecs_s)
def empty_results_table():
'''
Get rid of all tortoise results
'''
run_sql("TRUNCATE aidRESULTS")
def save_cluster(named_cluster):
'''
Save a cluster in aidRESULTS
@param named_cluster:
@type named_cluster:
'''
name, cluster = named_cluster
for bib in cluster.bibs:
run_sql("INSERT INTO aidRESULTS "
"(personid, bibref_table, bibref_value, bibrec) "
"VALUES (%s, %s, %s, %s) "
, (name, str(bib[0]), bib[1], bib[2]))
def remove_result_cluster(name):
'''
Remove result cluster using name string
@param name:
@type name:
'''
run_sql("DELETE FROM aidRESULTS "
"WHERE personid like '%s.%%'"
% name)
def personid_name_from_signature(sig):
'''
Find personid and name string of a signature
@param sig:
@type sig:
'''
ret = run_sql("select personid, name "
"from aidPERSONIDPAPERS "
"where bibref_table = %s and bibref_value = %s and bibrec = %s "
"and flag > '-2'"
, sig)
assert len(ret) < 2, ret
return ret
def personid_from_signature(sig):
'''
Find personid owner of a signature
@param sig:
@type sig:
'''
ret = run_sql("select personid, flag "
"from aidPERSONIDPAPERS "
"where bibref_table = %s and bibref_value = %s and bibrec = %s "
"and flag > '-2'"
, sig)
assert len(ret) < 2, ret
return ret
def get_signature_info(sig):
'''
Get personid and flag relative to a signature
@param sig:
@type sig:
'''
ret = run_sql("select personid, flag "
"from aidPERSONIDPAPERS "
"where bibref_table = %s and bibref_value = %s and bibrec = %s "
"order by flag"
, sig)
return ret
def get_claimed_papers(pid):
'''
Find all papers which have been manually claimed
@param pid:
@type pid:
'''
return run_sql("select bibref_table, bibref_value, bibrec "
"from aidPERSONIDPAPERS "
"where personid = %s "
"and flag > %s",
(pid, 1))
def copy_personids():
'''
Make a copy of aidPERSONID tables to aidPERSONID*_copy tables for later comparison/restore
'''
run_sql("DROP TABLE IF EXISTS `aidPERSONIDDATA_copy`")
run_sql("CREATE TABLE `aidPERSONIDDATA_copy` ( "
"`personid` BIGINT( 8 ) UNSIGNED NOT NULL , "
"`tag` VARCHAR( 64 ) NOT NULL , "
"`data` VARCHAR( 256 ) NOT NULL , "
"`opt1` MEDIUMINT( 8 ) DEFAULT NULL , "
"`opt2` MEDIUMINT( 8 ) DEFAULT NULL , "
"`opt3` VARCHAR( 256 ) DEFAULT NULL , "
"KEY `personid-b` ( `personid` ) , "
"KEY `tag-b` ( `tag` ) , "
"KEY `data-b` ( `data` ) , "
"KEY `opt1` ( `opt1` ) "
") ENGINE = MYISAM DEFAULT CHARSET = utf8")
run_sql("INSERT INTO `aidPERSONIDDATA_copy` "
"SELECT * "
"FROM `aidPERSONIDDATA`")
run_sql("DROP TABLE IF EXISTS `aidPERSONIDPAPERS_copy`")
run_sql("CREATE TABLE `aidPERSONIDPAPERS_copy` ( "
"`personid` bigint( 8 ) unsigned NOT NULL , "
"`bibref_table` enum( '100', '700' ) NOT NULL , "
"`bibref_value` mediumint( 8 ) unsigned NOT NULL , "
"`bibrec` mediumint( 8 ) unsigned NOT NULL , "
"`name` varchar( 256 ) NOT NULL , "
"`flag` smallint( 2 ) NOT NULL DEFAULT '0', "
"`lcul` smallint( 2 ) NOT NULL DEFAULT '0', "
"`last_updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP , "
"KEY `personid-b` ( `personid` ) , "
"KEY `reftable-b` ( `bibref_table` ) , "
"KEY `refvalue-b` ( `bibref_value` ) , "
"KEY `rec-b` ( `bibrec` ) , "
"KEY `name-b` ( `name` ) , "
"KEY `timestamp-b` ( `last_updated` ) , "
"KEY `ptvrf-b` ( `personid` , `bibref_table` , `bibref_value` , `bibrec` , `flag` ) "
") ENGINE = MyISAM DEFAULT CHARSET = utf8")
run_sql("INSERT INTO `aidPERSONIDPAPERS_copy` "
"SELECT * "
"FROM `aidPERSONIDPAPERS")
def delete_empty_persons():
'''
find and eliminate empty persons (not holding papers nor any other information that canonical_name
'''
pp = run_sql("select personid from aidPERSONIDPAPERS")
pp = set(p[0] for p in pp)
pd = run_sql("select personid from aidPERSONIDDATA")
pd = set(p[0] for p in pd)
fpd = run_sql("select personid from aidPERSONIDDATA where tag <> 'canonical_name'")
fpd = set(p[0] for p in fpd)
to_delete = pd - (pp | fpd)
if to_delete:
run_sql("delete from aidPERSONIDDATA where personid in %s" % list_2_SQL_str(to_delete))
def restore_personids():
'''
Restore personid tables from last copy saved with copy_personids
'''
run_sql("TRUNCATE `aidPERSONIDDATA`")
run_sql("INSERT INTO `aidPERSONIDDATA` "
"SELECT * "
"FROM `aidPERSONIDDATA_copy`")
run_sql("TRUNCATE `aidPERSONIDPAPERS`")
run_sql("INSERT INTO `aidPERSONIDPAPERS` "
"SELECT * "
"FROM `aidPERSONIDPAPERS_copy")
def resolve_affiliation(ambiguous_aff_string):
"""
This is a method available in the context of author disambiguation in ADS
only. No other platform provides the db table used by this function.
@warning: to be used in an ADS context only.
@param ambiguous_aff_string: Ambiguous affiliation string
@type ambiguous_aff_string: str
@return: The normalized version of the name string as presented in the database
@rtype: str
"""
if not ambiguous_aff_string or not bconfig.CFG_ADS_SITE:
return "None"
aff_id = run_sql("select aff_id from ads_affiliations where affstring=%s", (ambiguous_aff_string,))
if aff_id:
return aff_id[0][0]
else:
return "None"
def get_free_pids():
'''
Returns an iterator with all free personids.
It's cool, because it fills holes.
'''
all_pids = frozenset(x[0] for x in chain(
run_sql("select personid from aidPERSONIDPAPERS") ,
run_sql("select personid from aidPERSONIDDATA")))
return ifilter(lambda x: x not in all_pids, count())
def remove_results_outside(many_names):
'''
Delete results from aidRESULTS not including many_names
@param many_names:
@type many_names:
'''
many_names = frozenset(many_names)
res_names = frozenset(x[0].split(".")[0] for x in run_sql("select personid from aidRESULTS"))
for name in res_names - many_names:
run_sql("delete from aidRESULTS where personid like '%s.%%'" % name)
def get_signatures_from_bibrefs(bibrefs):
'''
@param bibrefs:
@type bibrefs:
'''
bib10x = ifilter(lambda x: x[0] == 100, bibrefs)
bib10x_s = list_2_SQL_str(bib10x, lambda x: x[1])
bib70x = ifilter(lambda x: x[0] == 700, bibrefs)
bib70x_s = list_2_SQL_str(bib70x, lambda x: x[1])
valid_recs = set(get_all_valid_bibrecs())
if bib10x_s != '()':
sig10x = run_sql("select 100, id_bibxxx, id_bibrec "
"from bibrec_bib10x "
"where id_bibxxx in %s"
% (bib10x_s,))
else:
sig10x = ()
if bib70x_s != '()':
sig70x = run_sql("select 700, id_bibxxx, id_bibrec "
"from bibrec_bib70x "
"where id_bibxxx in %s"
% (bib70x_s,))
else:
sig70x = ()
return ifilter(lambda x: x[2] in valid_recs, chain(set(sig10x), set(sig70x)))
def get_all_valid_bibrecs():
'''
Returns a list of valid record ids
'''
collection_restriction_pattern = " or ".join(["980__a:\"%s\"" % x for x in bconfig.LIMIT_TO_COLLECTIONS])
return perform_request_search(p="%s" % collection_restriction_pattern, rg=0)
def get_coauthor_pids(pid, exclude_bibrecs=None):
'''
Find personids sharing bibrecs with given pid, eventually excluding a given set of common bibrecs.
@param pid:
@type pid:
@param exclude_bibrecs:
@type exclude_bibrecs:
'''
papers = get_person_bibrecs(pid)
if exclude_bibrecs:
papers = set(papers) - set(exclude_bibrecs)
if not papers:
return []
papers_s = list_2_SQL_str(papers)
pids = run_sql("select personid,bibrec from aidPERSONIDPAPERS "
"where bibrec in %s and flag > -2" % papers_s)
pids = set((int(p[0]), int(p[1])) for p in pids)
pids = sorted([p[0] for p in pids])
pids = groupby(pids)
pids = [(key, len(list(val))) for key, val in pids if key != pid]
pids = sorted(pids, key=lambda x: x[1], reverse=True)
return pids
def get_doi_from_rec(recid):
"""
Returns the doi of the paper like str if found.
Otherwise returns None.
0247 $2 DOI $a id
"""
idx = run_sql("SELECT id_bibxxx, field_number FROM bibrec_bib02x WHERE id_bibrec = %s", (recid,))
if idx:
doi_id_s = list_2_SQL_str(idx, lambda x: x[0])
doi = run_sql("SELECT id, tag, value FROM bib02x WHERE id in %s " % doi_id_s)
if doi:
grouped = groupby(idx, lambda x: x[1])
doi_dict = dict((x[0],x[1:]) for x in doi)
for group in grouped:
elms = [x[0] for x in list(group[1])]
found = False
code = None
for el in elms:
if doi_dict[el][0] == '0247_2' and doi_dict[el][1] == 'DOI':
found = True
elif doi_dict[el][0] == '0247_a':
code = doi_dict[el][1]
if found and code:
return code
return None
def export_person(person_id):
'''list of records table: personidpapers and personiddate check existing function for getting the records!!!
exports a structure of dictunaries of tuples of [...] if strings, like:
{'name':('namestring',),
'repeatable_field':({'field1':('val1',)},{'field1':'val2'})}
'''
person_info = defaultdict(defaultdict)
full_names = get_person_db_names_set(person_id)
if full_names:
splitted_names = [split_name_parts(n[0]) for n in full_names]
splitted_names = [x+[len(x[2])] for x in splitted_names]
max_first_names = max([x[4] for x in splitted_names])
full_name_candidates = filter(lambda x: x[4] == max_first_names, splitted_names)
full_name = create_normalized_name(full_name_candidates[0])
person_info['names']['full_name'] = (full_name,)
person_info['names']['surname'] = (full_name_candidates[0][0],)
if full_name_candidates[0][2]:
person_info['names']['first_names'] = (' '.join(full_name_candidates[0][2]),)
person_info['names']['name_variants'] = ('; '.join([create_normalized_name(x) for x in splitted_names]),)
bibrecs = get_person_bibrecs(person_id)
recids_data = []
for recid in bibrecs:
recid_dict = defaultdict(defaultdict)
recid_dict['INSPIRE-record-id'] = (str(recid),)
recid_dict['INSPIRE-record-url'] = ('%s/record/%s' % (CFG_SITE_URL, str(recid)),)
rec_doi = get_doi_from_rec(recid)
if rec_doi:
recid_dict['DOI']= (str(rec_doi),)
recids_data.append(recid_dict)
person_info['records']['record'] = tuple(recids_data)
person_info['identifiers']['INSPIRE_person_ID'] = (str(person_id),)
canonical_names = get_canonical_names_by_pid(person_id)
if canonical_names:
person_info['identifiers']['INSPIRE_canonical_name'] = (str(canonical_names[0][0]),)
person_info['profile_page']['INSPIRE_profile_page'] = ('%s/author/%s' % (CFG_SITE_URL,canonical_names[0][0]),)
else:
person_info['profile_page']['INSPIRE_profile_page'] = ('%s/author/%s' % (CFG_SITE_URL,str(person_id)),)
orcids = get_orcids_by_pids(person_id)
if orcids:
person_info['identifiers']['ORCID'] = tuple(str(x[0]) for x in orcids)
inspire_ids = get_inspire_ids_by_pids(person_id)
if inspire_ids:
person_info['identifiers']['INSPIREID'] = tuple(str(x[0]) for x in inspire_ids)
return person_info
def export_person_to_foaf(person_id):
'''
Exports to foaf xml a dictionary of dictionaries or tuples of strings as retured by export_person
'''
infodict = export_person(person_id)
def export(val, indent=0):
if isinstance(val, dict):
contents = list()
for k,v in val.iteritems():
if isinstance(v,tuple):
contents.append( ''.join( [ X[str(k)](indent=indent, body=export(c)) for c in v] ))
else:
contents.append( X[str(k)](indent=indent,body=export(v, indent=indent+1)) )
return ''.join(contents)
elif isinstance(val, str):
return str(X.escaper(val))
else:
raise Exception('WHAT THE HELL DID WE GET HERE? %s' % str(val) )
return X['person'](body=export(infodict, indent=1))
diff --git a/invenio/legacy/bibauthorid/frontinterface.py b/invenio/legacy/bibauthorid/frontinterface.py
index dba0b8ed2..bf41bac93 100644
--- a/invenio/legacy/bibauthorid/frontinterface.py
+++ b/invenio/legacy/bibauthorid/frontinterface.py
@@ -1,252 +1,252 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
'''
bibauthorid_frontinterface
This file aims to filter and modify the interface given by
bibauthorid_bdinterface in order to make it usable by the
frontend so to keep it as clean as possible.
'''
-from invenio.bibauthorid_name_utils import split_name_parts #emitting #pylint: disable-msg=W0611
-from invenio.bibauthorid_name_utils import soft_compare_names
-from invenio.bibauthorid_name_utils import create_normalized_name #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.name_utils import split_name_parts #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.name_utils import soft_compare_names
+from invenio.legacy.bibauthorid.name_utils import create_normalized_name #emitting #pylint: disable-msg=W0611
from invenio import bibauthorid_dbinterface as dbinter
from cgi import escape
#Well this is bad, BUT otherwise there must 100+ lines
#of the form from dbinterface import ... # emitting
-from invenio.bibauthorid_dbinterface import * #pylint: disable-msg=W0614
+from invenio.legacy.bibauthorid.dbinterface import * #pylint: disable-msg=W0614
def set_person_data(person_id, tag, value, user_level=None):
'''
@param person_id:
@type person_id: int
@param tag:
@type tag: string
@param value:
@type value: string
@param user_level:
@type user_level: int
'''
old = dbinter.get_personid_row(person_id, tag)
old_data = [tup[0] for tup in old]
if value not in old_data:
dbinter.set_personid_row(person_id, tag, value, opt2=user_level)
def get_person_data(person_id, tag):
res = dbinter.get_personid_row(person_id, tag)
if res:
return (res[1], res[0])
else:
return []
def del_person_data(tag, person_id=None, value=None):
dbinter.del_personid_row(tag, person_id, value)
def get_bibrefrec_name_string(bibref):
'''
Returns the name string associated to a name string
@param bibref: bibrefrec '100:123,123'
@return: string
'''
name = ""
ref = ""
if not ((bibref and isinstance(bibref, str) and bibref.count(":"))):
return name
if bibref.count(","):
try:
ref = bibref.split(",")[0]
except (ValueError, TypeError, IndexError):
return name
else:
ref = bibref
table, ref = ref.split(":")
dbname = get_name_by_bibrecref((int(table), int(ref)))
if dbname:
name = dbname
return name
def add_person_paper_needs_manual_review(pid, bibrec):
'''
Adds to a person a paper which needs manual review before bibref assignment
@param pid: personid, int
@param bibrec: the bibrec, int
'''
set_person_data(pid, 'paper_needs_bibref_manual_confirm', bibrec)
def get_person_papers_to_be_manually_reviewed(pid):
'''
Returns the set of papers awaiting for manual review for a person for bibref assignment
@param pid: the personid, int
'''
return get_person_data(pid, 'paper_needs_bibref_manual_confirm')
def del_person_papers_needs_manual_review(pid, bibrec):
'''
Deletes from the set of papers awaiting for manual review for a person
@param pid: personid, int
@param bibrec: the bibrec, int
'''
del_person_data(person_id=pid, tag='paper_needs_bibref_manual_confirm', value=str(bibrec))
def set_processed_external_recids(pid, recid_list_str):
'''
Set processed external recids
@param pid: pid
@param recid_list_str: str
'''
del_person_data(person_id=pid, tag='processed_external_recids')
set_person_data(pid, "processed_external_recids", recid_list_str)
def assign_person_to_uid(uid, pid):
'''
Assigns a person to a userid. If person already assigned to someone else, create new person.
Returns the peron id assigned.
@param uid: user id, int
@param pid: person id, int, if -1 creates new person.
@return: pid int
'''
if pid == -1:
pid = dbinter.create_new_person_from_uid(uid)
return pid
else:
current_uid = get_person_data(pid, 'uid')
if len(current_uid) == 0:
set_person_data(pid, 'uid', str(uid))
return pid
else:
pid = dbinter.create_new_person_from_uid(uid)
return pid
def get_processed_external_recids(pid):
'''
Returns processed external recids
@param pid: pid
@return: [str]
'''
db_data = get_person_data(pid, "processed_external_recids")
recid_list_str = ''
if db_data and db_data[0] and db_data[0][1]:
recid_list_str = db_data[0][1]
return recid_list_str
def get_all_personids_recs(pid, claimed_only=False):
return dbinter.get_all_paper_records(pid, claimed_only)
def find_personIDs_by_name_string(target):
'''
Search engine to find persons matching the given string
The matching is done on the surname first, and names if present.
An ordered list (per compatibility) of pids and found names is returned.
@param namestring: string name, 'surname, names I.'
@type: string
@param strict: Define if this shall perform an exact or a fuzzy match
@type strict: boolean
@return: pid list of lists
[pid,[[name string, occur count, compatibility]]]
'''
splitted_name = split_name_parts(target)
family = splitted_name[0]
levels = (#target + '%', #this introduces a weird problem: different results for mele, salvatore and salvatore mele
family + ',%',
family[:-2] + '%',
'%' + family + ',%',
'%' + family[1:-1] + '%')
if len(family) <= 4:
levels = [levels[0], levels[2]]
for lev in levels:
names = dbinter.get_all_personids_by_name(lev)
if names:
break
is_canonical = False
if not names:
names = dbinter.get_personids_by_canonical_name(target)
is_canonical = True
names = groupby(sorted(names))
names = [(key[0], key[1], len(list(data)), soft_compare_names(target, key[1])) for key, data in names]
names = groupby(names, itemgetter(0))
names = [(key, sorted([(d[1], d[2], d[3]) for d in data if (d[3] > 0.5 or is_canonical)],
key=itemgetter(2), reverse=True)) for key, data in names]
names = [name for name in names if name[1]]
names = sorted(names, key=lambda x: (x[1][0][2], x[1][0][0], x[1][0][1]), reverse=True)
return names
def find_top5_personid_for_new_arXiv_user(bibrecs, name):
top5_list = []
pidlist = get_personids_and_papers_from_bibrecs(bibrecs, limit_by_name=name)
for p in pidlist:
if not get_uid_from_personid(p[0]):
top5_list.append(p[0])
if len(top5_list) > 4:
break
escaped_name = ""
if name:
escaped_name = escape(name, quote=True)
else:
return top5_list
pidlist = find_personIDs_by_name_string(escaped_name)
for p in pidlist:
if not get_uid_from_personid(p[0]) and not p[0] in top5_list:
top5_list.append(p[0])
if len(top5_list) > 4:
break
return top5_list
def check_personids_availability(picked_profile, uid):
if picked_profile == -1:
return create_new_person(uid, uid_is_owner=True)
else:
if not get_uid_from_personid(picked_profile):
dbinter.set_personid_row(picked_profile, 'uid', uid)
return picked_profile
else:
return create_new_person(uid, uid_is_owner=True)
diff --git a/invenio/legacy/bibauthorid/general_utils.py b/invenio/legacy/bibauthorid/general_utils.py
index 6fb428970..93d0c6d4a 100644
--- a/invenio/legacy/bibauthorid/general_utils.py
+++ b/invenio/legacy/bibauthorid/general_utils.py
@@ -1,168 +1,168 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
'''
bibauthorid_general_utils
Bibauthorid utilities used by many parts of the framework
'''
-from invenio import bibauthorid_config as bconfig
+from invenio.legacy.bibauthorid import config as bconfig
from datetime import datetime
import sys
PRINT_TS = bconfig.DEBUG_TIMESTAMPS
PRINT_TS_US = bconfig.DEBUG_TIMESTAMPS_UPDATE_STATUS and PRINT_TS
NEWLINE = bconfig.DEBUG_UPDATE_STATUS_THREAD_SAFE
FO = bconfig.DEBUG_LOG_TO_PIDFILE
TERMINATOR = '\r'
if NEWLINE or FO:
TERMINATOR = '\n'
import os
PID = os.getpid
pidfiles = dict()
def override_stdout_config(fileout=False, stdout=True):
global FO
assert fileout^stdout
if fileout:
FO = True
if stdout:
FO = False
def set_stdout():
if FO:
try:
sys.stdout = pidfiles[PID()]
except KeyError:
pidfiles[PID()] = open('/tmp/bibauthorid_log_pid_'+str(PID()),'w')
sys.stdout = pidfiles[PID()]
print 'REDIRECTING TO PIDFILE '
#python2.4 compatibility layer.
try:
any([True])
except:
def any(x):
for element in x:
if element:
return True
return False
bai_any = any
try:
all([True])
except:
def all(x):
for element in x:
if not element:
return False
return True
bai_all = all
#end of python2.4 compatibility. Please remove this horror as soon as all systems will have
#been ported to python2.6+
def __print_func(*args):
set_stdout()
if PRINT_TS:
print datetime.now(),
for arg in args:
print arg,
print ""
sys.stdout.flush()
def __dummy_print(*args):
pass
def __create_conditional_print(cond):
if cond:
return __print_func
else:
return __dummy_print
bibauthor_print = __create_conditional_print(bconfig.DEBUG_OUTPUT)
name_comparison_print = __create_conditional_print(bconfig.DEBUG_NAME_COMPARISON_OUTPUT)
metadata_comparison_print = __create_conditional_print(bconfig.DEBUG_METADATA_COMPARISON_OUTPUT)
wedge_print = __create_conditional_print(bconfig.DEBUG_WEDGE_OUTPUT)
if bconfig.DEBUG_OUTPUT:
status_len = 20
comment_len = 40
def padd(stry, l):
return stry[:l].ljust(l)
def update_status(percent, comment="", print_ts=False):
set_stdout()
filled = int(percent * status_len-2)
bar = "[%s%s] " % ("#" * filled, "-" * (status_len-2 - filled))
percent = ("%.2f%% done" % (percent * 100))
progress = padd(bar + percent, status_len)
comment = padd(comment, comment_len)
if print_ts or PRINT_TS_US:
print datetime.now(),
print 'pid:',PID(),
print progress, comment, TERMINATOR,
sys.stdout.flush()
def update_status_final(comment=""):
set_stdout()
update_status(1., comment, print_ts=PRINT_TS)
print ""
sys.stdout.flush()
else:
def update_status(percent, comment=""):
pass
def update_status_final(comment=""):
pass
def print_tortoise_memory_log(summary, fp):
stry = "PID:\t%s\tPEAK:\t%s,%s\tEST:\t%s\tBIBS:\t%s\n" % (summary['pid'], summary['peak1'], summary['peak2'], summary['est'], summary['bibs'])
fp.write(stry)
def parse_tortoise_memory_log(memfile_path):
f = open(memfile_path)
lines = f.readlines()
f.close()
def line_2_dict(line):
line = line.split('\t')
ret = { 'mem1' : int(line[3].split(",")[0]),
'mem2' : int(line[3].split(",")[1]),
'est' : float(line[5]),
'bibs' : int(line[7])
}
return ret
return map(line_2_dict, lines)
eps = 1e-6
def is_eq(v1, v2):
return v1 + eps > v2 and v2 + eps > v1
diff --git a/invenio/legacy/bibauthorid/least_squares.py b/invenio/legacy/bibauthorid/least_squares.py
index c5d302476..f48adfbed 100644
--- a/invenio/legacy/bibauthorid/least_squares.py
+++ b/invenio/legacy/bibauthorid/least_squares.py
@@ -1,93 +1,93 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import operator
from itertools import izip, starmap, repeat
#python2.4 compatibility
-from invenio.bibauthorid_general_utils import bai_all as all
+from invenio.legacy.bibauthorid.general_utils import bai_all as all
def approximate(xs, ys, power):
assert len(xs) == len(ys)
matrix_size = power + 1
variables = 2 * power + 1
xs = map(float, xs)
ys = map(float, ys)
xs = reduce(lambda x, y: x + [list(starmap(operator.mul, izip(x[-1], y)))], repeat(xs, variables - 1), [[1] * len(xs)])
assert len(xs) == variables
s = map(sum, xs)
assert s[0] == len(ys)
b = [sum(starmap(operator.mul, izip(ys, x))) for x in xs[:matrix_size]]
a = [s[i:i + matrix_size] for i in xrange(matrix_size)]
# So, we have a*x = b and we are looking for x
matr = [ai + [bi] for ai, bi in izip(a, b)]
def unify_row(i, j):
matr[i] = [cell / matr[i][j] for cell in matr[i]]
assert matr[i][j] == 1
def subtract_row(i, j, row):
assert matr[i][j] == 1
matr[row] = [matr[row][k] - matr[i][k] * matr[row][j] for k in xrange(len(matr[i]))]
assert matr[row][j] == 0
# NOTE: Example for matrix_size = 3
# unify_row(0, 0)
# subtract_row(0, 0, 1)
# subtract_row(0, 0, 2)
# unify_row(1, 1)
# subtract_row(1, 1, 2)
# unify_row(2, 2)
# subtract_row(2, 2, 1)
# subtract_row(2, 2, 0)
# subtract_row(1, 1, 0)
for i in xrange(matrix_size):
unify_row(i, i)
for j in xrange(matrix_size - i - 1):
subtract_row(i, i, i + j + 1)
for i in xrange(matrix_size):
for j in xrange(matrix_size - i - 1):
subtract_row(matrix_size - i - 1, matrix_size - i - 1, j)
assert all(matr[i][:matrix_size] == ([0] * i) + [1] + ([0] * (matrix_size - 1 - i)) for i in xrange(matrix_size))
ret = map(operator.itemgetter(matrix_size), matr)
return ret
def to_function(poly):
power = len(poly) - 1
def func(x):
arr = [1.]
for _ in xrange(power):
arr.append(arr[-1] * x)
assert len(arr) == len(poly)
return sum(p * x for p, x in izip(poly, arr))
return func
diff --git a/invenio/legacy/bibauthorid/merge.py b/invenio/legacy/bibauthorid/merge.py
index 26931e05c..d9034e799 100644
--- a/invenio/legacy/bibauthorid/merge.py
+++ b/invenio/legacy/bibauthorid/merge.py
@@ -1,394 +1,394 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
from operator import itemgetter
from itertools import groupby, chain, imap, izip
-from invenio.bibauthorid_general_utils import update_status \
+from invenio.legacy.bibauthorid.general_utils import update_status \
, update_status_final
from invenio.bibauthorid_matrix_optimization import maximized_mapping
from invenio.bibauthorid_backinterface import update_personID_canonical_names
from invenio.bibauthorid_backinterface import get_existing_result_clusters
from invenio.bibauthorid_backinterface import get_lastname_results
from invenio.bibauthorid_backinterface import personid_name_from_signature
from invenio.bibauthorid_backinterface import personid_from_signature
from invenio.bibauthorid_backinterface import move_signature
from invenio.bibauthorid_backinterface import get_claimed_papers
from invenio.bibauthorid_backinterface import get_new_personid
from invenio.bibauthorid_backinterface import find_conflicts
from invenio.bibauthorid_backinterface import get_free_pids as backinterface_get_free_pids
from invenio.bibauthorid_backinterface import get_signature_info
from invenio.bibauthorid_backinterface import delete_empty_persons
from invenio.bibauthorid_backinterface import get_bibrefrec_to_pid_flag_mapping
def merge_static_classy():
'''
This function merges aidPERSONIDPAPERS with aidRESULTS.
Use it after tortoise.
This function is static: if aid* tables are changed while it's running,
probably everything will crash and a black hole will open, eating all your data.
NOTE: this is more elegant that merge_static but much slower. Will have to be improved
before it can replace it.
'''
class Sig(object):
def __init__(self, bibrefrec, pid_flag):
self.rejected = dict(filter(lambda p: p[1] <= -2, pid_flag))
self.assigned = filter(lambda p:-2 < p[1] and p[1] < 2, pid_flag)
self.claimed = filter(lambda p: 2 <= p[1], pid_flag)
self.bibrefrec = bibrefrec
assert self.invariant()
def invariant(self):
return len(self.assigned) + len(self.claimed) <= 1
def empty(self):
return not self.isclaimed and not self.isassigned
def isclaimed(self):
return len(self.claimed) == 1
def get_claimed(self):
return self.claimed[0][0]
def get_assigned(self):
return self.assigned[0][0]
def isassigned(self):
return len(self.assigned) == 1
def isrejected(self, pid):
return pid in self.rejected
def change_pid(self, pid):
assert self.invariant()
assert self.isassigned()
self.assigned = [(pid, 0)]
move_signature(self.bibrefrec, pid)
class Cluster(object):
def __init__(self, pid, sigs):
self.pid = pid
self.sigs = dict((sig.bibrefrec[2], sig) for sig in sigs if not sig.empty())
def send_sig(self, other, sig):
paper = sig.bibrefrec[2]
assert paper in self.sigs and paper not in other.sigs
del self.sigs[paper]
other.sigs[paper] = sig
if sig.isassigned():
sig.change_pid(other.pid)
last_names = frozenset(name[0].split('.')[0] for name in get_existing_result_clusters())
personid = get_bibrefrec_to_pid_flag_mapping()
free_pids = backinterface_get_free_pids()
for idx, last in enumerate(last_names):
update_status(float(idx) / len(last_names), "Merging, %d/%d current: %s" % (idx, len(last_names), last))
results = ((int(row[0].split(".")[1]), row[1:4]) for row in get_lastname_results(last))
# [(last name number, [bibrefrecs])]
results = [(k, map(itemgetter(1), d)) for k, d in groupby(sorted(results, key=itemgetter(0)), key=itemgetter(0))]
# List of dictionaries.
# [{new_pid -> N}]
matr = []
# Set of all old pids.
old_pids = set()
for k, ds in results:
pids = []
for d in ds:
pid_flag = filter(lambda x: x[1] > -2, personid.get(d, []))
if pid_flag:
assert len(pid_flag) == 1
pid = pid_flag[0][0]
pids.append(pid)
old_pids.add(pid)
matr.append(dict((k, len(list(d))) for k, d in groupby(sorted(pids))))
old_pids = list(old_pids)
best_match = maximized_mapping([[row.get(old, 0) for old in old_pids] for row in matr])
# [[bibrefrecs] -> pid]
matched_clusters = [(results[new_idx][1], old_pids[old_idx]) for new_idx, old_idx, _ in best_match]
not_matched_clusters = frozenset(xrange(len(results))) - frozenset(imap(itemgetter(0), best_match))
not_matched_clusters = izip((results[i][1] for i in not_matched_clusters), free_pids)
# pid -> Cluster
clusters = dict((pid, Cluster(pid, [Sig(bib, personid.get(bib, [])) for bib in sigs]))
for sigs, pid in chain(matched_clusters, not_matched_clusters))
todo = clusters.items()
for pid, clus in todo:
assert clus.pid == pid
for paper, sig in clus.sigs.items():
if sig.isclaimed():
if sig.get_claimed() != pid:
target_clus = clusters[sig.get_claimed()]
if paper in target_clus.sigs:
new_clus = Cluster(free_pids.next(), [])
target_clus.send_sig(new_clus, target_clus[paper])
todo.append(new_clus)
clusters[new_clus.pid] = new_clus
assert paper not in target_clus.sigs
clus.send_sig(target_clus, sig)
elif sig.get_assigned() != pid:
if not sig.isrejected(pid):
move_signature(sig.bibrefrec, pid)
else:
move_signature(sig.bibrefrec, free_pids.next())
else:
assert not sig.isrejected(pid)
update_status_final("Merging done.")
update_status_final()
delete_empty_persons()
update_personID_canonical_names()
def merge_static():
'''
This function merges aidPERSONIDPAPERS with aidRESULTS.
Use it after tortoise.
This function is static: if aid* tables are changed while it's running,
probably everything will crash and a black hole will open, eating all your data.
'''
last_names = frozenset(name[0].split('.')[0] for name in get_existing_result_clusters())
def get_free_pids():
while True:
yield get_new_personid()
free_pids = get_free_pids()
current_mapping = get_bibrefrec_to_pid_flag_mapping()
def move_sig_and_update_mapping(sig, old_pid_flag, new_pid_flag):
move_signature(sig, new_pid_flag[0])
current_mapping[sig].remove(old_pid_flag)
current_mapping[sig].append(new_pid_flag)
def try_move_signature(sig, target_pid):
"""
"""
paps = current_mapping[sig]
rejected = filter(lambda p: p[1] <= -2, paps)
assigned = filter(lambda p:-2 < p[1] and p[1] < 2, paps)
claimed = filter(lambda p: 2 <= p[1] and p[0] == target_pid, paps)
if claimed or not assigned or assigned[0] == target_pid:
return
assert len(assigned) == 1
if rejected:
newpid = free_pids.next()
move_sig_and_update_mapping(sig, assigned[0], (newpid, assigned[0][1]))
else:
conflicts = find_conflicts(sig, target_pid)
if not conflicts:
move_sig_and_update_mapping(sig, assigned[0], (target_pid, assigned[0][1]))
else:
assert len(conflicts) == 1
if conflicts[0][3] == 2:
newpid = free_pids.next()
move_sig_and_update_mapping(sig, assigned[0], (newpid, assigned[0][1]))
else:
newpid = free_pids.next()
csig = tuple(conflicts[0][:3])
move_sig_and_update_mapping(csig, (target_pid, conflicts[0][3]), (newpid, conflicts[0][3]))
move_sig_and_update_mapping(sig, assigned[0], (target_pid, assigned[0][1]))
for idx, last in enumerate(last_names):
update_status(float(idx) / len(last_names), "%d/%d current: %s" % (idx, len(last_names), last))
results = ((int(row[0].split(".")[1]), row[1:4]) for row in get_lastname_results(last))
# [(last name number, [bibrefrecs])]
results = [(k, map(itemgetter(1), d)) for k, d in groupby(sorted(results, key=itemgetter(0)), key=itemgetter(0))]
# List of dictionaries.
# [{new_pid -> N}]
matr = []
# Set of all old pids.
old_pids = set()
for k, ds in results:
pids = []
claim = []
for d in ds:
pid_flag = current_mapping.get(d, [])
if pid_flag:
pid, flag = pid_flag[0]
pids.append(pid)
old_pids.add(pid)
if flag > 1:
claim.append((d, pid))
matr.append(dict((k, len(list(d))) for k, d in groupby(sorted(pids))))
# We cast it to list in order to ensure the order persistence.
old_pids = list(old_pids)
best_match = maximized_mapping([[row.get(old, 0) for old in old_pids] for row in matr])
matched_clusters = [(results[new_idx][1], old_pids[old_idx]) for new_idx, old_idx, _ in best_match]
not_matched_clusters = frozenset(xrange(len(results))) - frozenset(imap(itemgetter(0), best_match))
not_matched_clusters = izip((results[i][1] for i in not_matched_clusters), free_pids)
for sigs, pid in chain(matched_clusters, not_matched_clusters):
for sig in sigs:
if sig in current_mapping:
if not pid in map(itemgetter(0), filter(lambda x: x[1] > -2, current_mapping[sig])):
try_move_signature(sig, pid)
update_status_final()
delete_empty_persons()
update_personID_canonical_names()
def merge_dynamic():
'''
This function merges aidPERSONIDPAPERS with aidRESULTS.
Use it after tortoise.
This function is dynamic: it allows aid* tables to be changed while it is still running,
hence the claiming faciity for example can stay online during the merge. This comfort
however is paid off in term of speed.
'''
last_names = frozenset(name[0].split('.')[0] for name in get_existing_result_clusters())
def get_free_pids():
while True:
yield get_new_personid()
free_pids = get_free_pids()
def try_move_signature(sig, target_pid):
"""
"""
paps = get_signature_info(sig)
rejected = filter(lambda p: p[1] <= -2, paps)
assigned = filter(lambda p:-2 < p[1] and p[1] < 2, paps)
claimed = filter(lambda p: 2 <= p[1] and p[0] == target_pid, paps)
if claimed or not assigned or assigned[0] == target_pid:
return
assert len(assigned) == 1
if rejected:
move_signature(sig, free_pids.next())
else:
conflicts = find_conflicts(sig, target_pid)
if not conflicts:
move_signature(sig, target_pid)
else:
assert len(conflicts) == 1
if conflicts[0][3] == 2:
move_signature(sig, free_pids.next())
else:
move_signature(conflicts[0][:3], free_pids.next())
move_signature(sig, target_pid)
for idx, last in enumerate(last_names):
update_status(float(idx) / len(last_names), "%d/%d current: %s" % (idx, len(last_names), last))
results = ((int(row[0].split(".")[1]), row[1:4]) for row in get_lastname_results(last))
# [(last name number, [bibrefrecs])]
results = [(k, map(itemgetter(1), d)) for k, d in groupby(sorted(results, key=itemgetter(0)), key=itemgetter(0))]
# List of dictionaries.
# [{new_pid -> N}]
matr = []
# Set of all old pids.
old_pids = set()
for k, ds in results:
pids = []
claim = []
for d in ds:
pid_flag = personid_from_signature(d)
if pid_flag:
pid, flag = pid_flag[0]
pids.append(pid)
old_pids.add(pid)
if flag > 1:
claim.append((d, pid))
matr.append(dict((k, len(list(d))) for k, d in groupby(sorted(pids))))
# We cast it to list in order to ensure the order persistence.
old_pids = list(old_pids)
best_match = maximized_mapping([[row.get(old, 0) for old in old_pids] for row in matr])
matched_clusters = [(results[new_idx][1], old_pids[old_idx]) for new_idx, old_idx, _ in best_match]
not_matched_clusters = frozenset(xrange(len(results))) - frozenset(imap(itemgetter(0), best_match))
not_matched_clusters = izip((results[i][1] for i in not_matched_clusters), free_pids)
for sigs, pid in chain(matched_clusters, not_matched_clusters):
for sig in sigs:
try_move_signature(sig, pid)
update_status_final()
delete_empty_persons()
update_personID_canonical_names()
def matched_claims(inspect=None):
'''
Checks how many claims are violated in aidRESULTS.
Returs the number of preserved and the total number of claims.
'''
last_names = frozenset(name[0].split('.')[0] for name in get_existing_result_clusters())
r_match = 0
r_total = 0
for lname in last_names:
if inspect and lname != inspect:
continue
results_dict = dict(((row[1], row[2], row[3]), int(row[0].split(".")[1]))
for row in get_lastname_results(lname))
results_clusters = max(results_dict.values()) + 1
assert frozenset(results_dict.values()) == frozenset(range(results_clusters))
pids = frozenset(x[0] for x in chain.from_iterable(personid_name_from_signature(r) for r in results_dict.keys()))
matr = ((results_dict[x] for x in get_claimed_papers(pid) if x in results_dict) for pid in pids)
matr = (dict((k, len(list(d))) for k, d in groupby(sorted(row))) for row in matr)
matr = [[row.get(i, 0) for i in xrange(results_clusters)] for row in matr]
r_match += sum(m[2] for m in maximized_mapping(matr))
r_total += sum(sum(row) for row in matr)
return r_match, r_total
diff --git a/invenio/legacy/bibauthorid/name_utils.py b/invenio/legacy/bibauthorid/name_utils.py
index e9f160394..16b029d8c 100644
--- a/invenio/legacy/bibauthorid/name_utils.py
+++ b/invenio/legacy/bibauthorid/name_utils.py
@@ -1,794 +1,794 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
'''
bibauthorid_name_utils
Bibauthorid utilities used by many parts of the framework
'''
import re
-import invenio.bibauthorid_config as bconfig
-from invenio.bibauthorid_string_utils import string_partition
+import invenio.legacy.bibauthorid.config as bconfig
+from invenio.legacy.bibauthorid.string_utils import string_partition
from copy import deepcopy
from invenio.utils.text import translate_to_ascii
-from invenio.bibauthorid_general_utils import name_comparison_print
+from invenio.legacy.bibauthorid.general_utils import name_comparison_print
try:
from invenio.config import CFG_ETCDIR
NO_CFG_ETCDIR = False
except ImportError:
NO_CFG_ETCDIR = True
try:
from editdist import distance
except ImportError:
try:
from Levenshtein import distance
except ImportError:
name_comparison_print("Levenshtein Module not available!")
def distance(s1, s2):
d = {}
lenstr1 = len(s1)
lenstr2 = len(s2)
for i in xrange(-1, lenstr1 + 1):
d[(i, -1)] = i + 1
for j in xrange(-1, lenstr2 + 1):
d[(-1, j)] = j + 1
for i in xrange(0, lenstr1):
for j in xrange(0, lenstr2):
if s1[i] == s2[j]:
cost = 0
else:
cost = 1
d[(i, j)] = min(
d[(i - 1, j)] + 1, # deletion
d[(i, j - 1)] + 1, # insertion
d[(i - 1, j - 1)] + cost, # substitution
)
if i > 1 and j > 1 and s1[i] == s2[j - 1] and s1[i - 1] == s2[j]:
d[(i, j)] = min (d[(i, j)], d[i - 2, j - 2] + cost) # transposition
return d[lenstr1 - 1, lenstr2 - 1]
artifact_removal = re.compile("[^a-zA-Z0-9]")
#Gender names and names variation files are loaded updon module import to increase performances
def split_name_parts(name_string, delete_name_additions=True,
override_surname_sep='', return_all_lower=False):
'''
Splits name_string in three arrays of strings :
surname, initials (without trailing dot), names
RETURNS an array containing a string and two arrays of strings.
delete_name_additions defines if extensions
e.g. Jr., (Ed.) or (spokesperson)
will be ignored
@param name_string: the name to be spli
@type name: string
@param delete_name_additions: determines whether to delete name additions
@type delete_name_additions: boolean
@param override_surname_sep: Define alternative surname separator
@type override_surname_sep: string
@param reverse_name_surname: if true names come first
@return: list of [surname string, initials list, names list]
e.g. split_name_parts("Ellis, John R.")
--> ['Ellis', ['J', 'R'], ['John'], [0]]
--> ['Ellis', ['K', 'J', 'R'], ['John', 'Rob'], [1,2]]
@rtype: list of lists
'''
if not override_surname_sep:
surname_separators = bconfig.SURNAMES_SEPARATOR_CHARACTER_LIST
else:
surname_separators = ','
name_separators = bconfig.NAMES_SEPARATOR_CHARACTER_LIST
if name_separators == "-1":
name_separators = ',;.=\-\(\)'
if delete_name_additions:
name_additions = re.findall('\([.]*[^\)]*\)', name_string)
for name_addition in name_additions:
name_string = name_string.replace(name_addition, '')
surname = ""
rest_of_name = ""
found_sep = ''
name_string = name_string.strip()
for sep in surname_separators:
if name_string.count(sep) >= 1:
found_sep = sep
surname, rest_of_name = string_partition(name_string, sep)[0::2]
surname = surname.strip().capitalize()
# Fix for dashes
surname = re.sub('-([a-z])', lambda n:'-' + n.group(1).upper(), surname)
break
if not found_sep:
if name_string.count(" ") > 0:
rest_of_name, surname = string_partition(name_string, ' ', direc='r')[0::2]
surname = surname.strip().capitalize()
# Fix for dashes
surname = re.sub('-([a-z])', lambda n:'-' + n.group(1).upper(), surname)
else:
if not return_all_lower:
return [name_string.strip().capitalize(), [], [], []]
else:
return [name_string.strip().lower(), [], [], []]
if rest_of_name.count(","):
rest_of_name = string_partition(rest_of_name, ",")[0]
substitution_regexp = re.compile('[%s]' % (name_separators))
initials_names_list = substitution_regexp.sub(' ', rest_of_name).split()
names = []
initials = []
positions = []
pos_counter = 0
for i in initials_names_list:
if len(i) == 1:
initials.append(i.capitalize())
pos_counter += 1
else:
names.append(i.strip().capitalize())
initials.append(i[0].capitalize())
positions.append(pos_counter)
pos_counter += 1
retval = [surname, initials, names, positions]
if return_all_lower:
retval = [surname.lower(), [i.lower() for i in initials], [n.lower() for n in names], positions]
return retval
def create_canonical_name(name):
canonical_name = create_unified_name(name, reverse=True)
artifact_removal_re = re.compile("[^a-zA-Z0-9]")
whitespace_removal = re.compile("[ ]{1,10}")
canonical_name = artifact_removal_re.sub(" ", canonical_name)
canonical_name = whitespace_removal.sub(" ", canonical_name)
canonical_name = canonical_name.strip().replace(" ", ".")
return canonical_name
def create_normalized_name(splitted_name):
'''
Creates a normalized name from a given name array. A normalized name
looks like "Lastname, Firstnames and Initials"
@param splitted_name: name array from split_name_parts
@type splitted_name: list in form [string, list, list]
@return: normalized name
@rtype: string
'''
name = splitted_name[0] + ','
if not splitted_name[1] and not splitted_name[2]:
return name
for i in range(len(splitted_name[1])):
try:
fname = splitted_name[2][splitted_name[3].index(i)]
name = name + ' ' + fname
except (IndexError, ValueError):
name = name + ' ' + splitted_name[1][i] + '.'
return name
def create_unified_name(name, reverse=False):
'''
Creates unified name. E.g. Ellis, John Richard T. (Jr.)
will become Ellis, J. R. T.
@param name: The name to be unified
@type name: string
@param reverse: if true, names come first
@return: The unified name
@rtype: string
'''
split_name = split_name_parts(name)
if reverse:
unified_name = ''
for i in split_name[1]:
unified_name += "%s. " % (i)
unified_name += "%s" % (split_name[0])
else:
unified_name = "%s, " % (split_name[0])
for i in split_name[1]:
unified_name += "%s. " % (i)
if unified_name.count("ollabo"):
unified_name = unified_name.replace("ollaborations", "ollaboration")
unified_name = unified_name.replace("The ", "")
unified_name = unified_name.replace("the ", "")
unified_name = unified_name.replace("For ", "")
unified_name = unified_name.replace("for ", "")
return unified_name
def clean_name_string(namestring, replacement=" ", keep_whitespace=True,
trim_whitespaces=False):
'''
remove specific artifacts from the names in order to be able to
compare them. E.g. 't Hooft, G. and t'Hooft, G.
@param namestring: the string to be cleaned
@type namestring: string
'''
# artifact_removal = re.compile("['`\-\[\]\_\"]")
artifact_removal_re = None
if trim_whitespaces:
namestring.strip()
if keep_whitespace:
artifact_removal_re = re.compile("[^a-zA-Z0-9,.\s]")
else:
artifact_removal_re = re.compile("[^a-zA-Z0-9,.]")
whitespace_removal = re.compile("[\s]{2,100}")
tmp = artifact_removal_re.sub(replacement, namestring)
tmp = whitespace_removal.sub(" ", tmp).strip()
return tmp
def soft_compare_names(origin_name, target_name):
'''
Soft comparison of names, to use in search engine an similar
Base results:
If surname is equal in [0.6,1.0]
If surname similar in [0.4,0.8]
If surname differs in [0.0,0.4]
all depending on average compatibility of names and initials.
'''
jaro_fctn = distance
# try:
# from Levenshtein import jaro_winkler
# jaro_fctn = jaro_winkler
# except ImportError:
# jaro_fctn = jaro_winkler_str_similarity
score = 0.0
oname = deepcopy(origin_name)
tname = deepcopy(target_name)
oname = translate_to_ascii(oname)[0]
tname = translate_to_ascii(tname)[0]
orig_name = split_name_parts(oname.lower())
targ_name = split_name_parts(tname.lower())
orig_name[0] = clean_name_string(orig_name[0],
replacement="",
keep_whitespace=False)
targ_name[0] = clean_name_string(targ_name[0],
replacement="",
keep_whitespace=False)
if orig_name[0] == targ_name[0]:
score += 0.6
else:
if ((jaro_fctn(orig_name[0].lower(), targ_name[0].lower()) < .95)
or min(len(orig_name[0]), len(targ_name[0])) <= 4):
score += 0.0
else:
score += 0.4
if orig_name[1] and targ_name[1]:
max_initials = max(len(orig_name[1]), len(targ_name[1]))
matching_i = 0
if len(orig_name[1]) >= 1 and len(targ_name[1]) >= 1:
for i in orig_name[1]:
if i in targ_name[1]:
matching_i += 1
max_names = max(len(orig_name[2]), len(targ_name[2]))
matching_n = 0
if len(orig_name[2]) >= 1 and len(targ_name[2]) >= 1:
cleaned_targ_name = [clean_name_string(i, replacement="", keep_whitespace=False) for i in targ_name[2]]
for i in orig_name[2]:
if clean_name_string(i, replacement="", keep_whitespace=False) in cleaned_targ_name:
matching_n += 1
name_score = (matching_i + matching_n) * 0.4 / (max_names + max_initials)
score += name_score
return score
def create_name_tuples(names):
'''
Find name combinations, i.e. permutations of the names in different
positions of the name
@param names: a list of names
@type names: list of string
@return: the combinations of the names given
@rtype: list of lists of strings
'''
length = float(len(names))
max_tuples = int((length / 2) * (length - 1))
current_tuple = 1
pos = 0
off = 1
variants = [" ".join(names)]
for i in range(max_tuples):
variant = "%s %s %s" % (' '.join(names[0:pos]),
''.join(names[pos:off + 1]).capitalize(),
' '.join(names[off + 1::]))
variants.append(variant.strip())
pos += 1
off += 1
if off >= length:
pos = i * 0
off = current_tuple + 1
current_tuple += 1
return variants
def full_names_are_equal_composites(name1, name2):
'''
Checks if names are equal composites; e.g. "guangsheng" vs. "guang sheng"
@param name1: Full Name string of the first name (w/ last name)
@type name1: string
@param name2: Full Name string of the second name (w/ last name)
@type name2: string
@return: Are the names equal composites?
@rtype: boolean
'''
if not isinstance(name1, list):
name1 = split_name_parts(name1)
if not isinstance(name2, list):
name2 = split_name_parts(name2)
is_equal_composite = False
oname_variations = create_name_tuples(name1[2])
tname_variations = create_name_tuples(name2[2])
for oname_variation in oname_variations:
for tname_variation in tname_variations:
oname = clean_name_string(oname_variation.lower(), "", False, True)
tname = clean_name_string(tname_variation.lower(), "", False, True)
if oname == tname:
is_equal_composite = True
break
return is_equal_composite
def full_names_are_equal_gender(name1, name2, gendernames):
'''
Checks on gender equality of two first names baes on a word list
@param name1: Full Name string of the first name (w/ last name)
@type name1: string
@param name2: Full Name string of the second name (w/ last name)
@type name2: string
@param gendernames: dictionary of male/female names
@type gendernames: dict
@return: Are names gender-equal?
@rtype: boolean
'''
if not isinstance(name1, list):
name1 = split_name_parts(name1)
if not isinstance(name2, list):
name2 = split_name_parts(name2)
names_are_equal_gender_b = True
ogender = None
tgender = None
# oname = name1[2][0].lower()
# tname = name2[2][0].lower()
# oname = clean_name_string(oname, "", False, True)
# tname = clean_name_string(tname, "", False, True)
onames = [clean_name_string(n.lower(), "", False, True) for n in name1[2]]
tnames = [clean_name_string(n.lower(), "", False, True) for n in name2[2]]
for oname in onames:
if oname in gendernames['boys']:
if ogender != 'Conflict':
if ogender != 'Female':
ogender = 'Male'
else:
ogender = 'Conflict'
elif oname in gendernames['girls']:
if ogender != 'Conflict':
if ogender != 'Male':
ogender = 'Female'
else:
ogender = 'Conflict'
for tname in tnames:
if tname in gendernames['boys']:
if tgender != 'Conflict':
if tgender != 'Female':
tgender = 'Male'
else:
tgender = 'Conflict'
elif tname in gendernames['girls']:
if tgender != 'Conflict':
if tgender != 'Male':
tgender = 'Female'
else:
tgender = 'Conflict'
if ogender and tgender:
if ogender != tgender or ogender == 'Conflict' or tgender == 'Conflict':
names_are_equal_gender_b = False
return names_are_equal_gender_b
def names_are_synonymous(name1, name2, name_variations):
'''
Checks if names are synonims
@param name_variations: name variations list
@type name_variations: list of lists
'''
a = [name1 in nvar and name2 in nvar for nvar in name_variations]
if True in a:
return True
return False
def full_names_are_synonymous(name1, name2, name_variations):
'''
Checks if two names are synonymous; e.g. "Robert" vs. "Bob"
@param name1: Full Name string of the first name (w/ last name)
@type name1: string
@param name2: Full Name string of the second name (w/ last name)
@type name2: string
@param name_variations: name variations list
@type name_variations: list of lists
@return: are names synonymous
@rtype: boolean
'''
if not isinstance(name1, list):
name1 = split_name_parts(name1)
if not isinstance(name2, list):
name2 = split_name_parts(name2)
names_are_synonymous_b = False
max_matches = min(len(name1[2]), len(name2[2]))
matches = []
for i in xrange(max_matches):
matches.append(False)
for nvar in name_variations:
for i in xrange(max_matches):
oname = name1[2][i].lower()
tname = name2[2][i].lower()
oname = clean_name_string(oname, "", False, True)
tname = clean_name_string(tname, "", False, True)
if (oname in nvar and tname in nvar) or oname == tname:
name_comparison_print(' ', oname, ' and ', tname, ' are synonyms!')
matches[i] = True
if sum(matches) == max_matches:
names_are_synonymous_b = True
break
return names_are_synonymous_b
def names_are_substrings(name1, name2):
'''
Checks if the names are subtrings of each other, left to right
@return: bool
'''
return name1.startswith(name2) or name2.startswith(name1)
def full_names_are_substrings(name1, name2):
'''
Checks if two names are substrings of each other; e.g. "Christoph" vs. "Ch"
Only checks for the beginning of the names.
@param name1: Full Name string of the first name (w/ last name)
@type name1: string
@param name2: Full Name string of the second name (w/ last name)
@type name2: string
@return: are names synonymous
@rtype: boolean
'''
if not isinstance(name1, list):
name1 = split_name_parts(name1)
if not isinstance(name2, list):
name2 = split_name_parts(name2)
onames = name1[2]
tnames = name2[2]
# oname = "".join(onames).lower()
# tname = "".join(tnames).lower()
names_are_substrings_b = False
for o in onames:
oname = clean_name_string(o.lower(), "", False, True)
for t in tnames:
tname = clean_name_string(t.lower(), "", False, True)
if (oname.startswith(tname)
or tname.startswith(oname)):
names_are_substrings_b = True
return names_are_substrings_b
def _load_gender_firstnames_dict(files=''):
if not NO_CFG_ETCDIR and not files:
files = {'boy': CFG_ETCDIR + '/bibauthorid/name_authority_files/male_firstnames.txt',
'girl': CFG_ETCDIR + '/bibauthorid/name_authority_files/female_firstnames.txt'}
elif NO_CFG_ETCDIR and not files:
files = {'boy': '../etc/name_authority_files/male_firstnames.txt',
'girl': '../etc/name_authority_files/female_firstnames.txt'}
boyf = open(files['boy'], 'r')
boyn = set([x.strip().lower() for x in boyf.readlines()])
boyf.close()
girlf = open(files['girl'], 'r')
girln = set([x.strip().lower() for x in girlf.readlines()])
girlf.close()
return {'boys':(boyn - girln), 'girls':(girln - boyn)}
def _load_firstname_variations(filename=''):
#will load an array of arrays: [['rick','richard','dick'],['john','jhonny']]
if not NO_CFG_ETCDIR and not filename:
filename = CFG_ETCDIR + '/bibauthorid/name_authority_files/name_variants.txt'
elif NO_CFG_ETCDIR and not filename:
filename = '../etc/name_authority_files/name_variants.txt'
retval = []
r = re.compile("\n")
fp = open(filename)
for l in fp.readlines():
lr = r.sub("", l)
retval.append([clean_name_string(name.lower(), "", False, True)
for name in lr.split(";") if name])
fp.close()
return retval
def compare_names(origin_name, target_name, initials_penalty=False):
'''
Compare two names.
'''
MAX_ALLOWED_SURNAME_DISTANCE = 2
name_comparison_print("\nComparing: " , origin_name, ' ', target_name)
gendernames = GLOBAL_gendernames
name_variations = GLOBAL_name_variations
origin_name = translate_to_ascii(origin_name)[0]
target_name = translate_to_ascii(target_name)[0]
no = split_name_parts(origin_name, True, "", True)
nt = split_name_parts(target_name, True, "", True)
name_comparison_print("|- splitted no: ", no)
name_comparison_print("|- splitted nt: ", nt)
score = 0.0
surname_dist = distance(no[0], nt[0])
name_comparison_print("|- surname distance: ", surname_dist)
if surname_dist > 0:
l_artifact_removal = re.compile("[^a-zA-Z0-9]")
fn1 = l_artifact_removal.sub("", no[0])
fn2 = l_artifact_removal.sub("", nt[0])
if fn1 == fn2:
score = 1.0
else:
score = max(0.0, 0.5 - (float(surname_dist) / float(MAX_ALLOWED_SURNAME_DISTANCE)))
else:
score = 1.0
name_comparison_print('||- surname score: ', score)
initials_only = ((min(len(no[2]), len(nt[2]))) == 0)
only_initials_available = False
if len(no[2]) == len(nt[2]) and initials_only:
only_initials_available = True
name_comparison_print('|- initials only: ', initials_only)
name_comparison_print('|- only initials available: ', only_initials_available)
names_are_equal_composites = False
if not initials_only:
names_are_equal_composites = full_names_are_equal_composites(origin_name, target_name)
name_comparison_print("|- equal composites: ", names_are_equal_composites)
max_n_initials = max(len(no[1]), len(nt[1]))
initials_intersection = set(no[1]).intersection(set(nt[1]))
n_initials_intersection = len(initials_intersection)
initials_union = set(no[1]).union(set(nt[1]))
n_initials_union = len(initials_union)
initials_distance = distance("".join(no[1]), "".join(nt[1]))
if n_initials_union > 0:
initials_c = float(n_initials_intersection) / float(n_initials_union)
else:
initials_c = 1
if len(no[1]) > len(nt[1]):
alo = no[1]
alt = nt[1]
else:
alo = nt[1]
alt = no[1]
lo = len(alo)
lt = len(alt)
if max_n_initials > 0:
initials_screwup = sum([i + 1 for i, k in enumerate(reversed(alo))
if lo - 1 - i < lt and k != alt[lo - 1 - i] ]) / \
float(float(max_n_initials * (max_n_initials + 1)) / 2)
initials_distance = initials_distance / max_n_initials
else:
initials_screwup = 0
initials_distance = 0
score = max((score - ((0.75 * initials_screwup + 0.10 * (1. - initials_c)\
+ 0.15 * initials_distance) * score)), 0.0)
name_comparison_print("|- initials sets: ", no[1], " ", nt[1])
name_comparison_print("|- initials distance: ", initials_distance)
name_comparison_print("|- initials c: ", initials_c)
name_comparison_print("|- initials screwup: ", initials_screwup)
name_comparison_print("||- initials score: ", score)
composits_eq = full_names_are_equal_composites(no, nt)
if len(no[2]) > 0 and len(nt[2]) > 0:
gender_eq = full_names_are_equal_gender(no, nt, gendernames)
else:
gender_eq = True
vars_eq = full_names_are_synonymous(no, nt, name_variations)
substr_eq = full_names_are_substrings(no, nt)
if not initials_only:
if len(no[2]) > len(nt[2]):
nalo = no[2]
nalt = nt[2]
else:
nalo = nt[2]
nalt = no[2]
nlo = len(nalo)
nlt = len(nalt)
names_screwup_list = [(distance(k, nalt[nlo - 1 - i]), max(len(k), len(nalt[nlo - 1 - i])))
for i, k in enumerate(reversed(nalo)) \
if nlo - 1 - i < nlt]
max_names_screwup = max([float(i[0]) / i[1] for i in names_screwup_list])
avg_names_screwup = sum([float(i[0]) / i[1] for i in names_screwup_list])\
/ len(names_screwup_list)
else:
max_names_screwup = 0
avg_names_screwup = 0
score = max(score - score * ( 0.75 * max_names_screwup + 0.25 * avg_names_screwup), 0.0)
name_comparison_print("|- max names screwup: ", max_names_screwup)
name_comparison_print("|- avg screwup: ", avg_names_screwup)
name_comparison_print("||- names score: ", score)
name_comparison_print("|- names composites: ", composits_eq)
name_comparison_print("|- same gender: ", gender_eq)
name_comparison_print("|- synonims: ", vars_eq)
name_comparison_print("|- substrings: ", substr_eq)
if vars_eq:
synmap = [[i, j, names_are_synonymous(i, j, name_variations)] for i in no[2] for j in nt[2]]
synmap = [i for i in synmap if i[2] == True]
name_comparison_print("|-- synmap: ", synmap)
for i in synmap:
if no[2].index(i[0]) == nt[2].index(i[1]):
score = score + (1 - score) * 0.5
else:
score = score + (1 - score) * 0.15
else:
name_comparison_print("|-- synmap: empty")
name_comparison_print("|-- synmap score: ", score)
if substr_eq and not initials_only:
ssmap = [[i, j, names_are_substrings(i, j)] for i in no[2] for j in nt[2]]
ssmap = [i for i in ssmap if i[2] == True]
name_comparison_print("|-- substr map: ", ssmap)
for i in ssmap:
if no[2].index(i[0]) == nt[2].index(i[1]):
score = score + (1 - score) * 0.2
else:
score = score + (1 - score) * 0.05
else:
name_comparison_print("|-- substr map: empty")
name_comparison_print("|-- substring score: ", score)
if composits_eq and not initials_only:
name_comparison_print("|-- composite names")
score = score + (1 - score) * 0.2
else:
name_comparison_print("|-- not composite names")
name_comparison_print("|-- composite score: ", score)
if not gender_eq:
score = score / 3.
name_comparison_print("|-- apply gender penalty")
else:
name_comparison_print("|-- no gender penalty")
name_comparison_print("|-- gender score: ", score)
if surname_dist > MAX_ALLOWED_SURNAME_DISTANCE:
score = 0.0
name_comparison_print("|- surname trim: ", score)
else:
name_comparison_print("|- no surname trim: ", score)
if initials_only and (not only_initials_available or initials_penalty):
score = score * .9
name_comparison_print("|- initials only penalty: ", score, initials_only, only_initials_available)
else:
name_comparison_print("|- no initials only penalty", initials_only, only_initials_available)
name_comparison_print("||- final score: ", score)
return score
def generate_last_name_cluster_str(name):
'''
Use this function to find the last name cluster
this name should be associated with.
'''
family = split_name_parts(name.decode('utf-8'))[0]
return artifact_removal.sub("", family).lower()
from invenio.utils.datastructures import LazyDict
GLOBAL_gendernames = LazyDict(_load_gender_firstnames_dict)
GLOBAL_name_variations = [] #_load_firstname_variations()
diff --git a/invenio/legacy/bibauthorid/personid_maintenance.py b/invenio/legacy/bibauthorid/personid_maintenance.py
index a0ceda231..8598c0b38 100644
--- a/invenio/legacy/bibauthorid/personid_maintenance.py
+++ b/invenio/legacy/bibauthorid/personid_maintenance.py
@@ -1,108 +1,108 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
aidPersonID maintenance algorithms.
"""
-from invenio.bibauthorid_name_utils import split_name_parts
-from invenio.bibauthorid_name_utils import create_normalized_name
+from invenio.legacy.bibauthorid.name_utils import split_name_parts
+from invenio.legacy.bibauthorid.name_utils import create_normalized_name
from invenio.bibauthorid_backinterface import get_name_by_bibrecref
from invenio.bibauthorid_backinterface import copy_personids #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_backinterface import compare_personid_tables #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_backinterface import group_personid
from invenio.bibauthorid_backinterface import check_personid_papers #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_backinterface import personid_get_recids_affected_since as get_recids_affected_since #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_backinterface import repair_personid #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_backinterface import check_results #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_backinterface import check_merger #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_backinterface import restore_personids #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_backinterface import get_full_personid_papers #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_backinterface import get_full_results #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_backinterface import personid_get_recids_affected_since as get_recids_affected_since #emitting #pylint: disable-msg=W0611
def convert_personid():
from invenio.legacy.dbquery import run_sql # oh come on, the whole function will be removed soon
from itertools import repeat
chunk = 1000
old_personid = run_sql("SELECT `personid`, `tag`, `data`, `flag`, `lcul` FROM `aidPERSONID`")
def flush_papers(args):
run_sql("INSERT INTO `aidPERSONIDPAPERS` "
"(`personid`, "
" `bibref_table`, "
" `bibref_value`, "
" `bibrec`, "
" `name`, "
" `flag`, "
" `lcul`) "
"VALUES " + " , ".join(repeat("(%s, %s, %s, %s, %s, %s, %s)", len(args) / 7))
, tuple(args))
def flush_data(args):
run_sql("INSERT INTO `aidPERSONIDDATA` "
"(`personid`, "
" `tag`, "
" `data`, "
" `opt1`, "
" `opt2`) "
"VALUES " + " , ".join(repeat("(%s, %s, %s, %s, %s)", len(args) / 5))
, tuple(args))
paper_args = []
data_args = []
for row in old_personid:
if row[1] == 'paper':
bibref, rec = row[2].split(',')
tab, ref = bibref.split(':')
try:
name = get_name_by_bibrecref((int(tab), int(ref), int(rec)))
except:
continue
name = split_name_parts(name)
name = create_normalized_name(name)
paper_args += [row[0], tab, ref, rec, name, row[3], row[4]]
if len(paper_args) > chunk:
flush_papers(paper_args)
paper_args = []
elif row[1] == 'gathered_name':
continue
else:
data_args += list(row)
if len(data_args) > chunk:
flush_data(data_args)
data_args = []
if paper_args:
flush_papers(paper_args)
if data_args:
flush_data(data_args)
def compare_personids(path):
'''
Use this function with copy_personids() to diff personids.
'''
fp = open(path, "w")
pid1_p, pid1_d = group_personid("aidPERSONIDPAPERS_copy", "aidPERSONIDDATA_copy")
pid2_p, pid2_d = group_personid("aidPERSONIDPAPERS", "aidPERSONIDDATA")
compare_personid_tables(pid1_p, pid1_d, pid2_p, pid2_d, fp)
diff --git a/invenio/legacy/bibauthorid/prob_matrix.py b/invenio/legacy/bibauthorid/prob_matrix.py
index c87fbea0b..195aa726a 100644
--- a/invenio/legacy/bibauthorid/prob_matrix.py
+++ b/invenio/legacy/bibauthorid/prob_matrix.py
@@ -1,142 +1,142 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
-from invenio import bibauthorid_config as bconfig
+from invenio.legacy.bibauthorid import config as bconfig
from invenio.bibauthorid_comparison import compare_bibrefrecs
from invenio.bibauthorid_comparison import clear_all_caches as clear_comparison_caches
from invenio.bibauthorid_backinterface import Bib_matrix
from invenio.bibauthorid_backinterface import filter_modified_record_ids
-from invenio.bibauthorid_general_utils import bibauthor_print \
+from invenio.legacy.bibauthorid.general_utils import bibauthor_print \
, update_status \
, update_status_final \
, is_eq
if bconfig.DEBUG_CHECKS:
def _debug_is_eq_v(vl1, vl2):
if isinstance(vl1, str) and isinstance(vl2, str):
return vl1 == vl2
if isinstance(vl1, tuple) and isinstance(vl2, tuple):
return is_eq(vl1[0], vl2[0]) and is_eq(vl1[1], vl2[1])
return False
class ProbabilityMatrix(object):
'''
This class contains and maintains the comparison
between all virtual authors. It is able to write
and read from the database and update the results.
'''
def __init__(self):
self._bib_matrix = Bib_matrix()
def load(self, lname, load_map=True, load_matrix=True):
update_status(0., "Loading probability matrix...")
self._bib_matrix.load(lname, load_map, load_matrix)
update_status_final("Probability matrix loaded.")
def store(self, name):
update_status(0., "Saving probability matrix...")
self._bib_matrix.store(name)
update_status_final("Probability matrix saved.")
def __getitem__(self, bibs):
return self._bib_matrix[bibs[0], bibs[1]]
def __get_up_to_date_bibs(self):
return frozenset(filter_modified_record_ids(
self._bib_matrix.get_keys(),
self._bib_matrix.creation_time))
def is_up_to_date(self, cluster_set):
return self.__get_up_to_date_bibs() >= frozenset(cluster_set.all_bibs())
def recalculate(self, cluster_set):
'''
Constructs probability matrix. If use_cache is true, it will
try to load old computations from the database. If save cache
is true it will save the current results into the database.
@param cluster_set: A cluster set object, used to initialize
the matrix.
'''
last_cleaned = 0
old_matrix = self._bib_matrix
cached_bibs = self.__get_up_to_date_bibs()
have_cached_bibs = bool(cached_bibs)
self._bib_matrix = Bib_matrix(cluster_set)
ncl = cluster_set.num_all_bibs
expected = ((ncl * (ncl - 1)) / 2)
if expected == 0:
expected = 1
cur_calc, opti, prints_counter = 0, 0, 0
for cl1 in cluster_set.clusters:
if cur_calc+opti - prints_counter > 100000:
update_status((float(opti) + cur_calc) / expected, "Prob matrix: calc %d, opti %d." % (cur_calc, opti))
prints_counter = cur_calc+opti
#clean caches
if cur_calc - last_cleaned > 2000000:
clear_comparison_caches()
last_cleaned = cur_calc
for cl2 in cluster_set.clusters:
if id(cl1) < id(cl2) and not cl1.hates(cl2):
for bib1 in cl1.bibs:
for bib2 in cl2.bibs:
if have_cached_bibs and bib1 in cached_bibs and bib2 in cached_bibs:
val = old_matrix[bib1, bib2]
if not val:
cur_calc += 1
val = compare_bibrefrecs(bib1, bib2)
else:
opti += 1
if bconfig.DEBUG_CHECKS:
assert _debug_is_eq_v(val, compare_bibrefrecs(bib1, bib2))
else:
cur_calc += 1
val = compare_bibrefrecs(bib1, bib2)
self._bib_matrix[bib1, bib2] = val
clear_comparison_caches()
update_status_final("Matrix done. %d calc, %d opt." % (cur_calc, opti))
def prepare_matirx(cluster_set, force):
if bconfig.DEBUG_CHECKS:
assert cluster_set._debug_test_hate_relation()
assert cluster_set._debug_duplicated_recs()
matr = ProbabilityMatrix()
matr.load(cluster_set.last_name, load_map=True, load_matrix=False)
if not force and matr.is_up_to_date(cluster_set):
bibauthor_print("Cluster %s is up-to-date and therefore will not be computed."
% cluster_set.last_name)
# nothing to do
return False
matr.load(cluster_set.last_name, load_map=False, load_matrix=True)
matr.recalculate(cluster_set)
matr.store(cluster_set.last_name)
return True
diff --git a/invenio/legacy/bibauthorid/rabbit.py b/invenio/legacy/bibauthorid/rabbit.py
index 2ace70f82..d78647bda 100644
--- a/invenio/legacy/bibauthorid/rabbit.py
+++ b/invenio/legacy/bibauthorid/rabbit.py
@@ -1,189 +1,189 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
from itertools import cycle, imap, chain, izip
from operator import itemgetter
-from invenio.bibtask import task_sleep_now_if_required
-from invenio import bibauthorid_config as bconfig
+from invenio.legacy.bibsched.bibtask import task_sleep_now_if_required
+from invenio.legacy.bibauthorid import config as bconfig
from invenio.bibauthorid_comparison import cached_sym
-from invenio.bibauthorid_name_utils import compare_names as comp_names
-from invenio.bibauthorid_name_utils import split_name_parts
-from invenio.bibauthorid_name_utils import create_normalized_name
-from invenio.bibauthorid_general_utils import update_status \
+from invenio.legacy.bibauthorid.name_utils import compare_names as comp_names
+from invenio.legacy.bibauthorid.name_utils import split_name_parts
+from invenio.legacy.bibauthorid.name_utils import create_normalized_name
+from invenio.legacy.bibauthorid.general_utils import update_status \
, update_status_final
from invenio.bibauthorid_matrix_optimization import maximized_mapping
from invenio.bibauthorid_backinterface import get_all_valid_bibrecs
from invenio.bibauthorid_backinterface import filter_bibrecs_outside
from invenio.bibauthorid_backinterface import get_deleted_papers
from invenio.bibauthorid_backinterface import delete_paper_from_personid
from invenio.bibauthorid_backinterface import get_authors_from_paper
from invenio.bibauthorid_backinterface import get_coauthors_from_paper
from invenio.bibauthorid_backinterface import get_signatures_from_rec
from invenio.bibauthorid_backinterface import modify_signature
from invenio.bibauthorid_backinterface import remove_sigs
from invenio.bibauthorid_backinterface import find_pids_by_exact_name as _find_pids_by_exact_name
from invenio.bibauthorid_backinterface import new_person_from_signature as _new_person_from_signature
from invenio.bibauthorid_backinterface import add_signature as _add_signature
from invenio.bibauthorid_backinterface import update_personID_canonical_names
from invenio.bibauthorid_backinterface import update_personID_external_ids
from invenio.bibauthorid_backinterface import get_name_by_bibrecref
from invenio.bibauthorid_backinterface import populate_partial_marc_caches
from invenio.bibauthorid_backinterface import destroy_partial_marc_caches
from invenio.bibauthorid_backinterface import get_inspire_id
from invenio.bibauthorid_backinterface import get_person_with_extid
from invenio.bibauthorid_backinterface import get_name_string_to_pid_dictionary
from invenio.bibauthorid_backinterface import get_new_personid
USE_EXT_IDS = bconfig.RABBIT_USE_EXTERNAL_IDS
USE_INSPIREID = bconfig.RABBIT_USE_EXTERNAL_ID_INSPIREID
def rabbit(bibrecs, check_invalid_papers=False, personids_to_update_extids=None):
'''
@param bibrecs: an iterable full of bibrecs
@type bibrecs: an iterable of ints
@return: none
'''
if bconfig.RABBIT_USE_CACHED_PID:
PID_NAMES_CACHE = get_name_string_to_pid_dictionary()
def find_pids_by_exact_names_cache(name):
try:
return zip(PID_NAMES_CACHE[name])
except KeyError:
return []
def add_signature_using_names_cache(sig, name, pid):
try:
PID_NAMES_CACHE[name].add(pid)
except KeyError:
PID_NAMES_CACHE[name] = set([pid])
_add_signature(sig, name, pid)
def new_person_from_signature_using_names_cache(sig, name):
pid = get_new_personid()
add_signature_using_names_cache(sig, name, pid)
return pid
add_signature = add_signature_using_names_cache
new_person_from_signature = new_person_from_signature_using_names_cache
find_pids_by_exact_name = find_pids_by_exact_names_cache
else:
add_signature = _add_signature
new_person_from_signature = _new_person_from_signature
find_pids_by_exact_name = _find_pids_by_exact_name
compare_names = cached_sym(lambda x: x)(comp_names)
# fast assign threshold
threshold = 0.80
if not bibrecs or check_invalid_papers:
all_bibrecs = get_all_valid_bibrecs()
if not bibrecs:
bibrecs = all_bibrecs
if check_invalid_papers:
filter_bibrecs_outside(all_bibrecs)
if (bconfig.RABBIT_USE_CACHED_GET_GROUPED_RECORDS and
len(bibrecs) > bconfig.RABBIT_USE_CACHED_GET_GROUPED_RECORDS_THRESHOLD):
populate_partial_marc_caches()
SWAPPED_GET_GROUPED_RECORDS = True
else:
SWAPPED_GET_GROUPED_RECORDS = False
updated_pids = set()
deleted = frozenset(p[0] for p in get_deleted_papers())
for idx, rec in enumerate(bibrecs):
task_sleep_now_if_required(True)
update_status(float(idx) / len(bibrecs), "%d/%d current: %d" % (idx, len(bibrecs), rec))
if rec in deleted:
delete_paper_from_personid(rec)
continue
markrefs = frozenset(chain(izip(cycle([100]), imap(itemgetter(0), get_authors_from_paper(rec))),
izip(cycle([700]), imap(itemgetter(0), get_coauthors_from_paper(rec)))))
personid_rows = [map(int, row[:3]) + [row[4]] for row in get_signatures_from_rec(rec)]
personidrefs_names = dict(((row[1], row[2]), row[3]) for row in personid_rows)
personidrefs = frozenset(personidrefs_names.keys())
new_signatures = list(markrefs - personidrefs)
old_signatures = list(personidrefs - markrefs)
new_signatures_names = dict((new, create_normalized_name(split_name_parts(get_name_by_bibrecref(new))))
for new in new_signatures)
# matrix |new_signatures| X |old_signatures|
matrix = [[compare_names(new_signatures_names[new], personidrefs_names[old])
for old in old_signatures] for new in new_signatures]
# [(new_signatures, old_signatures)]
best_match = [(new_signatures[new], old_signatures[old])
for new, old, score in maximized_mapping(matrix) if score > threshold]
for new, old in best_match:
modify_signature(old, rec, new, new_signatures_names[new])
remove_sigs(tuple(list(old) + [rec]) for old in old_signatures)
not_matched = frozenset(new_signatures) - frozenset(map(itemgetter(0), best_match))
if not_matched:
used_pids = set(r[0] for r in personid_rows)
for sig in not_matched:
name = new_signatures_names[sig]
matched_pids = []
if USE_EXT_IDS:
if USE_INSPIREID:
inspire_id = get_inspire_id(sig + (rec,))
if inspire_id:
matched_pids = list(get_person_with_extid(inspire_id[0]))
if matched_pids:
add_signature(list(sig) + [rec], name, matched_pids[0][0])
updated_pids.add(matched_pids[0][0])
continue
matched_pids = find_pids_by_exact_name(name)
matched_pids = [p for p in matched_pids if int(p[0]) not in used_pids]
if not matched_pids:
new_pid = new_person_from_signature(list(sig) + [rec], name)
used_pids.add(new_pid)
updated_pids.add(new_pid)
else:
add_signature(list(sig) + [rec], name, matched_pids[0][0])
used_pids.add(matched_pids[0][0])
updated_pids.add(matched_pids[0][0])
update_status_final()
if personids_to_update_extids:
updated_pids |= personids_to_update_extids
if updated_pids: # an empty set will update all canonical_names
update_personID_canonical_names(updated_pids)
update_personID_external_ids(updated_pids, limit_to_claimed_papers=bconfig.LIMIT_EXTERNAL_IDS_COLLECTION_TO_CLAIMED_PAPERS)
if SWAPPED_GET_GROUPED_RECORDS:
destroy_partial_marc_caches()
diff --git a/invenio/legacy/bibauthorid/scheduler.py b/invenio/legacy/bibauthorid/scheduler.py
index 55ee8505a..119bec1b9 100644
--- a/invenio/legacy/bibauthorid/scheduler.py
+++ b/invenio/legacy/bibauthorid/scheduler.py
@@ -1,166 +1,166 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import re
import os
import sys
from itertools import dropwhile, chain
-from invenio.bibauthorid_general_utils import print_tortoise_memory_log
-from invenio import bibauthorid_config as bconfig
-from invenio.bibauthorid_general_utils import is_eq, update_status, update_status_final
+from invenio.legacy.bibauthorid.general_utils import print_tortoise_memory_log
+from invenio.legacy.bibauthorid import config as bconfig
+from invenio.legacy.bibauthorid.general_utils import is_eq, update_status, update_status_final
#python2.4 compatibility
-from invenio.bibauthorid_general_utils import bai_all as all
+from invenio.legacy.bibauthorid.general_utils import bai_all as all
def to_number(stry):
return int(re.sub("\D", "", stry))
def dict_by_file(fpath):
fp = open(fpath)
content = fp.read()
fp.close()
return dict(x.split(':') for x in content.split("\n")[:-1])
def get_free_memory():
mem = dict_by_file("/proc/meminfo")
return sum(map(to_number, (mem['MemFree'], mem['Buffers'], mem['Cached'])))
def get_total_memory():
mem = dict_by_file("/proc/meminfo")
return to_number(mem['MemTotal'])
def get_peak_mem():
pid = os.getpid()
mem = dict_by_file("/proc/%d/status" % pid)
return map(to_number, (mem["VmPeak"], mem["VmHWM"]))
#matrix_coefs = [1133088., 0., 1.5]
#wedge_coefs = [800000., 0., 2.]
matrix_coefs = [1000., 500., 0.01]
wedge_coefs = [1000., 500., 0.02]
def get_biggest_job_below(lim, arr):
return dropwhile(lambda x: x[1] < lim, enumerate(chain(arr, [lim]))).next()[0] - 1
def get_cores_count():
import multiprocessing
return multiprocessing.cpu_count()
def schedule(jobs, sizs, estimator, memfile_path=None):
if bconfig.DEBUG_PROCESS_PEAK_MEMORY and memfile_path:
def register_memory_usage():
pid = os.getpid()
peak = get_peak_mem()
fp = open(memfile_path, 'a')
print_tortoise_memory_log(
{'pid' : pid,
'peak1': peak[0],
'peak2': peak[1],
'est' : sizs[idx],
'bibs' : bibs[idx]
},
fp
)
fp.close()
else:
def register_memory_usage():
pass
def run_job(idx):
try:
sys.stdout = output_killer
jobs[idx]()
register_memory_usage()
os._exit(os.EX_OK)
except Exception, e:
f = open('/tmp/exception-%s' % str(os.getpid()), "w")
f.write(str(e) + '\n')
f.close()
os._exit(os.EX_SOFTWARE)
max_workers = get_cores_count()
pid_2_idx = {}
#free = get_free_memory()
initial = get_total_memory()
free = initial
output_killer = open(os.devnull, 'w')
ret_status = [None] * len(jobs)
bibs = sizs
sizs = map(estimator, sizs)
free_idxs = range(len(jobs))
assert len(jobs) == len(sizs) == len(ret_status) == len(bibs) == len(free_idxs)
done = 0.
total = sum(sizs)
biggest = max(sizs)
update_status(0., "0 / %d" % len(jobs))
too_big = [idx for idx in free_idxs if sizs[idx] > free]
for idx in too_big:
pid = os.fork()
if pid == 0: # child
run_job(idx)
else: # parent
done += sizs[idx]
del free_idxs[idx]
cpid, status = os.wait()
update_status(done / total, "%d / %d" % (len(jobs) - len(free_idxs), len(jobs)))
ret_status[idx] = status
assert cpid == pid
while free_idxs or pid_2_idx:
while len(pid_2_idx) < max_workers:
idx = get_biggest_job_below(free, (sizs[idx] for idx in free_idxs))
if idx != -1:
job_idx = free_idxs[idx]
pid = os.fork()
if pid == 0: # child
os.nice(int((float(sizs[idx]) * 20.0 / biggest)))
run_job(job_idx)
else: # parent
pid_2_idx[pid] = job_idx
assert free > sizs[job_idx]
free -= sizs[job_idx]
del free_idxs[idx]
else:
break
pid, status = os.wait()
assert pid in pid_2_idx
idx = pid_2_idx[pid]
freed = sizs[idx]
done += freed
ret_status[idx] = status
free += freed
del pid_2_idx[pid]
update_status(done / total, "%d / %d" % (len(jobs) - len(free_idxs) - len(pid_2_idx), len(jobs)))
update_status_final("%d / %d" % (len(jobs), len(jobs)))
assert is_eq(free, initial)
assert not pid_2_idx
assert not free_idxs
assert len(jobs) == len(sizs) == len(ret_status) == len(bibs)
assert all(stat != None for stat in ret_status)
return ret_status
diff --git a/invenio/legacy/bibauthorid/searchinterface.py b/invenio/legacy/bibauthorid/searchinterface.py
index ce24d4b62..ea0abe4bb 100644
--- a/invenio/legacy/bibauthorid/searchinterface.py
+++ b/invenio/legacy/bibauthorid/searchinterface.py
@@ -1,25 +1,25 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
'''
This file contains the functions, which are used by the search engine
to extract information about the authors.
'''
-from invenio.bibauthorid_dbinterface import get_person_bibrecs #emitting #pylint: disable-msg=W0611
-from invenio.bibauthorid_dbinterface import get_personids_from_bibrec #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.dbinterface import get_person_bibrecs #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.dbinterface import get_personids_from_bibrec #emitting #pylint: disable-msg=W0611
diff --git a/invenio/legacy/bibauthorid/templates.py b/invenio/legacy/bibauthorid/templates.py
index ce7428627..e2252379e 100644
--- a/invenio/legacy/bibauthorid/templates.py
+++ b/invenio/legacy/bibauthorid/templates.py
@@ -1,1938 +1,1938 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Bibauthorid HTML templates"""
# pylint: disable=W0105
# pylint: disable=C0301
#from cgi import escape
#from urllib import quote
#
-import invenio.bibauthorid_config as bconfig
+import invenio.legacy.bibauthorid.config as bconfig
from invenio.config import CFG_SITE_LANG
from invenio.config import CFG_SITE_URL
from invenio.config import CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL
from invenio.modules.formatter import format_record
from invenio.legacy.bibrecord import get_fieldvalues
-from invenio.bibauthorid_config import EXTERNAL_SYSTEMS_LIST
+from invenio.legacy.bibauthorid.config import EXTERNAL_SYSTEMS_LIST
from invenio.bibauthorid_webapi import get_person_redirect_link, get_canonical_id_from_person_id, get_person_names_from_id
from invenio.bibauthorid_webapi import get_personiID_external_ids
from invenio.bibauthorid_frontinterface import get_uid_from_personid
from invenio.bibauthorid_frontinterface import get_bibrefrec_name_string
from invenio.bibauthorid_frontinterface import get_canonical_id_from_personid
from invenio.base.i18n import gettext_set_language, wash_language
from invenio.legacy.webuser import get_email
from invenio.utils.html import escape_html
#from invenio.utils.text import encode_for_xml
from flask import session
class Template:
"""Templating functions used by aid"""
def __init__(self, language=CFG_SITE_LANG):
"""Set defaults for all aid template output"""
self.language = language
self._ = gettext_set_language(wash_language(language))
def tmpl_person_detail_layout(self, content):
'''
writes HTML content into the person css container
@param content: HTML content
@type content: string
@return: HTML code
@rtype: string
'''
html = []
h = html.append
h('<div id="aid_person">')
h(content)
h('</div>')
return "\n".join(html)
def tmpl_transaction_box(self, teaser_key, messages, show_close_btn=True):
'''
Creates a notification box based on the jQuery UI style
@param teaser_key: key to a dict which returns the teaser
@type teaser_key: string
@param messages: list of keys to a dict which return the message to display in the box
@type messages: list of strings
@param show_close_btn: display close button [x]
@type show_close_btn: boolean
@return: HTML code
@rtype: string
'''
transaction_teaser_dict = { 'success': 'Success!',
'failure': 'Failure!' }
transaction_message_dict = { 'confirm_success': '%s transaction%s successfully executed.',
'confirm_failure': '%s transaction%s failed. The system may have been updating during your operation. Please try again or contact %s to obtain help.',
'reject_success': '%s transaction%s successfully executed.',
'reject_failure': '%s transaction%s failed. The system may have been updating during your operation. Please try again or contact %s to obtain help.',
'reset_success': '%s transaction%s successfully executed.',
'reset_failure': '%s transaction%s failed. The system may have been updating during your operation. Please try again or contact %s to obtain help.' }
teaser = self._(transaction_teaser_dict[teaser_key])
html = []
h = html.append
for key in transaction_message_dict.keys():
same_kind = [mes for mes in messages if mes == key]
trans_no = len(same_kind)
if trans_no == 0:
continue
elif trans_no == 1:
args = [trans_no, '']
else:
args = [trans_no, 's']
color = ''
if teaser_key == 'failure':
color = 'background: #FC2626;'
args.append(CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL)
message = self._(transaction_message_dict[key] % tuple(args))
h('<div id="aid_notification_' + key + '" class="ui-widget ui-alert">')
h(' <div style="%s margin-top: 20px; padding: 0pt 0.7em;" class="ui-state-highlight ui-corner-all">' % (color))
h(' <p><span style="float: left; margin-right: 0.3em;" class="ui-icon ui-icon-info"></span>')
h(' <strong>%s</strong> %s' % (teaser, message))
if show_close_btn:
h(' <span style="float:right; margin-right: 0.3em;"><a rel="nofollow" href="#" class="aid_close-notify" style="border-style: none;">X</a></span></p>')
h(' </div>')
h('</div>')
return "\n".join(html)
def tmpl_notification_box(self, teaser_key, message_key, bibrefs, show_close_btn=True):
'''
Creates a notification box based on the jQuery UI style
@param teaser_key: key to a dict which returns the teaser
@type teaser_key: string
@param message_key: key to a dict which returns the message to display in the box
@type message_key: string
@param bibrefs: bibrefs which are about to be assigned
@type bibrefs: list of strings
@param show_close_btn: display close button [x]
@type show_close_btn: boolean
@return: HTML code
@rtype: string
'''
notification_teaser_dict = {'info': 'Info!' }
notification_message_dict = {'attribute_papers': 'You are about to attribute the following paper%s:' }
teaser = self._(notification_teaser_dict[teaser_key])
arg = ''
if len(bibrefs) > 1:
arg = 's'
message = self._(notification_message_dict[message_key] % (arg) )
html = []
h = html.append
h('<div id="aid_notification_' + teaser_key + '" class="ui-widget ui-alert">')
h(' <div style="margin-top: 20px; padding: 0pt 0.7em;" class="ui-state-highlight ui-corner-all">')
h(' <p><span style="float: left; margin-right: 0.3em;" class="ui-icon ui-icon-info"></span>')
h(' <strong>%s</strong> %s' % (teaser, message))
h("<ul>")
for paper in bibrefs:
if ',' in paper:
pbibrec = paper.split(',')[1]
else:
pbibrec = paper
h("<li>%s</li>" % (format_record(int(pbibrec), "ha")))
h("</ul>")
if show_close_btn:
h(' <span style="float:right; margin-right: 0.3em;"><a rel="nofollow" href="#" class="aid_close-notify">X</a></span></p>')
h(' </div>')
h('</div>')
return "\n".join(html)
def tmpl_error_box(self, teaser_key, message_key, show_close_btn=True):
'''
Creates an error box based on the jQuery UI style
@param teaser_key: key to a dict which returns the teaser
@type teaser_key: string
@param message_key: key to a dict which returns the message to display in the box
@type message_key: string
@param show_close_btn: display close button [x]
@type show_close_btn: boolean
@return: HTML code
@rtype: string
'''
error_teaser_dict = {'sorry': 'Sorry.',
'error': 'Error:' }
error_message_dict = {'check_entries': 'Please check your entries.',
'provide_transaction': 'Please provide at least one transaction.' }
teaser = self._(error_teaser_dict[teaser_key])
message = self._(error_message_dict[message_key])
html = []
h = html.append
h('<div id="aid_notification_' + teaser_key + '" class="ui-widget ui-alert">')
h(' <div style="background: #FC2626; margin-top: 20px; padding: 0pt 0.7em;" class="ui-state-error ui-corner-all">')
h(' <p><span style="float: left; margin-right: 0.3em;" class="ui-icon ui-icon-alert"></span>')
h(' <strong>%s</strong> %s' % (teaser, message))
if show_close_btn:
h(' <span style="float:right; margin-right: 0.3em;"><a rel="nofollow" href="#" class="aid_close-notify">X</a></span></p>')
h(' </div>')
h('</div>')
return "\n".join(html)
def tmpl_ticket_box(self, teaser_key, message_key, trans_no, show_close_btn=True):
'''
Creates a semi-permanent box informing about ticket
status notifications
@param teaser_key: key to a dict which returns the teaser
@type teaser_key: string
@param message_key: key to a dict which returns the message to display in the box
@type message_key: string
@param trans_no: number of transactions in progress
@type trans_no: integer
@param show_close_btn: display close button [x]
@type show_close_btn: boolean
@return: HTML code
@rtype: string
'''
ticket_teaser_dict = {'in_process': 'Claim in process!' }
ticket_message_dict = {'transaction': 'There %s %s transaction%s in progress.' }
teaser = self._(ticket_teaser_dict[teaser_key])
if trans_no == 1:
args = ['is', trans_no, '']
else:
args = ['are', trans_no, 's']
message = self._(ticket_message_dict[message_key] % tuple(args))
html = []
h = html.append
h('<div id="aid_notification_' + teaser_key + '" class="ui-widget ui-alert">')
h(' <div style="margin-top: 20px; padding: 0pt 0.7em;" class="ui-state-highlight ui-corner-all">')
h(' <p><span style="float: left; margin-right: 0.3em;" class="ui-icon ui-icon-info"></span>')
h(' <strong>%s</strong> %s ' % (teaser, message))
h('<a rel="nofollow" id="checkout" href="action?checkout=True">' + self._('Click here to review the transactions.') + '</a>')
h('<br>')
if show_close_btn:
h(' <span style="float:right; margin-right: 0.3em;"><a rel="nofollow" href="#" class="aid_close-notify">X</a></span></p>')
h(' </div>')
h('</div>')
return "\n".join(html)
def tmpl_search_ticket_box(self, teaser_key, message_key, bibrefs, show_close_btn=False):
'''
Creates a box informing about a claim in progress for
the search.
@param teaser_key: key to a dict which returns the teaser
@type teaser_key: string
@param message_key: key to a dict which returns the message to display in the box
@type message_key: string
@param bibrefs: bibrefs which are about to be assigned
@type bibrefs: list of strings
@param show_close_btn: display close button [x]
@type show_close_btn: boolean
@return: HTML code
@rtype: string
'''
error_teaser_dict = {'person_search': 'Person search for assignment in progress!' }
error_message_dict = {'assign_papers': 'You are searching for a person to assign the following paper%s:' }
teaser = self._(error_teaser_dict[teaser_key])
arg = ''
if len(bibrefs) > 1:
arg = 's'
message = self._(error_message_dict[message_key] % (arg) )
html = []
h = html.append
h('<div id="aid_notification_' + teaser_key + '" class="ui-widget ui-alert">')
h(' <div style="margin-top: 20px; padding: 0pt 0.7em;" class="ui-state-highlight ui-corner-all">')
h(' <p><span style="float: left; margin-right: 0.3em;" class="ui-icon ui-icon-info"></span>')
h(' <strong>%s</strong> %s ' % (teaser, message))
h("<ul>")
for paper in bibrefs:
if ',' in paper:
pbibrec = paper.split(',')[1]
else:
pbibrec = paper
h("<li>%s</li>"
% (format_record(int(pbibrec), "ha")))
h("</ul>")
h('<a rel="nofollow" id="checkout" href="action?cancel_search_ticket=True">' + self._('Quit searching.') + '</a>')
if show_close_btn:
h(' <span style="float:right; margin-right: 0.3em;"><a rel="nofollow" href="#" class="aid_close-notify">X</a></span></p>')
h(' </div>')
h('</div>')
h('<p>&nbsp;</p>')
return "\n".join(html)
def tmpl_meta_includes(self, kill_browser_cache=False):
'''
Generates HTML code for the header section of the document
META tags to kill browser caching
Javascript includes
CSS definitions
@param kill_browser_cache: Do we want to kill the browser cache?
@type kill_browser_cache: boolean
'''
js_path = "%s/js" % CFG_SITE_URL
imgcss_path = "%s/img" % CFG_SITE_URL
result = []
# Add browser cache killer, hence some notifications are not displayed
# out of the session.
if kill_browser_cache:
result = [
'<META HTTP-EQUIV="Pragma" CONTENT="no-cache">',
'<META HTTP-EQUIV="Cache-Control" CONTENT="no-cache">',
'<META HTTP-EQUIV="Pragma-directive" CONTENT="no-cache">',
'<META HTTP-EQUIV="Cache-Directive" CONTENT="no-cache">',
'<META HTTP-EQUIV="Expires" CONTENT="0">']
scripts = ["jquery-ui.min.js",
"jquery.form.js",
"jquery.dataTables.min.js",
"bibauthorid.js"]
result.append('<link rel="stylesheet" type="text/css" href='
'"%s/jquery-ui/themes/smoothness/jquery-ui.css" />'
% (imgcss_path))
result.append('<link rel="stylesheet" type="text/css" href='
'"%s/datatables_jquery-ui.css" />'
% (imgcss_path))
result.append('<link rel="stylesheet" type="text/css" href='
'"%s/bibauthorid.css" />'
% (imgcss_path))
for script in scripts:
result.append('<script type="text/javascript" src="%s/%s">'
'</script>' % (js_path, script))
return "\n".join(result)
def tmpl_author_confirmed(self, bibref, pid, verbiage_dict={'alt_confirm':'Confirmed.',
'confirm_text':'This record assignment has been confirmed.',
'alt_forget':'Forget decision!',
'forget_text':'Forget assignment decision',
'alt_repeal':'Repeal!',
'repeal_text':'Repeal record assignment',
'to_other_text':'Assign to another person',
'alt_to_other':'To other person!'
},
show_reset_button=True):
'''
Generate play per-paper links for the table for the
status "confirmed"
@param bibref: construct of unique ID for this author on this paper
@type bibref: string
@param pid: the Person ID
@type pid: int
@param verbiage_dict: language for the link descriptions
@type verbiage_dict: dict
'''
stri = ('<!--2!--><span id="aid_status_details"> '
'<img src="%(url)s/img/aid_check.png" alt="%(alt_confirm)s" />'
'%(confirm_text)s <br>')
if show_reset_button:
stri = stri + (
'<a rel="nofollow" id="aid_reset_gr" class="aid_grey" href="%(url)s/person/action?reset=True&selection=%(ref)s&pid=%(pid)s">'
'<img src="%(url)s/img/aid_reset_gray.png" alt="%(alt_forget)s" style="margin-left:22px;" />'
'%(forget_text)s</a><br>')
stri = stri + (
'<a rel="nofollow" id="aid_repeal" class="aid_grey" href="%(url)s/person/action?repeal=True&selection=%(ref)s&pid=%(pid)s">'
'<img src="%(url)s/img/aid_reject_gray.png" alt="%(alt_repeal)s" style="margin-left:22px;"/>'
'%(repeal_text)s</a><br>'
'<a rel="nofollow" id="aid_to_other" class="aid_grey" href="%(url)s/person/action?to_other_person=True&selection=%(ref)s">'
'<img src="%(url)s/img/aid_to_other_gray.png" alt="%(alt_to_other)s" style="margin-left:22px;"/>'
'%(to_other_text)s</a> </span>')
return (stri
% ({'url': CFG_SITE_URL, 'ref': bibref, 'pid': pid,
'alt_confirm':verbiage_dict['alt_confirm'],
'confirm_text':verbiage_dict['confirm_text'],
'alt_forget':verbiage_dict['alt_forget'],
'forget_text':verbiage_dict['forget_text'],
'alt_repeal':verbiage_dict['alt_repeal'],
'repeal_text':verbiage_dict['repeal_text'],
'to_other_text':verbiage_dict['to_other_text'],
'alt_to_other':verbiage_dict['alt_to_other']}))
def tmpl_author_repealed(self, bibref, pid, verbiage_dict={'alt_confirm':'Confirm!',
'confirm_text':'Confirm record assignment.',
'alt_forget':'Forget decision!',
'forget_text':'Forget assignment decision',
'alt_repeal':'Rejected!',
'repeal_text':'Repeal this record assignment.',
'to_other_text':'Assign to another person',
'alt_to_other':'To other person!'
} ):
'''
Generate play per-paper links for the table for the
status "repealed"
@param bibref: construct of unique ID for this author on this paper
@type bibref: string
@param pid: the Person ID
@type pid: int
@param verbiage_dict: language for the link descriptions
@type verbiage_dict: dict
'''
stri = ('<!---2!--><span id="aid_status_details"> '
'<img src="%(url)s/img/aid_reject.png" alt="%(alt_repeal)s" />'
'%(repeal_text)s <br>'
'<a rel="nofollow" id="aid_confirm" class="aid_grey" href="%(url)s/person/action?confirm=True&selection=%(ref)s&pid=%(pid)s">'
'<img src="%(url)s/img/aid_check_gray.png" alt="%(alt_confirm)s" style="margin-left: 22px;" />'
'%(confirm_text)s</a><br>'
'<a rel="nofollow" id="aid_to_other" class="aid_grey" href="%(url)s/person/action?to_other_person=True&selection=%(ref)s">'
'<img src="%(url)s/img/aid_to_other_gray.png" alt="%(alt_to_other)s" style="margin-left:22px;"/>'
'%(to_other_text)s</a> </span>')
return (stri
% ({'url': CFG_SITE_URL, 'ref': bibref, 'pid': pid,
'alt_confirm':verbiage_dict['alt_confirm'],
'confirm_text':verbiage_dict['confirm_text'],
'alt_forget':verbiage_dict['alt_forget'],
'forget_text':verbiage_dict['forget_text'],
'alt_repeal':verbiage_dict['alt_repeal'],
'repeal_text':verbiage_dict['repeal_text'],
'to_other_text':verbiage_dict['to_other_text'],
'alt_to_other':verbiage_dict['alt_to_other']}))
def tmpl_author_undecided(self, bibref, pid, verbiage_dict={'alt_confirm':'Confirm!',
'confirm_text':'Confirm record assignment.',
'alt_repeal':'Rejected!',
'repeal_text':'This record has been repealed.',
'to_other_text':'Assign to another person',
'alt_to_other':'To other person!'
},
show_reset_button=True):
'''
Generate play per-paper links for the table for the
status "no decision taken yet"
@param bibref: construct of unique ID for this author on this paper
@type bibref: string
@param pid: the Person ID
@type pid: int
@param verbiage_dict: language for the link descriptions
@type verbiage_dict: dict
'''
#batchprocess?mconfirm=True&bibrefs=['100:17,16']&pid=1
string = ('<!--0!--><span id="aid_status_details"> '
'<a rel="nofollow" id="aid_confirm" href="%(url)s/person/action?confirm=True&selection=%(ref)s&pid=%(pid)s">'
'<img src="%(url)s/img/aid_check.png" alt="%(alt_confirm)s" />'
'%(confirm_text)s</a><br />'
'<a rel="nofollow" id="aid_repeal" href="%(url)s/person/action?repeal=True&selection=%(ref)s&pid=%(pid)s">'
'<img src="%(url)s/img/aid_reject.png" alt="%(alt_repeal)s" />'
'%(repeal_text)s</a> <br />'
'<a rel="nofollow" id="aid_to_other" href="%(url)s/person/action?to_other_person=True&selection=%(ref)s">'
'<img src="%(url)s/img/aid_to_other.png" alt="%(alt_to_other)s" />'
'%(to_other_text)s</a> </span>')
return (string
% ({'url': CFG_SITE_URL, 'ref': bibref, 'pid': pid,
'alt_confirm':verbiage_dict['alt_confirm'],
'confirm_text':verbiage_dict['confirm_text'],
'alt_repeal':verbiage_dict['alt_repeal'],
'repeal_text':verbiage_dict['repeal_text'],
'to_other_text':verbiage_dict['to_other_text'],
'alt_to_other':verbiage_dict['alt_to_other']}))
def tmpl_open_claim(self, bibrefs, pid, last_viewed_pid,
search_enabled=True):
'''
Generate entry page for "claim or attribute this paper"
@param bibref: construct of unique ID for this author on this paper
@type bibref: string
@param pid: the Person ID
@type pid: int
@param last_viewed_pid: last ID that had been subject to an action
@type last_viewed_pid: int
'''
t_html = []
h = t_html.append
h(self.tmpl_notification_box('info', 'attribute_papers', bibrefs, show_close_btn=False))
h('<p> ' + self._('Your options') + ': </p>')
bibs = ''
for paper in bibrefs:
if bibs:
bibs = bibs + '&'
bibs = bibs + 'selection=' + str(paper)
if pid > -1:
h('<a rel="nofollow" id="clam_for_myself" href="%s/person/action?confirm=True&%s&pid=%s"> ' % (CFG_SITE_URL, bibs, str(pid)) )
h(self._('Claim for yourself') + ' </a> <br>')
if last_viewed_pid:
h('<a rel="nofollow" id="clam_for_last_viewed" href="%s/person/action?confirm=True&%s&pid=%s"> ' % (CFG_SITE_URL, bibs, str(last_viewed_pid[0])) )
h(self._('Attribute to') + ' %s </a> <br>' % (last_viewed_pid[1]) )
if search_enabled:
h('<a rel="nofollow" id="claim_search" href="%s/person/action?to_other_person=True&%s"> ' % (CFG_SITE_URL, bibs))
h(self._('Search for a person to attribute the paper to') + ' </a> <br>')
return "\n".join(t_html)
def __tmpl_admin_records_table(self, form_id, person_id, bibrecids, verbiage_dict={'no_doc_string':'Sorry, there are currently no documents to be found in this category.',
'b_confirm':'Confirm',
'b_repeal':'Repeal',
'b_to_others':'Assign to other person',
'b_forget':'Forget decision'},
buttons_verbiage_dict={'mass_buttons':{'no_doc_string':'Sorry, there are currently no documents to be found in this category.',
'b_confirm':'Confirm',
'b_repeal':'Repeal',
'b_to_others':'Assign to other person',
'b_forget':'Forget decision'},
'record_undecided':{'alt_confirm':'Confirm!',
'confirm_text':'Confirm record assignment.',
'alt_repeal':'Rejected!',
'repeal_text':'This record has been repealed.'},
'record_confirmed':{'alt_confirm':'Confirmed.',
'confirm_text':'This record assignment has been confirmed.',
'alt_forget':'Forget decision!',
'forget_text':'Forget assignment decision',
'alt_repeal':'Repeal!',
'repeal_text':'Repeal record assignment'},
'record_repealed':{'alt_confirm':'Confirm!',
'confirm_text':'Confirm record assignment.',
'alt_forget':'Forget decision!',
'forget_text':'Forget assignment decision',
'alt_repeal':'Rejected!',
'repeal_text':'Repeal this record assignment.'}},
show_reset_button=True):
'''
Generate the big tables for the person overview page
@param form_id: name of the form
@type form_id: string
@param person_id: Person ID
@type person_id: int
@param bibrecids: List of records to display
@type bibrecids: list
@param verbiage_dict: language for the elements
@type verbiage_dict: dict
@param buttons_verbiage_dict: language for the buttons
@type buttons_verbiage_dict: dict
'''
no_papers_html = ['<div style="text-align:left;margin-top:1em;"><strong>']
no_papers_html.append('%s' % self._(verbiage_dict['no_doc_string']) )
no_papers_html.append('</strong></div>')
if not bibrecids or not person_id:
return "\n".join(no_papers_html)
pp_html = []
h = pp_html.append
h('<form id="%s" action="/person/action" method="post">'
% (form_id))
h('<div class="aid_reclist_selector">') #+self._(' On all pages: '))
h('<a rel="nofollow" rel="group_1" href="#select_all">' + self._('Select All') + '</a> | ')
h('<a rel="nofollow" rel="group_1" href="#select_none">' + self._('Select None') + '</a> | ')
h('<a rel="nofollow" rel="group_1" href="#invert_selection">' + self._('Invert Selection') + '</a> | ')
h('<a rel="nofollow" id="toggle_claimed_rows" href="javascript:toggle_claimed_rows();" '
'alt="hide">' + self._('Hide successful claims') + '</a>')
h('</div>')
h('<div class="aid_reclist_buttons">')
h(('<img src="%s/img/aid_90low_right.png" alt="∟" />')
% (CFG_SITE_URL))
h('<input type="hidden" name="pid" value="%s" />' % (person_id))
h('<input type="submit" name="confirm" value="%s" class="aid_btn_blue" />' % self._(verbiage_dict['b_confirm']) )
h('<input type="submit" name="repeal" value="%s" class="aid_btn_blue" />' % self._(verbiage_dict['b_repeal']) )
h('<input type="submit" name="to_other_person" value="%s" class="aid_btn_blue" />' % self._(verbiage_dict['b_to_others']) )
#if show_reset_button:
# h('<input type="submit" name="reset" value="%s" class="aid_btn_blue" />' % verbiage_dict['b_forget'])
h(" </div>")
h('<table class="paperstable" cellpadding="3" width="100%">')
h("<thead>")
h(" <tr>")
h(' <th>&nbsp;</th>')
h(' <th>' + self._('Paper Short Info') + '</th>')
h(' <th>' + self._('Author Name') + '</th>')
h(' <th>' + self._('Affiliation') + '</th>')
h(' <th>' + self._('Date') + '</th>')
h(' <th>' + self._('Experiment') + '</th>')
h(' <th>' + self._('Actions') + '</th>')
h(' </tr>')
h('</thead>')
h('<tbody>')
for idx, paper in enumerate(bibrecids):
h(' <tr style="padding-top: 6px; padding-bottom: 6px;">')
h(' <td><input type="checkbox" name="selection" '
'value="%s" /> </td>' % (paper['bibref']))
rec_info = format_record(int(paper['recid']), "ha")
rec_info = str(idx + 1) + '. ' + rec_info
h(" <td>%s</td>" % (rec_info))
h(" <td>%s</td>" % (paper['authorname']))
aff = ""
if paper['authoraffiliation']:
aff = paper['authoraffiliation']
else:
aff = self._("Not assigned")
h(" <td>%s</td>" % (aff))
if paper['paperdate']:
pdate = paper['paperdate']
else:
pdate = 'N.A.'
h(" <td>%s</td>" % pdate)
if paper['paperexperiment']:
pdate = paper['paperexperiment']
else:
pdate = 'N.A.'
h(" <td>%s</td>" % pdate)
paper_status = self._("No status information found.")
if paper['flag'] == 2:
paper_status = self.tmpl_author_confirmed(paper['bibref'], person_id,
verbiage_dict=buttons_verbiage_dict['record_confirmed'],
show_reset_button=show_reset_button)
elif paper['flag'] == -2:
paper_status = self.tmpl_author_repealed(paper['bibref'], person_id,
verbiage_dict=buttons_verbiage_dict['record_repealed'])
else:
paper_status = self.tmpl_author_undecided(paper['bibref'], person_id,
verbiage_dict=buttons_verbiage_dict['record_undecided'],
show_reset_button=show_reset_button)
h(' <td><div id="bibref%s" style="float:left"><!--%s!-->%s &nbsp;</div>'
% (paper['bibref'], paper['flag'], paper_status))
if 'rt_status' in paper and paper['rt_status']:
h('<img src="%s/img/aid_operator.png" title="%s" '
'alt="actions pending" style="float:right" '
'height="24" width="24" />'
% (CFG_SITE_URL, self._("Operator review of user actions pending")))
h(' </td>')
h(" </tr>")
h(" </tbody>")
h("</table>")
h('<div class="aid_reclist_selector">') #+self._(' On all pages: '))
h('<a rel="nofollow" rel="group_1" href="#select_all">' + self._('Select All') + '</a> | ')
h('<a rel="nofollow" rel="group_1" href="#select_none">' + self._('Select None') + '</a> | ')
h('<a rel="nofollow" rel="group_1" href="#invert_selection">' + self._('Invert Selection') + '</a> | ')
h('<a rel="nofollow" id="toggle_claimed_rows" href="javascript:toggle_claimed_rows();" '
'alt="hide">' + self._('Hide successful claims') + '</a>')
h('</div>')
h('<div class="aid_reclist_buttons">')
h(('<img src="%s/img/aid_90low_right.png" alt="∟" />')
% (CFG_SITE_URL))
h('<input type="hidden" name="pid" value="%s" />' % (person_id))
h('<input type="submit" name="confirm" value="%s" class="aid_btn_blue" />' % verbiage_dict['b_confirm'])
h('<input type="submit" name="repeal" value="%s" class="aid_btn_blue" />' % verbiage_dict['b_repeal'])
h('<input type="submit" name="to_other_person" value="%s" class="aid_btn_blue" />' % verbiage_dict['b_to_others'])
#if show_reset_button:
# h('<input type="submit" name="reset" value="%s" class="aid_btn_blue" />' % verbiage_dict['b_forget'])
h(" </div>")
h("</form>")
return "\n".join(pp_html)
def __tmpl_reviews_table(self, person_id, bibrecids, admin=False):
'''
Generate the table for potential reviews.
@param form_id: name of the form
@type form_id: string
@param person_id: Person ID
@type person_id: int
@param bibrecids: List of records to display
@type bibrecids: list
@param admin: Show admin functions
@type admin: boolean
'''
no_papers_html = ['<div style="text-align:left;margin-top:1em;"><strong>']
no_papers_html.append(self._('Sorry, there are currently no records to be found in this category.'))
no_papers_html.append('</strong></div>')
if not bibrecids or not person_id:
return "\n".join(no_papers_html)
pp_html = []
h = pp_html.append
h('<form id="review" action="/person/batchprocess" method="post">')
h('<table class="reviewstable" cellpadding="3" width="100%">')
h(' <thead>')
h(' <tr>')
h(' <th>&nbsp;</th>')
h(' <th>' + self._('Paper Short Info') + '</th>')
h(' <th>' + self._('Actions') + '</th>')
h(' </tr>')
h(' </thead>')
h(' <tbody>')
for paper in bibrecids:
h(' <tr>')
h(' <td><input type="checkbox" name="selected_bibrecs" '
'value="%s" /> </td>' % (paper))
rec_info = format_record(int(paper[0]), "ha")
if not admin:
rec_info = rec_info.replace("person/search?q=", "author/")
h(" <td>%s</td>" % (rec_info))
h(' <td><a rel="nofollow" href="/person/batchprocess?selected_bibrecs=%s&mfind_bibref=claim">' + self._('Review Transaction') + '</a></td>'
% (paper))
h(" </tr>")
h(" </tbody>")
h("</table>")
h('<div style="text-align:left;"> ' + self._('On all pages') + ': ')
h('<a rel="nofollow" rel="group_1" href="#select_all">' + self._('Select All') + '</a> | ')
h('<a rel="nofollow" rel="group_1" href="#select_none">' + self._('Select None') + '</a> | ')
h('<a rel="nofollow" rel="group_1" href="#invert_selection">' + self._('Invert Selection') + '</a>')
h('</div>')
h('<div style="vertical-align:middle;">')
h('∟ ' + self._('With selected do') + ': ')
h('<input type="hidden" name="pid" value="%s" />' % (person_id))
h('<input type="hidden" name="mfind_bibref" value="claim" />')
h('<input type="submit" name="submit" value="Review selected transactions" />')
h(" </div>")
h('</form>')
return "\n".join(pp_html)
def tmpl_admin_person_info_box(self, ln, person_id= -1, names=[]):
'''
Generate the box showing names
@param ln: the language to use
@type ln: string
@param person_id: Person ID
@type person_id: int
@param names: List of names to display
@type names: list
'''
html = []
h = html.append
if not ln:
pass
#class="ui-tabs ui-widget ui-widget-content ui-corner-all">
h('<div id="aid_person_names"')
h('<p><strong>' + self._('Names variants') + ':</strong></p>')
h("<p>")
h('<!--<span class="aid_lowlight_text">Person ID: <span id="pid%s">%s</span></span><br />!-->'
% (person_id, person_id))
for name in names:
# h(("%s "+self._('as appeared on')+" %s"+self._(' records')+"<br />")
# % (name[0], name[1]))
h(("%s (%s); ")
% (name[0], name[1]))
h("</p>")
h("</div>")
return "\n".join(html)
def tmpl_admin_tabs(self, ln=CFG_SITE_LANG, person_id= -1,
rejected_papers=[],
rest_of_papers=[],
review_needed=[],
rt_tickets=[],
open_rt_tickets=[],
show_tabs=['records', 'repealed', 'review', 'comments', 'tickets', 'data'],
show_reset_button=True,
ticket_links=['delete', 'commit', 'del_entry', 'commit_entry'],
verbiage_dict={'confirmed':'Records', 'repealed':'Not this person\'s records',
'review':'Records in need of review',
'tickets':'Open Tickets', 'data':'Data',
'confirmed_ns':'Papers of this Person',
'repealed_ns':'Papers _not_ of this Person',
'review_ns':'Papers in need of review',
'tickets_ns':'Tickets for this Person',
'data_ns':'Additional Data for this Person'},
buttons_verbiage_dict={'mass_buttons':{'no_doc_string':'Sorry, there are currently no documents to be found in this category.',
'b_confirm':'Confirm',
'b_repeal':'Repeal',
'b_to_others':'Assign to other person',
'b_forget':'Forget decision'},
'record_undecided':{'alt_confirm':'Confirm!',
'confirm_text':'Confirm record assignment.',
'alt_repeal':'Rejected!',
'repeal_text':'This record has been repealed.'},
'record_confirmed':{'alt_confirm':'Confirmed.',
'confirm_text':'This record assignment has been confirmed.',
'alt_forget':'Forget decision!',
'forget_text':'Forget assignment decision',
'alt_repeal':'Repeal!',
'repeal_text':'Repeal record assignment'},
'record_repealed':{'alt_confirm':'Confirm!',
'confirm_text':'Confirm record assignment.',
'alt_forget':'Forget decision!',
'forget_text':'Forget assignment decision',
'alt_repeal':'Rejected!',
'repeal_text':'Repeal this record assignment.'}}):
'''
Generate the tabs for the person overview page
@param ln: the language to use
@type ln: string
@param person_id: Person ID
@type person_id: int
@param rejected_papers: list of repealed papers
@type rejected_papers: list
@param rest_of_papers: list of attributed of undecided papers
@type rest_of_papers: list
@param review_needed: list of papers that need a review (choose name)
@type review_needed:list
@param rt_tickets: list of tickets for this Person
@type rt_tickets: list
@param open_rt_tickets: list of open request tickets
@type open_rt_tickets: list
@param show_tabs: list of tabs to display
@type show_tabs: list of strings
@param ticket_links: list of links to display
@type ticket_links: list of strings
@param verbiage_dict: language for the elements
@type verbiage_dict: dict
@param buttons_verbiage_dict: language for the buttons
@type buttons_verbiage_dict: dict
'''
html = []
h = html.append
h('<div id="aid_tabbing">')
h(' <ul>')
if 'records' in show_tabs:
r = verbiage_dict['confirmed']
h(' <li><a rel="nofollow" href="#tabRecords"><span>%(r)s (%(l)s)</span></a></li>' %
({'r':r, 'l':len(rest_of_papers)}))
if 'repealed' in show_tabs:
r = verbiage_dict['repealed']
h(' <li><a rel="nofollow" href="#tabNotRecords"><span>%(r)s (%(l)s)</span></a></li>' %
({'r':r, 'l':len(rejected_papers)}))
if 'review' in show_tabs:
r = verbiage_dict['review']
h(' <li><a rel="nofollow" href="#tabReviewNeeded"><span>%(r)s (%(l)s)</span></a></li>' %
({'r':r, 'l':len(review_needed)}))
if 'tickets' in show_tabs:
r = verbiage_dict['tickets']
h(' <li><a rel="nofollow" href="#tabTickets"><span>%(r)s (%(l)s)</span></a></li>' %
({'r':r, 'l':len(open_rt_tickets)}))
if 'data' in show_tabs:
r = verbiage_dict['data']
h(' <li><a rel="nofollow" href="#tabData"><span>%s</span></a></li>' % r)
h(' </ul>')
if 'records' in show_tabs:
h(' <div id="tabRecords">')
r = verbiage_dict['confirmed_ns']
h('<noscript><h5>%s</h5></noscript>' % r)
h(self.__tmpl_admin_records_table("massfunctions",
person_id, rest_of_papers,
verbiage_dict=buttons_verbiage_dict['mass_buttons'],
buttons_verbiage_dict=buttons_verbiage_dict,
show_reset_button=show_reset_button))
h(" </div>")
if 'repealed' in show_tabs:
h(' <div id="tabNotRecords">')
r = verbiage_dict['repealed_ns']
h('<noscript><h5>%s</h5></noscript>' % r)
h(self._('These records have been marked as not being from this person.'))
h('<br />' + self._('They will be regarded in the next run of the author ')
+ self._('disambiguation algorithm and might disappear from this listing.'))
h(self.__tmpl_admin_records_table("rmassfunctions",
person_id, rejected_papers,
verbiage_dict=buttons_verbiage_dict['mass_buttons'],
buttons_verbiage_dict=buttons_verbiage_dict,
show_reset_button=show_reset_button))
h(" </div>")
if 'review' in show_tabs:
h(' <div id="tabReviewNeeded">')
r = verbiage_dict['review_ns']
h('<noscript><h5>%s</h5></noscript>' % r)
h(self.__tmpl_reviews_table(person_id, review_needed, True))
h(' </div>')
if 'tickets' in show_tabs:
h(' <div id="tabTickets">')
r = verbiage_dict['tickets']
h('<noscript><h5>%s</h5></noscript>' % r)
r = verbiage_dict['tickets_ns']
h('<p>%s:</p>' % r)
if rt_tickets:
pass
# open_rt_tickets = [a for a in open_rt_tickets if a[1] == rt_tickets]
for t in open_rt_tickets:
name = self._('Not provided')
surname = self._('Not provided')
uidip = self._('Not available')
comments = self._('No comments')
email = self._('Not provided')
date = self._('Not Available')
actions = []
for info in t[0]:
if info[0] == 'firstname':
name = info[1]
elif info[0] == 'lastname':
surname = info[1]
elif info[0] == 'uid-ip':
uidip = info[1]
elif info[0] == 'comments':
comments = info[1]
elif info[0] == 'email':
email = info[1]
elif info[0] == 'date':
date = info[1]
elif info[0] in ['confirm', 'repeal']:
actions.append(info)
if 'delete' in ticket_links:
h(('<strong>Ticket number: %(tnum)s </strong> <a rel="nofollow" id="cancel" href=%(url)s/person/action?cancel_rt_ticket=True&selection=%(tnum)s&pid=%(pid)s>' + self._(' Delete this ticket') + ' </a>')
% ({'tnum':t[1], 'url':CFG_SITE_URL, 'pid':str(person_id)}))
if 'commit' in ticket_links:
h((' or <a rel="nofollow" id="commit" href=%(url)s/person/action?commit_rt_ticket=True&selection=%(tnum)s&pid=%(pid)s>' + self._(' Commit this entire ticket') + ' </a> <br>')
% ({'tnum':t[1], 'url':CFG_SITE_URL, 'pid':str(person_id)}))
h('<dd>')
h('Open from: %s, %s <br>' % (surname, name))
h('Date: %s <br>' % date)
h('identified by: %s <br>' % uidip)
h('email: %s <br>' % email)
h('comments: %s <br>' % comments)
h('Suggested actions: <br>')
h('<dd>')
for a in actions:
bibref, bibrec = a[1].split(',')
pname = get_bibrefrec_name_string(bibref)
title = ""
try:
title = get_fieldvalues(int(bibrec), "245__a")[0]
except IndexError:
title = self._("No title available")
title = escape_html(title)
if 'commit_entry' in ticket_links:
h('<a rel="nofollow" id="action" href="%(url)s/person/action?%(action)s=True&pid=%(pid)s&selection=%(bib)s&rt_id=%(rt)s">%(action)s - %(name)s on %(title)s </a>'
% ({'action': a[0], 'url': CFG_SITE_URL,
'pid': str(person_id), 'bib':a[1],
'name': pname, 'title': title, 'rt': t[1]}))
else:
h('%(action)s - %(name)s on %(title)s'
% ({'action': a[0], 'name': pname, 'title': title}))
if 'del_entry' in ticket_links:
h(' - <a rel="nofollow" id="action" href="%(url)s/person/action?cancel_rt_ticket=True&pid=%(pid)s&selection=%(bib)s&rt_id=%(rt)s&rt_action=%(action)s"> Delete this entry </a>'
% ({'action': a[0], 'url': CFG_SITE_URL,
'pid': str(person_id), 'bib': a[1], 'rt': t[1]}))
h(' - <a rel="nofollow" id="show_paper" target="_blank" href="%(url)s/record/%(record)s"> View record <br>' % ({'url':CFG_SITE_URL, 'record':str(bibrec)}))
h('</dd>')
h('</dd><br>')
# h(str(open_rt_tickets))
h(" </div>")
if 'data' in show_tabs:
h(' <div id="tabData">')
r = verbiage_dict['data_ns']
h('<noscript><h5>%s</h5></noscript>' % r)
canonical_name = str(get_canonical_id_from_person_id(person_id))
if '.' in str(canonical_name) and not isinstance(canonical_name, int):
canonical_name = canonical_name[0:canonical_name.rindex('.')]
h('<div><div> <strong> Person id </strong> <br> %s <br>' % person_id)
h('<strong> <br> Canonical name setup </strong>')
h('<div style="margin-top: 15px;"> Current canonical name: %s <form method="GET" action="%s/person/action" rel="nofollow">' % (canonical_name, CFG_SITE_URL))
h('<input type="hidden" name="set_canonical_name" value="True" />')
h('<input name="canonical_name" id="canonical_name" type="text" style="border:1px solid #333; width:500px;" value="%s" /> ' % canonical_name)
h('<input type="hidden" name="pid" value="%s" />' % person_id)
h('<input type="submit" value="set canonical name" class="aid_btn_blue" />')
h('<br>NOTE: please note the a number is appended automatically to the name displayed above. This cannot be manually triggered so to ensure unicity of IDs.')
h('To change the number if greater then one, please change all the other names first, then updating this one will do the trick. </div>')
h('</form> </div></div>')
userid = get_uid_from_personid(person_id)
h('<div> <br>')
h('<strong> Internal IDs </strong> <br>')
if userid:
email = get_email(int(userid))
h('UserID: INSPIRE user %s is associated with this profile with email: %s' % (str(userid), str(email)))
else:
h('UserID: There is no INSPIRE user associated to this profile!')
h('<br></div>')
external_ids = get_personiID_external_ids(person_id)
h('<div> <br>')
h('<strong> External IDs </strong> <br>')
h('<form method="GET" action="%s/person/action" rel="nofollow">' % (CFG_SITE_URL) )
h('<input type="hidden" name="add_missing_external_ids" value="True">')
h('<input type="hidden" name="pid" value="%s">' % person_id)
h('<br> <input type="submit" value="add missing ids" class="aid_btn_blue"> </form>')
h('<form method="GET" action="%s/person/action" rel="nofollow">' % (CFG_SITE_URL) )
h('<input type="hidden" name="rewrite_all_external_ids" value="True">')
h('<input type="hidden" name="pid" value="%s">' % person_id)
h('<br> <input type="submit" value="rewrite all ids" class="aid_btn_blue"> </form> <br>')
if external_ids:
h('<form method="GET" action="%s/person/action" rel="nofollow">' % (CFG_SITE_URL) )
h(' <input type="hidden" name="delete_external_ids" value="True">')
h(' <input type="hidden" name="pid" value="%s">' % person_id)
for idx in external_ids:
try:
sys = [s for s in EXTERNAL_SYSTEMS_LIST if EXTERNAL_SYSTEMS_LIST[s] == idx][0]
except (IndexError):
sys = ''
for k in external_ids[idx]:
h('<br> <input type="checkbox" name="existing_ext_ids" value="%s||%s"> <strong> %s: </strong> %s' % (idx, k, sys, k))
h(' <br> <br> <input type="submit" value="delete selected ids" class="aid_btn_blue"> <br> </form>')
else:
h('UserID: There are no external users associated to this profile!')
h('<br> <br>')
h('<form method="GET" action="%s/person/action" rel="nofollow">' % (CFG_SITE_URL) )
h(' <input type="hidden" name="add_external_id" value="True">')
h(' <input type="hidden" name="pid" value="%s">' % person_id)
h(' <select name="ext_system">')
h(' <option value="" selected>-- ' + self._('Choose system') + ' --</option>')
for el in EXTERNAL_SYSTEMS_LIST:
h(' <option value="%s"> %s </option>' % (EXTERNAL_SYSTEMS_LIST[el], el))
h(' </select>')
h(' <input type="text" name="ext_id" id="ext_id" style="border:1px solid #333; width:350px;">')
h(' <input type="submit" value="add external id" class="aid_btn_blue">')
# h('<br>NOTE: please note that if you add an external id it will replace the previous one (if any).')
h('<br> </form> </div>')
h('</div> </div>')
h('</div>')
return "\n".join(html)
def tmpl_bibref_check(self, bibrefs_auto_assigned, bibrefs_to_confirm):
'''
Generate overview to let user chose the name on the paper that
resembles the person in question.
@param bibrefs_auto_assigned: list of auto-assigned papers
@type bibrefs_auto_assigned: list
@param bibrefs_to_confirm: list of unclear papers and names
@type bibrefs_to_confirm: list
'''
html = []
h = html.append
h('<form id="review" action="/person/action" method="post">')
h('<p><strong>' + self._("Make sure we match the right names!")
+ '</strong></p>')
h('<p>' + self._('Please select an author on each of the records that will be assigned.') + '<br/>')
h(self._('Papers without a name selected will be ignored in the process.'))
h('</p>')
for person in bibrefs_to_confirm:
if not "bibrecs" in bibrefs_to_confirm[person]:
continue
person_name = bibrefs_to_confirm[person]["person_name"]
if person_name.isspace():
h((self._('Claim for person with id') + ': %s. ') % person)
h(self._('This seems to be an empty profile without names associated to it yet'))
h(self._('(the names will be automatically gathered when the first paper is claimed to this profile).'))
else:
h((self._("Select name for") + " %s") % (person_name))
pid = person
for recid in bibrefs_to_confirm[person]["bibrecs"]:
h('<div id="aid_moreinfo">')
try:
fv = get_fieldvalues(int(recid), "245__a")[0]
except (ValueError, IndexError, TypeError):
fv = self._('Error retrieving record title')
fv = escape_html(fv)
h(self._("Paper title: ") + fv)
h('<select name="bibrecgroup%s">' % (recid))
h('<option value="" selected>-- Choose author name --</option>')
for bibref in bibrefs_to_confirm[person]["bibrecs"][recid]:
h('<option value="%s||%s">%s</option>'
% (pid, bibref[0], bibref[1]))
h('</select>')
h("</div>")
if bibrefs_auto_assigned:
h(self._('The following names have been automatically chosen:'))
for person in bibrefs_auto_assigned:
if not "bibrecs" in bibrefs_auto_assigned[person]:
continue
h((self._("For") + " %s:") % bibrefs_auto_assigned[person]["person_name"])
pid = person
for recid in bibrefs_auto_assigned[person]["bibrecs"]:
try:
fv = get_fieldvalues(int(recid), "245__a")[0]
except (ValueError, IndexError, TypeError):
fv = self._('Error retrieving record title')
fv = escape_html(fv)
h('<div id="aid_moreinfo">')
h(('%s' + self._(' -- With name: ')) % (fv) )
#, bibrefs_auto_assigned[person]["bibrecs"][recid][0][1]))
# asbibref = "%s||%s" % (person, bibrefs_auto_assigned[person]["bibrecs"][recid][0][0])
pbibref = bibrefs_auto_assigned[person]["bibrecs"][recid][0][0]
h('<select name="bibrecgroup%s">' % (recid))
h('<option value="" selected>-- ' + self._('Ignore') + ' --</option>')
for bibref in bibrefs_auto_assigned[person]["bibrecs"][recid]:
selector = ""
if bibref[0] == pbibref:
selector = ' selected="selected"'
h('<option value="%s||%s"%s>%s</option>'
% (pid, bibref[0], selector, bibref[1]))
h('</select>')
# h('<input type="hidden" name="bibrecgroup%s" value="%s" />'
# % (recid, asbibref))
h('</div>')
h('<div style="text-align:center;">')
h(' <input type="submit" class="aid_btn_green" name="bibref_check_submit" value="Accept" />')
h(' <input type="submit" class="aid_btn_blue" name="cancel_stage" value="Cancel" />')
h("</div>")
h('</form>')
return "\n".join(html)
def tmpl_invenio_search_box(self):
'''
Generate little search box for missing papers. Links to main invenio
search on start papge.
'''
html = []
h = html.append
h('<div style="margin-top: 15px;"> <strong>Search for missing papers:</strong> <form method="GET" action="%s/search">' % CFG_SITE_URL)
h('<input name="p" id="p" type="text" style="border:1px solid #333; width:500px;" /> ')
h('<input type="submit" name="action_search" value="search" '
'class="aid_btn_blue" />')
h('</form> </div>')
return "\n".join(html)
def tmpl_person_menu(self):
'''
Generate the menu bar
'''
html = []
h = html.append
h('<div id="aid_menu">')
h(' <ul>')
h(' <li>' + self._('Navigation:') + '</li>')
h((' <li><a rel="nofollow" href="%s/person/search">' + self._('Run paper attribution for another author') + '</a></li>') % CFG_SITE_URL)
h(' <!--<li><a rel="nofollow" href="#">' + self._('Person Interface FAQ') + '</a></li>!-->')
h(' </ul>')
h('</div>')
return "\n".join(html)
def tmpl_person_menu_admin(self):
'''
Generate the menu bar
'''
html = []
h = html.append
h('<div id="aid_menu">')
h(' <ul>')
h(' <li>' + self._('Navigation:') + '</li>')
h((' <li><a rel="nofollow" href="%s/person/search">' + self._('Person Search') + '</a></li>') % CFG_SITE_URL)
h((' <li><a rel="nofollow" href="%s/person/tickets_admin">' + self._('Open tickets') + '</a></li>') % CFG_SITE_URL)
h(' <!--<li><a rel="nofollow" href="#">' + self._('Person Interface FAQ') + '</a></li>!-->')
h(' </ul>')
h('</div>')
return "\n".join(html)
def tmpl_ticket_final_review(self, req, mark_yours=[], mark_not_yours=[],
mark_theirs=[], mark_not_theirs=[]):
'''
Generate final review page. Displaying transactions if they
need confirmation.
@param req: Apache request object
@type req: Apache request object
@param mark_yours: papers marked as 'yours'
@type mark_yours: list
@param mark_not_yours: papers marked as 'not yours'
@type mark_not_yours: list
@param mark_theirs: papers marked as being someone else's
@type mark_theirs: list
@param mark_not_theirs: papers marked as NOT being someone else's
@type mark_not_theirs: list
'''
def html_icon_legend():
html = []
h = html.append
h('<div id="legend">')
h("<p>")
h(self._("Symbols legend: "))
h("</p>")
h('<span style="margin-left:25px; vertical-align:middle;">')
h('<img src="%s/img/aid_granted.png" '
'alt="%s" width="30" height="30" />'
% (CFG_SITE_URL, self._("Everything is shiny, captain!")))
h(self._('The result of this request will be visible immediately'))
h('</span><br />')
h('<span style="margin-left:25px; vertical-align:middle;">')
h('<img src="%s/img/aid_warning_granted.png" '
'alt="%s" width="30" height="30" />'
% (CFG_SITE_URL, self._("Confirmation needed to continue")))
h(self._('The result of this request will be visible immediately but we need your confirmation to do so for this paper has been manually claimed before'))
h('</span><br />')
h('<span style="margin-left:25px; vertical-align:middle;">')
h('<img src="%s/img/aid_denied.png" '
'alt="%s" width="30" height="30" />'
% (CFG_SITE_URL, self._("This will create a change request for the operators")))
h(self._("The result of this request will be visible upon confirmation through an operator"))
h("</span>")
h("</div>")
return "\n".join(html)
def mk_ticket_row(ticket):
recid = -1
rectitle = ""
recauthor = "No Name Found."
personname = "No Name Found."
try:
recid = ticket['bibref'].split(",")[1]
except (ValueError, KeyError, IndexError):
return ""
try:
rectitle = get_fieldvalues(int(recid), "245__a")[0]
except (ValueError, IndexError, TypeError):
rectitle = self._('Error retrieving record title')
rectitle = escape_html(rectitle)
if "authorname_rec" in ticket:
recauthor = ticket['authorname_rec']
if "person_name" in ticket:
personname = ticket['person_name']
html = []
h = html.append
# h("Debug: " + str(ticket) + "<br />")
h('<td width="25">&nbsp;</td>')
h('<td>')
h(rectitle)
h('</td>')
h('<td>')
h((personname + " (" + self._("Selected name on paper") + ": %s)") % recauthor)
h('</td>')
h('<td>')
if ticket['status'] == "granted":
h('<img src="%s/img/aid_granted.png" '
'alt="%s" width="30" height="30" />'
% (CFG_SITE_URL, self._("Everything is shiny, captain!")))
elif ticket['status'] == "warning_granted":
h('<img src="%s/img/aid_warning_granted.png" '
'alt="%s" width="30" height="30" />'
% (CFG_SITE_URL, self._("Verification needed to continue")))
else:
h('<img src="%s/img/aid_denied.png" '
'alt="%s" width="30" height="30" />'
% (CFG_SITE_URL, self._("This will create a request for the operators")))
h('</td>')
h('<td>')
h('<a rel="nofollow" href="%s/person/action?checkout_remove_transaction=%s ">'
'Cancel'
'</a>' % (CFG_SITE_URL, ticket['bibref']))
h('</td>')
return "\n".join(html)
pinfo = session["personinfo"]
ulevel = pinfo["ulevel"]
html = []
h = html.append
# h(html_icon_legend())
if "checkout_faulty_fields" in pinfo and pinfo["checkout_faulty_fields"]:
h(self.tmpl_error_box('sorry', 'check_entries'))
if ("checkout_faulty_fields" in pinfo
and pinfo["checkout_faulty_fields"]
and "tickets" in pinfo["checkout_faulty_fields"]):
h(self.tmpl_error_box('error', 'provide_transaction'))
# h('<div id="aid_checkout_teaser">' +
# self._('Almost done! Please use the button "Confirm these changes" '
# 'at the end of the page to send this request to an operator '
# 'for review!') + '</div>')
h('<div id="aid_person_names" '
'class="ui-tabs ui-widget ui-widget-content ui-corner-all"'
'style="padding:10px;">')
h("<h4>" + self._('Please provide your information') + "</h4>")
h('<form id="final_review" action="%s/person/action" method="post">'
% (CFG_SITE_URL))
if ("checkout_faulty_fields" in pinfo
and pinfo["checkout_faulty_fields"]
and "user_first_name" in pinfo["checkout_faulty_fields"]):
h("<p class='aid_error_line'>" + self._('Please provide your first name') + "</p>")
h("<p>")
if "user_first_name_sys" in pinfo and pinfo["user_first_name_sys"]:
h((self._("Your first name:") + " %s") % pinfo["user_first_name"])
else:
h(self._('Your first name:') + ' <input type="text" name="user_first_name" value="%s" />'
% pinfo["user_first_name"])
if ("checkout_faulty_fields" in pinfo
and pinfo["checkout_faulty_fields"]
and "user_last_name" in pinfo["checkout_faulty_fields"]):
h("<p class='aid_error_line'>" + self._('Please provide your last name') + "</p>")
h("</p><p>")
if "user_last_name_sys" in pinfo and pinfo["user_last_name_sys"]:
h((self._("Your last name:") + " %s") % pinfo["user_last_name"])
else:
h(self._('Your last name:') + ' <input type="text" name="user_last_name" value="%s" />'
% pinfo["user_last_name"])
h("</p>")
if ("checkout_faulty_fields" in pinfo
and pinfo["checkout_faulty_fields"]
and "user_email" in pinfo["checkout_faulty_fields"]):
h("<p class='aid_error_line'>" + self._('Please provide your eMail address') + "</p>")
if ("checkout_faulty_fields" in pinfo
and pinfo["checkout_faulty_fields"]
and "user_email_taken" in pinfo["checkout_faulty_fields"]):
h("<p class='aid_error_line'>" +
self._('This eMail address is reserved by a user. Please log in or provide an alternative eMail address')
+ "</p>")
h("<p>")
if "user_email_sys" in pinfo and pinfo["user_email_sys"]:
h((self._("Your eMail:") + " %s") % pinfo["user_email"])
else:
h((self._('Your eMail:') + ' <input type="text" name="user_email" value="%s" />')
% pinfo["user_email"])
h("</p><p>")
h(self._("You may leave a comment (optional)") + ":<br>")
h('<textarea name="user_comments">')
if "user_ticket_comments" in pinfo:
h(pinfo["user_ticket_comments"])
h("</textarea>")
h("</p>")
h("<p>&nbsp;</p>")
h('<div style="text-align: center;">')
h((' <input type="submit" name="checkout_continue_claiming" class="aid_btn_green" value="%s" />')
% self._("Continue claiming*"))
h((' <input type="submit" name="checkout_submit" class="aid_btn_green" value="%s" />')
% self._("Confirm these changes**"))
h('<span style="margin-left:150px;">')
h((' <input type="submit" name="cancel" class="aid_btn_red" value="%s" />')
% self._("!Delete the entire request!"))
h('</span>')
h('</div>')
h("</form>")
h('</div>')
h('<div id="aid_person_names" '
'class="ui-tabs ui-widget ui-widget-content ui-corner-all"'
'style="padding:10px;">')
h('<table width="100%" border="0" cellspacing="0" cellpadding="4">')
if not ulevel == "guest":
h('<tr>')
h("<td colspan='5'><h4>" + self._('Mark as your documents') + "</h4></td>")
h('</tr>')
if mark_yours:
for idx, ticket in enumerate(mark_yours):
h('<tr id="aid_result%s">' % ((idx + 1) % 2))
h(mk_ticket_row(ticket))
h('</tr>')
else:
h('<tr>')
h('<td width="25">&nbsp;</td>')
h('<td colspan="4">Nothing staged as yours</td>')
h("</tr>")
h('<tr>')
h("<td colspan='5'><h4>" + self._("Mark as _not_ your documents") + "</h4></td>")
h('</tr>')
if mark_not_yours:
for idx, ticket in enumerate(mark_not_yours):
h('<tr id="aid_result%s">' % ((idx + 1) % 2))
h(mk_ticket_row(ticket))
h('</tr>')
else:
h('<tr>')
h('<td width="25">&nbsp;</td>')
h('<td colspan="4">' + self._('Nothing staged as not yours') + '</td>')
h("</tr>")
h('<tr>')
h("<td colspan='5'><h4>" + self._('Mark as their documents') + "</h4></td>")
h('</tr>')
if mark_theirs:
for idx, ticket in enumerate(mark_theirs):
h('<tr id="aid_result%s">' % ((idx + 1) % 2))
h(mk_ticket_row(ticket))
h('</tr>')
else:
h('<tr>')
h('<td width="25">&nbsp;</td>')
h('<td colspan="4">' + self._('Nothing staged in this category') + '</td>')
h("</tr>")
h('<tr>')
h("<td colspan='5'><h4>" + self._('Mark as _not_ their documents') + "</h4></td>")
h('</tr>')
if mark_not_theirs:
for idx, ticket in enumerate(mark_not_theirs):
h('<tr id="aid_result%s">' % ((idx + 1) % 2))
h(mk_ticket_row(ticket))
h('</tr>')
else:
h('<tr>')
h('<td width="25">&nbsp;</td>')
h('<td colspan="4">' + self._('Nothing staged in this category') + '</td>')
h("</tr>")
h('</table>')
h("</div>")
h("<p>")
h(self._(" * You can come back to this page later. Nothing will be lost. <br />"))
h(self._(" ** Performs all requested changes. Changes subject to permission restrictions "
"will be submitted to an operator for manual review."))
h("</p>")
h(html_icon_legend())
return "\n".join(html)
def tmpl_author_search(self, query, results,
search_ticket=None, author_pages_mode=True,
fallback_mode=False, fallback_title='',
fallback_message='', new_person_link=False):
'''
Generates the search for Person entities.
@param query: the query a user issued to the search
@type query: string
@param results: list of results
@type results: list
@param search_ticket: search ticket object to inform about pending
claiming procedure
@type search_ticket: dict
'''
linktarget = "person"
if author_pages_mode:
linktarget = "author"
if not query:
query = ""
html = []
h = html.append
h('<form id="searchform" action="/person/search" method="GET">')
h('Find author clusters by name. e.g: <i>Ellis, J</i>: <br>')
h('<input placeholder="Search for a name, e.g: Ellis, J" type="text" name="q" style="border:1px solid #333; width:500px;" '
'maxlength="250" value="%s" class="focus" />' % query)
h('<input type="submit" value="Search" />')
h('</form>')
if fallback_mode:
if fallback_title:
h('<div id="header">%s</div>' % fallback_title)
if fallback_message:
h('%s' % fallback_message)
if not results and not query:
h('</div>')
return "\n".join(html)
h("<p>&nbsp;</p>")
if query and not results:
authemail = CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL
h(('<strong>' + self._("We do not have a publication list for '%s'." +
" Try using a less specific author name, or check" +
" back in a few days as attributions are updated " +
"frequently. Or you can send us feedback, at ") +
"<a rel='nofollow' href=\"mailto:%s\">%s</a>.</strong>") % (query, authemail, authemail))
h('</div>')
if new_person_link:
link = "%s/person/action?confirm=True&pid=%s" % (CFG_SITE_URL, '-3')
if search_ticket:
for r in search_ticket['bibrefs']:
link = link + '&selection=%s' % str(r)
h('<div>')
h('<a rel="nofollow" href="%s">' % (link))
h(self._("Create a new Person for your search"))
h('</a>')
h('</div>')
return "\n".join(html)
# base_color = 100
# row_color = 0
for index, result in enumerate(results):
# if len(results) > base_color:
# row_color += 1
# else:
# row_color = base_color - (base_color - index *
# (base_color / len(results)))
pid = result[0]
names = result[1]
papers = result[2]
try:
total_papers = result[3]
if total_papers > 1:
papers_string = '(%s Papers)' % str(total_papers)
elif total_papers == 1:
if (len(papers) == 1 and
len(papers[0]) == 1 and
papers[0][0] == 'Not retrieved to increase performances.'):
papers_string = ''
else:
papers_string = '(1 Paper)'
else:
papers_string = '(No papers)'
except IndexError:
papers_string = ''
h('<div id="aid_result%s">' % (index % 2))
h('<div style="padding-bottom:5px;">')
# h('<span style="color:rgb(%d,%d,%d);">%s. </span>'
# % (row_color, row_color, row_color, index + 1))
h('<span>%s. </span>' % (index + 1))
# for nindex, name in enumerate(names):
# color = row_color + nindex * 35
# color = min(color, base_color)
# h('<span style="color:rgb(%d,%d,%d);">%s; </span>'
# % (color, color, color, name[0]))
for name in names:
h('<span style="margin-right:20px;">%s </span>'
% (name[0]))
h('</div>')
h('<em style="padding-left:1.5em;">')
if index < bconfig.PERSON_SEARCH_RESULTS_SHOW_PAPERS_PERSON_LIMIT:
h(('<a rel="nofollow" href="#" id="aid_moreinfolink" class="mpid%s">'
'<img src="../img/aid_plus_16.png" '
'alt = "toggle additional information." '
'width="11" height="11"/> '
+ self._('Recent Papers') +
'</a></em>')
% (pid))
else:
h("</em>")
if search_ticket:
link = "%s/person/action?confirm=True&pid=%s" % (CFG_SITE_URL, pid)
for r in search_ticket['bibrefs']:
link = link + '&selection=%s' % str(r)
h(('<span style="margin-left: 120px;">'
'<em><a rel="nofollow" href="%s" id="confirmlink">'
'<strong>' + self._('YES!') + '</strong>'
+ self._(' Attribute Papers To ') +
'%s %s </a></em></span>')
% (link, get_person_redirect_link(pid), papers_string))
else:
h(('<span style="margin-left: 40px;">'
'<em><a rel="nofollow" href="%s/%s/%s" id="aid_moreinfolink">'
+ self._('Publication List ') + '(%s) %s </a></em></span>')
% (CFG_SITE_URL, linktarget,
get_person_redirect_link(pid),
get_person_redirect_link(pid), papers_string))
h('<div class="more-mpid%s" id="aid_moreinfo">' % (pid))
if papers and index < bconfig.PERSON_SEARCH_RESULTS_SHOW_PAPERS_PERSON_LIMIT:
h((self._('Showing the') + ' %d ' + self._('most recent documents:')) % len(papers))
h("<ul>")
for paper in papers:
h("<li>%s</li>"
% (format_record(int(paper[0]), "ha")))
h("</ul>")
elif not papers:
h("<p>" + self._('Sorry, there are no documents known for this person') + "</p>")
elif index >= bconfig.PERSON_SEARCH_RESULTS_SHOW_PAPERS_PERSON_LIMIT:
h("<p>" + self._('Information not shown to increase performances. Please refine your search.') + "</p>")
h(('<span style="margin-left: 40px;">'
'<em><a rel="nofollow" href="%s/%s/%s" target="_blank" id="aid_moreinfolink">'
+ self._('Publication List ') + '(%s)</a> (in a new window or tab)</em></span>')
% (CFG_SITE_URL, linktarget,
get_person_redirect_link(pid),
get_person_redirect_link(pid)))
h('</div>')
h('</div>')
if new_person_link:
link = "%s/person/action?confirm=True&pid=%s" % (CFG_SITE_URL, '-3')
if search_ticket:
for r in search_ticket['bibrefs']:
link = link + '&selection=%s' % str(r)
h('<div>')
h('<a rel="nofollow" href="%s">' % (link))
h(self._("Create a new Person for your search"))
h('</a>')
h('</div>')
return "\n".join(html)
def tmpl_welcome_start(self):
'''
Shadows the behaviour of tmpl_search_pagestart
'''
return '<div class="pagebody"><div class="pagebodystripemiddle">'
def tmpl_welcome_arxiv(self):
'''
SSO landing/welcome page.
'''
html = []
h = html.append
h('<p><b>Congratulations! you have now successfully connected to INSPIRE via arXiv.org!</b></p>')
h('<p>Right now, you can verify your'
' publication records, which will help us to produce better publication lists and'
' citation statistics.'
'</p>')
h('<p>We are currently importing your publication list from arXiv.org .'
'When we\'re done, you\'ll see a link to verify your'
' publications below; please claim the papers that are yours '
' and remove the ones that are not. This information will be automatically processed'
' or be sent to our operator for approval if needed, usually within 24'
' hours.'
'</p>')
h('If you have '
'any questions or encounter any problems please contact us here: '
'<a rel="nofollow" href="mailto:%s">%s</a></p>'
% (CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL,
CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL))
return "\n".join(html)
def tmpl_welcome(self):
'''
SSO landing/welcome page.
'''
html = []
h = html.append
h('<p><b>Congratulations! you have successfully logged in!</b></p>')
h('<p>We are currently creating your publication list. When we\'re done, you\'ll see a link to correct your '
'publications below.</p>')
h('<p>When the link appears we invite you to confirm the papers that are '
'yours and to reject the ones that you are not author of. If you have '
'any questions or encounter any problems please contact us here: '
'<a rel="nofollow" href="mailto:%s">%s</a></p>'
% (CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL,
CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL))
return "\n".join(html)
def tmpl_claim_profile(self):
'''
claim profile
'''
html = []
h = html.append
h('<p>Unfortunately it was not possible to automatically match your arXiv account to an INSPIRE person profile. Please choose the correct person profile from the list below.')
h('If your profile is not in the list or none of them represents you correctly, please select the one which fits you best or choose '
'to create a new one; keep in mind that no matter what your choice is, you will be able to correct your publication list until it contains all of your publications.'
' In case of any question please do not hesitate to contact us at <a rel="nofollow" href="mailto:%s">%s</a></p>' % (CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL,
CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL))
return "\n".join(html)
def tmpl_profile_option(self, top5_list):
'''
show profile option
'''
html = []
h = html.append
h('<table border="0"> <tr>')
for pid in top5_list:
pid = int(pid)
canonical_id = get_canonical_id_from_personid(pid)
full_name = get_person_names_from_id(pid)
name_length = 0
most_common_name = ""
for name in full_name:
if len(name[0]) > name_length:
most_common_name = name [0]
if len(full_name) > 0:
name_string = most_common_name
else:
name_string = "[No name available] "
if len(canonical_id) > 0:
canonical_name_string = "(" + canonical_id[0][0] + ")"
canonical_id = canonical_id[0][0]
else:
canonical_name_string = "(" + pid + ")"
canonical_id = pid
h('<td>')
h('%s ' % (name_string))
h('<a href="%s/author/%s" target="_blank"> %s </a>' % (CFG_SITE_URL, canonical_id, canonical_name_string))
h('</td>')
h('<td>')
h('<INPUT TYPE="BUTTON" VALUE="This is my profile" ONCLICK="window.location.href=\'welcome?chosen_profile=%s\'">' % (str(pid)))
h('</td>')
h('</tr>')
h('</table>')
h('</br>')
if top5_list:
h('If none of the above is your profile it seems that you cannot match any of the existing accounts.</br>Would you like to create one?')
h('<INPUT TYPE="BUTTON" VALUE="Create an account" ONCLICK="window.location.href=\'welcome?chosen_profile=%s\'">' % (str(-1)))
else:
h('It seems that you cannot match any of the existig accounts.</br>Would you like to create one?')
h('<INPUT TYPE="BUTTON" VALUE="Create an account" ONCLICK="window.location.href=\'welcome?chosen_profile=%s\'">' % (str(-1)))
return "\n".join(html)
def tmpl_profile_not_available(self):
'''
show profile option
'''
html = []
h = html.append
h('<p> Unfortunately the profile that you previously chose is no longer available. A new empty profile has been created. You will be able to correct '
'your publication list until it contains all of your publications.</p>')
return "\n".join(html)
def tmpl_profile_assigned_by_user (self):
html = []
h = html.append
h('<p> Congratulations you have successfully claimed the chosen profile.</p>')
return "\n".join(html)
def tmpl_claim_stub(self, person='-1'):
'''
claim stub page
'''
html = []
h = html.append
h(' <ul><li><a rel="nofollow" href=%s> Login through arXiv.org </a> <small>' % bconfig.BIBAUTHORID_CFG_INSPIRE_LOGIN)
h(' - Use this option if you have an arXiv account and have claimed your papers in arXiv.')
h('(If you login through arXiv.org, INSPIRE will immediately verify you as an author and process your claimed papers.) </small><br><br>')
h(' <li><a rel="nofollow" href=%s/person/%s?open_claim=True> Continue as a guest </a> <small>' % (CFG_SITE_URL, person))
h(' - Use this option if you DON\'T have an arXiv account, or you have not claimed any paper in arXiv.')
h('(If you login as a guest, INSPIRE will need to confirm you as an author before processing your claimed papers.) </small><br><br>')
h('If you login through arXiv.org we can verify that you are the author of these papers and accept your claims rapidly, '
'as well as adding additional claims from arXiv. <br>If you choose not to login via arXiv your changes will '
'be publicly visible only after our editors check and confirm them, usually a few days.<br> '
'Either way, claims made on behalf of another author will go through our staff and may take longer to display. '
'This applies as well to papers which have been previously claimed, by yourself or someone else.')
return "\n".join(html)
def tmpl_welcome_link(self):
'''
Creates the link for the actual user action.
'''
return '<a rel="nofollow" href=action?checkout=True><b>' + \
self._('Correct my publication lists!') + \
'</b></a>'
def tmpl_welcome_personid_association(self, pid):
"""
"""
canon_name = get_canonical_id_from_personid(pid)
head = "<br>"
if canon_name:
body = ("Your arXiv.org account is associated "
"with person %s." % canon_name[0][0])
else:
body = ("Warning: your arXiv.org account is associated with an empty profile. "
"This can happen if it is the first time you log in and you do not have any "
"paper directly claimed in arXiv.org."
" In this case, you are welcome to search and claim your papers to your"
" new profile manually, or please contact us to get help.")
body += ("<br>You are very welcome to contact us shall you need any help or explanation"
" about the management of"
" your profile page"
" in INSPIRE and it's connections with arXiv.org: "
'''<a href="mailto:authors@inspirehep.net?subject=Help on arXiv.org SSO login and paper claiming"> authors@inspirehep.net </a>''')
tail = "<br>"
return head + body + tail
def tmpl_welcome_arXiv_papers(self, paps):
'''
Creates the list of arXiv papers
'''
plist = "<br><br>"
if paps:
plist = plist + "We have got and we are about to automatically claim for You the following papers from arXiv.org: <br>"
for p in paps:
plist = plist + " " + str(p) + "<br>"
else:
plist = "We have got no papers from arXiv.org which we could claim automatically for You. <br>"
return plist
def tmpl_welcome_end(self):
'''
Shadows the behaviour of tmpl_search_pageend
'''
return '</div></div>'
def tmpl_tickets_admin(self, tickets=[]):
'''
Open tickets short overview for operators.
'''
html = []
h = html.append
if len(tickets) > 0:
h('List of open tickets: <br><br>')
for t in tickets:
h('<a rel="nofollow" href=%(cname)s#tabTickets> %(longname)s - (%(cname)s - PersonID: %(pid)s): %(num)s open tickets. </a><br>'
% ({'cname':str(t[1]), 'longname':str(t[0]), 'pid':str(t[2]), 'num':str(t[3])}))
else:
h('There are currently no open tickets.')
return "\n".join(html)
# pylint: enable=C0301
diff --git a/invenio/legacy/bibauthorid/tortoise.py b/invenio/legacy/bibauthorid/tortoise.py
index e5c60ff7a..34e9d77aa 100644
--- a/invenio/legacy/bibauthorid/tortoise.py
+++ b/invenio/legacy/bibauthorid/tortoise.py
@@ -1,428 +1,428 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
-from invenio import bibauthorid_config as bconfig
+from invenio.legacy.bibauthorid import config as bconfig
from datetime import datetime
import os
#import cPickle as SER
import msgpack as SER
import gc
import matplotlib.pyplot as plt
import numpy as np
#This is supposed to defeat a bit of the python vm performance losses:
import sys
sys.setcheckinterval(1000000)
try:
from collections import defaultdict
except:
from invenio.utils.container import defaultdict
from itertools import groupby, chain, repeat
-from invenio.bibauthorid_general_utils import update_status, update_status_final, override_stdout_config
+from invenio.legacy.bibauthorid.general_utils import update_status, update_status_final, override_stdout_config
from invenio.bibauthorid_cluster_set import delayed_cluster_sets_from_marktables
from invenio.bibauthorid_cluster_set import delayed_cluster_sets_from_personid
from invenio.bibauthorid_wedge import wedge
-from invenio.bibauthorid_name_utils import generate_last_name_cluster_str
+from invenio.legacy.bibauthorid.name_utils import generate_last_name_cluster_str
from invenio.bibauthorid_backinterface import empty_results_table
from invenio.bibauthorid_backinterface import remove_result_cluster
-from invenio.bibauthorid_general_utils import bibauthor_print
+from invenio.legacy.bibauthorid.general_utils import bibauthor_print
from invenio.bibauthorid_prob_matrix import prepare_matirx
from invenio.bibauthorid_scheduler import schedule, matrix_coefs
from invenio.bibauthorid_least_squares import to_function as create_approx_func
from math import isnan
import multiprocessing as mp
#python2.4 compatibility
-from invenio.bibauthorid_general_utils import bai_all as all
+from invenio.legacy.bibauthorid.general_utils import bai_all as all
'''
There are three main entry points to tortoise
i) tortoise
Performs disambiguation iteration.
The arguemnt pure indicates whether to use
the claims and the rejections or not.
Use pure=True only to test the accuracy of tortoise.
ii) tortoise_from_scratch
NOT RECOMMENDED!
Use this function only if you have just
installed invenio and this is your first
disambiguation or if personid is broken.
iii) tortoise_last_name
Computes the clusters for only one last name
group. Is is primary used for testing. It
may also be used to fix a broken last name
cluster. It does not involve multiprocessing
so it is convinient to debug with pdb.
'''
# Exit codes:
# The standard ones are not well documented
# so we are using random numbers.
def tortoise_from_scratch():
bibauthor_print("Preparing cluster sets.")
cluster_sets, _lnames, sizes = delayed_cluster_sets_from_marktables()
bibauthor_print("Building all matrices.")
exit_statuses = schedule_create_matrix(
cluster_sets,
sizes,
force=True)
assert len(exit_statuses) == len(cluster_sets)
assert all(stat == os.EX_OK for stat in exit_statuses)
empty_results_table()
bibauthor_print("Preparing cluster sets.")
cluster_sets, _lnames, sizes = delayed_cluster_sets_from_marktables()
bibauthor_print("Starting disambiguation.")
exit_statuses = schedule_wedge_and_store(
cluster_sets,
sizes)
assert len(exit_statuses) == len(cluster_sets)
assert all(stat == os.EX_OK for stat in exit_statuses)
def tortoise(pure=False,
force_matrix_creation=False,
skip_matrix_creation=False,
last_run=None):
assert not force_matrix_creation or not skip_matrix_creation
# The computation must be forced in case we want
# to compute pure results
force_matrix_creation = force_matrix_creation or pure
if not skip_matrix_creation:
bibauthor_print("Preparing cluster sets.")
clusters, _lnames, sizes = delayed_cluster_sets_from_personid(pure, last_run)
bibauthor_print("Building all matrices.")
exit_statuses = schedule_create_matrix(
clusters,
sizes,
force=force_matrix_creation)
assert len(exit_statuses) == len(clusters)
assert all(stat == os.EX_OK for stat in exit_statuses)
bibauthor_print("Preparing cluster sets.")
clusters, _lnames, sizes = delayed_cluster_sets_from_personid(pure, last_run)
bibauthor_print("Starting disambiguation.")
exit_statuses = schedule_wedge_and_store(
clusters,
sizes)
assert len(exit_statuses) == len(clusters)
assert all(stat == os.EX_OK for stat in exit_statuses)
def tortoise_last_name(name, from_mark=False, pure=False):
bibauthor_print('Start working on %s' % name)
assert not(from_mark and pure)
lname = generate_last_name_cluster_str(name)
if from_mark:
bibauthor_print(' ... from mark!')
clusters, lnames, sizes = delayed_cluster_sets_from_marktables([lname])
bibauthor_print(' ... delayed done')
else:
bibauthor_print(' ... from pid, pure')
clusters, lnames, sizes = delayed_cluster_sets_from_personid(pure)
bibauthor_print(' ... delayed pure done!')
# try:
idx = lnames.index(lname)
cluster = clusters[idx]
size = sizes[idx]
cluster_set = cluster()
bibauthor_print("Found, %s(%s). Total number of bibs: %d." % (name, lname, size))
create_matrix(cluster_set, True)
wedge_and_store(cluster_set)
# except IndexError:
# bibauthor_print("Sorry, %s(%s) not found in the last name clusters" % (name, lname))
def _collect_statistics_lname_coeff(params):
lname = params[0]
coeff = params[1]
clusters, lnames, sizes = delayed_cluster_sets_from_marktables([lname])
idx = lnames.index(lname)
cluster = clusters[idx]
size = sizes[idx]
bibauthor_print("Found, %s. Total number of bibs: %d." % (lname, size))
cluster_set = cluster()
create_matrix(cluster_set, False)
bibs = cluster_set.num_all_bibs
expected = bibs * (bibs - 1) / 2
bibauthor_print("Start working on %s. Total number of bibs: %d, "
"maximum number of comparisons: %d"
% (cluster_set.last_name, bibs, expected))
wedge(cluster_set, True, coeff)
remove_result_cluster(cluster_set.last_name)
def _create_matrix(lname):
clusters, lnames, sizes = delayed_cluster_sets_from_marktables([lname])
idx = lnames.index(lname)
cluster = clusters[idx]
size = sizes[idx]
bibauthor_print("Found, %s. Total number of bibs: %d." % (lname, size))
cluster_set = cluster()
create_matrix(cluster_set, True)
bibs = cluster_set.num_all_bibs
expected = bibs * (bibs - 1) / 2
bibauthor_print("Start working on %s. Total number of bibs: %d, "
"maximum number of comparisons: %d"
% (cluster_set.last_name, bibs, expected))
cluster_set.store()
def tortoise_tweak_coefficient(lastnames, min_coef, max_coef, stepping, create_matrix=True):
bibauthor_print('Coefficient tweaking!')
bibauthor_print('Cluster sets from mark...')
lnames = set([generate_last_name_cluster_str(n) for n in lastnames])
coefficients = [x/100. for x in range(int(min_coef*100),int(max_coef*100),int(stepping*100))]
pool = mp.Pool()
if create_matrix:
pool.map(_create_matrix, lnames)
pool.map(_collect_statistics_lname_coeff, ((x,y) for x in lnames for y in coefficients ))
def _gen_plot(data, filename):
plt.clf()
ax = plt.subplot(111)
ax.grid(visible=True)
x = sorted(data.keys())
w = [data[k][0] for k in x]
try:
wscf = max(w)
except:
wscf = 0
w = [float(i)/wscf for i in w]
y = [data[k][1] for k in x]
maxi = [data[k][3] for k in x]
mini = [data[k][2] for k in x]
lengs = [data[k][4] for k in x]
try:
ml = float(max(lengs))
except:
ml = 1
lengs = [k/ml for k in lengs]
normalengs = [data[k][5] for k in x]
ax.plot(x,y,'-o',label='avg')
ax.plot(x,maxi,'-o', label='max')
ax.plot(x,mini,'-o', label='min')
ax.plot(x,w, '-x', label='norm %s' % str(wscf))
ax.plot(x,lengs,'-o',label='acl %s' % str(int(ml)))
ax.plot(x,normalengs, '-o', label='ncl')
plt.ylim(ymax = 1., ymin = -0.01)
plt.xlim(xmax = 1., xmin = -0.01)
ax.legend(bbox_to_anchor=(0., 1.02, 1., .102), loc=3,ncol=6, mode="expand", borderaxespad=0.)
plt.savefig(filename)
def tortoise_coefficient_statistics(pickle_output=None, generate_graphs=True):
override_stdout_config(stdout=True)
files = ['/tmp/baistats/'+x for x in os.listdir('/tmp/baistats/') if x.startswith('cluster_status_report_pid')]
fnum = float(len(files))
quanta = .1/fnum
total_stats = 0
used_coeffs = set()
used_clusters = set()
#av_counter, avg, min, max, nclus, normalized_avg
cluster_stats = defaultdict(lambda : defaultdict(lambda : [0.,0.,0.,0.,0.,0.]))
coeff_stats = defaultdict(lambda : [0.,0.,0.,0.,0.,0.])
def gen_graphs(only_synthetic=False):
update_status(0, 'Generating coefficients graph...')
_gen_plot(coeff_stats, '/tmp/graphs/AAAAA-coefficients.svg')
if not only_synthetic:
cn = cluster_stats.keys()
l = float(len(cn))
for i,c in enumerate(cn):
update_status(i/l, 'Generating name graphs... %s' % str(c))
_gen_plot(cluster_stats[c], '/tmp/graphs/CS-%s.png' % str(c))
for i,fi in enumerate(files):
if generate_graphs:
if i%1000 ==0:
gen_graphs(True)
f = open(fi,'r')
status = i/fnum
update_status(status, 'Loading '+ fi[fi.find('lastname')+9:])
contents = SER.load(f)
f.close()
cur_coef = contents[0]
cur_clust = contents[1]
cur_maxlen = float(contents[3])
if cur_coef:
total_stats += 1
used_coeffs.add(cur_coef)
used_clusters.add(cur_clust)
update_status(status+0.2*quanta, ' Computing averages...')
cur_clen = len(contents[2])
cur_coeffs = [x[2] for x in contents[2]]
cur_clustnumber = float(len(set([x[0] for x in contents[2]])))
assert cur_clustnumber > 0 and cur_clustnumber < cur_maxlen, "Error, found log with strange clustnumber! %s %s %s %s" % (str(cur_clust), str(cur_coef), str(cur_maxlen),
str(cur_clustnumber))
if cur_coeffs:
assert len(cur_coeffs) == cur_clen and cur_coeffs, "Error, there is a cluster witohut stuff? %s %s %s"% (str(cur_clust), str(cur_coef), str(cur_coeffs))
assert all([x >= 0 and x <= 1 for x in cur_coeffs]), "Error, a coefficient is wrong here! Check me! %s %s %s" % (str(cur_clust), str(cur_coef), str(cur_coeffs))
cur_min = min(cur_coeffs)
cur_max = max(cur_coeffs)
cur_avg = sum(cur_coeffs)/cur_clen
update_status(status+0.4*quanta, ' comulative per coeff...')
avi = coeff_stats[cur_coef][0]
#number of points
coeff_stats[cur_coef][0] = avi+1
#average of coefficients
coeff_stats[cur_coef][1] = (coeff_stats[cur_coef][1]*avi + cur_avg)/(avi+1)
#min coeff
coeff_stats[cur_coef][2] = min(coeff_stats[cur_coef][2], cur_min)
#max coeff
coeff_stats[cur_coef][3] = max(coeff_stats[cur_coef][3], cur_max)
#avg number of clusters
coeff_stats[cur_coef][4] = (coeff_stats[cur_coef][4]*avi + cur_clustnumber)/(avi+1)
#normalized avg number of clusters
coeff_stats[cur_coef][5] = (coeff_stats[cur_coef][5]*avi + cur_clustnumber/cur_maxlen)/(avi+1)
update_status(status+0.6*quanta, ' comulative per cluster per coeff...')
avi = cluster_stats[cur_clust][cur_coef][0]
cluster_stats[cur_clust][cur_coef][0] = avi+1
cluster_stats[cur_clust][cur_coef][1] = (cluster_stats[cur_clust][cur_coef][1]*avi + cur_avg)/(avi+1)
cluster_stats[cur_clust][cur_coef][2] = min(cluster_stats[cur_clust][cur_coef][2], cur_min)
cluster_stats[cur_clust][cur_coef][3] = max(cluster_stats[cur_clust][cur_coef][3], cur_max)
cluster_stats[cur_clust][cur_coef][4] = (cluster_stats[cur_clust][cur_coef][4]*avi + cur_clustnumber)/(avi+1)
cluster_stats[cur_clust][cur_coef][5] = (cluster_stats[cur_clust][cur_coef][5]*avi + cur_clustnumber/cur_maxlen)/(avi+1)
update_status_final('Done!')
if generate_graphs:
gen_graphs()
if pickle_output:
update_status(0,'Dumping to file...')
f = open(pickle_output,'w')
SER.dump({'cluster_stats':dict((x,dict(cluster_stats[x])) for x in cluster_stats.iterkeys()), 'coeff_stats':dict((coeff_stats))}, f)
f.close()
def create_matrix(cluster_set, force):
bibs = cluster_set.num_all_bibs
expected = bibs * (bibs - 1) / 2
bibauthor_print("Start building matrix for %s. Total number of bibs: %d, "
"maximum number of comparisons: %d"
% (cluster_set.last_name, bibs, expected))
return prepare_matirx(cluster_set, force)
def force_create_matrix(cluster_set, force):
bibauthor_print("Building a cluster set.")
return create_matrix(cluster_set(), force)
def wedge_and_store(cluster_set):
bibs = cluster_set.num_all_bibs
expected = bibs * (bibs - 1) / 2
bibauthor_print("Start working on %s. Total number of bibs: %d, "
"maximum number of comparisons: %d"
% (cluster_set.last_name, bibs, expected))
wedge(cluster_set)
remove_result_cluster(cluster_set.last_name)
cluster_set.store()
return True
def force_wedge_and_store(cluster_set):
bibauthor_print("Building a cluster set.")
return wedge_and_store(cluster_set())
def schedule_create_matrix(cluster_sets, sizes, force):
def create_job(cluster):
def ret():
return force_create_matrix(cluster, force)
return ret
memfile_path = None
if bconfig.DEBUG_PROCESS_PEAK_MEMORY:
tt = datetime.now()
tt = (tt.hour, tt.minute, tt.day, tt.month, tt.year)
memfile_path = ('%smatrix_memory_%d:%d_%d-%d-%d.log' %
((bconfig.TORTOISE_FILES_PATH,) + tt))
return schedule(map(create_job, cluster_sets),
sizes,
create_approx_func(matrix_coefs),
memfile_path)
def schedule_wedge_and_store(cluster_sets, sizes):
def create_job(cluster):
def ret():
return force_wedge_and_store(cluster)
return ret
memfile_path = None
if bconfig.DEBUG_PROCESS_PEAK_MEMORY:
tt = datetime.now()
tt = (tt.hour, tt.minute, tt.day, tt.month, tt.year)
memfile_path = ('%swedge_memory_%d:%d_%d-%d-%d.log' %
((bconfig.TORTOISE_FILES_PATH,) + tt))
return schedule(map(create_job, cluster_sets),
sizes,
create_approx_func(matrix_coefs),
memfile_path)
diff --git a/invenio/legacy/bibauthorid/webapi.py b/invenio/legacy/bibauthorid/webapi.py
index 80e84d47f..fa16c2470 100644
--- a/invenio/legacy/bibauthorid/webapi.py
+++ b/invenio/legacy/bibauthorid/webapi.py
@@ -1,1356 +1,1356 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
'''
Bibauthorid_webapi
Point of access to the documents clustering facility.
Provides utilities to safely interact with stored data.
'''
-import invenio.bibauthorid_config as bconfig
+import invenio.legacy.bibauthorid.config as bconfig
import invenio.bibauthorid_frontinterface as dbapi
import invenio.bibauthorid_name_utils as nameapi
import invenio.webauthorprofile_interface as webauthorapi
import invenio.legacy.search_engine as search_engine
from invenio.legacy.search_engine import perform_request_search
from cgi import escape
from invenio.utils.date import strftime
from time import gmtime, ctime
from invenio.modules.access.control import acc_find_user_role_actions
from invenio.legacy.webuser import collect_user_info, getUid
from invenio.legacy.webuser import isUserSuperAdmin
from invenio.modules.access.engine import acc_authorize_action
from invenio.modules.access.control import acc_get_role_id, acc_get_user_roles
from invenio.modules.access.external_authentication_robot import ExternalAuthRobot
from invenio.modules.access.external_authentication_robot import load_robot_keys
from invenio.config import CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL
from invenio.config import CFG_SITE_URL
from invenio.ext.email import send_email
from operator import add
-from invenio.bibauthorid_dbinterface import get_personiID_external_ids #pylint: disable-msg=W0614
+from invenio.legacy.bibauthorid.dbinterface import get_personiID_external_ids #pylint: disable-msg=W0614
from flask import session
def get_person_redirect_link(pid):
'''
Returns the canonical name of a pid if found, the pid itself otherwise
@param pid: int
'''
cname = dbapi.get_canonical_id_from_personid(pid)
if len(cname) > 0:
return str(cname[0][0])
else:
return str(pid)
def update_person_canonical_name(person_id, canonical_name, userinfo=''):
'''
Updates a person's canonical name
@param person_id: person id
@param canonical_name: string
'''
if userinfo.count('||'):
uid = userinfo.split('||')[0]
else:
uid = ''
dbapi.update_personID_canonical_names([person_id], overwrite=True, suggested=canonical_name)
dbapi.insert_user_log(userinfo, person_id, 'data_update', 'CMPUI_changecanonicalname', '', 'Canonical name manually updated.', userid=uid)
def delete_person_external_ids(person_id, existing_ext_ids, userinfo=''):
'''
Deletes external ids of the person
@param person_id: person id
@type person_id: int
@param existing_ext_ids: external ids to delete
@type existing_ext_ids: list
'''
if userinfo.count('||'):
uid = userinfo.split('||')[0]
else:
uid = ''
deleted_ids = []
for el in existing_ext_ids:
if el.count('||'):
ext_sys = el.split('||')[0]
ext_id = el.split('||')[1]
else:
continue
tag = 'extid:%s' % ext_sys
dbapi.del_person_data(tag, person_id, ext_id)
deleted_ids.append((person_id, tag, ext_id))
dbapi.insert_user_log(userinfo, person_id, 'data_deletion', 'CMPUI_deleteextid', '', 'External ids manually deleted: ' + str(deleted_ids), userid=uid)
def add_person_external_id(person_id, ext_sys, ext_id, userinfo=''):
'''
Adds an external id for the person
@param person_id: person id
@type person_id: int
@param ext_sys: external system
@type ext_sys: str
@param ext_id: external id
@type ext_id: str
'''
if userinfo.count('||'):
uid = userinfo.split('||')[0]
else:
uid = ''
tag = 'extid:%s' % ext_sys
dbapi.set_person_data(person_id, tag, ext_id)
log_value = '%s %s %s' % (person_id, tag, ext_id)
dbapi.insert_user_log(userinfo, person_id, 'data_insertion', 'CMPUI_addexternalid', log_value, 'External id manually added.', userid=uid)
def get_canonical_id_from_person_id(person_id):
'''
Finds the person canonical name from personid (e.g. 1)
@param person_id: the canonical ID
@type person_id: string
@return: result from the request or person_id on failure
@rtype: int
'''
if not person_id or not (isinstance(person_id, str) or isinstance(person_id, (int, long))):
return person_id
canonical_name = person_id
try:
canonical_name = dbapi.get_canonical_id_from_personid(person_id)[0][0]
except IndexError:
pass
return canonical_name
def get_person_id_from_canonical_id(canonical_id):
'''
Finds the person id from a canonical name (e.g. Ellis_J_R_1)
@param canonical_id: the canonical ID
@type canonical_id: string
@return: result from the request or -1 on failure
@rtype: int
'''
if not canonical_id or not isinstance(canonical_id, str):
return -1
pid = -1
try:
pid = dbapi.get_person_id_from_canonical_id(canonical_id)[0][0]
except IndexError:
pass
return pid
def get_bibrefs_from_bibrecs(bibreclist):
'''
Retrieve all bibrefs for all the recids in the list
@param bibreclist: list of record IDs
@type bibreclist: list of int
@return: a list of record->bibrefs
@return: list of lists
'''
return [[bibrec, dbapi.get_possible_bibrecref([''], bibrec, always_match=True)]
for bibrec in bibreclist]
def get_possible_bibrefs_from_pid_bibrec(pid, bibreclist, always_match=False, additional_names=None):
'''
Returns for each bibrec a list of bibrefs for which the surname matches.
@param pid: person id to gather the names strings from
@param bibreclist: list of bibrecs on which to search
@param always_match: match all bibrefs no matter the name
@param additional_names: [n1,...,nn] names to match other then the one from personid
'''
pid = wash_integer_id(pid)
pid_names = dbapi.get_person_db_names_set(pid)
if additional_names:
pid_names += zip(additional_names)
lists = []
for bibrec in bibreclist:
lists.append([bibrec, dbapi.get_possible_bibrecref([n[0] for n in pid_names], bibrec,
always_match)])
return lists
def get_pid_from_uid(uid):
'''
Return the PID associated with the uid
@param uid: the internal ID of a user
@type uid: int
@return: the Person ID attached to the user or -1 if none found
'''
if not isinstance(uid, tuple):
uid = ((uid,),)
return dbapi.get_personid_from_uid(uid)
def get_user_level(uid):
'''
Finds and returns the aid-universe-internal numeric user level
@param uid: the user's id
@type uid: int
@return: A numerical representation of the maximum access level of a user
@rtype: int
'''
actions = [row[1] for row in acc_find_user_role_actions({'uid': uid})]
return max([dbapi.resolve_paper_access_right(acc) for acc in actions])
def get_person_id_from_paper(bibref=None):
'''
Returns the id of the person who wrote the paper
@param bibref: the bibref,bibrec pair that identifies the person
@type bibref: str
@return: the person id
@rtype: int
'''
if not is_valid_bibref(bibref):
return -1
person_id = -1
db_data = dbapi.get_papers_status(bibref)
try:
person_id = db_data[0][1]
except (IndexError):
pass
return person_id
def get_papers_by_person_id(person_id= -1, rec_status= -2, ext_out=False):
'''
Returns all the papers written by the person
@param person_id: identifier of the person to retrieve papers from
@type person_id: int
@param rec_status: minimal flag status a record must have to be displayed
@type rec_status: int
@param ext_out: Extended output (w/ author aff and date)
@type ext_out: boolean
@return: list of record ids
@rtype: list of int
'''
if not isinstance(person_id, (int, long)):
try:
person_id = int(person_id)
except (ValueError, TypeError):
return []
if person_id < 0:
return []
if not isinstance(rec_status, int):
return []
records = []
db_data = dbapi.get_person_papers(person_id,
rec_status,
show_author_name=True,
show_title=False,
show_rt_status=True,
show_affiliations=ext_out,
show_date=ext_out,
show_experiment=ext_out)
if not ext_out:
records = [[row["data"].split(",")[1], row["data"], row["flag"],
row["authorname"]] for row in db_data]
else:
for row in db_data:
recid = row["data"].split(",")[1]
bibref = row["data"]
flag = row["flag"]
authorname = row["authorname"]
rt_status = row['rt_status']
authoraff = ", ".join(row['affiliation'])
try:
date = sorted(row['date'], key=len)[0]
except IndexError:
date = "Not available"
exp = ", ".join(row['experiment'])
#date = ""
records.append([recid, bibref, flag, authorname,
authoraff, date, rt_status, exp])
return records
def get_papers_cluster(bibref):
'''
Returns the cluster of documents connected with this one
@param bibref: the table:bibref,bibrec pair to look for
@type bibref: str
@return: a list of record IDs
@rtype: list of int
'''
papers = []
person_id = get_person_id_from_paper(bibref)
if person_id > -1:
papers = get_papers_by_person_id(person_id)
return papers
def get_person_request_ticket(pid= -1, tid=None):
'''
Returns the list of request tickets associated to a person.
@param pid: person id
@param tid: ticket id, to select if want to retrieve only a particular one
@return: tickets [[],[]]
'''
if pid < 0:
return []
else:
return dbapi.get_request_ticket(pid, ticket_id=tid)
def get_persons_with_open_tickets_list():
'''
Finds all the persons with open tickets and returns pids and count of tickets
@return: [[pid,ticket_count]]
'''
return dbapi.get_persons_with_open_tickets_list()
def get_person_names_from_id(person_id= -1):
'''
Finds and returns the names associated with this person along with the
frequency of occurrence (i.e. the number of papers)
@param person_id: an id to find the names for
@type person_id: int
@return: name and number of occurrences of the name
@rtype: tuple of tuple
'''
# #retrieve all rows for the person
if (not person_id > -1) or (not isinstance(person_id, (int, long))):
return []
return dbapi.get_person_names_count(person_id)
def get_person_db_names_from_id(person_id= -1):
'''
Finds and returns the names associated with this person as stored in the
meta data of the underlying data set along with the
frequency of occurrence (i.e. the number of papers)
@param person_id: an id to find the names for
@type person_id: int
@return: name and number of occurrences of the name
@rtype: tuple of tuple
'''
# #retrieve all rows for the person
if (not person_id > -1) or (not isinstance(person_id, (int, long))):
return []
return dbapi.get_person_db_names_count(person_id)
def get_longest_name_from_pid(person_id= -1):
'''
Finds the longest name of a person to be representative for this person.
@param person_id: the person ID to look at
@type person_id: int
@return: returns the longest normalized name of a person
@rtype: string
'''
if (not person_id > -1) or (not isinstance(person_id, (int, long))):
return "This doesn't look like a person ID!"
longest_name = ""
for name in dbapi.get_person_names_count(person_id):
if name and len(name[0]) > len(longest_name):
longest_name = name[0]
if longest_name:
return longest_name
else:
return "This person does not seem to have a name!"
def get_most_frequent_name_from_pid(person_id= -1, allow_none=False):
'''
Finds the most frequent name of a person to be
representative for this person.
@param person_id: the person ID to look at
@type person_id: int
@return: returns the most frequent normalized name of a person
@rtype: string
'''
pid = wash_integer_id(person_id)
if (not pid > -1) or (not isinstance(pid, int)):
if allow_none:
return None
else:
return "'%s' doesn't look like a person ID!" % person_id
person_id = pid
mf_name = ""
try:
nn = dbapi.get_person_names_count(person_id)
mf_name = sorted(nn, key=lambda k:k[1], reverse=True)[0][0]
except IndexError:
pass
if mf_name:
return mf_name
else:
if allow_none:
return None
else:
return "This person does not seem to have a name!"
def get_paper_status(bibref):
'''
Finds an returns the status of a bibrec to person assignment
@param bibref: the bibref-bibrec pair that unambiguously identifies a paper
@type bibref: string
'''
db_data = dbapi.get_papers_status(bibref)
#data,PersonID,flag
status = None
try:
status = db_data[0][2]
except IndexError:
status = -10
status = wash_integer_id(status)
return status
def wash_integer_id(param_id):
'''
Creates an int out of either int or string
@param param_id: the number to be washed
@type param_id: int or string
@return: The int representation of the param or -1
@rtype: int
'''
pid = -1
try:
pid = int(param_id)
except (ValueError, TypeError):
return (-1)
return pid
def is_valid_bibref(bibref):
'''
Determines if the provided string is a valid bibref-bibrec pair
@param bibref: the bibref-bibrec pair that unambiguously identifies a paper
@type bibref: string
@return: True if it is a bibref-bibrec pair and False if it's not
@rtype: boolean
'''
if (not isinstance(bibref, str)) or (not bibref):
return False
if not bibref.count(":"):
return False
if not bibref.count(","):
return False
try:
table = bibref.split(":")[0]
ref = bibref.split(":")[1].split(",")[0]
bibrec = bibref.split(":")[1].split(",")[1]
except IndexError:
return False
try:
table = int(table)
ref = int(ref)
bibrec = int(bibrec)
except (ValueError, TypeError):
return False
return True
def is_valid_canonical_id(cid):
'''
Checks if presented canonical ID is valid in structure
Must be of structure: ([Initial|Name]\.)*Lastname\.Number
Example of valid cid: J.Ellis.1
@param cid: The canonical ID to check
@type cid: string
@return: Is it valid?
@rtype: boolean
'''
if not cid.count("."):
return False
xcheck = -1
sp = cid.split(".")
if not (len(sp) > 1 and sp[-1]):
return False
try:
xcheck = int(sp[-1])
except (ValueError, TypeError, IndexError):
return False
if xcheck and xcheck > -1:
return True
else:
return False
def add_person_comment(person_id, message):
'''
Adds a comment to a person after enriching it with meta-data (date+time)
@param person_id: person id to assign the comment to
@type person_id: int
@param message: defines the comment to set
@type message: string
@return the message incl. the metadata if everything was fine, False on err
@rtype: string or boolean
'''
msg = ""
pid = -1
try:
msg = str(message)
pid = int(person_id)
except (ValueError, TypeError):
return False
strtimestamp = strftime("%Y-%m-%d %H:%M:%S", gmtime())
msg = escape(msg, quote=True)
dbmsg = "%s;;;%s" % (strtimestamp, msg)
dbapi.set_person_data(pid, "comment", dbmsg)
return dbmsg
def get_person_comments(person_id):
'''
Get all comments from a person
@param person_id: person id to get the comments from
@type person_id: int
@return the message incl. the metadata if everything was fine, False on err
@rtype: string or boolean
'''
pid = -1
comments = []
try:
pid = int(person_id)
except (ValueError, TypeError):
return False
for row in dbapi.get_person_data(pid, "comment"):
comments.append(row[1])
return comments
def search_person_ids_by_name(namequery):
'''
Prepares the search to search in the database
@param namequery: the search query the user enquired
@type namequery: string
@return: information about the result w/ probability and occurrence
@rtype: tuple of tuple
'''
query = ""
escaped_query = ""
try:
query = str(namequery)
except (ValueError, TypeError):
return []
if query:
escaped_query = escape(query, quote=True)
else:
return []
return dbapi.find_personIDs_by_name_string(escaped_query)
def insert_log(userinfo, personid, action, tag, value, comment='', transactionid=0):
'''
Log an action performed by a user
Examples (in the DB):
1 2010-09-30 19:30 admin||10.0.0.1 1 assign paper 1133:4442 'from 23'
1 2010-09-30 19:30 admin||10.0.0.1 1 assign paper 8147:4442
2 2010-09-30 19:35 admin||10.0.0.1 1 reject paper 72:4442
@param userinfo: information about the user [UID|IP]
@type userinfo: string
@param personid: ID of the person this action is targeting
@type personid: int
@param action: intended action
@type action: string
@param tag: A tag to describe the data entered
@type tag: string
@param value: The value of the action described by the tag
@type value: string
@param comment: Optional comment to describe the transaction
@type comment: string
@param transactionid: May group bulk operations together
@type transactionid: int
@return: Returns the current transactionid
@rtype: int
'''
userinfo = escape(str(userinfo))
action = escape(str(action))
tag = escape(str(tag))
value = escape(str(value))
comment = escape(str(comment))
if not isinstance(personid, int):
try:
personid = int(personid)
except (ValueError, TypeError):
return -1
if not isinstance(transactionid, int):
try:
transactionid = int(transactionid)
except (ValueError, TypeError):
return -1
if userinfo.count('||'):
uid = userinfo.split('||')[0]
else:
uid = ''
return dbapi.insert_user_log(userinfo, personid, action, tag,
value, comment, transactionid, userid=uid)
def user_can_modify_data(uid, pid):
'''
Determines if a user may modify the data of a person
@param uid: the id of a user (invenio user id)
@type uid: int
@param pid: the id of a person
@type pid: int
@return: True if the user may modify data, False if not
@rtype: boolean
@raise ValueError: if the supplied parameters are invalid
'''
if not isinstance(uid, int):
try:
uid = int(uid)
except (ValueError, TypeError):
raise ValueError("User ID has to be a number!")
if not isinstance(pid, int):
try:
pid = int(pid)
except (ValueError, TypeError):
raise ValueError("Person ID has to be a number!")
return dbapi.user_can_modify_data(uid, pid)
def user_can_modify_paper(uid, paper):
'''
Determines if a user may modify the record assignments of a person
@param uid: the id of a user (invenio user id)
@type uid: int
@param pid: the id of a person
@type pid: int
@return: True if the user may modify data, False if not
@rtype: boolean
@raise ValueError: if the supplied parameters are invalid
'''
if not isinstance(uid, int):
try:
uid = int(uid)
except (ValueError, TypeError):
raise ValueError("User ID has to be a number!")
if not paper:
raise ValueError("A bibref is expected!")
return dbapi.user_can_modify_paper(uid, paper)
def person_bibref_is_touched_old(pid, bibref):
'''
Determines if an assignment has been touched by a user (i.e. check for
the flag of an assignment being 2 or -2)
@param pid: the id of the person to check against
@type pid: int
@param bibref: the bibref-bibrec pair that unambiguously identifies a paper
@type bibref: string
@raise ValueError: if the supplied parameters are invalid
'''
if not isinstance(pid, int):
try:
pid = int(pid)
except (ValueError, TypeError):
raise ValueError("Person ID has to be a number!")
if not bibref:
raise ValueError("A bibref is expected!")
return dbapi.person_bibref_is_touched_old(pid, bibref)
def get_review_needing_records(pid):
'''
Returns list of records associated to pid which are in need of review
(only bibrec ma no bibref selected)
@param pid: pid
'''
pid = wash_integer_id(pid)
db_data = dbapi.get_person_papers_to_be_manually_reviewed(pid)
return [int(row[1]) for row in db_data if row[1]]
def add_review_needing_record(pid, bibrec_id):
'''
Add record in need of review to a person
@param pid: pid
@param bibrec_id: bibrec
'''
pid = wash_integer_id(pid)
bibrec_id = wash_integer_id(bibrec_id)
dbapi.add_person_paper_needs_manual_review(pid, bibrec_id)
def del_review_needing_record(pid, bibrec_id):
'''
Removes a record in need of review from a person
@param pid: personid
@param bibrec_id: bibrec
'''
pid = wash_integer_id(pid)
bibrec_id = wash_integer_id(bibrec_id)
dbapi.del_person_papers_needs_manual_review(pid, bibrec_id)
def get_processed_external_recids(pid):
'''
Get list of records that have been processed from external identifiers
@param pid: Person ID to look up the info for
@type pid: int
@return: list of record IDs
@rtype: list of strings
'''
list_str = dbapi.get_processed_external_recids(pid)
return list_str.split(";")
def set_processed_external_recids(pid, recid_list):
'''
Set list of records that have been processed from external identifiers
@param pid: Person ID to set the info for
@type pid: int
@param recid_list: list of recids
@type recid_list: list of int
'''
if isinstance(recid_list, list):
recid_list_str = ";".join(recid_list)
dbapi.set_processed_external_recids(pid, recid_list_str)
def arxiv_login(req, picked_profile=None):
'''
Log in through arxive. If user already associated to a personid, returns the personid.
If user has no pid, try to guess which personid to associate based on surname and papers
from arxiv. If no compatible person is found, creates a new person.
At the end of the process opens a ticket for the user claiming the papers from arxiv.
!!! the user will find the open ticket, which will require him to go through the
final review before getting committed.
@param req: Apache request object
@type req: Apache request object
@return: Returns the pid resulting in the process
@rtype: int
'''
def session_bareinit(req):
try:
pinfo = session["personinfo"]
if 'ticket' not in pinfo:
pinfo["ticket"] = []
except KeyError:
pinfo = dict()
session['personinfo'] = pinfo
pinfo["ticket"] = []
session.dirty = True
session_bareinit(req)
pinfo = session['personinfo']
ticket = session['personinfo']['ticket']
uinfo = collect_user_info(req)
pinfo['external_first_entry'] = False
try:
name = uinfo['external_firstname']
except KeyError:
name = ''
try:
surname = uinfo['external_familyname']
except KeyError:
surname = ''
if surname:
session['personinfo']['arxiv_name'] = nameapi.create_normalized_name(
nameapi.split_name_parts(surname + ', ' + name))
else:
session['personinfo']['arxiv_name'] = ''
session.dirty = True
try:
arxiv_p_ids = uinfo['external_arxivids'].split(';')
except KeyError:
arxiv_p_ids = []
#'external_arxivids': 'hep-th/0112017;hep-th/0112020',
#'external_familyname': 'Weiler',
#'external_firstname': 'Henning',
try:
found_bibrecs = set(reduce(add, [perform_request_search(p='037:' + str(arx), of='id', rg=0)for arx in arxiv_p_ids]))
except (IndexError, TypeError):
found_bibrecs = set()
#found_bibrecs = [567700, 567744]
uid = getUid(req)
pid, pid_found = dbapi.get_personid_from_uid([[uid]])
if pid_found:
pid = pid[0]
else:
if picked_profile == None:
top5_list = dbapi.find_top5_personid_for_new_arXiv_user(found_bibrecs,
nameapi.create_normalized_name(nameapi.split_name_parts(surname + ', ' + name)))
return ("top5_list", top5_list)
else:
pid = dbapi.check_personids_availability(picked_profile, uid)
pid_bibrecs = set([i[0] for i in dbapi.get_all_personids_recs(pid, claimed_only=True)])
missing_bibrecs = found_bibrecs - pid_bibrecs
#present_bibrecs = found_bibrecs.intersection(pid_bibrecs)
#assert len(found_bibrecs) == len(missing_bibrecs) + len(present_bibrecs)
tempticket = []
#now we have to open the tickets...
#person_papers contains the papers which are already assigned to the person and came from arxive,
#they can be claimed regardless
for bibrec in missing_bibrecs:
tempticket.append({'pid':pid, 'bibref':str(bibrec), 'action':'confirm'})
#check if ticket targets (bibref for pid) are already in ticket
for t in list(tempticket):
for e in list(ticket):
if e['pid'] == t['pid'] and e['bibref'] == t['bibref']:
ticket.remove(e)
ticket.append(t)
session.dirty = True
if picked_profile != None and picked_profile != pid and picked_profile != -1:
return ("chosen pid not available", pid)
elif picked_profile != None and picked_profile == pid and picked_profile != -1:
return ("pid assigned by user", pid)
else:
return ("pid", pid)
def external_user_can_perform_action(uid):
'''
Check for SSO user and if external claims will affect the
decision wether or not the user may use the Invenio claiming platform
@param uid: the user ID to check permissions for
@type uid: int
@return: is user allowed to perform actions?
@rtype: boolean
'''
#If no EXTERNAL_CLAIMED_RECORDS_KEY we bypass this check
if not bconfig.EXTERNAL_CLAIMED_RECORDS_KEY:
return True
uinfo = collect_user_info(uid)
keys = []
for k in bconfig.EXTERNAL_CLAIMED_RECORDS_KEY:
if k in uinfo:
keys.append(k)
full_key = False
for k in keys:
if uinfo[k]:
full_key = True
break
return full_key
def is_external_user(uid):
'''
Check for SSO user and if external claims will affect the
decision wether or not the user may use the Invenio claiming platform
@param uid: the user ID to check permissions for
@type uid: int
@return: is user allowed to perform actions?
@rtype: boolean
'''
#If no EXTERNAL_CLAIMED_RECORDS_KEY we bypass this check
if not bconfig.EXTERNAL_CLAIMED_RECORDS_KEY:
return False
uinfo = collect_user_info(uid)
keys = []
for k in bconfig.EXTERNAL_CLAIMED_RECORDS_KEY:
if k in uinfo:
keys.append(k)
full_key = False
for k in keys:
if uinfo[k]:
full_key = True
break
return full_key
def check_transaction_permissions(uid, bibref, pid, action):
'''
Check if the user can perform the given action on the given pid,bibrefrec pair.
return in: granted, denied, warning_granted, warning_denied
@param uid: The internal ID of a user
@type uid: int
@param bibref: the bibref pair to check permissions for
@type bibref: string
@param pid: the Person ID to check on
@type pid: int
@param action: the action that is to be performed
@type action: string
@return: granted, denied, warning_granted xor warning_denied
@rtype: string
'''
c_own = True
c_override = False
is_superadmin = isUserSuperAdmin({'uid': uid})
access_right = _resolve_maximum_acces_rights(uid)
bibref_status = dbapi.get_bibref_modification_status(bibref)
old_flag = bibref_status[0]
if old_flag == 2 or old_flag == -2:
if action in ['confirm', 'assign']:
new_flag = 2
elif action in ['repeal']:
new_flag = -2
elif action in ['reset']:
new_flag = 0
if old_flag != new_flag:
c_override = True
uid_pid = dbapi.get_personid_from_uid([[uid]])
if not uid_pid[1] or pid != uid_pid[0][0]:
c_own = False
#if we cannot override an already touched bibref, no need to go on checking
if c_override:
if is_superadmin:
return 'warning_granted'
if access_right[1] < bibref_status[1]:
return "warning_denied"
else:
if is_superadmin:
return 'granted'
#let's check if invenio is allowing us the action we want to perform
if c_own:
action = bconfig.CLAIMPAPER_CLAIM_OWN_PAPERS
else:
action = bconfig.CLAIMPAPER_CLAIM_OTHERS_PAPERS
auth = acc_authorize_action(uid, action)
if auth[0] != 0:
return "denied"
#now we know if claiming for ourselfs, we can ask for external ideas
if c_own:
action = 'claim_own_paper'
else:
action = 'claim_other_paper'
ext_permission = external_user_can_perform_action(uid)
#if we are here invenio is allowing the thing and we are not overwriting a
#user with higher privileges, if externals are ok we go on!
if ext_permission:
if not c_override:
return "granted"
else:
return "warning_granted"
return "denied"
def delete_request_ticket(pid, ticket):
'''
Delete a request ticket associated to a person
@param pid: pid (int)
@param ticket: ticket id (int)
'''
dbapi.delete_request_ticket(pid, ticket)
def delete_transaction_from_request_ticket(pid, tid, action, bibref):
'''
Deletes a transaction from a ticket. If ticket empty, deletes it.
@param pid: pid
@param tid: ticket id
@param action: action
@param bibref: bibref
'''
rt = get_person_request_ticket(pid, tid)
if len(rt) > 0:
# rt_num = rt[0][1]
rt = rt[0][0]
else:
return
for t in list(rt):
if str(t[0]) == str(action) and str(t[1]) == str(bibref):
rt.remove(t)
action_present = False
for t in rt:
if str(t[0]) in ['confirm', 'repeal']:
action_present = True
if not action_present:
delete_request_ticket(pid, tid)
return
dbapi.update_request_ticket(pid, rt, tid)
def create_request_ticket(userinfo, ticket):
'''
Creates a request ticket
@param usernfo: dictionary of info about user
@param ticket: dictionary ticket
'''
# write ticket to DB
# send eMail to RT
udata = []
mailcontent = []
m = mailcontent.append
m("A user sent a change request through the web interface.")
m("User Information:")
for k, v in userinfo.iteritems():
if v:
m(" %s: %s" % (k, v))
m("\nLinks to all issued Person-based requests:\n")
for i in userinfo:
udata.append([i, userinfo[i]])
tic = {}
for t in ticket:
if not t['action'] in ['confirm', 'assign', 'repeal', 'reset']:
return False
elif t['pid'] < 0:
return False
elif not is_valid_bibref(t['bibref']):
return False
if t['action'] == 'reset':
#we ignore reset tickets
continue
else:
if t['pid'] not in tic:
tic[t['pid']] = []
if t['action'] == 'assign':
t['action'] = 'confirm'
tic[t['pid']].append([t['action'], t['bibref']])
for pid in tic:
data = []
for i in udata:
data.append(i)
data.append(['date', ctime()])
for i in tic[pid]:
data.append(i)
dbapi.update_request_ticket(pid, data)
pidlink = get_person_redirect_link(pid)
m("%s/person/%s?open_claim=True#tabTickets" % (CFG_SITE_URL, pidlink))
m("\nPlease remember that you have to be logged in "
"in order to see the ticket of a person.\n")
if ticket and tic and mailcontent:
sender = CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL
if bconfig.TICKET_SENDING_FROM_USER_EMAIL and userinfo['email']:
sender = userinfo['email']
send_email(sender,
CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL,
subject="[Author] Change Request",
content="\n".join(mailcontent))
return True
def send_user_commit_notification_email(userinfo, ticket):
'''
Sends commit notification email to RT system
'''
# send eMail to RT
mailcontent = []
m = mailcontent.append
m("A user committed a change through the web interface.")
m("User Information:")
for k, v in userinfo.iteritems():
if v:
m(" %s: %s" % (k, v))
m("\nChanges:\n")
for t in ticket:
m(" --- <start> --- \n")
for k, v in t.iteritems():
m(" %s: %s \n" % (str(k), str(v)))
if k == 'bibref':
try:
br = int(v.split(',')[1])
m(" Title: %s\n" % search_engine.get_fieldvalues(br, "245__a"))
except (TypeError, ValueError, IndexError):
pass
m(" --- <end> --- \n")
if ticket and mailcontent:
sender = CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL
send_email(sender,
CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL,
subject="[Author] NO ACTIONS NEEDED. Changes performed by SSO user.",
content="\n".join(mailcontent))
return True
def user_can_view_CMP(uid):
action = bconfig.CLAIMPAPER_VIEW_PID_UNIVERSE
auth = acc_authorize_action(uid, action)
if auth[0] == 0:
return True
else:
return False
def _resolve_maximum_acces_rights(uid):
'''
returns [max_role, lcul] to use in execute_action and check_transaction_permissions.
Defaults to ['guest',0] if user has no roles assigned.
Always returns the maximum privilege.
'''
roles = {bconfig.CLAIMPAPER_ADMIN_ROLE: acc_get_role_id(bconfig.CLAIMPAPER_ADMIN_ROLE),
bconfig.CLAIMPAPER_USER_ROLE: acc_get_role_id(bconfig.CLAIMPAPER_USER_ROLE)}
uroles = acc_get_user_roles(uid)
max_role = ['guest', 0]
for r in roles:
if roles[r] in uroles:
rright = bconfig.CMPROLESLCUL[r]
if rright >= max_role[1]:
max_role = [r, rright]
return max_role
def create_new_person(uid, uid_is_owner=False):
'''
Create a new person.
@param uid: User ID to attach to the person
@type uid: int
@param uid_is_owner: Is the uid provided owner of the new person?
@type uid_is_owner: bool
@return: the resulting person ID of the new person
@rtype: int
'''
pid = dbapi.create_new_person(uid, uid_is_owner=uid_is_owner)
return pid
def execute_action(action, pid, bibref, uid, userinfo='', comment=''):
'''
Executes the action, setting the last user right according to uid
@param action: the action to perform
@type action: string
@param pid: the Person ID to perform the action on
@type pid: int
@param bibref: the bibref pair to perform the action for
@type bibref: string
@param uid: the internal user ID of the currently logged in user
@type uid: int
@return: list of a tuple: [(status, message), ] or None if something went wrong
@rtype: [(bool, str), ]
'''
pid = wash_integer_id(pid)
if not action in ['confirm', 'assign', 'repeal', 'reset']:
return None
elif pid == -3:
pid = dbapi.create_new_person(uid, uid_is_owner=False)
elif pid < 0:
return None
elif not is_valid_bibref(bibref):
return None
if userinfo.count('||'):
uid = userinfo.split('||')[0]
else:
uid = ''
user_level = _resolve_maximum_acces_rights(uid)[1]
res = None
if action in ['confirm', 'assign']:
dbapi.insert_user_log(userinfo, pid, 'assign', 'CMPUI_ticketcommit', bibref, comment, userid=uid)
res = dbapi.confirm_papers_to_person(pid, [bibref], user_level)
elif action in ['repeal']:
dbapi.insert_user_log(userinfo, pid, 'repeal', 'CMPUI_ticketcommit', bibref, comment, userid=uid)
res = dbapi.reject_papers_from_person(pid, [bibref], user_level)
elif action in ['reset']:
dbapi.insert_user_log(userinfo, pid, 'reset', 'CMPUI_ticketcommit', bibref, comment, userid=uid)
res = dbapi.reset_papers_flag(pid, [bibref])
#This is the only point which modifies a person, so this can trigger the
#deletion of a cached page
webauthorapi.expire_all_cache_for_personid(pid)
return res
def sign_assertion(robotname, assertion):
'''
Sign an assertion for the export of IDs
@param robotname: name of the robot. E.g. 'arxivz'
@type robotname: string
@param assertion: JSONized object to sign
@type assertion: string
@return: The signature
@rtype: string
'''
robotname = ""
secr = ""
if not robotname:
return ""
robot = ExternalAuthRobot()
keys = load_robot_keys()
try:
secr = keys["Robot"][robotname]
except:
secr = ""
return robot.sign(secr, assertion)
diff --git a/invenio/legacy/bibauthorid/webauthorprofileinterface.py b/invenio/legacy/bibauthorid/webauthorprofileinterface.py
index 0db888f9a..add4f1b75 100644
--- a/invenio/legacy/bibauthorid/webauthorprofileinterface.py
+++ b/invenio/legacy/bibauthorid/webauthorprofileinterface.py
@@ -1,49 +1,49 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
-from invenio.bibauthorid_config import CLAIMPAPER_ADMIN_ROLE #emitting #pylint: disable-msg=W0611
-from invenio.bibauthorid_config import CLAIMPAPER_USER_ROLE #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.config import CLAIMPAPER_ADMIN_ROLE #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.config import CLAIMPAPER_USER_ROLE #emitting #pylint: disable-msg=W0611
#import invenio.bibauthorid_webapi as webapi
-#import invenio.bibauthorid_config as bconfig
+#import invenio.legacy.bibauthorid.config as bconfig
from invenio.bibauthorid_frontinterface import get_bibrefrec_name_string #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_webapi import search_person_ids_by_name #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_webapi import get_papers_by_person_id #emitting #pylint: disable-msg=W0611
-from invenio.bibauthorid_dbinterface import get_person_db_names_count #emitting #pylint: disable-msg=W0611
-from invenio.bibauthorid_dbinterface import get_existing_personids #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.dbinterface import get_person_db_names_count #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.dbinterface import get_existing_personids #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_webapi import get_person_redirect_link #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_webapi import is_valid_canonical_id #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_webapi import get_person_id_from_paper #emitting #pylint: disable-msg=W0611
from invenio.bibauthorid_webapi import get_person_id_from_canonical_id #emitting #pylint: disable-msg=W0611
-from invenio.bibauthorid_dbinterface import get_person_names_count #emitting #pylint: disable-msg=W0611
-from invenio.bibauthorid_dbinterface import get_canonical_id_from_personid #emitting #pylint: disable-msg=W0611
-from invenio.bibauthorid_dbinterface import get_coauthor_pids #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.dbinterface import get_person_names_count #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.dbinterface import get_canonical_id_from_personid #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.dbinterface import get_coauthor_pids #emitting #pylint: disable-msg=W0611
-from invenio.bibauthorid_name_utils import create_normalized_name #emitting #pylint: disable-msg=W0611
-from invenio.bibauthorid_name_utils import split_name_parts #emitting #pylint: disable-msg=W0611
-#from invenio.bibauthorid_config import CLAIMPAPER_CLAIM_OTHERS_PAPERS
-from invenio.bibauthorid_config import AID_ENABLED #emitting #pylint: disable-msg=W0611
-from invenio.bibauthorid_config import AID_ON_AUTHORPAGES #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.name_utils import create_normalized_name #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.name_utils import split_name_parts #emitting #pylint: disable-msg=W0611
+#from invenio.legacy.bibauthorid.config import CLAIMPAPER_CLAIM_OTHERS_PAPERS
+from invenio.legacy.bibauthorid.config import AID_ENABLED #emitting #pylint: disable-msg=W0611
+from invenio.legacy.bibauthorid.config import AID_ON_AUTHORPAGES #emitting #pylint: disable-msg=W0611
from invenio import bibauthorid_searchinterface as pt #emitting #pylint: disable-msg=W0611
def gathered_names_by_personid(pid):
return [p[0] for p in get_person_names_count(pid)]
diff --git a/invenio/legacy/bibauthorid/webinterface.py b/invenio/legacy/bibauthorid/webinterface.py
index e226d1db3..e7549e002 100644
--- a/invenio/legacy/bibauthorid/webinterface.py
+++ b/invenio/legacy/bibauthorid/webinterface.py
@@ -1,2622 +1,2622 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
""" Bibauthorid Web Interface Logic and URL handler. """
# pylint: disable=W0105
# pylint: disable=C0301
# pylint: disable=W0613
from cgi import escape
from copy import deepcopy
from pprint import pformat
from operator import itemgetter
try:
from invenio.utils.json import json, CFG_JSON_AVAILABLE
except:
CFG_JSON_AVAILABLE = False
json = None
-from invenio.bibauthorid_config import AID_ENABLED, CLAIMPAPER_ADMIN_ROLE, CLAIMPAPER_USER_ROLE, \
+from invenio.legacy.bibauthorid.config import AID_ENABLED, CLAIMPAPER_ADMIN_ROLE, CLAIMPAPER_USER_ROLE, \
PERSON_SEARCH_RESULTS_SHOW_PAPERS_PERSON_LIMIT, \
BIBAUTHORID_UI_SKIP_ARXIV_STUB_PAGE, VALID_EXPORT_FILTERS
from invenio.config import CFG_SITE_LANG, CFG_SITE_URL, CFG_SITE_NAME, CFG_INSPIRE_SITE #, CFG_SITE_SECURE_URL
from invenio.legacy.webpage import page, pageheaderonly, pagefooteronly
from invenio.base.i18n import gettext_set_language #, wash_language
from invenio.template import load
from invenio.ext.legacy.handler import wash_urlargd, WebInterfaceDirectory
from invenio.utils.url import redirect_to_url
from invenio.legacy.webuser import getUid, page_not_authorized, collect_user_info, set_user_preferences, \
email_valid_p, emailUnique, get_email_from_username, get_uid_from_email, \
isUserSuperAdmin
from invenio.modules.access.control import acc_find_user_role_actions, acc_get_user_roles, acc_get_role_id
from invenio.legacy.search_engine import perform_request_search
from invenio.legacy.bibrecord import get_fieldvalues
import invenio.bibauthorid_webapi as webapi
from invenio.bibauthorid_frontinterface import get_bibrefrec_name_string
from flask import session
from invenio.bibauthorid_backinterface import update_personID_external_ids
TEMPLATE = load('bibauthorid')
class WebInterfaceBibAuthorIDPages(WebInterfaceDirectory):
"""
Handle /person pages and AJAX requests
Supplies the methods
/person/<string>
/person/action
/person/welcome
/person/search
/person/you -> /person/<string>
/person/export
/person/claimstub
"""
_exports = ['', 'action', 'welcome', 'search', 'you', 'export', 'tickets_admin', 'claimstub']
def __init__(self, person_id=None):
"""
Constructor of the web interface.
@param person_id: The identifier of a user. Can be one of:
- a bibref: e.g. "100:1442,155"
- a person id: e.g. "14"
- a canonical id: e.g. "Ellis_J_1"
@type person_id: string
@return: will return an empty object if the identifier is of wrong type
@rtype: None (if something is not right)
"""
pid = -1
is_bibref = False
is_canonical_id = False
self.adf = self.__init_call_dispatcher()
if (not isinstance(person_id, str)) or (not person_id):
self.person_id = pid
return None
if person_id.count(":") and person_id.count(","):
is_bibref = True
elif webapi.is_valid_canonical_id(person_id):
is_canonical_id = True
if is_bibref and pid > -2:
bibref = person_id
table, ref, bibrec = None, None, None
if not bibref.count(":"):
pid = -2
if not bibref.count(","):
pid = -2
try:
table = bibref.split(":")[0]
ref = bibref.split(":")[1].split(",")[0]
bibrec = bibref.split(":")[1].split(",")[1]
except IndexError:
pid = -2
try:
table = int(table)
ref = int(ref)
bibrec = int(bibrec)
except (ValueError, TypeError):
pid = -2
if pid == -1:
try:
pid = int(webapi.get_person_id_from_paper(person_id))
except (ValueError, TypeError):
pid = -1
else:
pid = -1
elif is_canonical_id:
try:
pid = int(webapi.get_person_id_from_canonical_id(person_id))
except (ValueError, TypeError):
pid = -1
else:
try:
pid = int(person_id)
except ValueError:
pid = -1
self.person_id = pid
def __call__(self, req, form):
'''
Serve the main person page.
Will use the object's person id to get a person's information.
@param req: Apache Request Object
@type req: Apache Request Object
@param form: Parameters sent via GET or POST request
@type form: dict
@return: a full page formatted in HTML
@return: string
'''
self._session_bareinit(req)
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG),
'verbose': (int, 0),
'ticketid': (int, -1),
'open_claim': (str, None)})
ln = argd['ln']
# ln = wash_language(argd['ln'])
rt_ticket_id = argd['ticketid']
req.argd = argd #needed for perform_req_search
ulevel = self.__get_user_role(req)
uid = getUid(req)
if self.person_id < 0:
return redirect_to_url(req, "%s/person/search" % (CFG_SITE_URL))
if isUserSuperAdmin({'uid': uid}):
ulevel = 'admin'
no_access = self._page_access_permission_wall(req, [self.person_id])
if no_access:
return no_access
try:
pinfo = session["personinfo"]
except KeyError:
pinfo = dict()
session['personinfo'] = pinfo
if 'open_claim' in argd and argd['open_claim']:
pinfo['claim_in_process'] = True
elif "claim_in_process" in pinfo and pinfo["claim_in_process"]:
pinfo['claim_in_process'] = True
else:
pinfo['claim_in_process'] = False
uinfo = collect_user_info(req)
uinfo['precached_viewclaimlink'] = pinfo['claim_in_process']
set_user_preferences(uid, uinfo)
pinfo['ulevel'] = ulevel
if self.person_id != -1:
pinfo["claimpaper_admin_last_viewed_pid"] = self.person_id
pinfo["ln"] = ln
if not "ticket" in pinfo:
pinfo["ticket"] = []
if rt_ticket_id:
pinfo["admin_requested_ticket_id"] = rt_ticket_id
session.dirty = True
content = ''
for part in ['optional_menu', 'ticket_box', 'personid_info', 'tabs', 'footer']:
content += self.adf[part][ulevel](req, form, ln)
title = self.adf['title'][ulevel](req, form, ln)
body = TEMPLATE.tmpl_person_detail_layout(content)
metaheaderadd = self._scripts()
metaheaderadd += '\n <meta name="robots" content="nofollow" />'
self._clean_ticket(req)
return page(title=title,
metaheaderadd=metaheaderadd,
body=body,
req=req,
language=ln)
def _page_access_permission_wall(self, req, req_pid=None, req_level=None):
'''
Display an error page if user not authorized to use the interface.
@param req: Apache Request Object for session management
@type req: Apache Request Object
@param req_pid: Requested person id
@type req_pid: int
@param req_level: Request level required for the page
@type req_level: string
'''
uid = getUid(req)
pinfo = session["personinfo"]
uinfo = collect_user_info(req)
if 'ln' in pinfo:
ln = pinfo["ln"]
else:
ln = CFG_SITE_LANG
_ = gettext_set_language(ln)
is_authorized = True
pids_to_check = []
if not AID_ENABLED:
return page_not_authorized(req, text=_("Fatal: Author ID capabilities are disabled on this system."))
if req_level and 'ulevel' in pinfo and pinfo["ulevel"] != req_level:
return page_not_authorized(req, text=_("Fatal: You are not allowed to access this functionality."))
if req_pid and not isinstance(req_pid, list):
pids_to_check = [req_pid]
elif req_pid and isinstance(req_pid, list):
pids_to_check = req_pid
if (not (uinfo['precached_usepaperclaim']
or uinfo['precached_usepaperattribution'])
and 'ulevel' in pinfo
and not pinfo["ulevel"] == "admin"):
is_authorized = False
if is_authorized and not webapi.user_can_view_CMP(uid):
is_authorized = False
if is_authorized and 'ticket' in pinfo:
for tic in pinfo["ticket"]:
if 'pid' in tic:
pids_to_check.append(tic['pid'])
if pids_to_check and is_authorized:
user_pid = webapi.get_pid_from_uid(uid)
if not uinfo['precached_usepaperattribution']:
if user_pid[1]:
user_pid = user_pid[0][0]
else:
user_pid = -1
if (not user_pid in pids_to_check
and 'ulevel' in pinfo
and not pinfo["ulevel"] == "admin"):
is_authorized = False
elif (user_pid in pids_to_check
and 'ulevel' in pinfo
and not pinfo["ulevel"] == "admin"):
for tic in list(pinfo["ticket"]):
if not tic["pid"] == user_pid:
pinfo['ticket'].remove(tic)
if not is_authorized:
return page_not_authorized(req, text=_("Fatal: You are not allowed to access this functionality."))
else:
return ""
def _session_bareinit(self, req):
'''
Initializes session personinfo entry if none exists
@param req: Apache Request Object
@type req: Apache Request Object
'''
uid = getUid(req)
ulevel = self.__get_user_role(req)
if isUserSuperAdmin({'uid': uid}):
ulevel = 'admin'
try:
pinfo = session["personinfo"]
pinfo['ulevel'] = ulevel
if "claimpaper_admin_last_viewed_pid" not in pinfo:
pinfo["claimpaper_admin_last_viewed_pid"] = -2
if 'ln' not in pinfo:
pinfo["ln"] = 'en'
if 'ticket' not in pinfo:
pinfo["ticket"] = []
session.dirty = True
except KeyError:
pinfo = dict()
session['personinfo'] = pinfo
pinfo['ulevel'] = ulevel
pinfo["claimpaper_admin_last_viewed_pid"] = -2
pinfo["ln"] = 'en'
pinfo["ticket"] = []
session.dirty = True
def _lookup(self, component, path):
"""
This handler parses dynamic URLs:
- /person/1332 shows the page of person 1332
- /person/100:5522,1431 shows the page of the person
identified by the table:bibref,bibrec pair
"""
if not component in self._exports:
return WebInterfaceBibAuthorIDPages(component), path
def __init_call_dispatcher(self):
'''
Initialization of call dispacher dictionary
@return: call dispatcher dictionary
@rtype: dict
'''
#author_detail_functions
adf = dict()
adf['title'] = dict()
adf['optional_menu'] = dict()
adf['ticket_box'] = dict()
adf['tabs'] = dict()
adf['footer'] = dict()
adf['personid_info'] = dict()
adf['ticket_dispatch'] = dict()
adf['ticket_commit'] = dict()
adf['title']['guest'] = self._generate_title_guest
adf['title']['user'] = self._generate_title_user
adf['title']['admin'] = self._generate_title_admin
adf['optional_menu']['guest'] = self._generate_optional_menu_guest
adf['optional_menu']['user'] = self._generate_optional_menu_user
adf['optional_menu']['admin'] = self._generate_optional_menu_admin
adf['ticket_box']['guest'] = self._generate_ticket_box_guest
adf['ticket_box']['user'] = self._generate_ticket_box_user
adf['ticket_box']['admin'] = self._generate_ticket_box_admin
adf['personid_info']['guest'] = self._generate_person_info_box_guest
adf['personid_info']['user'] = self._generate_person_info_box_user
adf['personid_info']['admin'] = self._generate_person_info_box_admin
adf['tabs']['guest'] = self._generate_tabs_guest
adf['tabs']['user'] = self._generate_tabs_user
adf['tabs']['admin'] = self._generate_tabs_admin
adf['footer']['guest'] = self._generate_footer_guest
adf['footer']['user'] = self._generate_footer_user
adf['footer']['admin'] = self._generate_footer_admin
adf['ticket_dispatch']['guest'] = self._ticket_dispatch_user
adf['ticket_dispatch']['user'] = self._ticket_dispatch_user
adf['ticket_dispatch']['admin'] = self._ticket_dispatch_admin
adf['ticket_commit']['guest'] = self._ticket_commit_guest
adf['ticket_commit']['user'] = self._ticket_commit_user
adf['ticket_commit']['admin'] = self._ticket_commit_admin
return adf
def _generate_title_guest(self, req, form, ln):
'''
Generate the title for a guest user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
if self.person_id:
return 'Attribute papers for: ' + str(webapi.get_person_redirect_link(self.person_id))
else:
return 'Attribute papers'
def _generate_title_user(self, req, form, ln):
'''
Generate the title for a regular user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
if self.person_id:
return 'Attribute papers (user interface) for: ' + str(webapi.get_person_redirect_link(self.person_id))
else:
return 'Attribute papers'
def _generate_title_admin(self, req, form, ln):
'''
Generate the title for an admin user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
if self.person_id:
return 'Attribute papers (administrator interface) for: ' + str(webapi.get_person_redirect_link(self.person_id))
else:
return 'Attribute papers'
def _generate_optional_menu_guest(self, req, form, ln):
'''
Generate the menu for a guest user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG),
'verbose': (int, 0)})
menu = TEMPLATE.tmpl_person_menu()
if "verbose" in argd and argd["verbose"] > 0:
pinfo = session['personinfo']
menu += "\n<pre>" + pformat(pinfo) + "</pre>\n"
return menu
def _generate_optional_menu_user(self, req, form, ln):
'''
Generate the menu for a regular user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG),
'verbose': (int, 0)})
menu = TEMPLATE.tmpl_person_menu()
if "verbose" in argd and argd["verbose"] > 0:
pinfo = session['personinfo']
menu += "\n<pre>" + pformat(pinfo) + "</pre>\n"
return menu
def _generate_optional_menu_admin(self, req, form, ln):
'''
Generate the title for an admin user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG),
'verbose': (int, 0)})
menu = TEMPLATE.tmpl_person_menu_admin()
if "verbose" in argd and argd["verbose"] > 0:
pinfo = session['personinfo']
menu += "\n<pre>" + pformat(pinfo) + "</pre>\n"
return menu
def _generate_ticket_box_guest(self, req, form, ln):
'''
Generate the semi-permanent info box for a guest user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
pinfo = session['personinfo']
ticket = pinfo['ticket']
results = []
pendingt = []
for t in ticket:
if 'execution_result' in t:
if t['execution_result']:
for res in t['execution_result']:
results.append(res)
else:
pendingt.append(t)
box = ''
if pendingt:
box = box + TEMPLATE.tmpl_ticket_box('in_process', 'transaction', len(pendingt))
if results:
failed = [messages for status, messages in results if not status]
if failed:
box = box + TEMPLATE.tmpl_transaction_box('failure', failed)
successfull = [messages for status, messages in results if status]
if successfull:
box = box + TEMPLATE.tmpl_transaction_box('success', successfull)
return box
def _generate_ticket_box_user(self, req, form, ln):
'''
Generate the semi-permanent info box for a regular user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
return self._generate_ticket_box_guest(req, form, ln)
def _generate_ticket_box_admin(self, req, form, ln):
'''
Generate the semi-permanent info box for an admin user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
return self._generate_ticket_box_guest(req, form, ln)
def _generate_person_info_box_guest(self, req, form, ln):
'''
Generate the name info box for a guest user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
return self._generate_person_info_box_admin(req, form, ln)
def _generate_person_info_box_user(self, req, form, ln):
'''
Generate the name info box for a regular user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
return self._generate_person_info_box_admin(req, form, ln)
def _generate_person_info_box_admin(self, req, form, ln):
'''
Generate the name info box for an admin user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
names = webapi.get_person_names_from_id(self.person_id)
box = TEMPLATE.tmpl_admin_person_info_box(ln, person_id=self.person_id,
names=names)
return box
def _generate_tabs_guest(self, req, form, ln):
'''
Generate the tabs content for a guest user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
# uid = getUid(req)
pinfo = session["personinfo"]
if 'ln' in pinfo:
ln = pinfo["ln"]
else:
ln = CFG_SITE_LANG
_ = gettext_set_language(ln)
links = [] # ['delete', 'commit','del_entry','commit_entry']
tabs = ['records', 'repealed', 'review']
verbiage_dict = {'confirmed': 'Papers', 'repealed': _('Papers removed from this profile'),
'review': _('Papers in need of review'),
'tickets': _('Open Tickets'), 'data': _('Data'),
'confirmed_ns': _('Papers of this Person'),
'repealed_ns': _('Papers _not_ of this Person'),
'review_ns': _('Papers in need of review'),
'tickets_ns': _('Tickets for this Person'),
'data_ns': _('Additional Data for this Person')}
buttons_verbiage_dict = {'mass_buttons': {'no_doc_string': _('Sorry, there are currently no documents to be found in this category.'),
'b_confirm': _('Yes, those papers are by this person.'),
'b_repeal': _('No, those papers are not by this person'),
'b_to_others': _('Assign to other person'),
'b_forget': _('Forget decision')},
'record_undecided': {'alt_confirm': _('Confirm!'),
'confirm_text': _('Yes, this paper is by this person.'),
'alt_repeal': _('Rejected!'),
'repeal_text': _('No, this paper is <i>not</i> by this person'),
'to_other_text': _('Assign to another person'),
'alt_to_other': _('To other person!')},
'record_confirmed': {'alt_confirm': _('Confirmed.'),
'confirm_text': _('Marked as this person\'s paper'),
'alt_forget': _('Forget decision!'),
'forget_text': _('Forget decision.'),
'alt_repeal': _('Repeal!'),
'repeal_text': _('But it\'s <i>not</i> this person\'s paper.'),
'to_other_text': _('Assign to another person'),
'alt_to_other': _('To other person!')},
'record_repealed': {'alt_confirm': _('Confirm!'),
'confirm_text': _('But it <i>is</i> this person\'s paper.'),
'alt_forget': _('Forget decision!'),
'forget_text': _('Forget decision.'),
'alt_repeal': _('Repealed'),
'repeal_text': _('Marked as not this person\'s paper'),
'to_other_text': _('Assign to another person'),
'alt_to_other': _('To other person!')}}
return self._generate_tabs_admin(req, form, ln, show_tabs=tabs, ticket_links=links,
show_reset_button=False,
open_tickets=[], verbiage_dict=verbiage_dict,
buttons_verbiage_dict=buttons_verbiage_dict)
def _generate_tabs_user(self, req, form, ln):
'''
Generate the tabs content for a regular user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
'''
uid = getUid(req)
pinfo = session['personinfo']
if 'ln' in pinfo:
ln = pinfo["ln"]
else:
ln = CFG_SITE_LANG
_ = gettext_set_language(ln)
links = ['delete', 'del_entry']
tabs = ['records', 'repealed', 'review', 'tickets']
if pinfo["claimpaper_admin_last_viewed_pid"] == webapi.get_pid_from_uid(uid)[0][0]:
verbiage_dict = {'confirmed': _('Your papers'), 'repealed': _('Not your papers'),
'review': _('Papers in need of review'),
'tickets': _('Your tickets'), 'data': _('Data'),
'confirmed_ns': _('Your papers'),
'repealed_ns': _('Not your papers'),
'review_ns': _('Papers in need of review'),
'tickets_ns': _('Your tickets'),
'data_ns': _('Additional Data for this Person')}
buttons_verbiage_dict = {'mass_buttons': {'no_doc_string': _('Sorry, there are currently no documents to be found in this category.'),
'b_confirm': _('These are mine!'),
'b_repeal': _('These are not mine!'),
'b_to_others': _('It\'s not mine, but I know whose it is!'),
'b_forget': _('Forget decision')},
'record_undecided': {'alt_confirm': _('Mine!'),
'confirm_text': _('This is my paper!'),
'alt_repeal': _('Not mine!'),
'repeal_text': _('This is not my paper!'),
'to_other_text': _('Assign to another person'),
'alt_to_other': _('To other person!')},
'record_confirmed': {'alt_confirm': _('Not Mine.'),
'confirm_text': _('Marked as my paper!'),
'alt_forget': _('Forget decision!'),
'forget_text': _('Forget assignment decision'),
'alt_repeal': _('Not Mine!'),
'repeal_text': _('But this is mine!'),
'to_other_text': _('Assign to another person'),
'alt_to_other': _('To other person!')},
'record_repealed': {'alt_confirm': _('Mine!'),
'confirm_text': _('But this is my paper!'),
'alt_forget': _('Forget decision!'),
'forget_text': _('Forget decision!'),
'alt_repeal': _('Not Mine!'),
'repeal_text': _('Marked as not your paper.'),
'to_other_text': _('Assign to another person'),
'alt_to_other': _('To other person!')}}
else:
verbiage_dict = {'confirmed': _('Papers'), 'repealed': _('Papers removed from this profile'),
'review': _('Papers in need of review'),
'tickets': _('Your tickets'), 'data': _('Data'),
'confirmed_ns': _('Papers of this Person'),
'repealed_ns': _('Papers _not_ of this Person'),
'review_ns': _('Papers in need of review'),
'tickets_ns': _('Tickets you created about this person'),
'data_ns': _('Additional Data for this Person')}
buttons_verbiage_dict = {'mass_buttons': {'no_doc_string': _('Sorry, there are currently no documents to be found in this category.'),
'b_confirm': _('Yes, those papers are by this person.'),
'b_repeal': _('No, those papers are not by this person'),
'b_to_others': _('Assign to other person'),
'b_forget': _('Forget decision')},
'record_undecided': {'alt_confirm': _('Confirm!'),
'confirm_text': _('Yes, this paper is by this person.'),
'alt_repeal': _('Rejected!'),
'repeal_text': _('No, this paper is <i>not</i> by this person'),
'to_other_text': _('Assign to another person'),
'alt_to_other': _('To other person!')},
'record_confirmed': {'alt_confirm': _('Confirmed.'),
'confirm_text': _('Marked as this person\'s paper'),
'alt_forget': _('Forget decision!'),
'forget_text': _('Forget decision.'),
'alt_repeal': _('Repeal!'),
'repeal_text': _('But it\'s <i>not</i> this person\'s paper.'),
'to_other_text': _('Assign to another person'),
'alt_to_other': _('To other person!')},
'record_repealed': {'alt_confirm': _('Confirm!'),
'confirm_text': _('But it <i>is</i> this person\'s paper.'),
'alt_forget': _('Forget decision!'),
'forget_text': _('Forget decision.'),
'alt_repeal': _('Repealed'),
'repeal_text': _('Marked as not this person\'s paper'),
'to_other_text': _('Assign to another person'),
'alt_to_other': _('To other person!')}}
uid = getUid(req)
open_tickets = webapi.get_person_request_ticket(self.person_id)
tickets = []
for t in open_tickets:
owns = False
for row in t[0]:
if row[0] == 'uid-ip' and row[1].split('||')[0] == str(uid):
owns = True
if owns:
tickets.append(t)
return self._generate_tabs_admin(req, form, ln, show_tabs=tabs, ticket_links=links,
open_tickets=tickets, verbiage_dict=verbiage_dict,
buttons_verbiage_dict=buttons_verbiage_dict)
def _generate_tabs_admin(self, req, form, ln,
show_tabs=['records', 'repealed', 'review', 'comments', 'tickets', 'data'],
open_tickets=None, ticket_links=['delete', 'commit', 'del_entry', 'commit_entry'],
verbiage_dict=None, buttons_verbiage_dict=None, show_reset_button=True):
'''
Generate the tabs content for an admin user
@param req: Apache Request Object
@type req: Apache Request Object
@param form: POST/GET variables of the request
@type form: dict
@param ln: language to show this page in
@type ln: string
@param show_tabs: list of tabs to display
@type show_tabs: list of strings
@param ticket_links: list of links to display
@type ticket_links: list of strings
@param verbiage_dict: language for the elements
@type verbiage_dict: dict
@param buttons_verbiage_dict: language for the buttons
@type buttons_verbiage_dict: dict
'''
personinfo = {}
try:
personinfo = session["personinfo"]
except KeyError:
return ""
if 'ln' in personinfo:
ln = personinfo["ln"]
else:
ln = CFG_SITE_LANG
_ = gettext_set_language(ln)
if not verbiage_dict:
verbiage_dict = self._get_default_verbiage_dicts_for_admin(req)
if not buttons_verbiage_dict:
buttons_verbiage_dict = self._get_default_buttons_verbiage_dicts_for_admin(req)
all_papers = webapi.get_papers_by_person_id(self.person_id,
ext_out=True)
records = [{'recid': paper[0],
'bibref': paper[1],
'flag': paper[2],
'authorname': paper[3],
'authoraffiliation': paper[4],
'paperdate': paper[5],
'rt_status': paper[6],
'paperexperiment': paper[7]}
for paper in all_papers]
rejected_papers = [row for row in records if row['flag'] < -1]
rest_of_papers = [row for row in records if row['flag'] >= -1]
review_needed = webapi.get_review_needing_records(self.person_id)
if len(review_needed) < 1:
if 'review' in show_tabs:
show_tabs.remove('review')
rt_tickets = None
if open_tickets == None:
open_tickets = webapi.get_person_request_ticket(self.person_id)
else:
if len(open_tickets) < 1:
if 'tickets' in show_tabs:
show_tabs.remove('tickets')
if "admin_requested_ticket_id" in personinfo:
rt_tickets = personinfo["admin_requested_ticket_id"]
# Send data to template function
tabs = TEMPLATE.tmpl_admin_tabs(ln, person_id=self.person_id,
rejected_papers=rejected_papers,
rest_of_papers=rest_of_papers,
review_needed=review_needed,
rt_tickets=rt_tickets,
open_rt_tickets=open_tickets,
show_tabs=show_tabs,
ticket_links=ticket_links,
verbiage_dict=verbiage_dict,
buttons_verbiage_dict=buttons_verbiage_dict,
show_reset_button=show_reset_button)
return tabs
def _get_default_verbiage_dicts_for_admin(self, req):
personinfo = {}
try:
personinfo = session["personinfo"]
except KeyError:
return ""
if 'ln' in personinfo:
ln = personinfo["ln"]
else:
ln = CFG_SITE_LANG
_ = gettext_set_language(ln)
verbiage_dict = {'confirmed': _('Papers'), 'repealed': _('Papers removed from this profile'),
'review': _('Papers in need of review'),
'tickets': _('Tickets'), 'data': _('Data'),
'confirmed_ns': _('Papers of this Person'),
'repealed_ns': _('Papers _not_ of this Person'),
'review_ns': _('Papers in need of review'),
'tickets_ns': _('Request Tickets'),
'data_ns': _('Additional Data for this Person')}
return verbiage_dict
def _get_default_buttons_verbiage_dicts_for_admin(self, req):
personinfo = {}
try:
personinfo = session["personinfo"]
except KeyError:
return ""
if 'ln' in personinfo:
ln = personinfo["ln"]
else:
ln = CFG_SITE_LANG
_ = gettext_set_language(ln)
buttons_verbiage_dict = {'mass_buttons': {'no_doc_string': _('Sorry, there are currently no documents to be found in this category.'),
'b_confirm': _('Yes, those papers are by this person.'),
'b_repeal': _('No, those papers are not by this person'),
'b_to_others': _('Assign to other person'),
'b_forget': _('Forget decision')},
'record_undecided': {'alt_confirm': _('Confirm!'),
'confirm_text': _('Yes, this paper is by this person.'),
'alt_repeal': _('Rejected!'),
'repeal_text': _('No, this paper is <i>not</i> by this person'),
'to_other_text': _('Assign to another person'),
'alt_to_other': _('To other person!')},
'record_confirmed': {'alt_confirm': _('Confirmed.'),
'confirm_text': _('Marked as this person\'s paper'),
'alt_forget': _('Forget decision!'),
'forget_text': _('Forget decision.'),
'alt_repeal': _('Repeal!'),
'repeal_text': _('But it\'s <i>not</i> this person\'s paper.'),
'to_other_text': _('Assign to another person'),
'alt_to_other': _('To other person!')},
'record_repealed': {'alt_confirm': _('Confirm!'),
'confirm_text': _('But it <i>is</i> this person\'s paper.'),
'alt_forget': _('Forget decision!'),
'forget_text': _('Forget decision.'),
'alt_repeal': _('Repealed'),
'repeal_text': _('Marked as not this person\'s paper'),
'to_other_text': _('Assign to another person'),
'alt_to_other': _('To other person!')}}
return buttons_verbiage_dict
def _generate_footer_guest(self, req, form, ln):
return self._generate_footer_admin(req, form, ln)
def _generate_footer_user(self, req, form, ln):
return self._generate_footer_admin(req, form, ln)
def _generate_footer_admin(self, req, form, ln):
return TEMPLATE.tmpl_invenio_search_box()
def _ticket_dispatch_guest(self, req):
'''
Takes care of the ticket when in guest mode
'''
return self._ticket_dispatch_user(req)
def _ticket_dispatch_user(self, req):
'''
Takes care of the ticket when in user and guest mode
'''
uid = getUid(req)
pinfo = session["personinfo"]
# ulevel = pinfo["ulevel"]
ticket = pinfo["ticket"]
bibref_check_required = self._ticket_review_bibref_check(req)
if bibref_check_required:
return bibref_check_required
for t in ticket:
t['status'] = webapi.check_transaction_permissions(uid,
t['bibref'],
t['pid'],
t['action'])
session.dirty = True
return self._ticket_final_review(req)
def _ticket_dispatch_admin(self, req):
'''
Takes care of the ticket when in administrator mode
'''
return self._ticket_dispatch_user(req)
def _ticket_review_bibref_check(self, req):
'''
checks if some of the transactions on the ticket are needing a review.
If it's the case prompts the user to select the right bibref
'''
pinfo = session["personinfo"]
ticket = pinfo["ticket"]
if 'arxiv_name' in pinfo:
arxiv_name = [pinfo['arxiv_name']]
else:
arxiv_name = None
if 'ln' in pinfo:
ln = pinfo["ln"]
else:
ln = CFG_SITE_LANG
_ = gettext_set_language(ln)
if ("bibref_check_required" in pinfo and pinfo["bibref_check_required"]
and "bibref_check_reviewed_bibrefs" in pinfo):
for rbibreft in pinfo["bibref_check_reviewed_bibrefs"]:
if not rbibreft.count("||") or not rbibreft.count(","):
continue
rpid, rbibref = rbibreft.split("||")
rrecid = rbibref.split(",")[1]
rpid = webapi.wash_integer_id(rpid)
for ticket_update in [row for row in ticket
if (row['bibref'] == str(rrecid) and
row['pid'] == rpid)]:
ticket_update["bibref"] = rbibref
if "incomplete" in ticket_update:
del(ticket_update["incomplete"])
for ticket_remove in [row for row in ticket
if ('incomplete' in row)]:
ticket.remove(ticket_remove)
if ("bibrefs_auto_assigned" in pinfo):
del(pinfo["bibrefs_auto_assigned"])
if ("bibrefs_to_confirm" in pinfo):
del(pinfo["bibrefs_to_confirm"])
del(pinfo["bibref_check_reviewed_bibrefs"])
pinfo["bibref_check_required"] = False
session.dirty = True
return ""
else:
bibrefs_auto_assigned = {}
bibrefs_to_confirm = {}
needs_review = []
# if ("bibrefs_auto_assigned" in pinfo
# and pinfo["bibrefs_auto_assigned"]):
# bibrefs_auto_assigned = pinfo["bibrefs_auto_assigned"]
#
# if ("bibrefs_to_confirm" in pinfo
# and pinfo["bibrefs_to_confirm"]):
# bibrefs_to_confirm = pinfo["bibrefs_to_confirm"]
for transaction in ticket:
if not webapi.is_valid_bibref(transaction['bibref']):
transaction['incomplete'] = True
needs_review.append(transaction)
if not needs_review:
pinfo["bibref_check_required"] = False
session.dirty = True
return ""
for transaction in needs_review:
recid = webapi.wash_integer_id(transaction['bibref'])
if recid < 0:
continue #this doesn't look like a recid--discard!
pid = transaction['pid']
if ((pid in bibrefs_auto_assigned
and 'bibrecs' in bibrefs_auto_assigned[pid]
and recid in bibrefs_auto_assigned[pid]['bibrecs'])
or
(pid in bibrefs_to_confirm
and 'bibrecs' in bibrefs_to_confirm[pid]
and recid in bibrefs_to_confirm[pid]['bibrecs'])):
continue # we already assessed those bibrefs.
fctptr = webapi.get_possible_bibrefs_from_pid_bibrec
bibrec_refs = fctptr(pid, [recid], additional_names=arxiv_name)
person_name = webapi.get_most_frequent_name_from_pid(pid, allow_none=True)
if not person_name:
if arxiv_name:
person_name = ''.join(arxiv_name)
else:
person_name = " "
for brr in bibrec_refs:
if len(brr[1]) == 1:
if not pid in bibrefs_auto_assigned:
bibrefs_auto_assigned[pid] = {
'person_name': person_name,
'canonical_id': "TBA",
'bibrecs': {brr[0]: brr[1]}}
else:
bibrefs_auto_assigned[pid]['bibrecs'][brr[0]] = brr[1]
else:
if not brr[1]:
tmp = webapi.get_bibrefs_from_bibrecs([brr[0]])
try:
brr[1] = tmp[0][1]
except IndexError:
continue # No bibrefs on record--discard
if not pid in bibrefs_to_confirm:
bibrefs_to_confirm[pid] = {
'person_name': person_name,
'canonical_id': "TBA",
'bibrecs': {brr[0]: brr[1]}}
else:
bibrefs_to_confirm[pid]['bibrecs'][brr[0]] = brr[1]
if bibrefs_to_confirm or bibrefs_auto_assigned:
pinfo["bibref_check_required"] = True
baa = deepcopy(bibrefs_auto_assigned)
btc = deepcopy(bibrefs_to_confirm)
for pid in baa:
for rid in baa[pid]['bibrecs']:
baa[pid]['bibrecs'][rid] = []
for pid in btc:
for rid in btc[pid]['bibrecs']:
btc[pid]['bibrecs'][rid] = []
pinfo["bibrefs_auto_assigned"] = baa
pinfo["bibrefs_to_confirm"] = btc
else:
pinfo["bibref_check_required"] = False
session.dirty = True
if 'external_first_entry' in pinfo and pinfo['external_first_entry']:
del(pinfo["external_first_entry"])
pinfo['external_first_entry_skip_review'] = True
session.dirty = True
return "" # don't bother the user the first time
body = TEMPLATE.tmpl_bibref_check(bibrefs_auto_assigned,
bibrefs_to_confirm)
body = TEMPLATE.tmpl_person_detail_layout(body)
metaheaderadd = self._scripts(kill_browser_cache=True)
title = _("Submit Attribution Information")
return page(title=title,
metaheaderadd=metaheaderadd,
body=body,
req=req,
language=ln)
def _ticket_final_review(self, req):
'''
displays the user what can/cannot finally be done, leaving the option of kicking some
transactions from the ticket before commit
'''
uid = getUid(req)
userinfo = collect_user_info(uid)
pinfo = session["personinfo"]
ulevel = pinfo["ulevel"]
ticket = pinfo["ticket"]
ticket = [row for row in ticket if not "execution_result" in row]
skip_checkout_page = True
upid = -1
user_first_name = ""
user_first_name_sys = False
user_last_name = ""
user_last_name_sys = False
user_email = ""
user_email_sys = False
if 'ln' in pinfo:
ln = pinfo["ln"]
else:
ln = CFG_SITE_LANG
_ = gettext_set_language(ln)
if ("external_firstname" in userinfo
and userinfo["external_firstname"]):
user_first_name = userinfo["external_firstname"]
user_first_name_sys = True
elif "user_first_name" in pinfo and pinfo["user_first_name"]:
user_first_name = pinfo["user_first_name"]
if ("external_familyname" in userinfo
and userinfo["external_familyname"]):
user_last_name = userinfo["external_familyname"]
user_last_name_sys = True
elif "user_last_name" in pinfo and pinfo["user_last_name"]:
user_last_name = pinfo["user_last_name"]
if ("email" in userinfo
and not userinfo["email"] == "guest"):
user_email = userinfo["email"]
user_email_sys = True
elif "user_email" in pinfo and pinfo["user_email"]:
user_email = pinfo["user_email"]
pinfo["user_first_name"] = user_first_name
pinfo["user_first_name_sys"] = user_first_name_sys
pinfo["user_last_name"] = user_last_name
pinfo["user_last_name_sys"] = user_last_name_sys
pinfo["user_email"] = user_email
pinfo["user_email_sys"] = user_email_sys
if "upid" in pinfo and pinfo["upid"]:
upid = pinfo["upid"]
else:
dbpid = webapi.get_pid_from_uid(uid)
if dbpid and dbpid[1]:
if dbpid[0] and not dbpid[0] == -1:
upid = dbpid[0][0]
pinfo["upid"] = upid
session.dirty = True
if not (user_first_name or user_last_name or user_email):
skip_checkout_page = False
if [row for row in ticket
if row["status"] in ["denied", "warning_granted",
"warning_denied"]]:
skip_checkout_page = False
if 'external_first_entry_skip_review' in pinfo and pinfo['external_first_entry_skip_review']:
del(pinfo["external_first_entry_skip_review"])
skip_checkout_page = True
session.dirty = True
if (not ticket or skip_checkout_page
or ("checkout_confirmed" in pinfo
and pinfo["checkout_confirmed"]
and "checkout_faulty_fields" in pinfo
and not pinfo["checkout_faulty_fields"])):
self.adf['ticket_commit'][ulevel](req)
if "checkout_confirmed" in pinfo:
del(pinfo["checkout_confirmed"])
if "checkout_faulty_fields" in pinfo:
del(pinfo["checkout_faulty_fields"])
if "bibref_check_required" in pinfo:
del(pinfo["bibref_check_required"])
# if "user_ticket_comments" in pinfo:
# del(pinfo["user_ticket_comments"])
session.dirty = True
return self._ticket_dispatch_end(req)
for tt in list(ticket):
if not 'bibref' in tt or not 'pid' in tt:
del(ticket[tt])
continue
tt['authorname_rec'] = get_bibrefrec_name_string(tt['bibref'])
tt['person_name'] = webapi.get_most_frequent_name_from_pid(tt['pid'])
mark_yours = []
mark_not_yours = []
if upid >= 0:
mark_yours = [row for row in ticket
if (str(row["pid"]) == str(upid) and
row["action"] in ["to_other_person", "confirm"])]
mark_not_yours = [row for row in ticket
if (str(row["pid"]) == str(upid) and
row["action"] in ["repeal", "reset"])]
mark_theirs = [row for row in ticket
if ((not str(row["pid"]) == str(upid)) and
row["action"] in ["to_other_person", "confirm"])]
mark_not_theirs = [row for row in ticket
if ((not str(row["pid"]) == str(upid)) and
row["action"] in ["repeal", "reset"])]
session.dirty = True
body = TEMPLATE.tmpl_ticket_final_review(req, mark_yours,
mark_not_yours,
mark_theirs,
mark_not_theirs)
body = TEMPLATE.tmpl_person_detail_layout(body)
metaheaderadd = self._scripts(kill_browser_cache=True)
title = _("Please review your actions")
#body = body + '<pre>' + pformat(pinfo) + '</pre>'
return page(title=title,
metaheaderadd=metaheaderadd,
body=body,
req=req,
language=ln)
def _ticket_commit_admin(self, req):
'''
Actual execution of the ticket transactions
'''
self._clean_ticket(req)
uid = getUid(req)
pinfo = session["personinfo"]
ticket = pinfo["ticket"]
userinfo = {'uid-ip': "%s||%s" % (uid, req.remote_ip)}
if "user_ticket_comments" in pinfo:
userinfo['comments'] = pinfo["user_ticket_comments"]
if "user_first_name" in pinfo:
userinfo['firstname'] = pinfo["user_first_name"]
if "user_last_name" in pinfo:
userinfo['lastname'] = pinfo["user_last_name"]
if "user_email" in pinfo:
userinfo['email'] = pinfo["user_email"]
for t in ticket:
t['execution_result'] = webapi.execute_action(t['action'], t['pid'], t['bibref'], uid,
userinfo['uid-ip'], str(userinfo))
session.dirty = True
def _ticket_commit_user(self, req):
'''
Actual execution of the ticket transactions
'''
self._clean_ticket(req)
uid = getUid(req)
pinfo = session["personinfo"]
ticket = pinfo["ticket"]
ok_tickets = []
userinfo = {'uid-ip': "%s||%s" % (uid, req.remote_ip)}
if "user_ticket_comments" in pinfo:
userinfo['comments'] = pinfo["user_ticket_comments"]
if "user_first_name" in pinfo:
userinfo['firstname'] = pinfo["user_first_name"]
if "user_last_name" in pinfo:
userinfo['lastname'] = pinfo["user_last_name"]
if "user_email" in pinfo:
userinfo['email'] = pinfo["user_email"]
for t in list(ticket):
if t['status'] in ['granted', 'warning_granted']:
t['execution_result'] = webapi.execute_action(t['action'],
t['pid'], t['bibref'], uid,
userinfo['uid-ip'], str(userinfo))
ok_tickets.append(t)
ticket.remove(t)
if ticket:
webapi.create_request_ticket(userinfo, ticket)
if CFG_INSPIRE_SITE and ok_tickets:
webapi.send_user_commit_notification_email(userinfo, ok_tickets)
for t in ticket:
t['execution_result'] = [(True, ''),]
ticket[:] = ok_tickets
session.dirty = True
def _ticket_commit_guest(self, req):
'''
Actual execution of the ticket transactions
'''
self._clean_ticket(req)
pinfo = session["personinfo"]
uid = getUid(req)
userinfo = {'uid-ip': "userid: %s (from %s)" % (uid, req.remote_ip)}
if "user_ticket_comments" in pinfo:
if pinfo["user_ticket_comments"]:
userinfo['comments'] = pinfo["user_ticket_comments"]
else:
userinfo['comments'] = "No comments submitted."
if "user_first_name" in pinfo:
userinfo['firstname'] = pinfo["user_first_name"]
if "user_last_name" in pinfo:
userinfo['lastname'] = pinfo["user_last_name"]
if "user_email" in pinfo:
userinfo['email'] = pinfo["user_email"]
ticket = pinfo['ticket']
webapi.create_request_ticket(userinfo, ticket)
for t in ticket:
t['execution_result'] = [(True, ''),]
session.dirty = True
def _ticket_dispatch_end(self, req):
'''
The ticket dispatch is finished, redirect to the original page of
origin or to the last_viewed_pid
'''
pinfo = session["personinfo"]
if 'claim_in_process' in pinfo:
pinfo['claim_in_process'] = False
uinfo = collect_user_info(req)
uinfo['precached_viewclaimlink'] = True
uid = getUid(req)
set_user_preferences(uid, uinfo)
if "referer" in pinfo and pinfo["referer"]:
referer = pinfo["referer"]
del(pinfo["referer"])
session.dirty = True
return redirect_to_url(req, referer)
return redirect_to_url(req, "%s/person/%s?open_claim=True" % (CFG_SITE_URL,
webapi.get_person_redirect_link(
pinfo["claimpaper_admin_last_viewed_pid"])))
def _clean_ticket(self, req):
'''
Removes from a ticket the transactions with an execution_result flag
'''
pinfo = session["personinfo"]
ticket = pinfo["ticket"]
for t in list(ticket):
if 'execution_result' in t:
ticket.remove(t)
session.dirty = True
def __get_user_role(self, req):
'''
Determines whether a user is guest, user or admin
'''
minrole = 'guest'
role = 'guest'
if not req:
return minrole
uid = getUid(req)
if not isinstance(uid, int):
return minrole
admin_role_id = acc_get_role_id(CLAIMPAPER_ADMIN_ROLE)
user_role_id = acc_get_role_id(CLAIMPAPER_USER_ROLE)
user_roles = acc_get_user_roles(uid)
if admin_role_id in user_roles:
role = 'admin'
elif user_role_id in user_roles:
role = 'user'
if role == 'guest' and webapi.is_external_user(uid):
role = 'user'
return role
def __user_is_authorized(self, req, action):
'''
Determines if a given user is authorized to perform a specified action
@param req: Apache Request Object
@type req: Apache Request Object
@param action: the action the user wants to perform
@type action: string
@return: True if user is allowed to perform the action, False if not
@rtype: boolean
'''
if not req:
return False
if not action:
return False
else:
action = escape(action)
uid = getUid(req)
if not isinstance(uid, int):
return False
if uid == 0:
return False
allowance = [i[1] for i in acc_find_user_role_actions({'uid': uid})
if i[1] == action]
if allowance:
return True
return False
def _scripts(self, kill_browser_cache=False):
'''
Returns html code to be included in the meta header of the html page.
The actual code is stored in the template.
@return: html formatted Javascript and CSS inclusions for the <head>
@rtype: string
'''
return TEMPLATE.tmpl_meta_includes(kill_browser_cache)
def _check_user_fields(self, req, form):
argd = wash_urlargd(
form,
{'ln': (str, CFG_SITE_LANG),
'user_first_name': (str, None),
'user_last_name': (str, None),
'user_email': (str, None),
'user_comments': (str, None)})
pinfo = session["personinfo"]
ulevel = pinfo["ulevel"]
skip_checkout_faulty_fields = False
if ulevel in ['user', 'admin']:
skip_checkout_faulty_fields = True
if not ("user_first_name_sys" in pinfo and pinfo["user_first_name_sys"]):
if "user_first_name" in argd and argd['user_first_name']:
if not argd["user_first_name"] and not skip_checkout_faulty_fields:
pinfo["checkout_faulty_fields"].append("user_first_name")
else:
pinfo["user_first_name"] = escape(argd["user_first_name"])
if not ("user_last_name_sys" in pinfo and pinfo["user_last_name_sys"]):
if "user_last_name" in argd and argd['user_last_name']:
if not argd["user_last_name"] and not skip_checkout_faulty_fields:
pinfo["checkout_faulty_fields"].append("user_last_name")
else:
pinfo["user_last_name"] = escape(argd["user_last_name"])
if not ("user_email_sys" in pinfo and pinfo["user_email_sys"]):
if "user_email" in argd and argd['user_email']:
if (not argd["user_email"]
or not email_valid_p(argd["user_email"])):
pinfo["checkout_faulty_fields"].append("user_email")
else:
pinfo["user_email"] = escape(argd["user_email"])
if (ulevel == "guest"
and emailUnique(argd["user_email"]) > 0):
pinfo["checkout_faulty_fields"].append("user_email_taken")
if "user_comments" in argd:
if argd["user_comments"]:
pinfo["user_ticket_comments"] = escape(argd["user_comments"])
else:
pinfo["user_ticket_comments"] = ""
session.dirty = True
def action(self, req, form):
'''
Initial step in processing of requests: ticket generation/update.
Also acts as action dispatcher for interface mass action requests
Valid mass actions are:
- confirm: confirm assignments to a person
- repeal: repeal assignments from a person
- reset: reset assignments of a person
- cancel: clean the session (erase tickets and so on)
- to_other_person: assign a document from a person to another person
@param req: Apache Request Object
@type req: Apache Request Object
@param form: Parameters sent via GET or POST request
@type form: dict
@return: a full page formatted in HTML
@return: string
'''
self._session_bareinit(req)
argd = wash_urlargd(
form,
{'ln': (str, CFG_SITE_LANG),
'pid': (int, None),
'confirm': (str, None),
'repeal': (str, None),
'reset': (str, None),
'cancel': (str, None),
'cancel_stage': (str, None),
'bibref_check_submit': (str, None),
'checkout': (str, None),
'checkout_continue_claiming': (str, None),
'checkout_submit': (str, None),
'checkout_remove_transaction': (str, None),
'to_other_person': (str, None),
'cancel_search_ticket': (str, None),
'user_first_name': (str, None),
'user_last_name': (str, None),
'user_email': (str, None),
'user_comments': (str, None),
'claim': (str, None),
'cancel_rt_ticket': (str, None),
'commit_rt_ticket': (str, None),
'rt_id': (int, None),
'rt_action': (str, None),
'selection': (list, []),
'set_canonical_name': (str, None),
'canonical_name': (str, None),
'add_missing_external_ids': (str, None),
'rewrite_all_external_ids': (str, None),
'delete_external_ids': (str, None),
'existing_ext_ids': (list, None),
'add_external_id': (str, None),
'ext_system': (str, None),
'ext_id': (str, None) })
ln = argd['ln']
# ln = wash_language(argd['ln'])
pid = None
action = None
bibrefs = None
uid = getUid(req)
pinfo = session["personinfo"]
ulevel = pinfo["ulevel"]
ticket = pinfo["ticket"]
tempticket = []
if not "ln" in pinfo:
pinfo["ln"] = ln
session.dirty = True
if 'confirm' in argd and argd['confirm']:
action = 'confirm'
elif 'repeal' in argd and argd['repeal']:
action = 'repeal'
elif 'reset' in argd and argd['reset']:
action = 'reset'
elif 'bibref_check_submit' in argd and argd['bibref_check_submit']:
action = 'bibref_check_submit'
elif 'cancel' in argd and argd['cancel']:
action = 'cancel'
elif 'cancel_stage' in argd and argd['cancel_stage']:
action = 'cancel_stage'
elif 'cancel_search_ticket' in argd and argd['cancel_search_ticket']:
action = 'cancel_search_ticket'
elif 'checkout' in argd and argd['checkout']:
action = 'checkout'
elif 'checkout_submit' in argd and argd['checkout_submit']:
action = 'checkout_submit'
elif ('checkout_remove_transaction' in argd
and argd['checkout_remove_transaction']):
action = 'checkout_remove_transaction'
elif ('checkout_continue_claiming' in argd
and argd['checkout_continue_claiming']):
action = "checkout_continue_claiming"
elif 'cancel_rt_ticket' in argd and argd['cancel_rt_ticket']:
action = 'cancel_rt_ticket'
elif 'commit_rt_ticket' in argd and argd['commit_rt_ticket']:
action = 'commit_rt_ticket'
elif 'to_other_person' in argd and argd['to_other_person']:
action = 'to_other_person'
elif 'claim' in argd and argd['claim']:
action = 'claim'
elif 'set_canonical_name' in argd and argd['set_canonical_name']:
action = 'set_canonical_name'
elif 'add_missing_external_ids' in argd and argd['add_missing_external_ids']:
action = 'add_missing_external_ids'
elif 'rewrite_all_external_ids' in argd and argd['rewrite_all_external_ids']:
action = 'rewrite_all_external_ids'
elif 'delete_external_ids' in argd and argd['delete_external_ids']:
action = 'delete_external_ids'
elif 'add_external_id' in argd and argd['add_external_id']:
action = 'add_external_id'
no_access = self._page_access_permission_wall(req, pid)
if no_access and not action in ["claim"]:
return no_access
if action in ['to_other_person', 'claim']:
if 'selection' in argd and len(argd['selection']) > 0:
bibrefs = argd['selection']
else:
return self._error_page(req, ln,
"Fatal: cannot create ticket without any bibrefrec")
if action == 'claim':
return self._ticket_open_claim(req, bibrefs, ln)
else:
return self._ticket_open_assign_to_other_person(req, bibrefs, form)
if action in ["cancel_stage"]:
if 'bibref_check_required' in pinfo:
del(pinfo['bibref_check_required'])
if 'bibrefs_auto_assigned' in pinfo:
del(pinfo['bibrefs_auto_assigned'])
if 'bibrefs_to_confirm' in pinfo:
del(pinfo['bibrefs_to_confirm'])
for tt in [row for row in ticket if 'incomplete' in row]:
ticket.remove(tt)
session.dirty = True
return self._ticket_dispatch_end(req)
if action in ["checkout_submit"]:
pinfo["checkout_faulty_fields"] = []
self._check_user_fields(req, form)
if not ticket:
pinfo["checkout_faulty_fields"].append("tickets")
if pinfo["checkout_faulty_fields"]:
pinfo["checkout_confirmed"] = False
else:
pinfo["checkout_confirmed"] = True
session.dirty = True
return self.adf['ticket_dispatch'][ulevel](req)
#return self._ticket_final_review(req)
if action in ["checkout_remove_transaction"]:
bibref = argd['checkout_remove_transaction']
if webapi.is_valid_bibref(bibref):
for rmt in [row for row in ticket
if row["bibref"] == bibref]:
ticket.remove(rmt)
pinfo["checkout_confirmed"] = False
session.dirty = True
return self.adf['ticket_dispatch'][ulevel](req)
#return self._ticket_final_review(req)
if action in ["checkout_continue_claiming"]:
pinfo["checkout_faulty_fields"] = []
self._check_user_fields(req, form)
return self._ticket_dispatch_end(req)
if (action in ['bibref_check_submit']
or (not action
and "bibref_check_required" in pinfo
and pinfo["bibref_check_required"])):
if not action in ['bibref_check_submit']:
if "bibref_check_reviewed_bibrefs" in pinfo:
del(pinfo["bibref_check_reviewed_bibrefs"])
session.dirty = True
return self.adf['ticket_dispatch'][ulevel](req)
pinfo["bibref_check_reviewed_bibrefs"] = []
add_rev = pinfo["bibref_check_reviewed_bibrefs"].append
if ("bibrefs_auto_assigned" in pinfo
or "bibrefs_to_confirm" in pinfo):
person_reviews = []
if ("bibrefs_auto_assigned" in pinfo
and pinfo["bibrefs_auto_assigned"]):
person_reviews.append(pinfo["bibrefs_auto_assigned"])
if ("bibrefs_to_confirm" in pinfo
and pinfo["bibrefs_to_confirm"]):
person_reviews.append(pinfo["bibrefs_to_confirm"])
for ref_review in person_reviews:
for person_id in ref_review:
for bibrec in ref_review[person_id]["bibrecs"]:
rec_grp = "bibrecgroup%s" % bibrec
elements = []
if rec_grp in form:
if isinstance(form[rec_grp], str):
elements.append(form[rec_grp])
elif isinstance(form[rec_grp], list):
elements += form[rec_grp]
else:
continue
for element in elements:
test = element.split("||")
if test and len(test) > 1 and test[1]:
tref = test[1] + "," + str(bibrec)
tpid = webapi.wash_integer_id(test[0])
if (webapi.is_valid_bibref(tref) and
tpid > -1):
add_rev(element + "," + str(bibrec))
session.dirty = True
return self.adf['ticket_dispatch'][ulevel](req)
if not action:
return self._error_page(req, ln,
"Fatal: cannot create ticket if no action selected.")
if action in ['confirm', 'repeal', 'reset']:
if 'pid' in argd:
pid = argd['pid']
else:
return self._error_page(req, ln,
"Fatal: cannot create ticket without a person id!")
if 'selection' in argd and len(argd['selection']) > 0:
bibrefs = argd['selection']
else:
if pid == -3:
return self._error_page(req, ln,
"Fatal: Please select a paper to assign to the new person first!")
else:
return self._error_page(req, ln,
"Fatal: cannot create ticket without any paper selected!")
if 'rt_id' in argd and argd['rt_id']:
rt_id = argd['rt_id']
for b in bibrefs:
self._cancel_transaction_from_rt_ticket(rt_id, pid, action, b)
#create temporary ticket
if pid == -3:
pid = webapi.create_new_person(uid)
for bibref in bibrefs:
tempticket.append({'pid': pid, 'bibref': bibref, 'action': action})
#check if ticket targets (bibref for pid) are already in ticket
for t in tempticket:
for e in list(ticket):
if e['bibref'] == t['bibref']:
ticket.remove(e)
ticket.append(t)
if 'search_ticket' in pinfo:
del(pinfo['search_ticket'])
#start ticket processing chain
pinfo["claimpaper_admin_last_viewed_pid"] = pid
session.dirty = True
return self.adf['ticket_dispatch'][ulevel](req)
# return self.perform(req, form)
elif action in ['cancel']:
self.__session_cleanup(req)
# return self._error_page(req, ln,
# "Not an error! Session cleaned! but "
# "redirect to be implemented")
return self._ticket_dispatch_end(req)
elif action in ['cancel_search_ticket']:
if 'search_ticket' in pinfo:
del(pinfo['search_ticket'])
session.dirty = True
if "claimpaper_admin_last_viewed_pid" in pinfo:
pid = pinfo["claimpaper_admin_last_viewed_pid"]
return redirect_to_url(req, "/person/%s" % webapi.get_person_redirect_link(pid))
return self.search(req, form)
elif action in ['checkout']:
return self.adf['ticket_dispatch'][ulevel](req)
#return self._ticket_final_review(req)
elif action in ['cancel_rt_ticket', 'commit_rt_ticket']:
if 'selection' in argd and len(argd['selection']) > 0:
bibref = argd['selection']
else:
return self._error_page(req, ln,
"Fatal: cannot cancel unknown ticket")
if 'pid' in argd and argd['pid'] > -1:
pid = argd['pid']
else:
return self._error_page(req, ln,
"Fatal: cannot cancel unknown ticket")
if action == 'cancel_rt_ticket':
if 'rt_id' in argd and argd['rt_id'] and 'rt_action' in argd and argd['rt_action']:
rt_id = argd['rt_id']
rt_action = argd['rt_action']
if 'selection' in argd and len(argd['selection']) > 0:
bibrefs = argd['selection']
else:
return self._error_page(req, ln,
"Fatal: no bibref")
for b in bibrefs:
self._cancel_transaction_from_rt_ticket(rt_id, pid, rt_action, b)
return redirect_to_url(req, "/person/%s" % webapi.get_person_redirect_link(pid))
return self._cancel_rt_ticket(req, bibref[0], pid)
elif action == 'commit_rt_ticket':
return self._commit_rt_ticket(req, bibref[0], pid)
elif action == 'set_canonical_name':
if 'pid' in argd and argd['pid'] > -1:
pid = argd['pid']
else:
return self._error_page(req, ln,
"Fatal: cannot set canonical name to unknown person")
if 'canonical_name' in argd and argd['canonical_name']:
cname = argd['canonical_name']
else:
return self._error_page(req, ln,
"Fatal: cannot set a custom canonical name without a suggestion")
uid = getUid(req)
userinfo = "%s||%s" % (uid, req.remote_ip)
webapi.update_person_canonical_name(pid, cname, userinfo)
return redirect_to_url(req, "/person/%s%s" % (webapi.get_person_redirect_link(pid), '#tabData'))
elif action == 'add_missing_external_ids':
if 'pid' in argd and argd['pid'] > -1:
pid = argd['pid']
else:
return self._error_page(req, ln, "Fatal: cannot recompute external ids for an unknown person")
update_personID_external_ids([pid], overwrite=False)
return redirect_to_url(req, "/person/%s%s" % (webapi.get_person_redirect_link(pid), '#tabData'))
elif action == 'rewrite_all_external_ids':
if 'pid' in argd and argd['pid'] > -1:
pid = argd['pid']
else:
return self._error_page(req, ln, "Fatal: cannot recompute external ids for an unknown person")
update_personID_external_ids([pid], overwrite=True)
return redirect_to_url(req, "/person/%s%s" % (webapi.get_person_redirect_link(pid), '#tabData'))
elif action == 'delete_external_ids':
if 'pid' in argd and argd['pid'] > -1:
pid = argd['pid']
else:
return self._error_page(req, ln, "Fatal: cannot delete external ids from an unknown person")
if 'existing_ext_ids' in argd and argd['existing_ext_ids']:
existing_ext_ids = argd['existing_ext_ids']
else:
return self._error_page(req, ln, "Fatal: you must select at least one external id in order to delete it!")
uid = getUid(req)
userinfo = "%s||%s" % (uid, req.remote_ip)
webapi.delete_person_external_ids(pid, existing_ext_ids, userinfo)
return redirect_to_url(req, "/person/%s%s" % (webapi.get_person_redirect_link(pid), '#tabData'))
elif action == 'add_external_id':
if 'pid' in argd and argd['pid'] > -1:
pid = argd['pid']
else:
return self._error_page(req, ln, "Fatal: cannot add external id to unknown person")
if 'ext_system' in argd and argd['ext_system']:
ext_sys = argd['ext_system']
else:
return self._error_page(req, ln, "Fatal: cannot add an external id without specifying the system")
if 'ext_id' in argd and argd['ext_id']:
ext_id = argd['ext_id']
else:
return self._error_page(req, ln, "Fatal: cannot add a custom external id without a suggestion")
uid = getUid(req)
userinfo = "%s||%s" % (uid, req.remote_ip)
webapi.add_person_external_id(pid, ext_sys, ext_id, userinfo)
return redirect_to_url(req, "/person/%s%s" % (webapi.get_person_redirect_link(pid), '#tabData'))
else:
return self._error_page(req, ln,
"Fatal: What were I supposed to do?")
def _ticket_open_claim(self, req, bibrefs, ln):
'''
Generate page to let user choose how to proceed
@param req: Apache Request Object
@type req: Apache Request Object
@param bibrefs: list of record IDs to perform an action on
@type bibrefs: list of int
@param ln: language to display the page in
@type ln: string
'''
uid = getUid(req)
uinfo = collect_user_info(req)
pinfo = session["personinfo"]
if 'ln' in pinfo:
ln = pinfo["ln"]
else:
ln = CFG_SITE_LANG
_ = gettext_set_language(ln)
no_access = self._page_access_permission_wall(req)
session.dirty = True
pid = -1
search_enabled = True
if not no_access and uinfo["precached_usepaperclaim"]:
tpid = webapi.get_pid_from_uid(uid)
if tpid and tpid[0] and tpid[1] and tpid[0][0]:
pid = tpid[0][0]
if (not no_access
and "claimpaper_admin_last_viewed_pid" in pinfo
and pinfo["claimpaper_admin_last_viewed_pid"]):
names = webapi.get_person_names_from_id(pinfo["claimpaper_admin_last_viewed_pid"])
names = sorted([i for i in names], key=lambda k: k[1], reverse=True)
if len(names) > 0:
if len(names[0]) > 0:
last_viewed_pid = [pinfo["claimpaper_admin_last_viewed_pid"], names[0][0]]
else:
last_viewed_pid = False
else:
last_viewed_pid = False
else:
last_viewed_pid = False
if no_access:
search_enabled = False
pinfo["referer"] = uinfo["referer"]
session.dirty = True
body = TEMPLATE.tmpl_open_claim(bibrefs, pid, last_viewed_pid,
search_enabled=search_enabled)
body = TEMPLATE.tmpl_person_detail_layout(body)
title = _('Claim this paper')
metaheaderadd = self._scripts(kill_browser_cache=True)
return page(title=title,
metaheaderadd=metaheaderadd,
body=body,
req=req,
language=ln)
def _ticket_open_assign_to_other_person(self, req, bibrefs, form):
'''
Initializes search to find a person to attach the selected records to
@param req: Apache request object
@type req: Apache request object
@param bibrefs: list of record IDs to consider
@type bibrefs: list of int
@param form: GET/POST request parameters
@type form: dict
'''
pinfo = session["personinfo"]
pinfo["search_ticket"] = dict()
search_ticket = pinfo["search_ticket"]
search_ticket['action'] = 'confirm'
search_ticket['bibrefs'] = bibrefs
session.dirty = True
return self.search(req, form)
def comments(self, req, form):
return ""
def _cancel_rt_ticket(self, req, tid, pid):
'''
deletes an RT ticket
'''
webapi.delete_request_ticket(pid, tid)
return redirect_to_url(req, "/person/%s" %
webapi.get_person_redirect_link(str(pid)))
def _cancel_transaction_from_rt_ticket(self, tid, pid, action, bibref):
'''
deletes a transaction from an rt ticket
'''
webapi.delete_transaction_from_request_ticket(pid, tid, action, bibref)
def _commit_rt_ticket(self, req, bibref, pid):
'''
Commit of an rt ticket: creates a real ticket and commits.
'''
pinfo = session["personinfo"]
ulevel = pinfo["ulevel"]
ticket = pinfo["ticket"]
open_rt_tickets = webapi.get_person_request_ticket(pid)
tic = [a for a in open_rt_tickets if str(a[1]) == str(bibref)]
if len(tic) > 0:
tic = tic[0][0]
#create temporary ticket
tempticket = []
for t in tic:
if t[0] in ['confirm', 'repeal']:
tempticket.append({'pid': pid, 'bibref': t[1], 'action': t[0]})
#check if ticket targets (bibref for pid) are already in ticket
for t in tempticket:
for e in list(ticket):
if e['pid'] == t['pid'] and e['bibref'] == t['bibref']:
ticket.remove(e)
ticket.append(t)
session.dirty = True
#start ticket processing chain
webapi.delete_request_ticket(pid, bibref)
return self.adf['ticket_dispatch'][ulevel](req)
def _error_page(self, req, ln=CFG_SITE_LANG, message=None, intro=True):
'''
Create a page that contains a message explaining the error.
@param req: Apache Request Object
@type req: Apache Request Object
@param ln: language
@type ln: string
@param message: message to be displayed
@type message: string
'''
body = []
_ = gettext_set_language(ln)
if not message:
message = "No further explanation available. Sorry."
if intro:
body.append(_("<p>We're sorry. An error occurred while "
"handling your request. Please find more information "
"below:</p>"))
body.append("<p><strong>%s</strong></p>" % message)
return page(title=_("Notice"),
body="\n".join(body),
description="%s - Internal Error" % CFG_SITE_NAME,
keywords="%s, Internal Error" % CFG_SITE_NAME,
language=ln,
req=req)
def __session_cleanup(self, req):
'''
Cleans the session from all bibauthorid specific settings and
with that cancels any transaction currently in progress.
@param req: Apache Request Object
@type req: Apache Request Object
'''
try:
pinfo = session["personinfo"]
except KeyError:
return
if "ticket" in pinfo:
pinfo['ticket'] = []
if "search_ticket" in pinfo:
pinfo['search_ticket'] = dict()
# clear up bibref checker if it's done.
if ("bibref_check_required" in pinfo
and not pinfo["bibref_check_required"]):
if 'bibrefs_to_confirm' in pinfo:
del(pinfo['bibrefs_to_confirm'])
if "bibrefs_auto_assigned" in pinfo:
del(pinfo["bibrefs_auto_assigned"])
del(pinfo["bibref_check_required"])
if "checkout_confirmed" in pinfo:
del(pinfo["checkout_confirmed"])
if "checkout_faulty_fields" in pinfo:
del(pinfo["checkout_faulty_fields"])
#pinfo['ulevel'] = ulevel
# pinfo["claimpaper_admin_last_viewed_pid"] = -1
pinfo["admin_requested_ticket_id"] = -1
session.dirty = True
def _generate_search_ticket_box(self, req):
'''
Generate the search ticket to remember a pending search for Person
entities in an attribution process
@param req: Apache request object
@type req: Apache request object
'''
pinfo = session["personinfo"]
search_ticket = None
if 'ln' in pinfo:
ln = pinfo["ln"]
else:
ln = CFG_SITE_LANG
_ = gettext_set_language(ln)
if 'search_ticket' in pinfo:
search_ticket = pinfo['search_ticket']
if not search_ticket:
return ''
else:
return TEMPLATE.tmpl_search_ticket_box('person_search', 'assign_papers', search_ticket['bibrefs'])
def search(self, req, form, is_fallback=False, fallback_query='', fallback_title='', fallback_message=''):
'''
Function used for searching a person based on a name with which the
function is queried.
@param req: Apache Request Object
@type req: Apache Request Object
@param form: Parameters sent via GET or POST request
@type form: dict
@return: a full page formatted in HTML
@return: string
'''
self._session_bareinit(req)
no_access = self._page_access_permission_wall(req)
new_person_link = False
if no_access:
return no_access
pinfo = session["personinfo"]
search_ticket = None
if 'search_ticket' in pinfo:
search_ticket = pinfo['search_ticket']
if "ulevel" in pinfo:
if pinfo["ulevel"] == "admin":
new_person_link = True
body = ''
if search_ticket:
body = body + self._generate_search_ticket_box(req)
max_num_show_papers = 5
argd = wash_urlargd(
form,
{'ln': (str, CFG_SITE_LANG),
'verbose': (int, 0),
'q': (str, None)})
ln = argd['ln']
# ln = wash_language(argd['ln'])
query = None
recid = None
nquery = None
search_results = None
title = "Person Search"
if 'q' in argd:
if argd['q']:
query = escape(argd['q'])
if is_fallback and fallback_query:
query = fallback_query
if query:
authors = []
if query.count(":"):
try:
left, right = query.split(":")
try:
recid = int(left)
nquery = str(right)
except (ValueError, TypeError):
try:
recid = int(right)
nquery = str(left)
except (ValueError, TypeError):
recid = None
nquery = query
except ValueError:
recid = None
nquery = query
else:
nquery = query
sorted_results = webapi.search_person_ids_by_name(nquery)
for index, results in enumerate(sorted_results):
pid = results[0]
# authorpapers = webapi.get_papers_by_person_id(pid, -1)
# authorpapers = sorted(authorpapers, key=itemgetter(0),
# reverse=True)
if index < PERSON_SEARCH_RESULTS_SHOW_PAPERS_PERSON_LIMIT:
#We are no longer sorting by date because of the huge impact this have
#on the system.
#The sorting is now done per recordid
# authorpapers = [[paper] for paper in
# sort_records(None, [i[0] for i in
# webapi.get_papers_by_person_id(pid, -1)],
# sort_field="year", sort_order="a")]
authorpapers = sorted([[p[0]] for p in webapi.get_papers_by_person_id(pid, -1)],
key=itemgetter(0))
else:
authorpapers = [['Not retrieved to increase performances.']]
if (recid and
not (str(recid) in [row[0] for row in authorpapers])):
continue
authors.append([results[0], results[1],
authorpapers[0:max_num_show_papers], len(authorpapers)])
search_results = authors
if recid and (len(search_results) == 1) and not is_fallback:
return redirect_to_url(req, "/person/%s" % search_results[0][0])
body = body + TEMPLATE.tmpl_author_search(query, search_results, search_ticket, author_pages_mode=True, fallback_mode=is_fallback,
fallback_title=fallback_title, fallback_message=fallback_message, new_person_link=new_person_link)
if not is_fallback:
body = TEMPLATE.tmpl_person_detail_layout(body)
return page(title=title,
metaheaderadd=self._scripts(kill_browser_cache=True),
body=body,
req=req,
language=ln)
def claimstub(self, req, form):
'''
Generate stub page before claiming process
@param req: Apache request object
@type req: Apache request object
@param form: GET/POST request params
@type form: dict
'''
argd = wash_urlargd(
form,
{'ln': (str, CFG_SITE_LANG),
'person': (str, '')})
ln = argd['ln']
# ln = wash_language(argd['ln'])
_ = gettext_set_language(ln)
person = '-1'
if 'person' in argd and argd['person']:
person = argd['person']
try:
pinfo = session["personinfo"]
if pinfo['ulevel'] == 'admin':
return redirect_to_url(req, '%s/person/%s?open_claim=True' % (CFG_SITE_URL, person))
except KeyError:
pass
if BIBAUTHORID_UI_SKIP_ARXIV_STUB_PAGE:
return redirect_to_url(req, '%s/person/%s?open_claim=True' % (CFG_SITE_URL, person))
body = TEMPLATE.tmpl_claim_stub(person)
pstr = 'Person ID missing or invalid'
if person != '-1':
pstr = person
title = _('You are going to claim papers for: %s' % pstr)
return page(title=title,
metaheaderadd=self._scripts(kill_browser_cache=True),
body=body,
req=req,
language=ln)
def welcome(self, req, form):
'''
Generate SSO landing/welcome page
@param req: Apache request object
@type req: Apache request object
@param form: GET/POST request params
@type form: dict
'''
uid = getUid(req)
self._session_bareinit(req)
argd = wash_urlargd(
form,
{'ln': (str, CFG_SITE_LANG),
'chosen_profile': (int, None)})
ln = argd['ln']
chosen_profile = argd['chosen_profile']
# ln = wash_language(argd['ln'])
_ = gettext_set_language(ln)
if uid == 0:
return page_not_authorized(req, text=_("This page in not accessible directly."))
title_message = _('Welcome!')
# start continuous writing to the browser...
req.content_type = "text/html"
req.send_http_header()
ssl_param = 0
if req.is_https():
ssl_param = 1
req.write(pageheaderonly(req=req, title=title_message,
language=ln, secure_page_p=ssl_param))
req.write(TEMPLATE.tmpl_welcome_start())
body = ""
if CFG_INSPIRE_SITE:
body = TEMPLATE.tmpl_welcome_arxiv()
else:
body = TEMPLATE.tmpl_welcome()
req.write(body)
req.write("USERID: %s " % str(uid))
if chosen_profile == None:
archive_user_info = webapi.arxiv_login(req)
elif chosen_profile == -1:
archive_user_info = webapi.arxiv_login(req, -1)
else:
archive_user_info = webapi.arxiv_login(req, chosen_profile)
# now do what will take time...
#req.write("ARXIV USER INFO: %s" % str(archive_user_info))
if archive_user_info[0] == "pid": ##### differentiate the two cases
pid = archive_user_info[1]
elif archive_user_info[0] == "chosen pid not available":
pid = archive_user_info[1]
req.write(TEMPLATE.tmpl_profile_not_available())
elif archive_user_info[0] == "pid assigned by user":
pid = archive_user_info[1]
req.write(TEMPLATE.tmpl_profile_assigned_by_user())
else :
req.write(TEMPLATE.tmpl_claim_profile())
req.write(TEMPLATE.tmpl_profile_option(archive_user_info[1]))
req.write(pagefooteronly(req=req))
return
#session must be read after webapi.arxiv_login did it's stuff
pinfo = session["personinfo"]
pinfo["claimpaper_admin_last_viewed_pid"] = pid
session.dirty = True
link = TEMPLATE.tmpl_welcome_link()
req.write(link)
req.write("<br><br>")
uinfo = collect_user_info(req)
arxivp = []
if 'external_arxivids' in uinfo and uinfo['external_arxivids']:
try:
for i in uinfo['external_arxivids'].split(';'):
arxivp.append(i)
except (IndexError, KeyError):
pass
req.write(TEMPLATE.tmpl_welcome_personid_association(pid))
req.write(TEMPLATE.tmpl_welcome_arXiv_papers(arxivp))
if CFG_INSPIRE_SITE:
#logs arXive logins, for debug purposes.
dbg = ('uinfo= ' + str(uinfo) + '\npinfo= ' + str(pinfo) + '\nreq= ' + str(req)
+ '\nsession= ' + str(session))
userinfo = "%s||%s" % (uid, req.remote_ip)
webapi.insert_log(userinfo, pid, 'arXiv_login', 'dbg', '', comment=dbg)
req.write(TEMPLATE.tmpl_welcome_end())
req.write(pagefooteronly(req=req))
def tickets_admin(self, req, form):
'''
Generate SSO landing/welcome page
@param req: Apache request object
@type req: Apache request object
@param form: GET/POST request params
@type form: dict
'''
self._session_bareinit(req)
no_access = self._page_access_permission_wall(req, req_level='admin')
if no_access:
return no_access
tickets = webapi.get_persons_with_open_tickets_list()
tickets = list(tickets)
for t in list(tickets):
tickets.remove(t)
tickets.append([webapi.get_most_frequent_name_from_pid(int(t[0])),
webapi.get_person_redirect_link(t[0]), t[0], t[1]])
body = TEMPLATE.tmpl_tickets_admin(tickets)
body = TEMPLATE.tmpl_person_detail_layout(body)
title = 'Open RT tickets'
return page(title=title,
metaheaderadd=self._scripts(),
body=body,
req=req)
def export(self, req, form):
'''
Generate JSONized export of Person data
@param req: Apache request object
@type req: Apache request object
@param form: GET/POST request params
@type form: dict
'''
argd = wash_urlargd(
form,
{'ln': (str, CFG_SITE_LANG),
'request': (str, None),
'userid': (str, None)})
if not CFG_JSON_AVAILABLE:
return "500_json_not_found__install_package"
request = None
userid = None
if "userid" in argd and argd['userid']:
userid = argd['userid']
else:
return "404_user_not_found"
if "request" in argd and argd['request']:
request = argd["request"]
# find user from ID
user_email = get_email_from_username(userid)
if user_email == userid:
return "404_user_not_found"
uid = get_uid_from_email(user_email)
uinfo = collect_user_info(uid)
# find person by uid
pid = webapi.get_pid_from_uid(uid)
# find papers py pid that are confirmed through a human.
papers = webapi.get_papers_by_person_id(pid, 2)
# filter by request param, e.g. arxiv
if not request:
return "404__no_filter_selected"
if not request in VALID_EXPORT_FILTERS:
return "500_filter_invalid"
if request == "arxiv":
query = "(recid:"
query += " OR recid:".join(papers)
query += ") AND 037:arxiv"
db_docs = perform_request_search(p=query, rg=0)
nickmail = ""
nickname = ""
db_arxiv_ids = []
try:
nickname = uinfo["nickname"]
except KeyError:
pass
if not nickname:
try:
nickmail = uinfo["email"]
except KeyError:
nickmail = user_email
nickname = nickmail
db_arxiv_ids = get_fieldvalues(db_docs, "037__a")
construct = {"nickname": nickname,
"claims": ";".join(db_arxiv_ids)}
jsondmp = json.dumps(construct)
signature = webapi.sign_assertion("arXiv", jsondmp)
construct["digest"] = signature
return json.dumps(construct)
index = __call__
me = welcome
you = welcome
# pylint: enable=C0301
# pylint: enable=W0613
diff --git a/invenio/legacy/bibauthorid/wedge.py b/invenio/legacy/bibauthorid/wedge.py
index beb65ff4a..9c17f458d 100644
--- a/invenio/legacy/bibauthorid/wedge.py
+++ b/invenio/legacy/bibauthorid/wedge.py
@@ -1,430 +1,430 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
-from invenio import bibauthorid_config as bconfig
+from invenio.legacy.bibauthorid import config as bconfig
from itertools import izip, starmap
from operator import mul
from invenio.bibauthorid_backinterface import Bib_matrix
-from invenio.bibauthorid_general_utils import update_status \
+from invenio.legacy.bibauthorid.general_utils import update_status \
, update_status_final \
, bibauthor_print \
, wedge_print
from invenio.bibauthorid_prob_matrix import ProbabilityMatrix
import numpy
#mport cPickle as SER
import msgpack as SER
import gc
SP_NUMBERS = Bib_matrix.special_numbers
SP_SYMBOLS = Bib_matrix.special_symbols
SP_CONFIRM = Bib_matrix.special_symbols['+']
SP_QUARREL = Bib_matrix.special_symbols['-']
eps = 0.01
edge_cut_prob = ''
wedge_thrsh = ''
import os
PID = lambda : str(os.getpid())
def wedge(cluster_set, report_cluster_status=False, force_wedge_thrsh=False):
# The lower bound of the edges being processed by the wedge algorithm.
global edge_cut_prob
global wedge_thrsh
if not force_wedge_thrsh:
edge_cut_prob = bconfig.WEDGE_THRESHOLD / 3.
wedge_thrsh = bconfig.WEDGE_THRESHOLD
else:
edge_cut_prob = force_wedge_thrsh / 3.
wedge_thrsh = force_wedge_thrsh
matr = ProbabilityMatrix()
matr.load(cluster_set.last_name)
convert_cluster_set(cluster_set, matr)
del matr # be sure that this is the last reference!
do_wedge(cluster_set)
report = []
if bconfig.DEBUG_WEDGE_PRINT_FINAL_CLUSTER_COMPATIBILITIES or report_cluster_status:
msg = []
for cl1 in cluster_set.clusters:
for cl2 in cluster_set.clusters:
if cl2 > cl1:
id1 = cluster_set.clusters.index(cl1)
id2 = cluster_set.clusters.index(cl2)
c12 = _compare_to(cl1,cl2)
c21 = _compare_to(cl2,cl1)
report.append((id1,id2,c12+c21))
msg.append( ' %s vs %s : %s + %s = %s -- %s' % (id1, id2, c12, c21, c12+c21, cl1.hates(cl2)))
msg = 'Wedge final clusters for %s: \n' % str(wedge_thrsh) + '\n'.join(msg)
if not bconfig.DEBUG_WEDGE_OUTPUT and bconfig.DEBUG_WEDGE_PRINT_FINAL_CLUSTER_COMPATIBILITIES:
print
print msg
print
wedge_print(msg)
restore_cluster_set(cluster_set)
if bconfig.DEBUG_CHECKS:
assert cluster_set._debug_test_hate_relation()
assert cluster_set._debug_duplicated_recs()
if report_cluster_status:
destfile = '/tmp/baistats/cluster_status_report_pid_%s_lastname_%s_thrsh_%s' % (str(PID()),str(cluster_set.last_name),str(wedge_thrsh))
f = open(destfile, 'w')
SER.dump([wedge_thrsh,cluster_set.last_name,report,cluster_set.num_all_bibs],f)
f.close()
gc.collect()
def _decide(cl1, cl2):
score1 = _compare_to(cl1, cl2)
score2 = _compare_to(cl2, cl1)
s = score1 + score2
wedge_print("Wedge: _decide (%f+%f) = %f cmp to %f" % (score1,score2,s,wedge_thrsh))
return s > wedge_thrsh, s
def _compare_to(cl1, cl2):
pointers = [cl1.out_edges[v] for v in cl2.bibs]
assert pointers, PID()+"Wedge: no edges between clusters!"
vals, probs = zip(*pointers)
wedge_print("Wedge: _compare_to: vals = %s, probs = %s" % (str(vals), str(probs)))
if SP_QUARREL in vals:
ret = 0.
wedge_print('Wedge: _compare_to: - edge present, returning 0')
elif SP_CONFIRM in vals:
ret = 0.5
wedge_print('Wedge: _compare_to: + edge present, returning 0.5')
else:
avg = sum(vals) / len(vals)
if avg > eps:
nvals = [(val / avg) ** prob for val, prob in pointers]
else:
wedge_print("Wedge: _compare_to: vals too low to compare, skipping")
return 0
coeff = _gini(nvals)
weight = sum(starmap(mul, pointers)) / sum(probs)
ret = (coeff * weight) / 2.
assert ret <= 0.5, PID()+'COMPARE_TO big value returned ret %s coeff %s weight %s nvals %s vals %s prob %s' % (ret, coeff, weight, nvals, vals, probs)
wedge_print("Wedge: _compare_to: coeff = %f, weight = %f, retval = %f" % (coeff, weight, ret))
return ret
def _gini(arr):
arr = sorted(arr, reverse=True)
dividend = sum(starmap(mul, izip(arr, xrange(1, 2 * len(arr), 2))))
divisor = len(arr) * sum(arr)
return float(dividend) / divisor
def _compare_to_final_bounds(score1, score2):
return score1 + score2 > bconfig.WEDGE_THRESHOLD
def _edge_sorting(edge):
'''
probability + certainty / 10
'''
return edge[2][0] + edge[2][1] / 10.
def do_wedge(cluster_set, deep_debug=False):
'''
Rearranges the cluster_set acoarding to be values in the probability_matrix.
The deep debug option will produce a lot of output. Avoid using it with more
than 20 bibs in the cluster set.
'''
bib_map = create_bib_2_cluster_dict(cluster_set)
plus_edges, minus_edges, edges = group_edges(cluster_set)
interval = 1000
for i, (bib1, bib2) in enumerate(plus_edges):
if (i % interval) == 0:
update_status(float(i) / len(plus_edges), "Agglomerating obvious clusters...")
cl1 = bib_map[bib1]
cl2 = bib_map[bib2]
if cl1 != cl2 and not cl1.hates(cl2):
join(cl1, cl2)
cluster_set.clusters.remove(cl2)
for v in cl2.bibs:
bib_map[v] = cl1
update_status_final("Agglomerating obvious clusters done.")
interval = 1000
for i, (bib1, bib2) in enumerate(minus_edges):
if (i % interval) == 0:
update_status(float(i) / len(minus_edges), "Dividing obvious clusters...")
cl1 = bib_map[bib1]
cl2 = bib_map[bib2]
if cl1 != cl2 and not cl1.hates(cl2):
cl1.quarrel(cl2)
update_status_final("Dividing obvious clusters done.")
bibauthor_print("Sorting the value edges.")
edges = sorted(edges, key=_edge_sorting, reverse=True)
interval = 500000
wedge_print("Wedge: New wedge, %d edges." % len(edges))
for current, (v1, v2, unused) in enumerate(edges):
if (current % interval) == 0:
update_status(float(current) / len(edges), "Wedge...")
assert unused != '+' and unused != '-', PID()+"Signed edge after filter!"
cl1 = bib_map[v1]
cl2 = bib_map[v2]
idcl1 = cluster_set.clusters.index(cl1)
idcl2 = cluster_set.clusters.index(cl2)
#keep the ids low!
if idcl1 > idcl2:
idcl1, idcl2 = idcl2, idcl1
cl1, cl2 = cl2, cl1
wedge_print("Wedge: popped new edge: Verts = (%s,%s) from (%s, %s) Value = (%f, %f)" % (idcl1, idcl2, v1, v2, unused[0], unused[1]))
if cl1 != cl2 and not cl1.hates(cl2):
if deep_debug:
export_to_dot(cluster_set, "/tmp/%s%d.dot" % (cluster_set.last_name, current), bib_map, (v1, v2, unused))
decision, value = _decide(cl1, cl2)
if decision:
wedge_print("Wedge: Joined %s to %s with %s"% (idcl1, idcl2, value))
join(cl1, cl2)
cluster_set.clusters.remove(cl2)
for v in cl2.bibs:
bib_map[v] = cl1
else:
wedge_print("Wedge: Quarreled %s from %s with %s " % (idcl1, idcl2, value))
cl1.quarrel(cl2)
elif cl1 == cl2:
wedge_print("Wedge: Clusters already joined! (%s,%s)" % (idcl1, idcl2))
else:
wedge_print("Wedge: Clusters hate each other! (%s,%s)" % (idcl1, idcl2))
update_status_final("Wedge done.")
bibauthor_print("")
if deep_debug:
export_to_dot(cluster_set, "/tmp/%sfinal.dot" % cluster_set.last_name, bib_map)
def meld_edges(p1, p2):
'''
Creates one out_edges set from two.
The operation is associative and commutative.
The objects are: (out_edges for in a cluster, number of vertices in the same cluster)
'''
out_edges1, verts1 = p1
out_edges2, verts2 = p2
assert verts1 > 0 and verts2 > 0, PID()+'MELD_EDGES: verts problem %s %s ' % (str(verts1), str(verts2))
vsum = verts1 + verts2
invsum = 1. / vsum
special_numbers = Bib_matrix.special_numbers #local reference optimization
def median(e1, e2):
#dirty optimization, should check if value is in dictionary instead
# if e1[0] in special_numbers: return e1
# if e2[0] in special_numbers: return e2
if e1[0] < 0:
assert e1[0] in special_numbers, "MELD_EDGES: wrong value for median? %s" % str(e1)
return e1
if e2[0] < 0:
assert e2[0] in special_numbers, "MELD_EDGES: wrong value for median? %s" % str(e2)
return e2
i1 = e1[1] * verts1
i2 = e2[1] * verts2
inter_cert = i1 + i2
inter_prob = e1[0] * i1 + e2[0] * i2
return (inter_prob / inter_cert, inter_cert * invsum)
assert len(out_edges1) == len(out_edges2), "Invalid arguments for meld edges"
size = len(out_edges1)
result = numpy.ndarray(shape=(size, 2), dtype=float, order='C')
for i in xrange(size):
result[i] = median(out_edges1[i], out_edges2[i])
assert (result[i][0] >= 0 and result[i][0] <= 1) or result[i][0] in Bib_matrix.special_numbers, PID()+'MELD_EDGES: value %s' % result[i]
assert (result[i][1] >= 0 and result[i][1] <= 1) or result[i][1] in Bib_matrix.special_numbers, PID()+'MELD_EDGES: compat %s' % result[i]
return (result, vsum)
def convert_cluster_set(cs, prob_matr):
'''
Convertes a normal cluster set to a wedge clsuter set.
@param cs: a cluster set to be converted
@param type: cluster set
@return: a mapping from a number to a bibrefrec.
'''
gc.disable()
# step 1:
# + Assign a number to each bibrefrec.
# + Replace the arrays of bibrefrecs with arrays of numbers.
# + Store the result and prepare it to be returned.
result_mapping = []
for clus in cs.clusters:
start = len(result_mapping)
result_mapping += list(clus.bibs)
end = len(result_mapping)
clus.bibs = range(start, end)
assert len(result_mapping) == len(set(result_mapping)), PID()+"Cluster set conversion failed"
assert len(result_mapping) == cs.num_all_bibs, PID()+"Cluster set conversion failed"
cs.new2old = result_mapping
# step 2:
# + Using the prob matrix create a vector values to all other bibs.
# + Meld those vectors into one for each cluster.
special_symbols = Bib_matrix.special_symbols #locality optimization
interval = 10000
for current, c1 in enumerate(cs.clusters):
if (current % interval) == 0:
update_status(float(current) / len(cs.clusters), "Converting the cluster set...")
assert len(c1.bibs) > 0, PID()+"Empty cluster send to wedge"
pointers = []
for v1 in c1.bibs:
pointer = numpy.ndarray(shape=(len(result_mapping), 2), dtype=float, order='C')
pointer.fill(special_symbols[None])
rm = result_mapping[v1] #locality optimization
for c2 in cs.clusters:
if c1 != c2 and not c1.hates(c2):
for v2 in c2.bibs:
val = prob_matr[rm, result_mapping[v2]]
try:
numb = special_symbols[val]
val = (numb, numb)
except KeyError:
pass
assert len(val) == 2, "Edge coding failed"
pointer[v2] = val
pointers.append((pointer, 1))
c1.out_edges = reduce(meld_edges, pointers)[0]
update_status_final("Converting the cluster set done.")
gc.enable()
def restore_cluster_set(cs):
for cl in cs.clusters:
cl.bibs = set(cs.new2old[b] for b in cl.bibs)
del cl.out_edges
cs.update_bibs()
def create_bib_2_cluster_dict(cs):
'''
Creates and returns a dictionary bibrefrec -> cluster.
The cluster set must be converted!
'''
size = sum(len(cl.bibs) for cl in cs.clusters)
ret = range(size)
for cl in cs.clusters:
for bib in cl.bibs:
ret[bib] = cl
return ret
def group_edges(cs):
plus = []
minus = []
pairs = []
gc.disable()
interval = 1000
for current, cl1 in enumerate(cs.clusters):
if (current % interval) == 0:
update_status(float(current) / len(cs.clusters), "Grouping all edges...")
bib1 = tuple(cl1.bibs)[0]
pointers = cl1.out_edges
for bib2 in xrange(len(cl1.out_edges)):
val = pointers[bib2]
if val[0] not in Bib_matrix.special_numbers:
if val[0] > edge_cut_prob:
pairs.append((bib1, bib2, val))
elif val[0] == Bib_matrix.special_symbols['+']:
plus.append((bib1, bib2))
elif val[0] == Bib_matrix.special_symbols['-']:
minus.append((bib1, bib2))
else:
assert val[0] == Bib_matrix.special_symbols[None], "Invalid Edge"
update_status_final("Finished with the edge grouping.")
bibauthor_print("Positive edges: %d, Negative edges: %d, Value edges: %d."
% (len(plus), len(minus), len(pairs)))
gc.enable()
return plus, minus, pairs
def join(cl1, cl2):
'''
Joins two clusters from a cluster set in the first.
'''
cl1.out_edges = meld_edges((cl1.out_edges, len(cl1.bibs)),
(cl2.out_edges, len(cl2.bibs)))[0]
cl1.bibs += cl2.bibs
assert not cl1.hates(cl1), PID()+"Joining hateful clusters"
assert not cl2.hates(cl2), PID()+"Joining hateful clusters2"
cl1.hate |= cl2.hate
for cl in cl2.hate:
cl.hate.remove(cl2)
cl.hate.add(cl1)
def export_to_dot(cs, fname, graph_info, extra_edge=None):
- from invenio.bibauthorid_dbinterface import get_name_by_bibrecref
+ from invenio.legacy.bibauthorid.dbinterface import get_name_by_bibrecref
fptr = open(fname, "w")
fptr.write("graph wedgy {\n")
fptr.write(" overlap=prism\n")
for idx, bib in enumerate(graph_info):
fptr.write(' %d [color=black label="%s"];\n' % (idx, get_name_by_bibrecref(idx)))
if extra_edge:
v1, v2, (prob, cert) = extra_edge
fptr.write(' %d -- %d [color=green label="p: %.2f, c: %.2f"];\n' % (v1, v2, prob, cert))
for clus in cs.clusters:
fptr.write(" %s [color=blue];\n" % " -- ".join(str(x) for x in clus.bibs))
fptr.write("".join(" %d -- %d [color=red]\n" % (b1, b2)
for b1 in clus.bibs for h in clus.hate for b2 in h.bibs))
fptr.write("}")
diff --git a/invenio/legacy/bibcatalog/api.py b/invenio/legacy/bibcatalog/api.py
index 9c93d922d..922128f72 100644
--- a/invenio/legacy/bibcatalog/api.py
+++ b/invenio/legacy/bibcatalog/api.py
@@ -1,34 +1,34 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Provide a "ticket" interface with a request tracker.
See: https://twiki.cern.ch/twiki/bin/view/Inspire/SystemDesignBibCatalogue
This creates an instance of the class that has been configured for this installation,
or returns None if no ticket system is configured.
"""
from invenio.config import CFG_BIBCATALOG_SYSTEM
bibcatalog_system = None
if CFG_BIBCATALOG_SYSTEM == 'RT':
- from invenio.bibcatalog_system_rt import BibCatalogSystemRT
+ from invenio.legacy.bibcatalog.system_rt import BibCatalogSystemRT
bibcatalog_system = BibCatalogSystemRT()
elif CFG_BIBCATALOG_SYSTEM == 'EMAIL':
- from invenio.bibcatalog_system_email import BibCatalogSystemEmail
+ from invenio.legacy.bibcatalog.system_email import BibCatalogSystemEmail
bibcatalog_system = BibCatalogSystemEmail()
diff --git a/invenio/legacy/bibcatalog/doc/hacking/hacking/bibcatalog-api.webdoc b/invenio/legacy/bibcatalog/doc/hacking/hacking/bibcatalog-api.webdoc
index d540efcc6..cf2fb6395 100644
--- a/invenio/legacy/bibcatalog/doc/hacking/hacking/bibcatalog-api.webdoc
+++ b/invenio/legacy/bibcatalog/doc/hacking/hacking/bibcatalog-api.webdoc
@@ -1,89 +1,89 @@
## -*- mode: html; coding: utf-8; -*-
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
<!-- WebDoc-Page-Title: BibCatalog API -->
<!-- WebDoc-Page-Navtrail: <a class="navtrail" href="<CFG_SITE_URL>/help/hacking">Hacking Invenio</a> -->
<pre>
BibCatalog supports "ticketing" so that cataloguers can keep track of their tasks.
Of several ticketing systems, RT (Request Tracker) is currently supported.
1. The API
bibcatalog.py consist of ticket operations, as follows.
* check_system(uid) returns an empty string if things are OK, and an error string otherwise.
* ticket_search(uid, recordid, subject, text, creator, owner, date_from, date_until,
status, priority) search tickets by various criteria.
* ticket_submit(uid, subject, recordid, text, queue, priority, owner)
submit a ticket and initially set its fields.
* ticket_assign(uid, ticketid, to_user) assign a ticket to someone.
* ticket_set_attribute(uid, ticketid, attribute, new_value) sets an attribute.
These are members of TICKET_ATTRIBUTES in bibcatalog_system.py.
* ticket_get_attribute(uid, ticketid, attrname) returns the value of an attribute.
* ticket_get_info(uid, ticketid, attrlist) return ticket information as a dictionary.
2. Configuring your access to RT
To use the BibCatalog module you first need to set up the ticket system you want to use (currently only RT).
Edit these lines in invenio.conf:
CFG_BIBCATALOG_SYSTEM = RT
CFG_BIBCATALOG_SYSTEM_RT_CLI = /usr/bin/rt
CFG_BIBCATALOG_SYSTEM_RT_URL = http://xxx.server.org/rt3
Your RT installation does not need to be in the same computer where
your Invenio installation is. However, you will need the CLI (/usr/bin/rt)
Perl program.
3. Configuring RT
RT version 3 has been tested with this installation.
There are two custom fields in tickets. These should be created by the administrator by
using the "Admin/CustomFields" URL in RT. The fields are:
(i) name: RecordID - applies to: tickets
(ii) name: TicketSetID - applies to: tickets
In general, the invenio cataloguers need to have the right to submit/create tickets in the queues.
Creating users is done by RT admin using the "Admin/Users" URL.
Ticket creation etc should be enabled by giving the following rights to group "Everyone" in queues:
AssignCustomFields
CommentOnTicket
CreateTicket
ModifyTicket
ReplyToTicket
4. Using the API
-from invenio.bibcatalog import bibcatalog_system
+from invenio.legacy.bibcatalog.api import bibcatalog_system
if bibcatalog_system is not None:
uid = 1 #or whatever..
x = bibcatalog_system.check_system(uid)
if len(x) > 0:
print "errors: "+str(x)
sys.exit()
else:
print "ok"
</pre>
diff --git a/invenio/legacy/bibcatalog/system_email.py b/invenio/legacy/bibcatalog/system_email.py
index a34c9049b..6d55ffcf5 100644
--- a/invenio/legacy/bibcatalog/system_email.py
+++ b/invenio/legacy/bibcatalog/system_email.py
@@ -1,174 +1,174 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Provide a "ticket" interface with Email.
This is a subclass of BibCatalogSystem
"""
import datetime
from time import mktime
import invenio.legacy.webuser
from invenio.utils.shell import escape_shell_arg
-from invenio.bibcatalog_system import BibCatalogSystem
+from invenio.legacy.bibcatalog.system import BibCatalogSystem
from invenio.ext.email import send_email
from invenio.ext.logging import register_exception
EMAIL_SUBMIT_CONFIGURED = False
import invenio.config
if hasattr(invenio.config, 'CFG_BIBCATALOG_SYSTEM') and invenio.config.CFG_BIBCATALOG_SYSTEM == "EMAIL":
if hasattr(invenio.config, 'CFG_BIBCATALOG_SYSTEM_EMAIL_ADDRESS'):
EMAIL_SUBMIT_CONFIGURED = True
FROM_ADDRESS = invenio.config.CFG_SITE_SUPPORT_EMAIL
TO_ADDRESS = invenio.config.CFG_BIBCATALOG_SYSTEM_EMAIL_ADDRESS
class BibCatalogSystemEmail(BibCatalogSystem):
#BIBCATALOG_RT_SERVER = "" #construct this by http://user:password@RT_URL
def check_system(self, uid=None):
"""return an error string if there are problems"""
ret = ''
if not EMAIL_SUBMIT_CONFIGURED:
ret = "Please configure bibcatalog email sending in CFG_BIBCATALOG_SYSTEM and CFG_BIBCATALOG_SYSTEM_EMAIL_ADDRESS"
return ret
def ticket_search(self, uid, recordid=-1, subject="", text="", creator="", owner="", \
date_from="", date_until="", status="", priority="", queue=""):
"""Not implemented."""
raise NotImplementedError
def ticket_submit(self, uid=None, subject="", recordid=-1, text="", queue="", priority="", owner="", requestor=""):
"""creates a ticket. return true on success, otherwise false"""
if not EMAIL_SUBMIT_CONFIGURED:
register_exception(stream='warning',
subject='bibcatalog email not configured',
prefix="please configure bibcatalog email sending in CFG_BIBCATALOG_SYSTEM and CFG_BIBCATALOG_SYSTEM_EMAIL_ADDRESS")
ticket_id = self._get_ticket_id()
priorityset = ""
queueset = ""
requestorset = ""
ownerset = ""
recidset = " cf-recordID:" + escape_shell_arg(str(recordid)) + '\n'
textset = ""
subjectset = ""
if subject:
subjectset = 'ticket #' + ticket_id + ' - ' + escape_shell_arg(subject)
if priority:
priorityset = " priority:" + escape_shell_arg(str(priority)) + '\n'
if queue:
queueset = " queue:" + escape_shell_arg(queue) + '\n'
if requestor:
requestorset = " requestor:" + escape_shell_arg(requestor) + '\n'
if owner:
ownerprefs = invenio.legacy.webuser.get_user_preferences(owner)
if ownerprefs.has_key("bibcatalog_username"):
owner = ownerprefs["bibcatalog_username"]
ownerset = " owner:" + escape_shell_arg(owner) + '\n'
textset = textset + ownerset + requestorset + recidset + queueset + priorityset + '\n'
textset = textset + escape_shell_arg(text) + '\n'
ok = send_email(fromaddr=FROM_ADDRESS, toaddr=TO_ADDRESS, subject=subjectset, header='Hello,\n\n', content=textset)
if ok:
return ticket_id
return None
def ticket_comment(self, uid, ticketid, comment):
""" Comment on ticket with given ticketid"""
subjectset = 'ticket #' + ticketid + ' - Comment ...'
textset = '...\n\n*Comment on ticket #' + ticketid + '\nComment:' + comment
ok = send_email(fromaddr=FROM_ADDRESS, toaddr=TO_ADDRESS, subject=subjectset, header='Hello,\n\n', content=textset)
if ok:
return 1
return 0
def ticket_assign(self, uid, ticketid, to_user):
""" Re-assign existing ticket with given ticketid to user to_user"""
subjectset = 'ticket #' + ticketid + ' - Re-assign ...'
textset = '...\n\n*Please re-assigning ticket #' + ticketid + ' to ' + to_user
ok = send_email(fromaddr=FROM_ADDRESS, toaddr=TO_ADDRESS, subject=subjectset, header='Hello,\n\n', content=textset)
if ok:
return 1
return 0
def ticket_set_attribute(self, uid, ticketid, attribute, new_value):
""" Request to set attribute to new value on ticket with given ticketid"""
subjectset = 'ticket #' + ticketid + ' - Attribute Update ...'
textset = '...\n\n*Please modify attribute:' + attribute + ' to:' + new_value + ' on ticket:' + ticketid
ok = send_email(fromaddr=FROM_ADDRESS, toaddr=TO_ADDRESS, subject=subjectset, header='Hello,\n\n', content=textset)
if ok:
return 1
return 0
def ticket_get_attribute(self, uid, ticketid, attribute):
"""Not implemented."""
raise NotImplementedError
def ticket_get_info(self, uid, ticketid, attributes = None):
"""Not implemented."""
raise NotImplementedError
def _str_base(self, num, base, numerals = '0123456789abcdefghijklmnopqrstuvwxyz'):
""" Convert number to base (2 to 36) """
if base < 2 or base > len(numerals):
raise ValueError("str_base: base must be between 2 and %i" % len(numerals))
if num == 0:
return '0'
if num < 0:
sign = '-'
num = -num
else:
sign = ''
result = ''
while num:
result = numerals[num % (base)] + result
num //= base
return sign + result
def _get_ticket_id(self):
""" Return timestamp in seconds since the Epoch converted to base36 """
now = datetime.datetime.now()
t = mktime(now.timetuple())+1e-6*now.microsecond
t_str = str("%.6f" % t)
t1, t2 = t_str.split('.')
t_str = t1 + t2
#return base64.encodestring(t_str).strip()
return self._str_base(int(t_str), 36)
diff --git a/invenio/legacy/bibcatalog/system_rt.py b/invenio/legacy/bibcatalog/system_rt.py
index 09f932649..96c249c65 100644
--- a/invenio/legacy/bibcatalog/system_rt.py
+++ b/invenio/legacy/bibcatalog/system_rt.py
@@ -1,408 +1,408 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Provide a "ticket" interface with a request tracker.
This is a subclass of BibCatalogSystem
"""
import os
import invenio.legacy.webuser
from invenio.utils.shell import run_shell_command, escape_shell_arg
-from invenio.bibcatalog_system import BibCatalogSystem, get_bibcat_from_prefs
+from invenio.legacy.bibcatalog.system import BibCatalogSystem, get_bibcat_from_prefs
from invenio.config import CFG_BIBCATALOG_SYSTEM, \
CFG_BIBCATALOG_SYSTEM_RT_CLI, \
CFG_BIBCATALOG_SYSTEM_RT_URL, \
CFG_BIBCATALOG_SYSTEM_RT_DEFAULT_USER, \
CFG_BIBCATALOG_SYSTEM_RT_DEFAULT_PWD
class BibCatalogSystemRT(BibCatalogSystem):
BIBCATALOG_RT_SERVER = "" #construct this by http://user:password@RT_URL
def check_system(self, uid=None):
"""return an error string if there are problems"""
if uid:
rtuid, rtpw = get_bibcat_from_prefs(uid)
else:
# Assume default RT user
rtuid = CFG_BIBCATALOG_SYSTEM_RT_DEFAULT_USER
rtpw = CFG_BIBCATALOG_SYSTEM_RT_DEFAULT_PWD
if not rtuid and not rtpw:
return "No valid RT user login specified"
if not CFG_BIBCATALOG_SYSTEM == 'RT':
return "CFG_BIBCATALOG_SYSTEM is not RT though this is an RT module"
if not CFG_BIBCATALOG_SYSTEM_RT_CLI:
return "CFG_BIBCATALOG_SYSTEM_RT_CLI not defined or empty"
if not os.path.exists(CFG_BIBCATALOG_SYSTEM_RT_CLI):
return "CFG_BIBCATALOG_SYSTEM_RT_CLI " + CFG_BIBCATALOG_SYSTEM_RT_CLI + " file does not exists"
# Check that you can execute the binary.. this is a safe call unless someone can fake CFG_BIBCATALOG_SYSTEM_RT_CLI (unlikely)
dummy, myout, myerr = run_shell_command(CFG_BIBCATALOG_SYSTEM_RT_CLI + " help")
helpfound = False
if myerr.count("help") > 0:
helpfound = True
if not helpfound:
return "Execution of CFG_BIBCATALOG_SYSTEM_RT_CLI " + CFG_BIBCATALOG_SYSTEM_RT_CLI + " help did not produce output 'help'"
if not CFG_BIBCATALOG_SYSTEM_RT_URL:
return "CFG_BIBCATALOG_SYSTEM_RT_URL not defined or empty"
# Construct URL, split RT_URL at //
if not CFG_BIBCATALOG_SYSTEM_RT_URL.startswith('http://') and \
not CFG_BIBCATALOG_SYSTEM_RT_URL.startswith('https://'):
return "CFG_BIBCATALOG__SYSTEM_RT_URL does not start with 'http://' or 'https://'"
httppart, siteandpath = CFG_BIBCATALOG_SYSTEM_RT_URL.split("//")
# Assemble by http://user:password@RT_URL
BIBCATALOG_RT_SERVER = httppart + "//" + rtuid + ":" + rtpw + "@" + siteandpath
#set as env var
os.environ["RTUSER"] = rtuid
os.environ["RTSERVER"] = BIBCATALOG_RT_SERVER
#try to talk to RT server
#this is a safe call since rtpw is the only variable in it, and it is escaped
rtpw = escape_shell_arg(rtpw)
dummy, myout, myerr = run_shell_command("echo "+rtpw+" | " + CFG_BIBCATALOG_SYSTEM_RT_CLI + " ls \"Subject like 'F00'\"")
if len(myerr) > 0:
return "could not connect to " + BIBCATALOG_RT_SERVER + " " + myerr
#finally, check that there is some sane output like tickets or 'No matching result'
saneoutput = (myout.count('matching') > 0) or (myout.count('1') > 0)
if not saneoutput:
return CFG_BIBCATALOG_SYSTEM_RT_CLI + " returned " + myout + " instead of 'matching' or '1'"
return ""
def ticket_search(self, uid, recordid=-1, subject="", text="", creator="", owner="", \
date_from="", date_until="", status="", priority="", queue=""):
"""returns a list of ticket ID's related to this record or by
matching the subject, creator or owner of the ticket."""
search_atoms = [] #the search expression will be made by and'ing these
if (recordid > -1):
#search by recid
search_atoms.append("CF.{RecordID} = " + escape_shell_arg(str(recordid)))
if (len(subject) > 0):
#search by subject
search_atoms.append("Subject like " + escape_shell_arg(str(subject)))
if (len(text) > 0):
search_atoms.append("Content like " + escape_shell_arg(str(text)))
if (len(str(creator)) > 0):
#search for this person's bibcatalog_username in preferences
creatorprefs = invenio.legacy.webuser.get_user_preferences(creator)
creator = "Nobody can Have This Kind of Name"
if creatorprefs.has_key("bibcatalog_username"):
creator = creatorprefs["bibcatalog_username"]
search_atoms.append("Creator = " + escape_shell_arg(str(creator)))
if (len(str(owner)) > 0):
ownerprefs = invenio.legacy.webuser.get_user_preferences(owner)
owner = "Nobody can Have This Kind of Name"
if ownerprefs.has_key("bibcatalog_username"):
owner = ownerprefs["bibcatalog_username"]
search_atoms.append("Owner = " + escape_shell_arg(str(owner)))
if (len(date_from) > 0):
search_atoms.append("Created >= " + escape_shell_arg(str(date_from)))
if (len(date_until) > 0):
search_atoms.append("Created <= " + escape_shell_arg(str(date_until)))
if (len(str(status)) > 0) and (type(status) == type("this is a string")):
search_atoms.append("Status = " + escape_shell_arg(str(status)))
if (len(str(priority)) > 0):
#try to convert to int
intpri = -1
try:
intpri = int(priority)
except:
pass
if (intpri > -1):
search_atoms.append("Priority = " + str(intpri))
if queue:
search_atoms.append("Queue = " + escape_shell_arg(queue))
searchexp = " and ".join(search_atoms)
tickets = []
if len(searchexp) == 0:
#just make an expression that is true for all tickets
searchexp = "Created > '1900-01-01'"
command = CFG_BIBCATALOG_SYSTEM_RT_CLI + " ls -l \"" + searchexp + "\""
command_out = self._run_rt_command(command, uid)
if command_out == None:
return tickets
statuses = []
for line in command_out.split("\n"):
#if there are matching lines they will look like NUM:subj.. so pick num
if (line.count('id: ticket/') > 0):
dummy, tnum = line.split('/') #get the ticket id
try:
dummy = int(tnum)
tickets.append(tnum)
except:
pass
if (line.count('Status: ') > 0):
dummy, tstatus = line.split('Status: ')
statuses.append(tstatus)
if (type(status) == type([])):
#take only those tickets whose status matches with one of the status list
alltickets = tickets
tickets = []
for i in range(len(alltickets)):
tstatus = statuses[i]
tnum = alltickets[i]
if (status.count(tstatus) > 0): #match
tickets.append(tnum)
return tickets
def ticket_submit(self, uid=None, subject="", recordid=-1, text="", queue="",
priority="", owner="", requestor=""):
"""creates a ticket. return ticket num on success, otherwise None"""
queueset = ""
textset = ""
priorityset = ""
ownerset = ""
subjectset = ""
requestorset = ""
if subject:
subjectset = " subject=" + escape_shell_arg(subject)
recidset = " CF-RecordID=" + escape_shell_arg(str(recordid))
if priority:
priorityset = " priority=" + escape_shell_arg(str(priority))
if queue:
queueset = " queue=" + escape_shell_arg(queue)
if requestor:
requestorset = " requestor=" + escape_shell_arg(requestor)
if owner:
#get the owner name from prefs
ownerprefs = invenio.legacy.webuser.get_user_preferences(owner)
if ownerprefs.has_key("bibcatalog_username"):
owner = ownerprefs["bibcatalog_username"]
ownerset = " owner=" + escape_shell_arg(owner)
if text:
if '\n' in text:
# contains newlines (\n) return with error
return "Newlines are not allowed in text parameter. Use ticket_comment() instead."
else:
textset = " text=" + escape_shell_arg(text)
# make a command.. note that all set 'set' parts have been escaped
command = CFG_BIBCATALOG_SYSTEM_RT_CLI + " create -t ticket set " + subjectset + recidset + \
queueset + textset + priorityset + ownerset + requestorset
command_out = self._run_rt_command(command, uid)
if command_out == None:
return None
inum = -1
for line in command_out.split("\n"):
if line.count(' ') > 0:
stuff = line.split(' ')
try:
inum = int(stuff[2])
except:
pass
if inum > 0:
return inum
return None
def ticket_comment(self, uid, ticketid, comment):
"""comment on a given ticket. Returns 1 on success, 0 on failure"""
command = '%s comment -m %s %s' % (CFG_BIBCATALOG_SYSTEM_RT_CLI, \
escape_shell_arg(comment), str(ticketid))
command_out = self._run_rt_command(command, uid)
if command_out == None:
return None
return 1
def ticket_assign(self, uid, ticketid, to_user):
"""assign a ticket to an RT user. Returns 1 on success, 0 on failure"""
return self.ticket_set_attribute(uid, ticketid, 'owner', to_user)
def ticket_set_attribute(self, uid, ticketid, attribute, new_value):
"""change the ticket's attribute. Returns 1 on success, 0 on failure"""
#check that the attribute is accepted..
if attribute not in BibCatalogSystem.TICKET_ATTRIBUTES:
return 0
#we cannot change read-only values.. including text that is an attachment. pity
if attribute in ['creator', 'date', 'ticketid', 'url_close', 'url_display', 'recordid', 'text']:
return 0
#check attribute
setme = ""
if (attribute == 'priority'):
try:
dummy = int(new_value)
except:
return 0
setme = "set Priority=" + str(new_value)
if (attribute == 'subject'):
subject = escape_shell_arg(new_value)
setme = "set Subject='" + subject +"'"
if (attribute == 'owner'):
#convert from invenio to RT
ownerprefs = invenio.legacy.webuser.get_user_preferences(new_value)
if not ownerprefs.has_key("bibcatalog_username"):
return 0
else:
owner = escape_shell_arg(ownerprefs["bibcatalog_username"])
setme = " set owner='" + owner +"'"
if (attribute == 'status'):
setme = " set status='" + escape_shell_arg(new_value) +"'"
if (attribute == 'queue'):
setme = " set queue='" + escape_shell_arg(new_value) +"'"
#make sure ticketid is numeric
try:
dummy = int(ticketid)
except:
return 0
command = CFG_BIBCATALOG_SYSTEM_RT_CLI + " edit ticket/" + str(ticketid) + setme
command_out = self._run_rt_command(command, uid)
if command_out == None:
return 0
mylines = command_out.split("\n")
for line in mylines:
if line.count('updated') > 0:
return 1
return 0
def ticket_get_attribute(self, uid, ticketid, attribute):
"""return an attribute of a ticket"""
ticinfo = self.ticket_get_info(uid, ticketid, [attribute])
if ticinfo.has_key(attribute):
return ticinfo[attribute]
return None
def ticket_get_info(self, uid, ticketid, attributes = None):
"""return ticket info as a dictionary of pre-defined attribute names.
Or just those listed in attrlist.
Returns None on failure"""
#make sure ticketid is numeric
try:
dummy = int(ticketid)
except:
return 0
if attributes is None:
attributes = []
command = CFG_BIBCATALOG_SYSTEM_RT_CLI + " show ticket/" + str(ticketid)
command_out = self._run_rt_command(command, uid)
if command_out == None:
return 0
tdict = {}
for line in command_out.split("\n"):
if line.count(": ") > 0:
tattr, tvaluen = line.split(": ")
tvalue = tvaluen.rstrip()
tdict[tattr] = tvalue
#query again to get attachments -> Contents
command = CFG_BIBCATALOG_SYSTEM_RT_CLI + " show ticket/" + str(ticketid) + "/attachments/"
command_out = self._run_rt_command(command, uid)
if command_out == None:
return 0
attachments = []
for line in command_out.split("\n"):
if line.count(": ") > 1: #there is a line Attachments: 40: xxx
aline = line.split(": ")
attachments.append(aline[1])
#query again for each attachment
for att in attachments:
command = CFG_BIBCATALOG_SYSTEM_RT_CLI + " show ticket/" + str(ticketid) + "/attachments/" + att
command_out = self._run_rt_command(command, uid)
if command_out == None:
return 0
#get the contents line
for line in command_out.split("\n"):
if line.count("Content: ") > 0:
cstuff = line.split("Content: ")
tdict['Text'] = cstuff[1].rstrip()
if (len(tdict) > 0):
#iterate over TICKET_ATTRIBUTES to make a canonical ticket
candict = {}
for f in BibCatalogSystem.TICKET_ATTRIBUTES:
tcased = f.title()
if tdict.has_key(tcased):
candict[f] = tdict[tcased]
if tdict.has_key('CF.{RecordID}'):
candict['recordid'] = tdict['CF.{RecordID}']
if tdict.has_key('id'):
candict['ticketid'] = tdict['id']
#make specific URL attributes:
url_display = CFG_BIBCATALOG_SYSTEM_RT_URL + "/Ticket/Display.html?id="+str(ticketid)
candict['url_display'] = url_display
url_close = CFG_BIBCATALOG_SYSTEM_RT_URL + "/Ticket/Update.html?Action=Comment&DefaultStatus=resolved&id="+str(ticketid)
candict['url_close'] = url_close
url_modify = CFG_BIBCATALOG_SYSTEM_RT_URL + "/Ticket/ModifyAll.html?id="+str(ticketid)
candict['url_modify'] = url_modify
#change the ticket owner into invenio UID
if tdict.has_key('owner'):
rt_owner = tdict["owner"]
uid = invenio.legacy.webuser.get_uid_based_on_pref("bibcatalog_username", rt_owner)
candict['owner'] = uid
if len(attributes) == 0: #return all fields
return candict
else: #return only the fields that were requested
tdict = {}
for myatt in attributes:
if candict.has_key(myatt):
tdict[myatt] = candict[myatt]
return tdict
else:
return None
def _run_rt_command(self, command, uid=None):
"""
This function will run a RT CLI command as given user. If no user is specified
the default RT user will be used, if configured.
Should any of the configuration parameters be missing this function will return
None. Otherwise it will return the standard output from the CLI command.
@param command: RT CLI command to execute
@type command: string
@param uid: the Invenio user id to submit on behalf of. Optional.
@type uid: int
@return: standard output from the command given. None, if any errors.
@rtype: string
"""
if not CFG_BIBCATALOG_SYSTEM_RT_URL:
return None
if uid:
username, passwd = get_bibcat_from_prefs(uid)
else:
username = CFG_BIBCATALOG_SYSTEM_RT_DEFAULT_USER
passwd = CFG_BIBCATALOG_SYSTEM_RT_DEFAULT_PWD
httppart, siteandpath = CFG_BIBCATALOG_SYSTEM_RT_URL.split("//")
BIBCATALOG_RT_SERVER = httppart + "//" + username + ":" + passwd + "@" + siteandpath
#set as env var
os.environ["RTUSER"] = username
os.environ["RTSERVER"] = BIBCATALOG_RT_SERVER
passwd = escape_shell_arg(passwd)
error_code, myout, dummyerr = run_shell_command("echo "+passwd+" | " + command)
if error_code > 0:
raise ValueError, 'Problem running "%s": %d' % (command, error_code)
return myout
diff --git a/invenio/legacy/bibcatalog/templates.py b/invenio/legacy/bibcatalog/templates.py
index c56023f6d..5c0b768f0 100644
--- a/invenio/legacy/bibcatalog/templates.py
+++ b/invenio/legacy/bibcatalog/templates.py
@@ -1,77 +1,77 @@
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Invenio BibCatalog HTML generator."""
-from invenio.bibcatalog import bibcatalog_system
+from invenio.legacy.bibcatalog.api import bibcatalog_system
from invenio.base.i18n import wash_language, gettext_set_language
from invenio.config import CFG_SITE_LANG
from invenio.legacy.webstyle.templates import Template as DefaultTemplate
class Template(DefaultTemplate):
""" HTML generators for BibCatalog """
SHOW_MAX_TICKETS = 25
def tmpl_your_tickets(self, uid, ln=CFG_SITE_LANG, start=1):
""" make a pretty html body of tickets that belong to the user given as param """
ln = wash_language(ln)
_ = gettext_set_language(ln)
if bibcatalog_system is None:
return _("Error: No BibCatalog system configured.")
#errors? tell what happened and get out
bibcat_probs = bibcatalog_system.check_system(uid)
if bibcat_probs:
return _("Error")+" "+bibcat_probs
tickets = bibcatalog_system.ticket_search(uid, owner=uid) #get ticket id's
lines = "" #put result here
i = 1
lines += (_("You have %i tickets.") % len(tickets)) + "<br/>"
#make a prev link if needed
if (start > 1):
newstart = start - self.SHOW_MAX_TICKETS
if (newstart < 1):
newstart = 1
lines += '<a href="/yourtickets/display?start='+str(newstart)+'">'+_("Previous")+'</a>'
lines += """<table border="1">"""
lastshown = len(tickets) #what was the number of the last shown ticket?
for ticket in tickets:
#get info and show only for those that within the show range
if (i >= start) and (i < start+self.SHOW_MAX_TICKETS):
ticket_info = bibcatalog_system.ticket_get_info(uid, ticket)
subject = ticket_info['subject']
status = ticket_info['status']
text = ""
if ticket_info.has_key('text'):
text = ticket_info['text']
display = '<a href="'+ticket_info['url_display']+'">'+_("show")+'</a>'
close = '<a href="'+ticket_info['url_close']+'">'+_("close")+'</a>'
lines += "<tr><td>"+str(ticket)+"</td><td>"+subject+" "+text+"</td><td>"+status+"</td><td>"+display+"</td><td>"+close+"</td></tr>\n"
lastshown = i
i = i+1
lines += "</table>"
#make next link if needed
if (len(tickets) > lastshown):
newstart = lastshown+1
lines += '<a href="/yourtickets/display?start='+str(newstart)+'">'+_("Next")+'</a>'
return lines
diff --git a/invenio/legacy/bibcirculation/adminlib.py b/invenio/legacy/bibcirculation/adminlib.py
index 7a15c5fce..6d6ad51da 100644
--- a/invenio/legacy/bibcirculation/adminlib.py
+++ b/invenio/legacy/bibcirculation/adminlib.py
@@ -1,6234 +1,6234 @@
## Administrator interface for Bibcirculation
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
## """Invenio Bibcirculation Administrator Interface."""
from __future__ import division
"""
Invenio Bibcirculation Administrator.
The functions are positioned by grouping into logical
categories('User Pages', 'Loans, Returns and Loan requests',
'ILLs', 'Libraries', 'Vendors' ...)
These orders should be maintained and when necessary, improved
for readability, as and when additional methods are added.
When applicable, methods should be renamed, refactored and
appropriate documentation added.
"""
__revision__ = "$Id$"
__lastupdated__ = """$Date$"""
import datetime, time, types
# Other Invenio imports
from invenio.config import \
CFG_SITE_LANG, \
CFG_SITE_URL, \
CFG_SITE_SECURE_URL, \
CFG_CERN_SITE
import invenio.modules.access.engine as acce
from invenio.legacy.webpage import page
from invenio.legacy.webuser import getUid, page_not_authorized
-from invenio.webstat import register_customevent
+from invenio.legacy.webstat.api import register_customevent
from invenio.ext.logging import register_exception
from invenio.ext.email import send_email
from invenio.legacy.search_engine import perform_request_search, record_exists
from invenio.utils.url import create_html_link, create_url, redirect_to_url
from invenio.base.i18n import gettext_set_language
from invenio.config import \
CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, \
CFG_BIBCIRCULATION_ITEM_STATUS_ON_ORDER, \
CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF, \
CFG_BIBCIRCULATION_ITEM_STATUS_IN_PROCESS, \
CFG_BIBCIRCULATION_ITEM_STATUS_UNDER_REVIEW, \
CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN, \
CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED, \
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING, \
CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING, \
CFG_BIBCIRCULATION_REQUEST_STATUS_DONE, \
CFG_BIBCIRCULATION_REQUEST_STATUS_CANCELLED, \
CFG_BIBCIRCULATION_ILL_STATUS_NEW, \
CFG_BIBCIRCULATION_ILL_STATUS_ON_LOAN, \
CFG_BIBCIRCULATION_LIBRARY_TYPE_MAIN, \
CFG_BIBCIRCULATION_ACQ_STATUS_NEW, \
CFG_BIBCIRCULATION_ACQ_STATUS_RECEIVED, \
CFG_BIBCIRCULATION_PROPOSAL_STATUS_ON_ORDER, \
CFG_BIBCIRCULATION_PROPOSAL_STATUS_PUT_ASIDE, \
CFG_BIBCIRCULATION_PROPOSAL_STATUS_RECEIVED
# Bibcirculation imports
-from invenio.bibcirculation_config import \
+from invenio.legacy.bibcirculation.config import \
CFG_BIBCIRCULATION_TEMPLATES, CFG_BIBCIRCULATION_LIBRARIAN_EMAIL, \
CFG_BIBCIRCULATION_LOANS_EMAIL, CFG_BIBCIRCULATION_ILLS_EMAIL, \
CFG_BIBCIRCULATION_PROPOSAL_TYPE, CFG_BIBCIRCULATION_ACQ_STATUS
-from invenio.bibcirculation_utils import book_title_from_MARC, \
+from invenio.legacy.bibcirculation.utils import book_title_from_MARC, \
update_status_if_expired, \
renew_loan_for_X_days, \
print_pending_hold_requests_information, \
print_new_loan_information, \
validate_date_format, \
generate_email_body, \
book_information_from_MARC, \
search_user, \
tag_all_requests_as_done, \
update_user_info_from_ldap, \
update_request_data, \
update_requests_statuses, \
has_date_format, \
generate_tmp_barcode, \
looks_like_dictionary
-import invenio.bibcirculation_dblayer as db
+import invenio.legacy.bibcirculation.db_layer as db
import invenio.legacy.template
bc_templates = invenio.legacy.template.load('bibcirculation')
def is_adminuser(req):
"""check if user is a registered administrator. """
return acce.acc_authorize_action(req, "runbibcirculation")
def mustloginpage(req, message):
"""show a page asking the user to login."""
navtrail_previous_links = '<a class="navtrail" href="%s/admin/">' \
'Admin Area</a> &gt; ' \
'<a class="navtrail" href="%s/admin/bibcirculation/">' \
'BibCirculation Admin</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL)
return page_not_authorized(req=req, text=message,
navtrail=navtrail_previous_links)
def load_template(template):
"""
Load a letter/notification template from
bibcirculation_config.py.
@type template: string.
@param template: template that will be used.
@return: template(string)
"""
if template == "overdue_letter":
output = CFG_BIBCIRCULATION_TEMPLATES['OVERDUE']
elif template == "reminder":
output = CFG_BIBCIRCULATION_TEMPLATES['REMINDER']
elif template == "notification":
output = CFG_BIBCIRCULATION_TEMPLATES['NOTIFICATION']
elif template == "ill_received":
output = CFG_BIBCIRCULATION_TEMPLATES['ILL_RECEIVED']
elif template == "ill_recall1":
output = CFG_BIBCIRCULATION_TEMPLATES['ILL_RECALL1']
elif template == "ill_recall2":
output = CFG_BIBCIRCULATION_TEMPLATES['ILL_RECALL2']
elif template == "ill_recall3":
output = CFG_BIBCIRCULATION_TEMPLATES['ILL_RECALL3']
elif template == "claim_return":
output = CFG_BIBCIRCULATION_TEMPLATES['SEND_RECALL']
elif template == "proposal_notification":
output = CFG_BIBCIRCULATION_TEMPLATES['PROPOSAL_NOTIFICATION']
elif template == "proposal_acceptance":
output = CFG_BIBCIRCULATION_TEMPLATES['PROPOSAL_ACCEPTANCE_NOTIFICATION']
elif template == "proposal_refusal":
output = CFG_BIBCIRCULATION_TEMPLATES['PROPOSAL_REFUSAL_NOTIFICATION']
elif template == "purchase_notification":
output = CFG_BIBCIRCULATION_TEMPLATES['PURCHASE_NOTIFICATION']
elif template == "purchase_received_tid":
output = CFG_BIBCIRCULATION_TEMPLATES['PURCHASE_RECEIVED_TID']
elif template == "purchase_received_cash":
output = CFG_BIBCIRCULATION_TEMPLATES['PURCHASE_RECEIVED_CASH']
else:
output = CFG_BIBCIRCULATION_TEMPLATES['EMPTY']
return output
def index(req, ln=CFG_SITE_LANG):
"""
main function to show pages for bibcirculationadmin
"""
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
body = bc_templates.tmpl_index(ln=ln)
return page(title=_("BibCirculation Admin"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
###
### Loans, Loan Requests, Loan Returns related templates.
###
def loan_on_desk_step1(req, key, string, ln=CFG_SITE_LANG):
"""
Step 1/4 of loan procedure.
Search a user/borrower and return a list with all the possible results.
@type key: string.
@param key: attribute that will be considered during the search. Can be 'name',
'email' or 'ccid/id'.
@type string: string.
@param string: keyword used during the search.
@return: list of potential borrowers.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
infos = []
_ = gettext_set_language(ln)
if key and not string:
infos.append(_('Empty string. Please, try again.'))
body = bc_templates.tmpl_loan_on_desk_step1(result=None, key=key,
string=string, infos=infos,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
return page(title=_("Loan on desk"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
result = search_user(key, string)
borrowers_list = []
if len(result) == 0 and key:
if CFG_CERN_SITE:
infos.append(_("0 borrowers found.") + ' ' +_("Search by CCID."))
else:
new_borrower_link = create_html_link(CFG_SITE_SECURE_URL +
'/admin2/bibcirculation/add_new_borrower_step1',
{'ln': ln}, _("Register new borrower."))
message = _("0 borrowers found.") + ' ' + new_borrower_link
infos.append(message)
elif len(result) == 1:
return loan_on_desk_step2(req, result[0][0], ln)
else:
for user in result:
borrower_data = db.get_borrower_data_by_id(user[0])
borrowers_list.append(borrower_data)
body = bc_templates.tmpl_loan_on_desk_step1(result=borrowers_list,
key=key,
string=string,
infos=infos,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
return page(title=_("Circulation management"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def loan_on_desk_step2(req, user_id, ln=CFG_SITE_LANG):
"""
Step 2/4 of loan procedure.
Display the user/borrower's information.
@type user_id: integer
@param user_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
_ = gettext_set_language(ln)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
infos = []
body = bc_templates.tmpl_loan_on_desk_step2(user_id=user_id,
infos=infos,
ln=ln)
return page(title=_("Circulation management"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def loan_on_desk_step3(req, user_id, list_of_barcodes, ln=CFG_SITE_LANG):
"""
Step 3/4 of loan procedure.
Checks that the barcodes exist and that there are no request on these records.
Lets the librarian change the due dates and add notes.
@type user_id: integer
@param user_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
@type list_of_barcodes: list
@param list_of_barcodes: list of strings with the barcodes
introduced by the librarian with the barcode reader
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
_ = gettext_set_language(ln)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
infos = []
list_of_books = []
# to avoid duplicates
aux = []
for bc in list_of_barcodes:
if bc not in aux:
aux.append(bc)
list_of_barcodes = aux
for value in list_of_barcodes:
recid = db.get_id_bibrec(value)
loan_id = db.is_item_on_loan(value)
item_description = db.get_item_description(value)
if recid is None:
infos.append(_('%(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s Unknown barcode.') % {'x_barcode': value, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'} + ' ' + _('Please, try again.'))
body = bc_templates.tmpl_loan_on_desk_step2(user_id=user_id,
infos=infos,
ln=ln)
elif loan_id:
infos.append('The item with the barcode %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s is on a loan. Cannot be checked out.' % {'x_barcode': value, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
body = bc_templates.tmpl_loan_on_desk_step2(user_id=user_id,
infos=infos,
ln=ln)
elif user_id is None:
infos.append(_('You must select one borrower.'))
body = bc_templates.tmpl_loan_on_desk_step1(result=None,
key='',
string='',
infos=infos,
ln=ln)
else:
queue = db.get_queue_request(recid, item_description)
(library_id, location) = db.get_lib_location(value)
tup = (recid, value, library_id, location)
list_of_books.append(tup)
book_details = db.get_item_info(value)
item_status = book_details[7]
if item_status != CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF:
message = _("%(x_strong_tag_open)sWARNING:%(x_strong_tag_close)s Note that item %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s status is %(x_strong_tag_open)s%(x_status)s%(x_strong_tag_close)s") % {'x_barcode': value, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>', 'x_status': item_status}
infos.append(message)
if CFG_CERN_SITE:
library_type = db.get_library_type(library_id)
if library_type != CFG_BIBCIRCULATION_LIBRARY_TYPE_MAIN:
library_name = db.get_library_name(library_id)
message = _("%(x_strong_tag_open)sWARNING:%(x_strong_tag_close)s Note that item %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s location is %(x_strong_tag_open)s%(x_location)s%(x_strong_tag_close)s") % {'x_barcode': value, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>', 'x_location': library_name}
infos.append(message)
if len(queue) != 0 and queue[0][0] != user_id:
message = _("Another user is waiting for the book: %(x_strong_tag_open)s%(x_title)s%(x_strong_tag_close)s. \n\n If you want continue with this loan choose %(x_strong_tag_open)s[Continue]%(x_strong_tag_close)s.") % {'x_title': book_title_from_MARC(recid), 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'}
infos.append(message)
body = bc_templates.tmpl_loan_on_desk_step3(user_id=user_id,
list_of_books=list_of_books,
infos=infos, ln=ln)
if list_of_barcodes == []:
infos.append(_('Empty barcode.') + ' ' + _('Please, try again.'))
body = bc_templates.tmpl_loan_on_desk_step2(user_id=user_id,
infos=infos,
ln=ln)
if infos == []:
# shortcut to simplify loan process
due_dates = []
for bc in list_of_barcodes:
due_dates.append(renew_loan_for_X_days(bc))
return loan_on_desk_step4(req, list_of_barcodes, user_id,
due_dates, None, ln)
else:
return page(title=_("Circulation management"),
uid=id_user,
req=req,
body=body,
metaheaderadd = "<link rel=\"stylesheet\" href=\"%s/img/jquery-ui.css\" type=\"text/css\" />" % CFG_SITE_SECURE_URL,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def loan_on_desk_step4(req, list_of_barcodes, user_id,
due_date, note, ln=CFG_SITE_LANG):
"""
Step 4/4 of loan procedure.
Checks that items are not on loan and that the format of
the dates is correct and creates the loans
@type user_id: integer
@param user_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
@type list_of_barcodes: list
@param list_of_barcodes: list of strings with the barcodes
introduced by the librarian with the barcode reader
@type due_date: list.
@param due_date: list of due dates.
@type note: string.
@param note: note about the new loan.
@return: page with the list 'Last Loans'
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
_ = gettext_set_language(ln)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
infos = []
#loaned_on = datetime.date.today()
#Check if one of the given items is on loan.
on_loan = []
for barcode in list_of_barcodes:
is_on_loan = db.is_item_on_loan(barcode)
if is_on_loan:
on_loan.append(barcode)
if len(on_loan) != 0:
message = _("The items with barcode %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s are already on loan.") % {'x_barcode': on_loan, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'}
infos.append(message)
body = bc_templates.tmpl_loan_on_desk_step1(result=None, key='',
string='', infos=infos,
ln=ln)
return page(title=_("Loan on desk"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
# validate the period of interest given by the admin
for date in due_date:
if validate_date_format(date) is False:
infos = []
message = _("The given due date %(x_strong_tag_open)s%(x_date)s%(x_strong_tag_close)s is not a valid date or date format") % {'x_date': date, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'}
infos.append(message)
list_of_books = []
for bc in list_of_barcodes:
recid = db.get_id_bibrec(bc)
(library_id, location) = db.get_lib_location(bc)
tup = (recid, bc, library_id, location)
list_of_books.append(tup)
body = bc_templates.tmpl_loan_on_desk_step3(user_id=user_id,
list_of_books=list_of_books,
infos=infos, ln=ln)
return page(title=_("Circulation management"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
#if borrower_id == None:
# db.new_borrower(ccid, name, email, phone, address, mailbox, '')
# borrower_id = db.get_borrower_id_by_email(email)
for i in range(len(list_of_barcodes)):
note_format = {}
if note:
note_format[time.strftime("%Y-%m-%d %H:%M:%S")] = str(note)
barcode = list_of_barcodes[i]
recid = db.get_id_bibrec(barcode)
db.new_loan(user_id, recid, barcode, due_date[i],
CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
'normal', note_format)
# Duplicate requests on items belonging to a single record has been disabled.
db.tag_requests_as_done(user_id, barcode)
# tag_all_requests_as_done(barcode, user_id)
db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, barcode)
update_requests_statuses(barcode)
infos.append(_("A loan for the item %(x_strong_tag_open)s%(x_title)s%(x_strong_tag_close)s, with barcode %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s, has been registered with success.") % {'x_title': book_title_from_MARC(recid), 'x_barcode': barcode, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
infos.append(_("You could enter the barcode for this user's next loan, if any."))
body = bc_templates.tmpl_loan_on_desk_step2(user_id=user_id,
infos=infos, ln=ln)
return page(title=_("Circulation management"),
uid=id_user,
req=req,
body=body,
metaheaderadd = "<link rel=\"stylesheet\" href=\"%s/img/jquery-ui.css\" type=\"text/css\" />" % CFG_SITE_SECURE_URL,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def loan_on_desk_confirm(req, barcode=None, borrower_id=None, ln=CFG_SITE_LANG):
"""
*** Obsolete and unmantained function ***
Confirm the return of an item.
@type barcode: string.
@param barcode: identify the item. It is the primary key of the table
crcITEM.
@type borrower_id: integer.
@param borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
result = db.loan_on_desk_confirm(barcode, borrower_id)
body = bc_templates.tmpl_loan_on_desk_confirm(result=result,
barcode=barcode,
borrower_id=borrower_id,
ln=ln)
return page(title=_("Loan on desk confirm"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def register_new_loan(req, barcode, borrower_id,
request_id, new_note, print_data, ln=CFG_SITE_LANG):
"""
Register a new loan. This function is from the "Create Loan" pages.
@type barcode: string.
@param barcode: identify the item. It is the primary key of the table
crcITEM.
@type borrower_id: integer.
@param borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
@type request_id: integer.
@param request_id: identify the hold request. It is also the primary key
of the table crcLOANREQUEST.
@type new_note: string.
@param new_note: associate a note to this loan.
@type print_data: string.
@param print_data: print the information about this loan.
@return: new loan
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
has_recid = db.get_id_bibrec(barcode)
loan_id = db.is_item_on_loan(barcode)
recid = db.get_request_recid(request_id)
req_barcode = db.get_requested_barcode(request_id)
req_description = db.get_item_description(req_barcode)
# Get all the items belonging to the record whose
# description is the same.
list_of_barcodes = db.get_barcodes(recid, req_description)
infos = []
if print_data == 'true':
return print_new_loan_information(req, ln)
else:
if has_recid is None:
message = _('%(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s Unknown barcode.') % {'x_barcode': barcode, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'} + ' ' + _('Please, try again.')
infos.append(message)
borrower = db.get_borrower_details(borrower_id)
title = _("Create Loan")
body = bc_templates.tmpl_create_loan(request_id=request_id,
recid=recid,
borrower=borrower,
infos=infos,
ln=ln)
elif loan_id:
infos.append(_('The item with the barcode %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s is on loan.') % {'x_barcode': barcode, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
borrower = db.get_borrower_details(borrower_id)
title = _("Create Loan")
body = bc_templates.tmpl_create_loan(request_id=request_id,
recid=recid,
borrower=borrower,
infos=infos,
ln=ln)
elif barcode not in list_of_barcodes:
infos.append(_('The given barcode "%(x_barcode)s" does not correspond to requested item.') % {'x_barcode': barcode})
borrower = db.get_borrower_details(borrower_id)
title = _("Create Loan")
body = bc_templates.tmpl_create_loan(request_id=request_id,
recid=recid,
borrower=borrower,
infos=infos,
ln=ln)
else:
recid = db.get_id_bibrec(barcode)
#loaned_on = datetime.date.today()
due_date = renew_loan_for_X_days(barcode)
if new_note:
note_format = '[' + time.ctime() + '] ' + new_note + '\n'
else:
note_format = ''
last_id = db.new_loan(borrower_id, recid, barcode,
due_date, CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
'normal', note_format)
# register event in webstat
try:
register_customevent("loanrequest", [request_id, last_id])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
tag_all_requests_as_done(barcode, borrower_id)
db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, barcode)
db.update_loan_request_status(CFG_BIBCIRCULATION_REQUEST_STATUS_DONE,
request_id)
db.update_request_barcode(barcode, request_id)
update_requests_statuses(barcode)
result = db.get_all_loans(20)
infos.append(_('A new loan has been registered with success.'))
title = _("Current loans")
body = bc_templates.tmpl_all_loans(result=result,
infos=infos,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
return page(title=title,
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def create_loan(req, request_id, recid, borrower_id, ln=CFG_SITE_LANG):
"""
Create a new loan from a hold request.
@type request_id: integer.
@param request_id: identify the hold request. It is also the primary key
of the table crcLOANREQUEST.
@type recid: integer.
@param recid: identify the record. It is also the primary key of
the table bibrec.
@type borrower_id: integer.
@param borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
borrower = db.get_borrower_details(borrower_id)
infos = []
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_create_loan(request_id=request_id,
recid=recid,
borrower=borrower,
infos=infos,
ln=ln)
return page(title=_("Create Loan"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def make_new_loan_from_request(req, check_id, barcode, ln=CFG_SITE_LANG):
"""
Turns a request into a loan.
@type check_id: integer.
@param check_id: identify the hold request. It is also the primary key
of the table crcLOANREQUEST.
@type barcode: string.
@param barcode: identify the item. It is the primary key of the table
crcITEM.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
recid = db.get_request_recid(check_id)
borrower_id = db.get_request_borrower_id(check_id)
borrower_info = db.get_borrower_details(borrower_id)
due_date = renew_loan_for_X_days(barcode)
if db.is_item_on_loan(barcode):
infos.append('The item with the barcode %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s is on loan.' % {'x_barcode': barcode, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
return redirect_to_url(req,
'%s/admin2/bibcirculation/all_loans?ln=%s&msg=ok' % (CFG_SITE_SECURE_URL, ln))
else:
db.new_loan(borrower_id, recid, barcode, due_date,
CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN, 'normal', '')
infos.append(_('A new loan has been registered with success.'))
#try:
# register_customevent("baskets", ["display", "", user_str])
#except:
# register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
tag_all_requests_as_done(barcode, borrower_id)
db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, barcode)
update_requests_statuses(barcode)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">' \
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
body = bc_templates.tmpl_register_new_loan(borrower_info=borrower_info,
infos=infos,
recid=recid,
ln=ln)
return page(title=_("New Loan"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def loan_return(req, ln=CFG_SITE_LANG):
"""
Page where is possible to register the return of an item.
"""
_ = gettext_set_language(ln)
infos = []
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">' \
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
body = bc_templates.tmpl_loan_return(infos=infos, ln=ln)
return page(title=_("Loan return"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def loan_return_confirm(req, barcode, ln=CFG_SITE_LANG):
"""
Performs the return of a loan and displays a confirmation page.
In case the book is requested, it is possible to select a request
and make a loan from it (make_new_loan_from_request)
@type barcode: string.
@param barcode: identify the item. It is the primary key of the table
crcITEM.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
infos = []
_ = gettext_set_language(ln)
recid = db.get_id_bibrec(barcode)
loan_id = db.is_item_on_loan(barcode)
if recid is None:
infos.append(_('%(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s Unknown barcode.') % {'x_barcode': barcode, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'} + ' ' + _('Please, try again.'))
body = bc_templates.tmpl_loan_return(infos=infos, ln=ln)
elif loan_id is None:
message = _("The item the with barcode %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s is not on loan. Please, try again.") % {'x_barcode': barcode, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'}
infos.append(message)
body = bc_templates.tmpl_loan_return(infos=infos, ln=ln)
else:
library_id = db.get_item_info(barcode)[1]
if CFG_CERN_SITE:
library_type = db.get_library_type(library_id)
if library_type != CFG_BIBCIRCULATION_LIBRARY_TYPE_MAIN:
library_name = db.get_library_name(library_id)
message = _("%(x_strong_tag_open)sWARNING:%(x_strong_tag_close)s Note that item %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s location is %(x_strong_tag_open)s%(x_location)s%(x_strong_tag_close)s") % {'x_barcode': barcode, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>', 'x_location': library_name}
infos.append(message)
borrower_id = db.get_borrower_id(barcode)
borrower_name = db.get_borrower_name(borrower_id)
db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF, barcode)
db.return_loan(barcode)
update_requests_statuses(barcode)
description = db.get_item_description(barcode)
result = db.get_pending_loan_request(recid, description)
body = bc_templates.tmpl_loan_return_confirm(
infos=infos,
borrower_name=borrower_name,
borrower_id=borrower_id,
recid=recid,
barcode=barcode,
return_date=datetime.date.today(),
result=result,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
return page(title=_("Loan return"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def claim_book_return(req, borrower_id, recid, loan_id,
template, ln=CFG_SITE_LANG):
"""
Claim the return of an item.
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
recid: identify the record. It is also the primary key of
the table bibrec.
template: letter template.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
email_body = generate_email_body(load_template(template), loan_id)
email = db.get_borrower_email(borrower_id)
subject = book_title_from_MARC(int(recid))
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_borrower_notification(email=email,
subject=subject,
email_body=email_body,
borrower_id=borrower_id,
from_address=CFG_BIBCIRCULATION_LOANS_EMAIL,
ln=ln)
return page(title=_("Claim return"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def change_due_date_step1(req, barcode, borrower_id, ln=CFG_SITE_LANG):
"""
Change the due date of a loan, step1.
loan_id: identify a loan. It is the primery key of the table
crcLOAN.
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
loan_id = db.get_current_loan_id(barcode)
loan_details = db.get_loan_infos(loan_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_change_due_date_step1(loan_details=loan_details,
loan_id=loan_id,
borrower_id=borrower_id,
ln=ln)
return page(title=_("Change due date"),
uid=id_user,
req=req,
body=body, language=ln,
#metaheaderadd = '<link rel="stylesheet" '\
# 'href="%s/img/jquery-ui/themes/redmond/ui.theme.css" '\
# 'type="text/css" />' % CFG_SITE_SECURE_URL,
metaheaderadd = '<link rel="stylesheet" href="%s/img/jquery-ui.css" '\
'type="text/css" />' % CFG_SITE_SECURE_URL,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def change_due_date_step2(req, new_due_date, loan_id, borrower_id,
ln=CFG_SITE_LANG):
"""
Change the due date of a loan, step2.
due_date: new due date.
loan_id: identify a loan. It is the primery key of the table
crcLOAN.
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
db.update_due_date(loan_id, new_due_date)
update_status_if_expired(loan_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_change_due_date_step2(new_due_date=new_due_date,
borrower_id=borrower_id,
ln=ln)
return page(title=_("Change due date"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def place_new_request_step1(req, barcode, recid, key, string, ln=CFG_SITE_LANG):
"""
Place a new request from the item's page, step1.
barcode: identify the item. It is the primary key of the table
crcITEM.
recid: identify the record. It is also the primary key of
the table bibrec.
key: search field.
string: search pattern.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
recid = db.get_id_bibrec(barcode)
infos = []
if key and not string:
infos.append(_('Empty string.') + ' ' + _('Please, try again.'))
body = bc_templates.tmpl_place_new_request_step1(result=None,
key=key,
string=string,
barcode=barcode,
recid=recid,
infos=infos,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
return page(title=_("New request"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
result = search_user(key, string)
borrowers_list = []
if len(result) == 0 and key:
if CFG_CERN_SITE:
infos.append(_("0 borrowers found.") + ' ' +_("Search by CCID."))
else:
new_borrower_link = create_html_link(CFG_SITE_SECURE_URL +
'/admin2/bibcirculation/add_new_borrower_step1',
{'ln': ln}, _("Register new borrower."))
message = _("0 borrowers found.") + ' ' + new_borrower_link
infos.append(message)
else:
for user in result:
borrower_data = db.get_borrower_data_by_id(user[0])
borrowers_list.append(borrower_data)
if len(result) == 1:
return place_new_request_step2(req, barcode, recid,
borrowers_list[0], ln)
else:
body = bc_templates.tmpl_place_new_request_step1(result=borrowers_list,
key=key,
string=string,
barcode=barcode,
recid=recid,
infos=infos,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
return page(title=_("New request"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def place_new_request_step2(req, barcode, recid, user_info, ln=CFG_SITE_LANG):
"""
Place a new request from the item's page, step2.
@type barcode: string.
@param barcode: identify the item. It is the primary key of the table
crcITEM.
@type recid: integer.
@param recid: identify the record. It is also the primary key of
the table bibrec.
@type user_info: list.
@param user_info: information of the user/borrower who was selected.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
body = bc_templates.tmpl_place_new_request_step2(barcode=barcode,
recid=recid,
user_info=user_info,
infos=infos,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
return page(title=_("New request"),
uid=id_user,
req=req,
body=body,
metaheaderadd = "<link rel=\"stylesheet\" href=\"%s/img/jquery-ui.css\" type=\"text/css\" />" % CFG_SITE_SECURE_URL,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def place_new_request_step3(req, barcode, recid, user_info,
period_from, period_to, ln=CFG_SITE_LANG):
"""
Place a new request from the item's page, step3.
@type barcode: string.
@param barcode: identify the item. It is the primary key of the table
crcITEM.
@type recid: integer.
@param recid: identify the record. It is also the primary key of
the table bibrec.
@return: new request.
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
(_id, ccid, name, email, phone, address, mailbox) = user_info
# validate the period of interest given by the admin
if validate_date_format(period_from) is False:
infos = []
infos.append(_("The period of interest %(x_strong_tag_open)sFrom: %(x_date)s%(x_strong_tag_close)s is not a valid date or date format") % {'x_date': period_from, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
body = bc_templates.tmpl_place_new_request_step2(barcode=barcode,
recid=recid,
user_info=user_info,
infos=infos,
ln=ln)
return page(title=_("New request"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
elif validate_date_format(period_to) is False:
infos = []
infos.append(_("The period of interest %(x_strong_tag_open)sTo: %(x_date)s%(x_strong_tag_close)s is not a valid date or date format") % {'x_date': period_to, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
body = bc_templates.tmpl_place_new_request_step2(barcode=barcode,
recid=recid,
user_info=user_info,
infos=infos,
ln=ln)
# Register request
borrower_id = db.get_borrower_id_by_email(email)
if borrower_id == None:
db.new_borrower(ccid, name, email, phone, address, mailbox, '')
borrower_id = db.get_borrower_id_by_email(email)
req_id = db.new_hold_request(borrower_id, recid, barcode,
period_from, period_to,
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING)
pending_request = update_requests_statuses(barcode)
if req_id == pending_request:
(title, year, author,
isbn, publisher) = book_information_from_MARC(int(recid))
details = db.get_loan_request_details(req_id)
if details:
library = details[3]
location = details[4]
request_date = details[7]
else:
location = ''
library = ''
request_date = ''
link_to_holdings_details = CFG_SITE_URL + \
'/record/%s/holdings' % str(recid)
subject = _('New request')
message = load_template('notification')
message = message % (name, ccid, email, address, mailbox, title,
author, publisher, year, isbn, location, library,
link_to_holdings_details, request_date)
send_email(fromaddr = CFG_BIBCIRCULATION_LIBRARIAN_EMAIL,
toaddr = CFG_BIBCIRCULATION_LOANS_EMAIL,
subject = subject,
content = message,
header = '',
footer = '',
attempt_times=1,
attempt_sleeptime=10
)
send_email(fromaddr = CFG_BIBCIRCULATION_LIBRARIAN_EMAIL,
toaddr = email,
subject = subject,
content = message,
header = '',
footer = '',
attempt_times=1,
attempt_sleeptime=10
)
body = bc_templates.tmpl_place_new_request_step3(ln=ln)
return page(title=_("New request"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def place_new_loan_step1(req, barcode, recid, key, string, ln=CFG_SITE_LANG):
"""
Place a new loan from the item's page, step1.
@type barcode: string.
@param barcode: identify the item. It is the primary key of the table
crcITEM.
@type recid: integer.
@param recid: identify the record. It is also the primary key of
the table bibrec.
@type key: string.
@param key: search field.
@type string: string.
@param string: search pattern.
@return: list of users/borrowers.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
recid = db.get_id_bibrec(barcode)
infos = []
if key and not string:
infos.append(_('Empty string.') + ' ' + _('Please, try again.'))
body = bc_templates.tmpl_place_new_loan_step1(result=None,
key=key,
string=string,
barcode=barcode,
recid=recid,
infos=infos,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
return page(title=_("New loan"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
result = search_user(key, string)
borrowers_list = []
if len(result) == 0 and key:
if CFG_CERN_SITE:
infos.append(_("0 borrowers found.") + ' ' +_("Search by CCID."))
else:
new_borrower_link = create_html_link(CFG_SITE_SECURE_URL +
'/admin2/bibcirculation/add_new_borrower_step1',
{'ln': ln}, _("Register new borrower."))
message = _("0 borrowers found.") + ' ' + new_borrower_link
infos.append(message)
else:
for user in result:
borrower_data = db.get_borrower_data_by_id(user[0])
borrowers_list.append(borrower_data)
body = bc_templates.tmpl_place_new_loan_step1(result=borrowers_list,
key=key,
string=string,
barcode=barcode,
recid=recid,
infos=infos,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
return page(title=_("New loan"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def place_new_loan_step2(req, barcode, recid, user_info, ln=CFG_SITE_LANG):
"""
Place a new loan from the item's page, step2.
@type barcode: string.
@param barcode: identify the item. It is the primary key of the table
crcITEM.
@type recid: integer.
@param recid: identify the record. It is also the primary key of
the table bibrec.
@type user_info: list.
@param user_info: information of the user/borrower who was selected.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
body = bc_templates.tmpl_place_new_loan_step2(barcode=barcode,
recid=recid,
user_info=user_info,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
return page(title=_("New loan"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def place_new_loan_step3(req, barcode, recid, ccid, name, email, phone,
address, mailbox, due_date, notes, ln=CFG_SITE_LANG):
"""
Place a new loan from the item's page, step3.
@type barcode: string.
@param barcode: identify the item. It is the primary key of the table
crcITEM.
@type recid: integer.
@param recid: identify the record. It is also the primary key of
the table bibrec.
@type name: string.
@type email: string.
@type phone: string.
@type address: string.
@type mailbos: string.
@type due_date: string.
@type notes: string.
@return: new loan.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
if notes:
notes_format = '[' + time.ctime() + '] ' + notes + '\n'
else:
notes_format = ''
#loaned_on = datetime.date.today()
borrower_id = db.get_borrower_id_by_email(email)
borrower_info = db.get_borrower_data(borrower_id)
if db.is_on_loan(barcode):
infos.append(_("Item with barcode %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s is already on loan.") % {'x_barcode': barcode, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
copies = db.get_item_copies_details(recid)
requests = db.get_item_requests(recid)
loans = db.get_item_loans(recid)
purchases = db.get_item_purchases(CFG_BIBCIRCULATION_ACQ_STATUS_NEW, recid)
req_hist_overview = db.get_item_requests_historical_overview(recid)
loans_hist_overview = db.get_item_loans_historical_overview(recid)
purchases_hist_overview = db.get_item_purchases(CFG_BIBCIRCULATION_ACQ_STATUS_RECEIVED, recid)
title = _("Item details")
body = bc_templates.tmpl_get_item_details(
recid=recid, copies=copies,
requests=requests, loans=loans,
purchases=purchases,
req_hist_overview=req_hist_overview,
loans_hist_overview=loans_hist_overview,
purchases_hist_overview=purchases_hist_overview,
infos=infos, ln=ln)
elif borrower_id != 0:
db.new_loan(borrower_id, recid, barcode,
due_date, CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
'normal', notes_format)
tag_all_requests_as_done(barcode, borrower_id)
db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, barcode)
update_requests_statuses(barcode)
title = _("New loan")
body = bc_templates.tmpl_register_new_loan(borrower_info=borrower_info,
infos=infos,
recid=recid, ln=ln)
else:
db.new_borrower(ccid, name, email, phone, address, mailbox, '')
borrower_id = db.get_borrower_id_by_email(email)
db.new_loan(borrower_id, recid, barcode,
due_date, CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
'normal', notes_format)
tag_all_requests_as_done(barcode, borrower_id)
db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, barcode)
update_requests_statuses(barcode)
title = _("New loan")
body = bc_templates.tmpl_register_new_loan(borrower_info=borrower_info,
infos=infos,
recid=recid,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
return page(title=title,
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def create_new_request_step1(req, borrower_id, p="", f="", search=None,
ln=CFG_SITE_LANG):
"""
Create a new request from the borrower's page, step1.
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
p: search pattern.
f: field
search: search an item.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
if borrower_id != None:
borrower = db.get_borrower_details(borrower_id)
else:
message = _('Empty borrower ID.')
return borrower_search(req, message, False, ln)
if search and p == '':
infos.append(_('Empty string.') + ' ' + _('Please, try again.'))
result = ''
elif search and f == 'barcode':
p = p.strip('\'" \t')
has_recid = db.get_id_bibrec(p)
if has_recid is None:
infos.append(_('The barcode %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s does not exist on BibCirculation database.') % {'x_barcode': p, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
result = ''
else:
result = has_recid
elif search:
result = perform_request_search(cc="Books", sc="1", p=p, f=f)
else:
result = ''
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
if type(result) is types.IntType or type(result) is types.LongType:
recid = result
holdings_information = db.get_holdings_information(recid)
user_info = db.get_borrower_details(borrower_id)
body = bc_templates.tmpl_create_new_request_step2(user_info=user_info,
holdings_information=holdings_information,
recid=recid, ln=ln)
else:
body = bc_templates.tmpl_create_new_request_step1(borrower=borrower,
infos=infos,
result=result,
p=p,
f=f,
ln=ln)
return page(title=_("New request"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def create_new_request_step2(req, recid, borrower_id, ln=CFG_SITE_LANG):
"""
Create a new request from the borrower's page, step2.
recid: identify the record. It is also the primary key of
the table bibrec.
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
holdings_information = db.get_holdings_information(recid)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
user_info = db.get_borrower_details(borrower_id)
body = bc_templates.tmpl_create_new_request_step2(user_info=user_info,
holdings_information=holdings_information,
recid=recid, ln=ln)
return page(title=_("New request"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def create_new_request_step3(req, borrower_id, barcode, recid,
ln=CFG_SITE_LANG):
"""
Create a new request from the borrower's page, step3.
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
barcode: identify the item. It is the primary key of the table
crcITEM.
recid: identify the record. It is also the primary key of
the table bibrec.
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
item_info = db.get_item_info(barcode)
if item_info[6] == 'Reference':
body = bc_templates.tmpl_book_not_for_loan(ln=ln)
else:
body = bc_templates.tmpl_create_new_request_step3(
borrower_id=borrower_id,
barcode=barcode,
recid=recid,
ln=ln)
return page(title=_("New request"),
uid=id_user,
req=req,
body=body,
metaheaderadd = "<link rel=\"stylesheet\" href=\"%s/img/jquery-ui.css\" type=\"text/css\" />" % CFG_SITE_SECURE_URL,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def create_new_request_step4(req, period_from, period_to, barcode,
borrower_id, recid, ln=CFG_SITE_LANG):
"""
Create a new request from the borrower's page, step4.
period_from: begining of the period of interest.
period_to: end of the period of interest.
barcode: identify the item. It is the primary key of the table
crcITEM.
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
recid: identify the record. It is also the primary key of
the table bibrec.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
db.new_hold_request(borrower_id, recid, barcode,
period_from, period_to,
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING)
update_requests_statuses(barcode)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_create_new_request_step4(ln=ln)
return page(title=_("New request"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def create_new_loan_step1(req, borrower_id, ln=CFG_SITE_LANG):
"""
Create a new loan from the borrower's page, step1.
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
borrower = db.get_borrower_details(borrower_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_create_new_loan_step1(borrower=borrower,
infos=infos,
ln=ln)
return page(title=_("New loan"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def create_new_loan_step2(req, borrower_id, barcode, notes, ln=CFG_SITE_LANG):
"""
Create a new loan from the borrower's page, step2.
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
barcode: identify the item. It is the primary key of the table
crcITEM.
notes: notes about the new loan.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
#borrower_info = db.get_borrower_data(borrower_id)
has_recid = db.get_id_bibrec(barcode)
loan_id = db.is_item_on_loan(barcode)
if notes:
notes_format = '[' + time.ctime() + '] ' + notes + '\n'
else:
notes_format = ''
infos = []
if has_recid is None:
infos.append(_('%(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s Unknown barcode.') % {'x_barcode': barcode, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'} + ' ' + _('Please, try again.'))
borrower = db.get_borrower_details(borrower_id)
title = _("New loan")
body = bc_templates.tmpl_create_new_loan_step1(borrower=borrower,
infos=infos,
ln=ln)
elif loan_id:
infos.append(_('The item with the barcode %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s is on loan.') % {'x_barcode': barcode, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
borrower = db.get_borrower_details(borrower_id)
title = _("New loan")
body = bc_templates.tmpl_create_new_loan_step1(borrower=borrower,
infos=infos,
ln=ln)
else:
#loaned_on = datetime.date.today()
due_date = renew_loan_for_X_days(barcode)
db.new_loan(borrower_id, has_recid, barcode,
due_date, CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
'normal', notes_format)
tag_all_requests_as_done(barcode, borrower_id)
db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, barcode)
update_requests_statuses(barcode)
result = db.get_all_loans(20)
title = _("Current loans")
infos.append(_('A new loan has been registered with success.'))
body = bc_templates.tmpl_all_loans(result=result, infos=infos, ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
return page(title=title,
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def all_requests(req, request_id, ln=CFG_SITE_LANG):
"""
Display all requests.
@type request_id: integer.
@param request_id: identify the hold request. It is also the primary key
of the table crcLOANREQUEST.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if request_id:
db.update_loan_request_status(CFG_BIBCIRCULATION_REQUEST_STATUS_CANCELLED,
request_id)
result = db.get_all_requests()
else:
result = db.get_all_requests()
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_all_requests(result=result, ln=ln)
return page(title=_("List of hold requests"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def all_loans(req, msg=None, ln=CFG_SITE_LANG):
"""
Display all loans.
@type loans_per_page: integer.
@param loans_per_page: number of loans per page.
@type jloan: integer.
@param jloan: jump to next loan.
@return: list with all loans (current loans).
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
if msg == 'ok':
infos.append(_('A new loan has been registered with success.'))
result = db.get_all_loans(20)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">' \
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
body = bc_templates.tmpl_all_loans(result=result, infos=infos, ln=ln)
return page(title=_("Current loans"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def all_expired_loans(req, ln=CFG_SITE_LANG):
"""
Display all loans.
@type loans_per_page: integer.
@param loans_per_page: number of loans per page.
@return: list with all expired loans (overdue loans).
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
result = db.get_all_expired_loans()
infos = []
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">' \
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
body = bc_templates.tmpl_all_expired_loans(result=result,
infos=infos,
ln=ln)
return page(title=_('Overdue loans'),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_pending_requests(req, request_id, print_data, ln=CFG_SITE_LANG):
"""
Retrun all loan requests that are pending. If request_id is not None,
cancel the request and then, return all loan requests that are pending.
@type request_id: integer.
@param request_id: identify the hold request. It is also the primary key
of the table crcLOANREQUEST.
@type print_data: string.
@param print_data: print requests information.
@return: list of pending requests (on shelf with hold).
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if print_data == 'true':
return print_pending_hold_requests_information(req, ln)
elif request_id:
# Cancel a request too.
db.update_loan_request_status(CFG_BIBCIRCULATION_REQUEST_STATUS_CANCELLED,
request_id)
barcode = db.get_request_barcode(request_id)
update_requests_statuses(barcode)
result = db.get_loan_request_by_status(CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING)
else:
result = db.get_loan_request_by_status(CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_get_pending_requests(result=result, ln=ln)
return page(title=_("Items on shelf with holds"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_waiting_requests(req, request_id, print_data, ln=CFG_SITE_LANG):
"""
Get all loans requests that are waiting.
@type request_id: integer.
@param request_id: identify the hold request. It is also the primary key
of the table crcLOANREQUEST.
@type print_data: string.
@param print_data: print requests information.
@return: list of waiting requests (on loan with hold).
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if print_data == 'true':
return print_pending_hold_requests_information(req, ln)
elif request_id:
db.update_loan_request_status(CFG_BIBCIRCULATION_REQUEST_STATUS_CANCELLED,
request_id)
result = db.get_loan_request_by_status(CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING)
aux = ()
for request in result:
if db.get_nb_copies_on_loan(request[1]):
aux += request,
result = aux
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_get_waiting_requests(result=result, ln=ln)
return page(title=_("Items on loan with holds"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_expired_loans_with_waiting_requests(req, request_id, ln=CFG_SITE_LANG):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if request_id:
db.update_loan_request_status(CFG_BIBCIRCULATION_REQUEST_STATUS_CANCELLED,
request_id)
result = db.get_expired_loans_with_waiting_requests()
else:
result = db.get_expired_loans_with_waiting_requests()
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
body = bc_templates.tmpl_get_expired_loans_with_waiting_requests(result=result,
ln=ln)
return page(title=_("Overdue loans with holds"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_loans_notes(req, loan_id, delete_key,
library_notes, back, ln=CFG_SITE_LANG):
"""
Get loan's note(s).
@type loan_id: integer.
@param loan_id: identify a loan. It is the primery key of the table
crcLOAN.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if delete_key and loan_id:
if looks_like_dictionary(db.get_loan_notes(loan_id)):
loans_notes = eval(db.get_loan_notes(loan_id))
if delete_key in loans_notes.keys():
del loans_notes[delete_key]
db.update_loan_notes(loan_id, loans_notes)
elif library_notes:
if db.get_loan_notes(loan_id):
if looks_like_dictionary(db.get_loan_notes(loan_id)):
loans_notes = eval(db.get_loan_notes(loan_id))
else:
loans_notes = {}
else:
loans_notes = {}
note_time = time.strftime("%Y-%m-%d %H:%M:%S")
if note_time not in loans_notes.keys():
loans_notes[note_time] = str(library_notes)
db.update_loan_notes(loan_id, loans_notes)
loans_notes = db.get_loan_notes(loan_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">' \
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
referer = req.headers_in.get('referer')
body = bc_templates.tmpl_get_loans_notes(loans_notes=loans_notes,
loan_id=loan_id,
referer=referer, back=back,
ln=ln)
return page(title=_("Loan notes"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_item_loans_notes(req, loan_id, add_notes, new_note, ln=CFG_SITE_LANG):
"""
Get loan's notes.
@param loan_id: identify a loan. It is the primery key of the table
crcLOAN.
@param recid: identify the record. It is also the primary key of
the table bibrec.
@param borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
@param add_notes: display the textarea where will be written a new notes.
@param new_notes: note that will be added to the others library's notes.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if new_note:
date = '[' + time.ctime() + '] '
new_line = '\n'
new_note = date + new_note + new_line
db.add_new_loan_note(new_note, loan_id)
loans_notes = db.get_loan_notes(loan_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_get_loans_notes(loans_notes=loans_notes,
loan_id=loan_id,
add_notes=add_notes,
ln=ln)
return page(title=_("Loan notes"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
###
### Items and their copies' related .
###
def get_item_details(req, recid, ln=CFG_SITE_LANG):
"""
Display the details of an item.
@type recid: integer.
@param recid: identify the record. It is also the primary key of
the table bibrec.
@return: item details.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
id_user = 1
infos = []
if recid == None:
infos.append(_("Record id not valid"))
copies = db.get_item_copies_details(recid)
requests = db.get_item_requests(recid)
loans = db.get_item_loans(recid)
purchases = db.get_item_purchases(CFG_BIBCIRCULATION_ACQ_STATUS_NEW, recid)
req_hist_overview = db.get_item_requests_historical_overview(recid)
loans_hist_overview = db.get_item_loans_historical_overview(recid)
purchases_hist_overview = db.get_item_purchases(CFG_BIBCIRCULATION_ACQ_STATUS_RECEIVED, recid)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_get_item_details(recid=recid,
copies=copies,
requests=requests,
loans=loans,
purchases=purchases,
req_hist_overview=req_hist_overview,
loans_hist_overview=loans_hist_overview,
purchases_hist_overview=purchases_hist_overview,
infos=infos,
ln=ln)
return page(title=_("Item details"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_item_requests_details(req, recid, request_id, ln=CFG_SITE_LANG):
"""
Display all requests for a specific item.
@type recid: integer.
@param recid: identify the record. It is also the primary key of
the table bibrec.
@type request_id: integer.
@param request_id: identify the hold request. It is also the primary key
of the table crcLOANREQUEST.
@return: Item requests details.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if request_id:
db.cancel_request(request_id)
update_request_data(request_id)
result = db.get_item_requests(recid)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_get_item_requests_details(result=result,
ln=ln)
return page(title=_("Hold requests") + \
" - %s" % (book_title_from_MARC(recid)),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_item_loans_details(req, recid, barcode, loan_id, force,
ln=CFG_SITE_LANG):
"""
Show all the details about all current loans related with a record.
@type recid: integer.
@param recid: identify the record. It is also the primary key of
the table bibrec.
@type barcode: string.
@param barcode: identify the item. It is the primary key of the table
crcITEM.
@type loan_id: integer.
@param loan_id: identify a loan. It is the primery key of the table
crcLOAN.
@type force: string.
@param force: force the renew of a loan, when usually this is not possible.
@return: item loans details.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
if loan_id and barcode and force == 'true':
new_due_date = renew_loan_for_X_days(barcode)
#db.update_due_date(loan_id, new_due_date)
db.renew_loan(loan_id, new_due_date)
update_status_if_expired(loan_id)
infos.append(_("Loan renewed with success."))
elif barcode:
recid = db.get_id_bibrec(barcode)
item_description = db.get_item_description(barcode)
queue = db.get_queue_request(recid, item_description)
new_due_date = renew_loan_for_X_days(barcode)
force_renew_link = create_html_link(CFG_SITE_SECURE_URL +
'/admin2/bibcirculation/get_item_loans_details',
{'barcode': barcode, 'loan_id': loan_id, 'force': 'true',
'recid': recid, 'ln': ln}, (_("Yes")))
no_renew_link = create_html_link(CFG_SITE_SECURE_URL +
'/admin2/bibcirculation/get_item_loans_details',
{'recid': recid, 'ln': ln},
(_("No")))
if len(queue) != 0:
title = book_title_from_MARC(recid)
message = _("Another user is waiting for this book %(x_strong_tag_open)s%(x_title)s%(x_strong_tag_close)s.") % {'x_title': title, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'}
message += '\n\n'
message += _("Do you want renew this loan anyway?")
message += '\n\n'
message += "[%s] [%s]" % (force_renew_link, no_renew_link)
infos.append(message)
else:
db.renew_loan(loan_id, new_due_date)
#db.update_due_date(loan_id, new_due_date)
update_status_if_expired(loan_id)
infos.append(_("Loan renewed with success."))
result = db.get_item_loans(recid)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_get_item_loans_details(result=result,
recid=recid,
infos=infos,
ln=ln)
return page(title=_("Loans details") + \
" - %s" % (book_title_from_MARC(int(recid))),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_item_req_historical_overview(req, recid, ln=CFG_SITE_LANG):
"""
Display the requests historical overview of an item.
@type recid: integer.
@param recid: identify the record. It is also the primary key of
the table bibrec.
@return: Item requests - historical overview.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
req_hist_overview = db.get_item_requests_historical_overview(recid)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_get_item_req_historical_overview(
req_hist_overview=req_hist_overview,
ln=ln)
return page(title=_("Requests") + " - " + _("historical overview"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_item_loans_historical_overview(req, recid, ln=CFG_SITE_LANG):
"""
Display the loans historical overview of an item.
@type recid: integer.
@param recid: identify the record. It is also the primary key of
the table bibrec.
@return: Item loans - historical overview.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
loans_hist_overview = db.get_item_loans_historical_overview(recid)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_get_item_loans_historical_overview(
loans_hist_overview=loans_hist_overview,
ln=ln)
return page(title=_("Loans") + " - " + _("historical overview"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def add_new_copy_step1(req, ln=CFG_SITE_LANG):
"""
Add a new copy.
"""
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
body = bc_templates.tmpl_add_new_copy_step1(ln)
return page(title=_("Add new copy") + " - I",
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def add_new_copy_step2(req, p, f, ln=CFG_SITE_LANG):
"""
Add a new copy.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
result = perform_request_search(cc="Books", sc="1", p=p, f=f)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_add_new_copy_step2(result=result, ln=ln)
return page(title=_("Add new copy") + " - II",
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def add_new_copy_step3(req, recid, barcode, ln=CFG_SITE_LANG):
"""
Add a new copy.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
result = db.get_item_copies_details(recid)
libraries = db.get_internal_libraries()
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
if barcode is not None:
if not db.barcode_in_use(barcode):
barcode = None
tmp_barcode = generate_tmp_barcode()
body = bc_templates.tmpl_add_new_copy_step3(recid=recid,
result=result,
libraries=libraries,
original_copy_barcode=barcode,
tmp_barcode=tmp_barcode,
infos=infos,
ln=ln)
return page(title=_("Add new copy") + " - III",
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def add_new_copy_step4(req, barcode, library, location, collection, description,
loan_period, status, expected_arrival_date, recid,
ln=CFG_SITE_LANG):
"""
Add a new copy.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
infos = []
result = db.get_item_copies_details(recid)
libraries = db.get_internal_libraries()
if db.barcode_in_use(barcode):
infos.append(_("The given barcode <strong>%s</strong> is already in use." % barcode))
title = _("Add new copy") + " - III"
body = bc_templates.tmpl_add_new_copy_step3(recid=recid,
result=result,
libraries=libraries,
original_copy_barcode=None,
tmp_barcode=None,
infos=infos,
ln=ln)
elif not barcode:
infos.append(_("The given barcode is empty."))
title = _("Add new copy") + " - III"
body = bc_templates.tmpl_add_new_copy_step3(recid=recid,
result=result,
libraries=libraries,
original_copy_barcode=None,
tmp_barcode=None,
infos=infos,
ln=ln)
elif barcode[:3] == 'tmp' \
and status in [CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF,
CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_ITEM_STATUS_IN_PROCESS]:
infos.append(_("The status selected does not accept tamporary barcodes."))
title = _("Add new copy") + " - III"
tmp_barcode = generate_tmp_barcode()
body = bc_templates.tmpl_add_new_copy_step3(recid=recid,
result=result,
libraries=libraries,
original_copy_barcode=None,
tmp_barcode=tmp_barcode,
infos=infos,
ln=ln)
else:
library_name = db.get_library_name(library)
tup_infos = (barcode, library, library_name, location, collection,
description, loan_period, status, expected_arrival_date,
recid)
title = _("Add new copy") + " - IV"
body = bc_templates.tmpl_add_new_copy_step4(tup_infos=tup_infos, ln=ln)
return page(title=title,
uid=id_user,
req=req,
body=body,
metaheaderadd='<link rel="stylesheet" href="%s/img/jquery-ui.css" '\
'type="text/css" />' % CFG_SITE_SECURE_URL,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def add_new_copy_step5(req, barcode, library, location, collection, description,
loan_period, status, expected_arrival_date, recid,
ln=CFG_SITE_LANG):
"""
Add a new copy.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
if not db.barcode_in_use(barcode):
db.add_new_copy(barcode, recid, library, collection, location, description.strip() or '-',
loan_period, status, expected_arrival_date)
update_requests_statuses(barcode)
else:
infos.append(_("The given barcode <strong>%s</strong> is already in use.") % barcode)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_add_new_copy_step5(infos=infos, recid=recid, ln=ln)
return page(title=_("Add new copy") + " - V",
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def delete_copy_step1(req, barcode, ln):
#id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
barcode = barcode.strip('\'" \t')
recid = db.get_id_bibrec(barcode)
if recid:
#recid = recid[0]
infos.append(_("Do you really want to delete this copy of the book?"))
copies = db.get_item_copies_details(recid)
title = _("Delete copy")
body = bc_templates.tmpl_delete_copy_step1(barcode_to_delete=barcode,
recid=recid,
result=copies,
infos=infos,
ln=ln)
else:
message = _("""The barcode <strong>%s</strong> was not found""") % (barcode)
infos.append(message)
title = _("Item search")
body = bc_templates.tmpl_item_search(infos=infos, ln=ln)
return page(title=title,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def delete_copy_step2(req, barcode, ln):
#id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
barcode = barcode.strip('\'" \t')
recid = db.get_id_bibrec(barcode)
if recid:
#recid = recid[0]
if db.delete_copy(barcode)==1:
message = _("The copy with barcode <strong>%s</strong> has been deleted.") % (barcode)
else:
message = _('It was NOT possible to delete the copy with barcode <strong>%s</strong>') % (barcode)
infos.append(message)
copies = db.get_item_copies_details(recid)
requests = db.get_item_requests(recid)
loans = db.get_item_loans(recid)
purchases = db.get_item_purchases(CFG_BIBCIRCULATION_ACQ_STATUS_NEW, recid)
req_hist_overview = db.get_item_requests_historical_overview(recid)
loans_hist_overview = db.get_item_loans_historical_overview(recid)
purchases_hist_overview = db.get_item_purchases(CFG_BIBCIRCULATION_ACQ_STATUS_RECEIVED, recid)
title = _("Item details")
body = bc_templates.tmpl_get_item_details(
recid=recid, copies=copies,
requests=requests, loans=loans,
purchases=purchases,
req_hist_overview=req_hist_overview,
loans_hist_overview=loans_hist_overview,
purchases_hist_overview=purchases_hist_overview,
infos=infos, ln=ln)
else:
message = _("The barcode <strong>%s</strong> was not found") % (barcode)
infos.append(message)
title = _("Item search")
body = bc_templates.tmpl_item_search(infos=infos, ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
return page(title=title,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_item_info_step1(req, ln=CFG_SITE_LANG):
"""
Update the item's information.
"""
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
body = bc_templates.tmpl_update_item_info_step1(ln=ln)
return page(title=_("Update item information"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_item_info_step2(req, p, f, ln=CFG_SITE_LANG):
"""
Update the item's information.
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
result = perform_request_search(cc="Books", sc="1", p=p, f=f)
body = bc_templates.tmpl_update_item_info_step2(result=result, ln=ln)
return page(title="Update item information",
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_item_info_step3(req, recid, ln=CFG_SITE_LANG):
"""
Update the item's information.
"""
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
result = db.get_item_copies_details(recid)
body = bc_templates.tmpl_update_item_info_step3(recid=recid, result=result,
ln=ln)
return page(title=_("Update item information"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_item_info_step4(req, barcode, ln=CFG_SITE_LANG):
"""
Update the item's information.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
recid = db.get_id_bibrec(barcode)
result = db.get_item_info(barcode)
libraries = db.get_internal_libraries()
libraries += db.get_hidden_libraries()
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
if recid == None:
_ = gettext_set_language(ln)
infos = []
infos.append(_("Barcode <strong>%s</strong> not found" % barcode))
return item_search(req, infos, ln)
body = bc_templates.tmpl_update_item_info_step4(recid=recid,
result=result,
libraries=libraries,
ln=ln)
return page(title=_("Update item information"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_item_info_step5(req, barcode, old_barcode, library, location,
collection, description, loan_period, status,
expected_arrival_date, recid, ln=CFG_SITE_LANG):
"""
Update the item's information.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
library_name = db.get_library_name(library)
tup_infos = (barcode, old_barcode, library, library_name, location,
collection, description, loan_period, status,
expected_arrival_date, recid)
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_update_item_info_step5(tup_infos=tup_infos, ln=ln)
return page(title=_("Update item information"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_item_info_step6(req, tup_infos, ln=CFG_SITE_LANG):
"""
Update the item's information.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
# tuple containing information for the update process.
(barcode, old_barcode, library_id, location, collection,
description, loan_period, status, expected_arrival_date, recid) = tup_infos
is_on_loan = db.is_on_loan(old_barcode)
#is_requested = db.is_requested(old_barcode)
# if item on loan and new status is CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF,
# item has to be returned.
if is_on_loan and status == CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF:
db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF, old_barcode)
db.return_loan(old_barcode)
if not is_on_loan and status == CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN:
status = db.get_copy_details(barcode)[7]
infos.append(_("Item <strong>[%s]</strong> updated, but the <strong>status was not modified</strong>.") % (old_barcode))
# update item information.
db.update_item_info(old_barcode, library_id, collection, location, description.strip(),
loan_period, status, expected_arrival_date)
update_requests_statuses(old_barcode)
navtrail_previous_links = '<a class="navtrail"' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
if barcode != old_barcode:
if db.barcode_in_use(barcode):
infos.append(_("Item <strong>[%s]</strong> updated, but the <strong>barcode was not modified</strong> because it is already in use.") % (old_barcode))
else:
if db.update_barcode(old_barcode, barcode):
infos.append(_("Item <strong>[%s]</strong> updated to <strong>[%s]</strong> with success.") % (old_barcode, barcode))
else:
infos.append(_("Item <strong>[%s]</strong> updated, but the <strong>barcode was not modified</strong> because it was not found (!?).") % (old_barcode))
copies = db.get_item_copies_details(recid)
requests = db.get_item_requests(recid)
loans = db.get_item_loans(recid)
purchases = db.get_item_purchases(CFG_BIBCIRCULATION_ACQ_STATUS_NEW, recid)
req_hist_overview = db.get_item_requests_historical_overview(recid)
loans_hist_overview = db.get_item_loans_historical_overview(recid)
purchases_hist_overview = db.get_item_purchases(CFG_BIBCIRCULATION_ACQ_STATUS_RECEIVED, recid)
body = bc_templates.tmpl_get_item_details(recid=recid,
copies=copies,
requests=requests,
loans=loans,
purchases=purchases,
req_hist_overview=req_hist_overview,
loans_hist_overview=loans_hist_overview,
purchases_hist_overview=purchases_hist_overview,
infos=infos,
ln=ln)
return page(title=_("Update item information"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
else:
return redirect_to_url(req, CFG_SITE_SECURE_URL +
"/record/edit/#state=edit&recid=" + str(recid))
def item_search(req, infos=[], ln=CFG_SITE_LANG):
"""
Display a form where is possible to searh for an item.
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">' \
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
_ = gettext_set_language(ln)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
body = bc_templates.tmpl_item_search(infos=infos, ln=ln)
return page(title=_("Item search"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def item_search_result(req, p, f, ln=CFG_SITE_LANG):
"""
Search an item and return a list with all the possible results. To retrieve
the information desired, we use the method 'perform_request_search' (from
search_engine.py). In the case of BibCirculation, we are just looking for
books (items) inside the collection 'Books'.
@type p: string
@param p: search pattern
@type f: string
@param f: search field
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
if p == '':
infos.append(_('Empty string.') + ' ' + _('Please, try again.'))
return item_search(req, infos, ln)
if f == 'barcode':
p = p.strip('\'" \t')
recid = db.get_id_bibrec(p)
if recid is None:
infos.append(_('The barcode %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s does not exist on BibCirculation database.') % {'x_barcode': p, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
body = bc_templates.tmpl_item_search(infos=infos, ln=ln)
else:
return get_item_details(req, recid, ln=ln)
elif f == 'recid':
p = p.strip('\'" \t')
recid = p
if not record_exists(recid):
infos.append(_("Requested record does not seem to exist."))
body = bc_templates.tmpl_item_search(infos=infos, ln=ln)
else:
return get_item_details(req, recid, ln=ln)
else:
result = perform_request_search(cc="Books", sc="1", p=p, f=f)
body = bc_templates.tmpl_item_search_result(result=result, ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">' \
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
return page(title=_("Item search result"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
###
### "Borrower" related templates
###
def get_borrower_details(req, borrower_id, update, ln=CFG_SITE_LANG):
"""
Display the details of a borrower.
@type borrower_id: integer.
@param borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if update and CFG_CERN_SITE:
update_user_info_from_ldap(borrower_id)
borrower = db.get_borrower_details(borrower_id)
if borrower == None:
info = _('Borrower not found.') + ' ' + _('Please, try again.')
return borrower_search(req, info, False, ln)
else:
requests = db.get_borrower_request_details(borrower_id)
loans = db.get_borrower_loan_details(borrower_id)
notes = db.get_borrower_notes(borrower_id)
ill = db.get_ill_requests_details(borrower_id)
proposals = db.get_proposal_requests_details(borrower_id)
req_hist = db.bor_requests_historical_overview(borrower_id)
loans_hist = db.bor_loans_historical_overview(borrower_id)
ill_hist = db.bor_ill_historical_overview(borrower_id)
proposal_hist = db.bor_proposal_historical_overview(borrower_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_borrower_details(borrower=borrower,
requests=requests,
loans=loans,
notes=notes,
ill=ill,
proposals=proposals,
req_hist=req_hist,
loans_hist=loans_hist,
ill_hist=ill_hist,
proposal_hist=proposal_hist,
ln=ln)
return page(title=_("Borrower details"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def add_new_borrower_step1(req, ln=CFG_SITE_LANG):
"""
Add new borrower. Step 1
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
body = bc_templates.tmpl_add_new_borrower_step1(ln=ln)
return page(title=_("Add new borrower") + " - I",
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def add_new_borrower_step2(req, name, email, phone, address, mailbox,
notes, ln=CFG_SITE_LANG):
"""
Add new borrower. Step 2.
@type name: string.
@type email: string.
@type phone: string.
@type address: string.
@type mailbox: string.
@type notes: string.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
if name == '':
infos.append(_("Please, insert a name"))
if email == '':
infos.append(_("Please, insert a valid email address"))
else:
borrower_id = db.get_borrower_id_by_email(email)
if borrower_id is not None:
infos.append(_("There is already a borrower using the following email:")
+ " <strong>%s</strong>" % (email))
tup_infos = (name, email, phone, address, mailbox, notes)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
if len(infos) > 0:
body = bc_templates.tmpl_add_new_borrower_step1(tup_infos=tup_infos,
infos=infos, ln=ln)
title = _("Add new borrower") + " - I"
else:
if notes != '':
borrower_notes = {}
note_time = time.strftime("%Y-%m-%d %H:%M:%S")
borrower_notes[note_time] = notes
else:
borrower_notes = ''
borrower_id = db.new_borrower(None, name, email, phone,
address, mailbox, borrower_notes)
return redirect_to_url(req,
'%s/admin2/bibcirculation/get_borrower_details?ln=%s&borrower_id=%s' \
% (CFG_SITE_SECURE_URL, ln, borrower_id))
#body = bc_templates.tmpl_add_new_borrower_step2(tup_infos=tup_infos,
# infos=infos, ln=ln)
#title = _("Add new borrower") + " - II"
return page(title=title,
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def add_new_borrower_step3(req, tup_infos, ln=CFG_SITE_LANG):
"""
Add new borrower. Step 3.
@type tup_infos: tuple.
@param tup_infos: tuple containing borrower information.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if tup_infos[5] != '':
borrower_notes = {}
note_time = time.strftime("%Y-%m-%d %H:%M:%S")
borrower_notes[note_time] = str(tup_infos[5])
else:
borrower_notes = ''
db.new_borrower(None, tup_infos[0], tup_infos[1], tup_infos[2],
tup_infos[3], tup_infos[4], str(borrower_notes))
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_add_new_borrower_step3(ln=ln)
return page(title=_("Add new borrower") + " - III",
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_borrower_info_step1(req, borrower_id, ln=CFG_SITE_LANG):
"""
Update the borrower's information.
@param borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
borrower_details = db.get_borrower_details(borrower_id)
tup_infos = (borrower_details[0], borrower_details[2], borrower_details[3],
borrower_details[4], borrower_details[5], borrower_details[6])
body = bc_templates.tmpl_update_borrower_info_step1(tup_infos=tup_infos,
ln=ln)
return page(title=_("Update borrower information"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_borrower_info_step2(req, borrower_id, name, email, phone, address,
mailbox, ln=CFG_SITE_LANG):
"""
Update the borrower's information.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
if name == '':
infos.append(_("Please, insert a name"))
if email == '':
infos.append(_("Please, insert a valid email address"))
else:
borrower_email_id = db.get_borrower_id_by_email(email)
if borrower_email_id is not None and borrower_id != borrower_email_id:
infos.append(_("There is already a borrower using the following email:")
+ " <strong>%s</strong>" % (email))
tup_infos = (borrower_id, name, email, phone, address, mailbox)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
if len(infos) > 0:
body = bc_templates.tmpl_update_borrower_info_step1(tup_infos=tup_infos,
infos=infos, ln=ln)
else:
db.update_borrower_info(borrower_id, name, email,
phone, address, mailbox)
return redirect_to_url(req,
'%s/admin2/bibcirculation/get_borrower_details?ln=%s&borrower_id=%s' \
% (CFG_SITE_SECURE_URL, ln, borrower_id))
return page(title=_("Update borrower information"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_borrower_requests_details(req, borrower_id, request_id,
ln=CFG_SITE_LANG):
"""
Display loans details of a borrower.
@type borrower_id: integer.
@param borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
@type request_id: integer.
@param request_id: identify the hold request to be cancelled
@return: borrower requests details.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if request_id:
db.cancel_request(request_id)
update_request_data(request_id)
result = db.get_borrower_request_details(borrower_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
name = db.get_borrower_name(borrower_id)
title = _("Hold requests details") + " - %s" % (name)
body = bc_templates.tmpl_borrower_request_details(result=result,
borrower_id=borrower_id,
ln=ln)
return page(title=title,
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_borrower_loans_details(req, recid, barcode, borrower_id,
renewal, force, loan_id, ln=CFG_SITE_LANG):
"""
Show borrower's loans details.
@type recid: integer.
@param recid: identify the record. It is also the primary key of
the table bibrec.
@type barcode: string.
@param barcode: identify the item. It is the primary key of the table
crcITEM.
@type borrower_id: integer.
@param borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
@type renewal: string.
@param renewal: renew all loans.
@type force: string.
@param force: force the renew of a loan, when usually this is not possible.
@type loan_id: integer.
@param loan_id: identify a loan. It is the primery key of the table
crcLOAN.
@return: borrower loans details.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
force_renew_link = create_html_link(CFG_SITE_SECURE_URL +
'/admin2/bibcirculation/get_borrower_loans_details',
{'barcode': barcode, 'borrower_id': borrower_id,
'loan_id': loan_id, 'force': 'true', 'ln': ln},
(_("Yes")))
no_renew_link = create_html_link(CFG_SITE_SECURE_URL +
'/admin2/bibcirculation/get_borrower_loans_details',
{'borrower_id': borrower_id, 'ln': ln},
(_("No")))
if barcode and loan_id and recid:
item_description = db.get_item_description(barcode)
queue = db.get_queue_request(recid, item_description)
new_due_date = renew_loan_for_X_days(barcode)
if len(queue) != 0:
title = book_title_from_MARC(recid)
message = _("Another user is waiting for this book %(x_strong_tag_open)s%(x_title)s%(x_strong_tag_close)s.") % {'x_title': title, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'}
message += '\n\n'
message += _("Do you want renew this loan anyway?")
message += '\n\n'
message += "[%s] [%s]" % (force_renew_link, no_renew_link)
infos.append(message)
else:
#db.update_due_date(loan_id, new_due_date)
db.renew_loan(loan_id, new_due_date)
#update_status_if_expired(loan_id)
infos.append(_("Loan renewed with success."))
elif loan_id and barcode and force == 'true':
new_due_date = renew_loan_for_X_days(barcode)
db.renew_loan(loan_id, new_due_date)
update_status_if_expired(loan_id)
infos.append(_("Loan renewed with success."))
elif borrower_id and renewal=='true':
list_of_loans = db.get_recid_borrower_loans(borrower_id)
for (loan_id, recid, barcode) in list_of_loans:
item_description = db.get_item_description(barcode)
queue = db.get_queue_request(recid, item_description)
new_due_date = renew_loan_for_X_days(barcode)
force_renewall_link = create_html_link(CFG_SITE_SECURE_URL +
'/admin2/bibcirculation/get_borrower_loans_details',
{'barcode': barcode, 'borrower_id': borrower_id,
'loan_id': loan_id, 'force': 'true', 'ln': ln},
(_("Yes")))
if len(queue) != 0:
title = book_title_from_MARC(recid)
message = _("Another user is waiting for this book %(x_strong_tag_open)s%(x_title)s%(x_strong_tag_close)s.") % {'x_title': title, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'}
message += '\n\n'
message += _("Do you want renew this loan anyway?")
message += '\n\n'
message += "[%s] [%s]" % (force_renewall_link, no_renew_link)
infos.append(message)
else:
db.renew_loan(loan_id, new_due_date)
update_status_if_expired(loan_id)
if infos == []:
infos.append(_("All loans renewed with success."))
borrower_loans = db.get_borrower_loan_details(borrower_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_borrower_loans_details(
borrower_loans=borrower_loans,
borrower_id=borrower_id,
infos=infos, ln=ln)
return page(title=_("Loans details") + \
" - %s" %(db.get_borrower_name(borrower_id)),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def bor_loans_historical_overview(req, borrower_id, ln=CFG_SITE_LANG):
"""
Display the loans historical overview of a borrower.
@type borrower_id: integer.
@param borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
@return: borrower loans - historical overview.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
loans_hist_overview = db.bor_loans_historical_overview(borrower_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_bor_loans_historical_overview(
loans_hist_overview = loans_hist_overview,
ln=ln)
return page(title=_("Loans") + " - " + _("historical overview"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def bor_requests_historical_overview(req, borrower_id, ln=CFG_SITE_LANG):
"""
Display the requests historical overview of a borrower.
@type borrower_id: integer.
@param borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
@return: borrower requests - historical overview.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
req_hist_overview = db.bor_requests_historical_overview(borrower_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_bor_requests_historical_overview(
req_hist_overview = req_hist_overview,
ln=ln)
return page(title=_("Requests") + " - " + _("historical overview"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_borrower_ill_details(req, borrower_id, request_type='', ln=CFG_SITE_LANG):
"""
Display ILL details of a borrower.
@type borrower_id: integer.
@param borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
@type ill_id: integer.
@param ill_id: identify the ILL request. It is also the primary key
of the table crcILLREQUEST.
@return: borrower ILL details.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if request_type == 'proposal-book':
result = db.get_proposal_requests_details(borrower_id)
else:
result = db.get_ill_requests_details(borrower_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
name = db.get_borrower_name(borrower_id)
title = _("ILL details") + "- %s" % (name)
body = bc_templates.tmpl_borrower_ill_details(result=result,
borrower_id=borrower_id,
ln=ln)
return page(title=title,
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def bor_ill_historical_overview(req, borrower_id, request_type='', ln=CFG_SITE_LANG):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if request_type == 'proposal-book':
result = db.bor_proposal_historical_overview(borrower_id)
else:
result = db.bor_ill_historical_overview(borrower_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
name = db.get_borrower_name(borrower_id)
title = _("ILL historical overview") + " - %s" % (name)
body = bc_templates.tmpl_borrower_ill_details(result=result,
borrower_id=borrower_id,
ln=ln)
return page(title=title,
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def borrower_notification(req, borrower_id, template, message, load_msg_template,
subject, send_message, from_address, ln=CFG_SITE_LANG):
"""
Send an email to a borrower or simply load and display an editable email
template.
@type borrower_id: integer.
@param borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
@type borrower_email: string.
@param borrower_email: The librarian can change the email manually.
In that case, this value will be taken instead
of the that in borrower details.
@type template: string.
@param template: The name of the notification template to be loaded.
If the @param load_msg_template holds True, the
template is not loaded.
@type message: string.
@param message: Message to be sent if the flag @param send_message is set.
@type subject: string.
@param subject: Subject of the message.
@type from_address: string.
@param from_address: From address in the message sent.
@return: Display the email template or send an email to a borrower.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
email = db.get_borrower_email(borrower_id)
if load_msg_template == 'False' and template is not None:
# Do not load the template. It is the email body itself.
body = bc_templates.tmpl_borrower_notification(email=email,
subject=subject,
email_body=template,
borrower_id=borrower_id,
from_address=from_address,
ln=ln)
elif send_message:
send_email(fromaddr = from_address,
toaddr = email,
subject = subject,
content = message,
header = '',
footer = '',
attempt_times = 1,
attempt_sleeptime = 10
)
body = bc_templates.tmpl_send_notification(ln=ln)
else:
show_template = load_template(template)
body = bc_templates.tmpl_borrower_notification(email=email,
subject=subject,
email_body=show_template,
borrower_id=borrower_id,
from_address=from_address,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">' \
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
return page(title="User Notification",
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_borrower_notes(req, borrower_id, delete_key, library_notes,
ln=CFG_SITE_LANG):
"""
Retrieve the notes of a borrower.
@type borrower_id: integer.
@param borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if delete_key and borrower_id:
if looks_like_dictionary(db.get_borrower_notes(borrower_id)):
borrower_notes = eval(db.get_borrower_notes(borrower_id))
if delete_key in borrower_notes.keys():
del borrower_notes[delete_key]
db.update_borrower_notes(borrower_id, borrower_notes)
elif library_notes:
if db.get_borrower_notes(borrower_id):
if looks_like_dictionary(db.get_borrower_notes(borrower_id)):
borrower_notes = eval(db.get_borrower_notes(borrower_id))
else:
borrower_notes = {}
else:
borrower_notes = {}
note_time = time.strftime("%Y-%m-%d %H:%M:%S")
if note_time not in borrower_notes.keys():
borrower_notes[note_time] = str(library_notes)
db.update_borrower_notes(borrower_id, borrower_notes)
borrower_notes = db.get_borrower_notes(borrower_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">' \
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
body = bc_templates.tmpl_borrower_notes(borrower_notes=borrower_notes,
borrower_id=borrower_id,
ln=ln)
return page(title=_("Borrower notes"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def borrower_search(req, empty_barcode, redirect_to_new_request=False,
ln=CFG_SITE_LANG):
"""
Page (for administrator) where is it possible to search
for a borrower (who is on crcBORROWER table) using his/her name,
email, phone or id.
If redirect_to_new_request is False, the returned page will be "Borrower details"
If redirect_to_new_request is True, the returned page will be "New Request"
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
if empty_barcode:
infos.append(empty_barcode)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">' \
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
body = bc_templates.tmpl_borrower_search(infos=infos,
redirect_to_new_request=redirect_to_new_request,
ln=ln)
if redirect_to_new_request:
title = _("New Request")
else:
title = _("Borrower Search")
return page(title=title,
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def borrower_search_result(req, column, string, redirect_to_new_request=False,
ln=CFG_SITE_LANG):
"""
Search a borrower and return a list with all the possible results.
@type column: string
@param column: identify the column, of the table crcBORROWER, that will be
considered during the search. Can be 'name', 'email' or 'id'.
@type string: string
@param string: string used for the search process.
If redirect_to_new_request is True, the returned page will be "Borrower details"
If redirect_to_new_request is False, the returned page will be "New Request"
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if string == '':
message = _('Empty string.') + ' ' + _('Please, try again.')
return borrower_search(req, message, redirect_to_new_request, ln)
else:
result = search_user(column, string)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">' \
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
if len(result) == 1:
if redirect_to_new_request:
return create_new_request_step1(req, result[0][0])
else:
return get_borrower_details(req, result[0][0], False, ln)
#return create_new_request_step1(req, borrower_id, p, f, search, ln)
else:
body = bc_templates.tmpl_borrower_search_result(result=result,
redirect_to_new_request=redirect_to_new_request,
ln=ln)
return page(title=_("Borrower search result"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
###
### ILL/Purchase/Acquisition related functions.
### Naming of the methods is not intuitive. Should be improved
### and appropriate documentation added, when required.
### Also, methods could be refactored.
###
def register_ill_from_proposal(req, ill_request_id, bor_id=None, ln=CFG_SITE_LANG):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
book_info = db.get_ill_book_info(ill_request_id)
infos = []
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
if looks_like_dictionary(book_info):
book_info = eval(book_info)
if not bor_id:
bid = db.get_ill_borrower(ill_request_id)
else:
bid = bor_id
if book_info.has_key('recid') and bid:
recid = book_info['recid']
if not db.has_loan_request(bid, recid, ill=1):
db.tag_requests_as_done(bid, recid=recid)
library_notes = {}
library_notes[time.strftime("%Y-%m-%d %H:%M:%S")] = \
_("This ILL has been created from a proposal.")
db.register_ill_from_proposal(ill_request_id,
bid, library_notes)
infos.append(_('An ILL has been created for the user.'))
else:
infos.append(_('An active ILL already exists for this user on this record.'))
else:
infos.append(_('Could not create an ILL from the proposal'))
else:
infos.append(_('Could not create an ILL from the proposal'))
ill_req = db.get_ill_requests(CFG_BIBCIRCULATION_ILL_STATUS_NEW)
body = bc_templates.tmpl_list_ill(ill_req, infos=infos, ln=ln)
return page(title=_("ILL requests"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
#return redirect_to_url(req,
# '%s/admin2/bibcirculation/list_proposal?status=%s' % \
# (CFG_SITE_SECURE_URL, CFG_BIBCIRCULATION_PROPOSAL_STATUS_PUT_ASIDE))
def register_ill_request_with_no_recid_step1(req, borrower_id,
ln=CFG_SITE_LANG):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_register_ill_request_with_no_recid_step1(
infos=infos,
borrower_id=borrower_id,
admin=True, ln=ln)
return page(title=_("Register ILL request"),
uid=id_user,
req=req,
metaheaderadd = "<link rel=\"stylesheet\" href=\"%s/img/jquery-ui.css\" type=\"text/css\" />" % CFG_SITE_SECURE_URL,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def register_ill_request_with_no_recid_step2(req, title, authors, place,
publisher, year, edition, isbn, budget_code,
period_of_interest_from, period_of_interest_to,
additional_comments, only_edition, key, string,
borrower_id, ln=CFG_SITE_LANG):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
book_info = (title, authors, place, publisher, year, edition, isbn)
request_details = (budget_code, period_of_interest_from,
period_of_interest_to, additional_comments, only_edition)
if borrower_id in (None, '', 'None'):
body = None
if not key:
borrowers_list = None
elif not string:
infos.append(_('Empty string.') + ' ' + _('Please, try again.'))
borrowers_list = None
else:
if validate_date_format(period_of_interest_from) is False:
infos = []
infos.append(_("The period of interest %(x_strong_tag_open)sFrom: %(x_date)s%(x_strong_tag_close)s is not a valid date or date format") % {'x_date': period_of_interest_from, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
body = bc_templates.tmpl_register_ill_request_with_no_recid_step1(
infos=infos,
borrower_id=None,
admin=True,
ln=ln)
elif validate_date_format(period_of_interest_to) is False:
infos = []
infos.append(_("The period of interest %(x_strong_tag_open)sTo: %(x_date)s%(x_strong_tag_close)s is not a valid date or date format") % {'x_date': period_of_interest_to, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
body = bc_templates.tmpl_register_ill_request_with_no_recid_step1(
infos=infos,
ln=ln)
else:
result = search_user(key, string)
borrowers_list = []
if len(result) == 0:
infos.append(_("0 borrowers found."))
else:
for user in result:
borrower_data = db.get_borrower_data_by_id(user[0])
borrowers_list.append(borrower_data)
if body == None:
body = bc_templates.tmpl_register_ill_request_with_no_recid_step2(
book_info=book_info, request_details=request_details,
result=borrowers_list, key=key, string=string,
infos=infos, ln=ln)
else:
user_info = db.get_borrower_data_by_id(borrower_id)
return register_ill_request_with_no_recid_step3(req, title, authors,
place, publisher,year, edition,
isbn, user_info, budget_code,
period_of_interest_from,
period_of_interest_to,
additional_comments, only_edition,
ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
return page(title=_("Register ILL request"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def register_ill_request_with_no_recid_step3(req, title, authors, place,
publisher, year, edition, isbn,
user_info, budget_code,
period_of_interest_from,
period_of_interest_to,
additional_comments,
only_edition, ln=CFG_SITE_LANG):
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
request_details = (budget_code, period_of_interest_from,
period_of_interest_to, additional_comments, only_edition)
book_info = (title, authors, place, publisher, year, edition, isbn)
if user_info is None:
return register_ill_request_with_no_recid_step2(req, title, authors,
place, publisher, year, edition, isbn, budget_code,
period_of_interest_from, period_of_interest_to,
additional_comments, only_edition, 'name', None,
None, ln)
else:
body = bc_templates.tmpl_register_ill_request_with_no_recid_step3(
book_info=book_info,
user_info=user_info,
request_details=request_details,
admin=True,
ln=ln)
return page(title=_("Register ILL request"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def register_ill_request_with_no_recid_step4(req, book_info, borrower_id,
request_details, ln):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
_ = gettext_set_language(ln)
(title, authors, place, publisher, year, edition, isbn) = book_info
#create_ill_record(book_info))
(budget_code, period_of_interest_from,
period_of_interest_to, library_notes, only_edition) = request_details
ill_request_notes = {}
if library_notes:
ill_request_notes[time.strftime("%Y-%m-%d %H:%M:%S")] = \
str(library_notes)
### budget_code ###
if db.get_borrower_data_by_id(borrower_id) == None:
_ = gettext_set_language(ln)
infos = []
infos.append(_("<strong>Request not registered:</strong> wrong borrower id"))
body = bc_templates.tmpl_register_ill_request_with_no_recid_step2(
book_info=book_info,
request_details=request_details, result=[],
key='name', string=None, infos=infos, ln=ln)
return page(title=_("Register ILL request"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
else:
book_info = {'title': title, 'authors': authors, 'place': place,
'publisher': publisher,'year' : year, 'edition': edition,
'isbn' : isbn}
db.ill_register_request_on_desk(borrower_id, book_info,
period_of_interest_from,
period_of_interest_to,
CFG_BIBCIRCULATION_ILL_STATUS_NEW,
str(ill_request_notes),
only_edition, 'book', budget_code)
return list_ill_request(req, CFG_BIBCIRCULATION_ILL_STATUS_NEW, ln)
def register_ill_book_request(req, borrower_id, ln=CFG_SITE_LANG):
"""
Display a form where is possible to searh for an item.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
_ = gettext_set_language(ln)
infos = []
body = bc_templates.tmpl_register_ill_book_request(infos=infos,
borrower_id=borrower_id,
ln=ln)
return page(title=_("Register ILL Book request"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def register_ill_book_request_result(req, borrower_id, p, f, ln=CFG_SITE_LANG):
"""
Search an item and return a list with all the possible results. To retrieve
the information desired, we use the method 'perform_request_search' (from
search_engine.py). In the case of BibCirculation, we are just looking for
books (items) inside the collection 'Books'.
@type p: string
@param p: search pattern
@type f: string
@param f: search field
@return: list of recids
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
if p == '':
infos.append(_('Empty string.') + ' ' + _('Please, try again.'))
body = bc_templates.tmpl_register_ill_book_request(infos=infos,
borrower_id=borrower_id,
ln=ln)
else:
if f == 'barcode':
p = p.strip('\'" \t')
recid = db.get_id_bibrec(p)
if recid is None:
infos.append(_('The barcode %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s does not exist on BibCirculation database.') % {'x_barcode': p, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
body = bc_templates.tmpl_register_ill_book_request(infos=infos,
borrower_id=borrower_id,
ln=ln)
else:
body = bc_templates.tmpl_register_ill_book_request_result(
result=[recid],
borrower_id=borrower_id,
ln=ln)
else:
result = perform_request_search(cc="Books", sc="1", p=p, f=f)
if len(result) == 0:
return register_ill_request_with_no_recid_step1(req,
borrower_id, ln)
else:
body = bc_templates.tmpl_register_ill_book_request_result(
result=result,
borrower_id=borrower_id,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
return page(title=_("Register ILL Book request"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def register_ill_article_request_step1(req, ln=CFG_SITE_LANG):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">' \
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
body = bc_templates.tmpl_register_ill_article_request_step1(infos=infos,
ln=ln)
return page(title=_("Register ILL Article request"),
uid=id_user,
req=req,
body=body,
metaheaderadd = "<link rel=\"stylesheet\" href=\"%s/img/jquery-ui.css\" type=\"text/css\" />"%(CFG_SITE_SECURE_URL),
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def register_ill_article_request_step2(req, periodical_title, article_title,
author, report_number, volume, issue,
pages, year, budget_code, issn,
period_of_interest_from,
period_of_interest_to,
additional_comments, key, string,
ln=CFG_SITE_LANG):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
if key and not string:
infos.append(_('Empty string.') + ' ' + _('Please, try again.'))
article_info = (periodical_title, article_title, author, report_number,
volume, issue, pages, year, issn)
request_details = (period_of_interest_from, period_of_interest_to,
budget_code, additional_comments)
body = bc_templates.tmpl_register_ill_article_request_step2(
article_info=article_info,
request_details=request_details,
result=None, key=key,
string=string, infos=infos,
ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
return page(title=_("Register ILL request"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
result = search_user(key, string)
borrowers_list = []
if len(result) == 0 and key:
if CFG_CERN_SITE:
infos.append(_("0 borrowers found.") + ' ' +_("Search by CCID."))
else:
new_borrower_link = create_html_link(CFG_SITE_SECURE_URL +
'/admin2/bibcirculation/add_new_borrower_step1',
{'ln': ln}, _("Register new borrower."))
message = _("0 borrowers found.") + ' ' + new_borrower_link
infos.append(message)
else:
for user in result:
borrower_data = db.get_borrower_data_by_id(user[0])
borrowers_list.append(borrower_data)
if validate_date_format(period_of_interest_from) is False:
infos = []
infos.append(_("The period of interest %(x_strong_tag_open)sFrom: %(x_date)s%(x_strong_tag_close)s is not a valid date or date format") % {'x_date': period_of_interest_from, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
body = bc_templates.tmpl_register_ill_article_request_step1(infos=infos,
ln=ln)
elif validate_date_format(period_of_interest_to) is False:
infos = []
infos.append(_("The period of interest %(x_strong_tag_open)sTo: %(x_date)s%(x_strong_tag_close)s is not a valid date or date format") % {'x_date': period_of_interest_to, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'})
body = bc_templates.tmpl_register_ill_article_request_step1(infos=infos,
ln=ln)
else:
article_info = (periodical_title, article_title, author, report_number,
volume, issue, pages, year, issn)
request_details = (period_of_interest_from, period_of_interest_to,
budget_code, additional_comments)
body = bc_templates.tmpl_register_ill_article_request_step2(
article_info=article_info,
request_details=request_details,
result=borrowers_list,
key=key, string=string,
infos=infos, ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
return invenio.webpage.page(title=_("Register ILL request"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def register_ill_article_request_step3(req, periodical_title, title, authors,
report_number, volume, issue,
page_number, year, issn, user_info,
request_details, ln=CFG_SITE_LANG):
#id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
#info = (title, authors, "", "", year, "", issn)
#create_ill_record(info)
item_info = {'periodical_title': periodical_title, 'title': title,
'authors': authors, 'place': "", 'publisher': "",
'year' : year, 'edition': "", 'issn' : issn,
'volume': volume, 'issue': issue, 'page': page_number }
(period_of_interest_from, period_of_interest_to, budget_code,
library_notes) = request_details
only_edition = ""
if user_info is None:
return register_ill_article_request_step2(req, periodical_title, title,
authors, report_number, volume, issue,
page_number, year, budget_code, issn,
period_of_interest_from,
period_of_interest_to,
library_notes, 'name', None, ln)
else:
borrower_id = user_info[0]
ill_request_notes = {}
if library_notes:
ill_request_notes[time.strftime("%Y-%m-%d %H:%M:%S")] \
= str(library_notes)
db.ill_register_request_on_desk(borrower_id, item_info,
period_of_interest_from,
period_of_interest_to,
CFG_BIBCIRCULATION_ILL_STATUS_NEW,
str(ill_request_notes),
only_edition, 'article', budget_code)
return list_ill_request(req, CFG_BIBCIRCULATION_ILL_STATUS_NEW, ln)
def register_purchase_request_step1(req, request_type, recid, title, authors,
place, publisher, year, edition, this_edition_only,
isbn, standard_number,
budget_code, cash, period_of_interest_from,
period_of_interest_to, additional_comments,
ln=CFG_SITE_LANG):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
if recid:
fields = (request_type, recid, budget_code, cash,
period_of_interest_from, period_of_interest_to,
additional_comments)
else:
fields = (request_type, title, authors, place, publisher, year, edition,
this_edition_only, isbn, standard_number, budget_code,
cash, period_of_interest_from, period_of_interest_to,
additional_comments)
body = bc_templates.tmpl_register_purchase_request_step1(infos=infos,
fields=fields, admin=True, ln=ln)
return page(title=_("Register purchase request"),
uid=id_user,
req=req,
body=body,
language=ln,
metaheaderadd='<link rel="stylesheet" ' \
'href="%s/img/jquery-ui.css" ' \
'type="text/css" />' % CFG_SITE_SECURE_URL,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def register_purchase_request_step2(req, request_type, recid, title, authors,
place, publisher, year, edition, this_edition_only,
isbn, standard_number,
budget_code, cash, period_of_interest_from,
period_of_interest_to, additional_comments,
p, f, ln=CFG_SITE_LANG):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
infos = []
if cash and budget_code == '':
budget_code = 'cash'
if recid:
fields = (request_type, recid, budget_code, cash,
period_of_interest_from, period_of_interest_to,
additional_comments)
else:
fields = (request_type, title, authors, place, publisher, year, edition,
this_edition_only, isbn, standard_number, budget_code,
cash, period_of_interest_from, period_of_interest_to,
additional_comments)
if budget_code == '' and not cash:
infos.append(_("Payment method information is mandatory. \
Please, type your budget code or tick the 'cash' checkbox."))
body = bc_templates.tmpl_register_purchase_request_step1(infos=infos,
fields=fields, admin=True, ln=ln)
else:
########################
########################
if p and not f:
infos.append(_('Empty string.') + ' ' + _('Please, try again.'))
body = bc_templates.tmpl_register_purchase_request_step2(
infos=infos, fields=fields,
result=None, p=p, f=f, ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
return page(title=_("Register ILL request"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
result = search_user(f, p)
borrowers_list = []
if len(result) == 0 and f:
if CFG_CERN_SITE:
infos.append(_("0 borrowers found.") + ' ' +_("Search by CCID."))
else:
new_borrower_link = create_html_link(CFG_SITE_SECURE_URL +
'/admin2/bibcirculation/add_new_borrower_step1',
{'ln': ln}, _("Register new borrower."))
message = _("0 borrowers found.") + ' ' + new_borrower_link
infos.append(message)
else:
for user in result:
borrower_data = db.get_borrower_data_by_id(user[0])
borrowers_list.append(borrower_data)
body = bc_templates.tmpl_register_purchase_request_step2(
infos=infos, fields=fields,
result=borrowers_list, p=p,
f=f, ln=ln)
########################
########################
return page(title=_("Register purchase request"),
uid=id_user,
req=req,
body=body,
language=ln,
metaheaderadd='<link rel="stylesheet" ' \
'href="%s/img/jquery-ui.css" ' \
'type="text/css" />' % CFG_SITE_SECURE_URL,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def register_purchase_request_step3(req, request_type, recid, title, authors,
place, publisher, year, edition, this_edition_only,
isbn, standard_number,
budget_code, cash, period_of_interest_from,
period_of_interest_to, additional_comments,
borrower_id, ln=CFG_SITE_LANG):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
infos = []
if recid:
fields = (request_type, recid, budget_code, cash,
period_of_interest_from, period_of_interest_to,
additional_comments)
else:
fields = (request_type, title, authors, place, publisher, year, edition,
this_edition_only, isbn, standard_number, budget_code,
cash, period_of_interest_from, period_of_interest_to,
additional_comments)
if budget_code == '' and not cash:
infos.append(_("Payment method information is mandatory. \
Please, type your budget code or tick the 'cash' checkbox."))
body = bc_templates.tmpl_register_purchase_request_step1(infos=infos,
fields=fields, admin=True, ln=ln)
else:
if recid:
item_info = "{'recid': " + str(recid) + "}"
title = book_title_from_MARC(recid)
else:
item_info = {'title': title, 'authors': authors, 'place': place,
'publisher': publisher, 'year' : year, 'edition': edition,
'isbn' : isbn, 'standard_number': standard_number}
ill_request_notes = {}
if additional_comments:
ill_request_notes[time.strftime("%Y-%m-%d %H:%M:%S")] \
= str(additional_comments)
if cash and budget_code == '':
budget_code = 'cash'
if borrower_id:
borrower_email = db.get_borrower_email(borrower_id)
else:
borrower_email = db.get_invenio_user_email(id_user)
borrower_id = db.get_borrower_id_by_email(borrower_email)
db.ill_register_request_on_desk(borrower_id, item_info,
period_of_interest_from,
period_of_interest_to,
CFG_BIBCIRCULATION_ACQ_STATUS_NEW,
str(ill_request_notes),
this_edition_only, request_type, budget_code)
msg_for_user = load_template('purchase_notification') % title
send_email(fromaddr = CFG_BIBCIRCULATION_ILLS_EMAIL,
toaddr = borrower_email,
subject = _("Your book purchase request"),
header = '', footer = '',
content = msg_for_user,
attempt_times=1,
attempt_sleeptime=10
)
return redirect_to_url(req,
'%s/admin2/bibcirculation/list_purchase?ln=%s&status=%s' % \
(CFG_SITE_SECURE_URL, ln,
CFG_BIBCIRCULATION_ACQ_STATUS_NEW))
return page(title=_("Register purchase request"),
uid=id_user,
req=req,
body=body,
language=ln,
metaheaderadd='<link rel="stylesheet" ' \
'href="%s/img/jquery-ui.css" ' \
'type="text/css" />' % CFG_SITE_SECURE_URL,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def ill_request_details_step1(req, delete_key, ill_request_id, new_status,
ln=CFG_SITE_LANG):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
if delete_key and ill_request_id:
if looks_like_dictionary(db.get_ill_request_notes(ill_request_id)):
library_notes = eval(db.get_ill_request_notes(ill_request_id))
if delete_key in library_notes.keys():
del library_notes[delete_key]
db.update_ill_request_notes(ill_request_id, library_notes)
if new_status:
db.update_ill_request_status(ill_request_id, new_status)
ill_request_borrower_details = \
db.get_ill_request_borrower_details(ill_request_id)
if ill_request_borrower_details is None \
or len(ill_request_borrower_details) == 0:
infos.append(_("Borrower request details not found."))
ill_request_details = db.get_ill_request_details(ill_request_id)
if ill_request_details is None or len(ill_request_details) == 0:
infos.append(_("Request not found."))
libraries = db.get_external_libraries()
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
title = _("ILL request details")
if infos == []:
body = bc_templates.tmpl_ill_request_details_step1(
ill_request_id=ill_request_id,
ill_request_details=ill_request_details,
libraries=libraries,
ill_request_borrower_details=ill_request_borrower_details,
ln=ln)
else:
body = bc_templates.tmpl_display_infos(infos, ln)
return page(title=title,
uid=id_user,
req=req,
metaheaderadd='<link rel="stylesheet" ' \
'href="%s/img/jquery-ui.css" ' \
'type="text/css" />' % CFG_SITE_SECURE_URL,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def ill_request_details_step2(req, delete_key, ill_request_id, new_status,
library_id, request_date, expected_date,
arrival_date, due_date, return_date,
cost, _currency, barcode, library_notes,
book_info, article_info, ln=CFG_SITE_LANG):
#id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if delete_key and ill_request_id:
if looks_like_dictionary(db.get_ill_request_notes(ill_request_id)):
library_previous_notes = eval(db.get_ill_request_notes(ill_request_id))
if delete_key in library_previous_notes.keys():
del library_previous_notes[delete_key]
db.update_ill_request_notes(ill_request_id, library_previous_notes)
if db.get_ill_request_notes(ill_request_id):
if looks_like_dictionary(db.get_ill_request_notes(ill_request_id)):
library_previous_notes = eval(db.get_ill_request_notes(ill_request_id))
else:
library_previous_notes = {}
else:
library_previous_notes = {}
if library_notes:
library_previous_notes[time.strftime("%Y-%m-%d %H:%M:%S")] = \
str(library_notes)
if new_status == CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED:
borrower_id = db.get_ill_borrower(ill_request_id)
barcode = db.get_ill_barcode(ill_request_id)
db.update_ill_loan_status(borrower_id, barcode, return_date, 'ill')
db.update_ill_request(ill_request_id, library_id, request_date,
expected_date, arrival_date, due_date, return_date,
new_status, cost, barcode,
str(library_previous_notes))
request_type = db.get_ill_request_type(ill_request_id)
if request_type == 'book':
item_info = book_info
else:
item_info = article_info
db.update_ill_request_item_info(ill_request_id, item_info)
if new_status == CFG_BIBCIRCULATION_ILL_STATUS_ON_LOAN:
# Redirect to an email template when the ILL 'book' arrives
# (Not for articles.)
subject = _("ILL received: ")
book_info = db.get_ill_book_info(ill_request_id)
if looks_like_dictionary(book_info):
book_info = eval(book_info)
if book_info.has_key('recid'):
subject += "'" + book_title_from_MARC(int(book_info['recid'])) + "'"
bid = db.get_ill_borrower(ill_request_id)
msg = load_template("ill_received")
return redirect_to_url(req,
create_url(CFG_SITE_SECURE_URL +
'/admin2/bibcirculation/borrower_notification',
{'borrower_id': bid,
'subject': subject,
'load_msg_template': False,
'template': msg,
'from_address': CFG_BIBCIRCULATION_ILLS_EMAIL
}
)
)
return list_ill_request(req, new_status, ln)
def purchase_details_step1(req, delete_key, ill_request_id, new_status,
ln=CFG_SITE_LANG):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
if delete_key and ill_request_id:
if looks_like_dictionary(db.get_ill_request_notes(ill_request_id)):
library_notes = eval(db.get_ill_request_notes(ill_request_id))
if delete_key in library_notes.keys():
del library_notes[delete_key]
db.update_ill_request_notes(ill_request_id, library_notes)
if new_status:
db.update_ill_request_status(ill_request_id, new_status)
ill_request_borrower_details = \
db.get_purchase_request_borrower_details(ill_request_id)
if ill_request_borrower_details is None \
or len(ill_request_borrower_details) == 0:
infos.append(_("Borrower request details not found."))
ill_request_details = db.get_ill_request_details(ill_request_id)
if ill_request_details is None or len(ill_request_details) == 0:
infos.append(_("Request not found."))
vendors = db.get_all_vendors()
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
if infos == []:
body = bc_templates.tmpl_purchase_details_step1(
ill_request_id=ill_request_id,
ill_request_details=ill_request_details,
libraries=vendors,
ill_request_borrower_details=ill_request_borrower_details,
ln=ln)
title = _("Purchase details")
else:
body = bc_templates.tmpl_display_infos(infos, ln)
return page(title=title,
uid=id_user,
req=req,
metaheaderadd = "<link rel=\"stylesheet\" href=\"%s/img/jquery-ui.css\" type=\"text/css\" />" % CFG_SITE_SECURE_URL,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def purchase_details_step2(req, delete_key, ill_request_id, new_status,
library_id, request_date, expected_date,
arrival_date, due_date, return_date,
cost, budget_code, library_notes,
item_info, ln=CFG_SITE_LANG):
#id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if delete_key and ill_request_id:
if looks_like_dictionary(db.get_ill_request_notes(ill_request_id)):
library_previous_notes = eval(db.get_ill_request_notes(ill_request_id))
if delete_key in library_previous_notes.keys():
del library_previous_notes[delete_key]
db.update_ill_request_notes(ill_request_id, library_previous_notes)
if db.get_ill_request_notes(ill_request_id):
if looks_like_dictionary(db.get_ill_request_notes(ill_request_id)):
library_previous_notes = eval(db.get_ill_request_notes(ill_request_id))
else:
library_previous_notes = {}
else:
library_previous_notes = {}
if library_notes:
library_previous_notes[time.strftime("%Y-%m-%d %H:%M:%S")] = \
str(library_notes)
if new_status == CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED:
borrower_id = db.get_ill_borrower(ill_request_id)
db.update_purchase_request(ill_request_id, library_id, request_date,
expected_date, arrival_date, due_date, return_date,
new_status, cost, budget_code,
str(library_previous_notes))
request_type = db.get_ill_request_type(ill_request_id)
if request_type not in CFG_BIBCIRCULATION_PROPOSAL_TYPE:
db.update_ill_request_item_info(ill_request_id, item_info)
if new_status in (CFG_BIBCIRCULATION_PROPOSAL_STATUS_ON_ORDER,
CFG_BIBCIRCULATION_PROPOSAL_STATUS_PUT_ASIDE):
barcode = db.get_ill_barcode(ill_request_id)
if new_status == CFG_BIBCIRCULATION_PROPOSAL_STATUS_ON_ORDER:
db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_ON_ORDER, barcode)
subject = _("Book suggestion accepted: ")
template = "proposal_acceptance"
else:
db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_UNDER_REVIEW, barcode)
subject = _("Book suggestion refused: ")
template = "proposal_refusal"
book_info = db.get_ill_book_info(ill_request_id)
if looks_like_dictionary(book_info):
book_info = eval(book_info)
if book_info.has_key('recid'):
bid = db.get_ill_borrower(ill_request_id)
if db.has_loan_request(bid, book_info['recid']):
subject += "'" + book_title_from_MARC(int(book_info['recid'])) + "'"
return redirect_to_url(req,
create_url(CFG_SITE_SECURE_URL +
'/admin2/bibcirculation/borrower_notification',
{'borrower_id': bid,
'subject': subject,
'template': template,
'from_address': CFG_BIBCIRCULATION_ILLS_EMAIL
}
)
)
if new_status == CFG_BIBCIRCULATION_PROPOSAL_STATUS_RECEIVED:
barcode = db.get_ill_barcode(ill_request_id)
# Reset the item description to the default value.
db.set_item_description(barcode, '-')
#db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_IN_PROCESS, barcode)
borrower_id = db.get_ill_borrower(ill_request_id)
recid = db.get_id_bibrec(barcode)
if db.has_loan_request(borrower_id, recid):
#If an ILL has already been created(After the book had been put aside), there
#would be no waiting request by the proposer.
db.update_loan_request_status(CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING,
barcode=barcode,
borrower_id=borrower_id)
return redirect_to_url(req,
'%s/admin2/bibcirculation/update_item_info_step4?barcode=%s' % \
(CFG_SITE_SECURE_URL, barcode))
if new_status == CFG_BIBCIRCULATION_ACQ_STATUS_RECEIVED:
subject = _("Purchase received: ")
book_info = db.get_ill_book_info(ill_request_id)
if looks_like_dictionary(book_info):
book_info = eval(book_info)
if book_info.has_key('recid'):
subject += "'" + book_title_from_MARC(int(book_info['recid'])) + "'"
bid = db.get_ill_borrower(ill_request_id)
if budget_code == 'cash':
msg = load_template("purchase_received_cash") % cost
else:
msg = load_template("purchase_received_tid") % cost
return redirect_to_url(req,
create_url(CFG_SITE_SECURE_URL +
'/admin2/bibcirculation/borrower_notification',
{'borrower_id': bid,
'subject': subject,
'load_msg_template': False,
'template': msg,
'from_address': CFG_BIBCIRCULATION_ILLS_EMAIL
}
)
)
if new_status in CFG_BIBCIRCULATION_ACQ_STATUS or \
new_status == CFG_BIBCIRCULATION_PROPOSAL_STATUS_ON_ORDER:
# The items 'on order' whether for acquisition for the library or purchase
# on behalf of the user are displayed in the same list.
return redirect_to_url(req,
'%s/admin2/bibcirculation/list_purchase?ln=%s&status=%s' % \
(CFG_SITE_SECURE_URL, ln, new_status))
else:
return redirect_to_url(req,
'%s/admin2/bibcirculation/list_proposal?ln=%s&status=%s' % \
(CFG_SITE_SECURE_URL, ln, new_status))
def get_ill_library_notes(req, ill_id, delete_key, library_notes,
ln=CFG_SITE_LANG):
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if delete_key and ill_id:
if looks_like_dictionary(db.get_ill_notes(ill_id)):
ill_notes = eval(db.get_ill_notes(ill_id))
if delete_key in ill_notes.keys():
del ill_notes[delete_key]
db.update_ill_notes(ill_id, ill_notes)
elif library_notes:
if db.get_ill_notes(ill_id):
if looks_like_dictionary(db.get_ill_notes(ill_id)):
ill_notes = eval(db.get_ill_notes(ill_id))
else:
ill_notes = {}
else:
ill_notes = {}
ill_notes[time.strftime("%Y-%m-%d %H:%M:%S")] = str(library_notes)
db.update_ill_notes(ill_id, ill_notes)
ill_notes = db.get_ill_notes(ill_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_ill_notes(ill_notes=ill_notes,
ill_id=ill_id,
ln=ln)
return page(title=_("ILL notes"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def list_ill_request(req, status, ln=CFG_SITE_LANG):
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
ill_req = db.get_ill_requests(status)
body = bc_templates.tmpl_list_ill(ill_req=ill_req, ln=ln)
return page(title=_("List of ILL requests"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def list_purchase(req, status, recid=None, ln=CFG_SITE_LANG):
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if recid:
# Purchases of a particular item to be displayed in the item info page.
purchase_reqs = db.get_item_purchases(status, recid)
else:
purchase_reqs = db.get_purchases(status)
body = bc_templates.tmpl_list_purchase(purchase_reqs, ln=ln)
return page(title=_("List of purchase requests"),
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def list_proposal(req, status, ln=CFG_SITE_LANG):
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if status == "requests-putaside":
requests = db.get_requests_on_put_aside_proposals()
body = bc_templates.tmpl_list_requests_on_put_aside_proposals(requests, ln=ln)
title=_("List of requests on put aside proposals")
else:
proposals = db.get_proposals(status)
body = bc_templates.tmpl_list_proposal(proposals, ln=ln)
title=_("List of proposals")
return page(title=title,
uid=id_user,
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def ill_search(req, ln=CFG_SITE_LANG):
infos = []
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
body = bc_templates.tmpl_ill_search(infos=infos, ln=ln)
return page(title=_("ILL search"),
uid=id_user,
req=req,
body=body,
language=ln,
metaheaderadd='<link rel="stylesheet" href="%s/img/jquery-ui.css" '\
'type="text/css" />' % CFG_SITE_SECURE_URL,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def ill_search_result(req, p, f, date_from, date_to, ln):
"""
Search an item and return a list with all the possible results. To retrieve
the information desired, we use the method 'perform_request_search' (from
search_engine.py). In the case of BibCirculation, we are just looking for
books (items) inside the collection 'Books'.
@type p: string
@param p: search pattern
@type f: string
@param f: search field
@return: list of recids
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
#id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if not has_date_format(date_from):
date_from = '0000-00-00'
if not has_date_format(date_to):
date_to = '9999-12-31'
if f == 'title':
ill_req = db.search_ill_requests_title(p, date_from, date_to)
body = bc_templates.tmpl_list_ill(ill_req=ill_req, ln=ln)
elif f == 'ILL_request_ID':
ill_req = db.search_ill_requests_id(p, date_from, date_to)
body = bc_templates.tmpl_list_ill(ill_req=ill_req, ln=ln)
elif f == 'cost':
purchase_reqs = db.search_requests_cost(p, date_from, date_to)
body = bc_templates.tmpl_list_purchase(purchase_reqs=purchase_reqs, ln=ln)
elif f == 'notes':
purchase_reqs = db.search_requests_notes(p, date_from, date_to)
body = bc_templates.tmpl_list_purchase(purchase_reqs=purchase_reqs, ln=ln)
return page(title=_("List of ILL requests"),
req=req,
body=body,
language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
###
### "Library" related templates ###
###
def get_library_details(req, library_id, ln=CFG_SITE_LANG):
"""
Display the details of a library.
@type library_id: integer.
@param library_id: identify the library. It is also the primary key of
the table crcLIBRARY.
@return: library details.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
navtrail_previous_links = '<a class="navtrail" ' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
library_details = db.get_library_details(library_id)
if library_details is None:
_ = gettext_set_language(ln)
infos = []
infos.append(_('Library ID not found.'))
return search_library_step1(req, infos, ln)
library_items = db.get_library_items(library_id)
body = bc_templates.tmpl_library_details(library_details=library_details,
library_items=library_items,
ln=ln)
return page(title=_("Library details"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def merge_libraries_step1(req, library_id, f=None, p=None, ln=CFG_SITE_LANG):
"""
Step 1/3 of library merging procedure
@param library_id: ID of the library to be deleted
@param p: search pattern.
@param f: field
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
navtrail_previous_links = '<a class="navtrail" ' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
library_details = db.get_library_details(library_id)
library_items = db.get_library_items(library_id)
result = None
if f is not None:
if p in (None, '', '*'):
result = db.get_all_libraries() #list of (id, name)
elif f == 'name':
result = db.search_library_by_name(p)
elif f == 'email':
result = db.search_library_by_email(p)
body = bc_templates.tmpl_merge_libraries_step1(
library_details=library_details,
library_items=library_items,
result=result,
p=p,
ln=ln)
return page(title=_("Merge libraries"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def merge_libraries_step2(req, library_from, library_to, ln=CFG_SITE_LANG):
"""
Step 2/3 of library merging procedure
Confirm the libraries selected
@param library_from: ID of the library to be deleted
@param library_to: ID of the resulting library
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
navtrail_previous_links = '<a class="navtrail" ' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
library_from_details = db.get_library_details(library_from)
library_from_items = db.get_library_items(library_from)
library_to_details = db.get_library_details(library_to)
library_to_items = db.get_library_items(library_to)
body = bc_templates.tmpl_merge_libraries_step2(
library_from_details=library_from_details,
library_from_items=library_from_items,
library_to_details=library_to_details,
library_to_items=library_to_items,
ln=ln)
return page(title=_("Merge libraries"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def merge_libraries_step3(req, library_from, library_to, ln=CFG_SITE_LANG):
"""
Step 3/3 of library merging procedure
Perform the merge and display the details of the resulting library
@param library_from: ID of the library to be deleted
@param library_to: ID of the resulting library
"""
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
db.merge_libraries(library_from, library_to)
return get_library_details(req, library_to, ln)
def add_new_library_step1(req, ln=CFG_SITE_LANG):
"""
Add a new Library.
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
body = bc_templates.tmpl_add_new_library_step1(ln=ln)
return page(title=_("Add new library"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def add_new_library_step2(req, name, email, phone, address,
lib_type, notes, ln=CFG_SITE_LANG):
"""
Add a new Library.
"""
tup_infos = (name, email, phone, address, lib_type, notes)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
_ = gettext_set_language(ln)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
body = bc_templates.tmpl_add_new_library_step2(tup_infos=tup_infos, ln=ln)
return page(title=_("Add new library"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def add_new_library_step3(req, name, email, phone, address,
lib_type, notes, ln=CFG_SITE_LANG):
"""
Add a new Library.
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
db.add_new_library(name, email, phone, address, lib_type, notes)
body = bc_templates.tmpl_add_new_library_step3(ln=ln)
return page(title=_("Add new library"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_library_info_step1(req, ln=CFG_SITE_LANG):
"""
Update the library's information.
"""
infos = []
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
body = bc_templates.tmpl_update_library_info_step1(infos=infos, ln=ln)
return page(title=_("Update library information"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_library_info_step2(req, column, string, ln=CFG_SITE_LANG):
"""
Update the library's information.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if not string:
infos = []
infos.append(_("Empty string.") + ' ' + _('Please, try again.'))
body = bc_templates.tmpl_update_library_info_step1(infos=infos, ln=ln)
elif string == '*':
result = db.get_all_libraries()
body = bc_templates.tmpl_update_library_info_step2(result=result, ln=ln)
else:
if column == 'name':
result = db.search_library_by_name(string)
else:
result = db.search_library_by_email(string)
body = bc_templates.tmpl_update_library_info_step2(result=result, ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
return page(title=_("Update library information"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_library_info_step3(req, library_id, ln=CFG_SITE_LANG):
"""
Update the library's information.
library_id - identify the library. It is also the primary key of
the table crcLIBRARY.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
library_info = db.get_library_details(library_id)
body = bc_templates.tmpl_update_library_info_step3(
library_info=library_info,
ln=ln)
return page(title=_("Update library information"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_library_info_step4(req, name, email, phone, address, lib_type,
library_id, ln=CFG_SITE_LANG):
"""
Update the library's information.
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
tup_infos = (library_id, name, email, phone, address, lib_type)
body = bc_templates.tmpl_update_library_info_step4(tup_infos=tup_infos,
ln=ln)
return page(title=_("Update library information"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_library_info_step5(req, name, email, phone, address, lib_type,
library_id, ln=CFG_SITE_LANG):
"""
Update the library's information.
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
#(library_id, name, email, phone, address) = tup_infos
db.update_library_info(library_id, name, email, phone, address, lib_type)
body = bc_templates.tmpl_update_library_info_step5(ln=ln)
return page(title=_("Update library information"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_library_notes(req, library_id, delete_key,
library_notes, ln=CFG_SITE_LANG):
"""
Retrieve notes related with a library.
library_id - identify the library. It is also the primary key of
the table crcLIBRARY.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if delete_key and library_id:
if looks_like_dictionary(db.get_library_notes(library_id)):
lib_notes = eval(db.get_library_notes(library_id))
if delete_key in lib_notes.keys():
del lib_notes[delete_key]
db.update_library_notes(library_id, lib_notes)
elif library_notes:
if db.get_library_notes(library_id):
if looks_like_dictionary(db.get_library_notes(library_id)):
lib_notes = eval(db.get_library_notes(library_id))
else:
lib_notes = {}
else:
lib_notes = {}
lib_notes[time.strftime("%Y-%m-%d %H:%M:%S")] = str(library_notes)
db.update_library_notes(library_id, lib_notes)
lib_notes = db.get_library_notes(library_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
body = bc_templates.tmpl_library_notes(library_notes=lib_notes,
library_id=library_id,
ln=ln)
return page(title=_("Library notes"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def search_library_step1(req, infos=[], ln=CFG_SITE_LANG):
"""
Display the form where we can search a library (by name or email).
"""
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
body = bc_templates.tmpl_search_library_step1(infos=infos,
ln=ln)
return page(title=_("Search library"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def search_library_step2(req, column, string, ln=CFG_SITE_LANG):
"""
Search a library and return a list with all the possible results, using the
parameters received from the previous step.
column - identify the column, of the table crcLIBRARY, that will be
considered during the search. Can be 'name' or 'email'.
str - string used for the search process.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if not string:
infos = []
infos.append(_("Emptry string.") + ' ' + _('Please, try again.'))
body = bc_templates.tmpl_search_library_step1(infos=infos, ln=ln)
elif string == '*':
result = db.get_all_libraries()
body = bc_templates.tmpl_search_library_step2(result=result, ln=ln)
else:
if column == 'name':
result = db.search_library_by_name(string)
else:
result = db.search_library_by_email(string)
body = bc_templates.tmpl_search_library_step2(result=result, ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a> &gt; <a class="navtrail" ' \
'href="%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s">'\
'Circulation Management' \
'</a> ' % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, ln)
return page(title=_("Search library"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
###
### "Vendor" related templates ###
###
def get_vendor_details(req, vendor_id, ln=CFG_SITE_LANG):
"""
Display the details of a vendor.
@type vendor_id: integer.
@param vendor_id: identify the vendor. It is also the primary key of
the table crcVENDOR.
@return: vendor details.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
vendor_details = db.get_vendor_details(vendor_id)
navtrail_previous_links = '<a class="navtrail" ' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_vendor_details(vendor_details=vendor_details,
ln=ln)
return page(title=_("Vendor details"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def add_new_vendor_step1(req, ln=CFG_SITE_LANG):
"""
Add a new Vendor.
"""
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
body = bc_templates.tmpl_add_new_vendor_step1(ln=ln)
return page(title=_("Add new vendor"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def add_new_vendor_step2(req, name, email, phone, address,
notes, ln=CFG_SITE_LANG):
"""
Add a new Vendor.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
tup_infos = (name, email, phone, address, notes)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_add_new_vendor_step2(tup_infos=tup_infos, ln=ln)
return page(title=_("Add new vendor"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def add_new_vendor_step3(req, name, email, phone, address,
notes, ln=CFG_SITE_LANG):
"""
Add a new Vendor.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
db.add_new_vendor(name, email, phone, address, notes)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_add_new_vendor_step3(ln=ln)
return page(title=_("Add new vendor"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_vendor_info_step1(req, ln=CFG_SITE_LANG):
"""
Update the vendor's information.
"""
infos = []
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
_ = gettext_set_language(ln)
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
body = bc_templates.tmpl_update_vendor_info_step1(infos=infos, ln=ln)
return page(title=_("Update vendor information"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_vendor_info_step2(req, column, string, ln=CFG_SITE_LANG):
"""
Update the vendor's information.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if not string:
infos = []
infos.append(_('Empty string.') + ' ' + _('Please, try again.'))
body = bc_templates.tmpl_update_vendor_info_step1(infos=infos, ln=ln)
elif string == '*':
result = db.get_all_vendors()
body = bc_templates.tmpl_update_vendor_info_step2(result=result, ln=ln)
else:
if column == 'name':
result = db.search_vendor_by_name(string)
else:
result = db.search_vendor_by_email(string)
body = bc_templates.tmpl_update_vendor_info_step2(result=result, ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_update_vendor_info_step2(result=result, ln=ln)
return page(title=_("Update vendor information"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_vendor_info_step3(req, vendor_id, ln=CFG_SITE_LANG):
"""
Update the library's information.
vendor_id - identify the vendor. It is also the primary key of
the table crcVENDOR.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
vendor_info = db.get_vendor_details(vendor_id)
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_update_vendor_info_step3(vendor_info=vendor_info,
ln=ln)
return page(title=_("Update vendor information"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_vendor_info_step4(req, name, email, phone, address,
vendor_id, ln=CFG_SITE_LANG):
"""
Update the vendor's information.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
tup_infos = (vendor_id, name, email, phone, address)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_update_vendor_info_step4(tup_infos=tup_infos,
ln=ln)
return page(title=_("Update vendor information"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def update_vendor_info_step5(req, name, email, phone, address,
vendor_id, ln=CFG_SITE_LANG):
"""
Update the library's information.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
db.update_vendor_info(vendor_id, name, email, phone, address)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_update_vendor_info_step5(ln=ln)
return page(title=_("Update vendor information"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def get_vendor_notes(req, vendor_id, add_notes, new_note, ln=CFG_SITE_LANG):
"""
Retrieve notes related with a vendor.
vendor_id - identify the vendor. It is also the primary key of
the table crcVENDOR.
@param add_notes: display the textarea where will be written a new notes.
@param new_notes: note that will be added to the others vendor's notes.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if new_note:
date = '[' + time.ctime() + '] '
new_line = '\n'
new_note = date + new_note + new_line
db.add_new_vendor_note(new_note, vendor_id)
vendor_notes = db.get_vendor_notes(vendor_id)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_vendor_notes(vendor_notes=vendor_notes,
vendor_id=vendor_id,
add_notes=add_notes,
ln=ln)
return page(title=_("Vendor notes"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def search_vendor_step1(req, ln=CFG_SITE_LANG):
"""
Display the form where we can search a vendor (by name or email).
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
infos = []
navtrail_previous_links = '<a class="navtrail"' \
' href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
body = bc_templates.tmpl_search_vendor_step1(infos=infos,
ln=ln)
return page(title=_("Search vendor"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
def search_vendor_step2(req, column, string, ln=CFG_SITE_LANG):
"""
Search a vendor and return a list with all the possible results, using the
parameters received from the previous step.
column - identify the column, of the table crcVENDOR, that will be
considered during the search. Can be 'name' or 'email'.
str - string used for the search process.
"""
id_user = getUid(req)
(auth_code, auth_message) = is_adminuser(req)
if auth_code != 0:
return mustloginpage(req, auth_message)
_ = gettext_set_language(ln)
if not string:
infos = []
infos.append(_('Empty string.') + ' ' + _('Please, try again.'))
body = bc_templates.tmpl_search_vendor_step1(infos=infos,
ln=ln)
elif string == '*':
result = db.get_all_vendors()
body = bc_templates.tmpl_search_vendor_step2(result=result, ln=ln)
else:
if column == 'name':
result = db.search_vendor_by_name(string)
else:
result = db.search_vendor_by_email(string)
body = bc_templates.tmpl_search_vendor_step2(result=result, ln=ln)
navtrail_previous_links = '<a class="navtrail" ' \
'href="%s/help/admin">Admin Area' \
'</a>' % (CFG_SITE_SECURE_URL,)
return page(title=_("Search vendor"),
uid=id_user,
req=req,
body=body, language=ln,
navtrail=navtrail_previous_links,
lastupdated=__lastupdated__)
diff --git a/invenio/legacy/bibcirculation/api.py b/invenio/legacy/bibcirculation/api.py
index 84787612a..6e694ecd7 100644
--- a/invenio/legacy/bibcirculation/api.py
+++ b/invenio/legacy/bibcirculation/api.py
@@ -1,799 +1,799 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Invenio Bibcirculation User.
When applicable, methods should be renamed, refactored and
appropriate documentation added.
"""
__revision__ = "$Id$"
import datetime, time
# Invenio imports
from invenio.config import \
CFG_SITE_LANG, \
CFG_CERN_SITE, \
CFG_SITE_URL
from invenio.legacy.webuser import collect_user_info
from invenio.ext.email import send_email
from invenio.base.i18n import gettext_set_language
from invenio.legacy.bibrecord import record_get_field_value
from invenio.legacy.search_engine import get_record
# Bibcirculation imports
-import invenio.bibcirculation_dblayer as db
-from invenio.bibcirculationadminlib import load_template
-from invenio.bibcirculation_utils import book_title_from_MARC, \
+import invenio.legacy.bibcirculation.db_layer as db
+from invenio.legacy.bibcirculation.adminlib import load_template
+from invenio.legacy.bibcirculation.utils import book_title_from_MARC, \
book_information_from_MARC, \
create_ill_record, \
tag_all_requests_as_done, \
generate_tmp_barcode, \
generate_new_due_date, \
update_requests_statuses, \
search_user
-from invenio.bibcirculation_cern_ldap import get_user_info_from_ldap
-from invenio.bibcirculation_config import CFG_BIBCIRCULATION_LIBRARIAN_EMAIL, \
+from invenio.legacy.bibcirculation.cern_ldap import get_user_info_from_ldap
+from invenio.legacy.bibcirculation.config import CFG_BIBCIRCULATION_LIBRARIAN_EMAIL, \
CFG_BIBCIRCULATION_LOANS_EMAIL, \
CFG_BIBCIRCULATION_ITEM_STATUS_UNDER_REVIEW, \
CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING, \
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING, \
CFG_BIBCIRCULATION_REQUEST_STATUS_PROPOSED, \
CFG_BIBCIRCULATION_ILL_STATUS_NEW, \
CFG_BIBCIRCULATION_PROPOSAL_STATUS_NEW, \
AMZ_BOOK_PUBLICATION_DATE_TAG, \
CFG_BIBCIRCULATION_DEFAULT_LIBRARY_ID
import invenio.legacy.template
bc_templates = invenio.legacy.template.load('bibcirculation')
def perform_borrower_loans(uid, barcode, borrower_id,
request_id, action, ln=CFG_SITE_LANG):
"""
Display all the loans and the requests of a given borrower.
@param barcode: identify the item. Primary key of crcITEM.
@type barcode: string
@param borrower_id: identify the borrower. Primary key of crcBORROWER.
@type borrower_id: int
@param request_id: identify the request: Primary key of crcLOANREQUEST
@type request_id: int
@return body(html)
"""
_ = gettext_set_language(ln)
infos = []
borrower_id = db.get_borrower_id_by_email(db.get_invenio_user_email(uid))
new_due_date = generate_new_due_date(30)
#renew loan
if action == 'renew':
recid = db.get_id_bibrec(barcode)
item_description = db.get_item_description(barcode)
queue = db.get_queue_request(recid, item_description)
if len(queue) != 0 and queue[0][0] != borrower_id:
message = "It is not possible to renew your loan for %(x_strong_tag_open)s%(x_title)s%(x_strong_tag_close)s" % {'x_title': book_title_from_MARC(recid), 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'}
message += ' ' + _("Another user is waiting for this book.")
infos.append(message)
else:
loan_id = db.get_current_loan_id(barcode)
db.renew_loan(loan_id, new_due_date)
#update_status_if_expired(loan_id)
tag_all_requests_as_done(barcode, borrower_id)
infos.append(_("Your loan has been renewed with success."))
#cancel request
elif action == 'cancel':
db.cancel_request(request_id)
barcode_requested = db.get_requested_barcode(request_id)
update_requests_statuses(barcode_requested)
#renew all loans
elif action == 'renew_all':
list_of_barcodes = db.get_borrower_loans_barcodes(borrower_id)
for bc in list_of_barcodes:
bc_recid = db.get_id_bibrec(bc)
item_description = db.get_item_description(bc)
queue = db.get_queue_request(bc_recid, item_description)
#check if there are requests
if len(queue) != 0 and queue[0][0] != borrower_id:
message = "It is not possible to renew your loan for %(x_strong_tag_open)s%(x_title)s%(x_strong_tag_close)s" % {'x_title': book_title_from_MARC(bc_recid), 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'}
message += ' ' + _("Another user is waiting for this book.")
infos.append(message)
else:
loan_id = db.get_current_loan_id(bc)
db.renew_loan(loan_id, new_due_date)
#update_status_if_expired(loan_id)
tag_all_requests_as_done(barcode, borrower_id)
if infos == []:
infos.append(_("All loans have been renewed with success."))
loans = db.get_borrower_loans(borrower_id)
requests = db.get_borrower_requests(borrower_id)
proposals = db.get_borrower_proposals(borrower_id)
body = bc_templates.tmpl_yourloans(loans=loans, requests=requests, proposals=proposals,
borrower_id=borrower_id, infos=infos, ln=ln)
return body
def perform_loanshistoricaloverview(uid, ln=CFG_SITE_LANG):
"""
Display Loans historical overview for user uid.
@param uid: user id
@param ln: language of the page
@return body(html)
"""
invenio_user_email = db.get_invenio_user_email(uid)
borrower_id = db.get_borrower_id_by_email(invenio_user_email)
result = db.get_historical_overview(borrower_id)
body = bc_templates.tmpl_loanshistoricaloverview(result=result, ln=ln)
return body
def perform_get_holdings_information(recid, req, action="borrowal", ln=CFG_SITE_LANG):
"""
Display all the copies of an item. If the parameter action is 'proposal', display
appropriate information to the user.
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@param action: Specifies whether the current record is put up to solicit acquisition
proposals(if "proposal") or not("borrowal").
@type proposal: string
@return body(html)
"""
_ = gettext_set_language(ln)
if action == "proposal":
tag = AMZ_BOOK_PUBLICATION_DATE_TAG
publication_date = record_get_field_value(get_record(recid), tag[:3],
ind1=tag[3], ind2=tag[4],
code=tag[5])
msg = ''
if publication_date:
cur_date = datetime.date.today()
try:
pub_date = time.strptime(publication_date, '%d %b %Y')
pub_date = datetime.date(pub_date[0], pub_date[1], pub_date[2])
if cur_date < pub_date:
msg += _("The publication date of this book is %s.") % (publication_date)
msg += "<br /><br />"
else:
msg += _("This book has no copies in the library. ")
except:
msg += _("This book has no copies in the library. ")
msg += _("If you think this book is interesting, suggest it and tell us why you consider this \
book is important. The library will consider your opinion and if we decide to buy the \
book, we will issue a loan for you as soon as it arrives and send it by internal mail.")
msg += "<br \><br \>"
msg += _("In case we decide not to buy the book, we will offer you an interlibrary loan")
body = bc_templates.tmpl_book_proposal_information(recid, msg, ln=ln)
else:
holdings_information = db.get_holdings_information(recid, False)
body = bc_templates.tmpl_holdings_information(recid=recid,
req=req,
holdings_info=holdings_information,
ln=ln)
return body
def perform_new_request(recid, barcode, action="borrowal", ln=CFG_SITE_LANG):
"""
Display form to be filled by the user.
@param uid: user id
@type: int
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@param barcode: identify the item. Primary key of crcITEM.
@type barcode: string
@return request form
"""
body = bc_templates.tmpl_new_request(recid=recid, barcode=barcode, action=action, ln=ln)
return body
def perform_book_proposal_send(uid, recid, period_from, period_to,
remarks, ln=CFG_SITE_LANG):
"""
The subfield containing the information about the source of importation
of the record acts as the marker for the records put up for acquisition
proposals.
Register the user's book proposal, his period of interest and his remarks
in the 'ILLREQUEST' table. Add a new 'dummy' copy for the proposed book.
Create a loan(hold) request on behalf of the user for that copy and send
a confirmation e-mail to her/him.
"""
_ = gettext_set_language(ln)
user = collect_user_info(uid)
if CFG_CERN_SITE:
try:
borrower = search_user('ccid', user['external_personid'])
except:
borrower = ()
else:
borrower = search_user('email', user['email'])
if borrower != ():
if not db.has_copies(recid):
tmp_barcode = generate_tmp_barcode()
ill_register_request_with_recid(recid, uid, period_from, period_to, remarks,
conditions='register_acquisition_suggestion',
only_edition='False', barcode=tmp_barcode, ln=CFG_SITE_LANG)
db.add_new_copy(tmp_barcode, recid, library_id=CFG_BIBCIRCULATION_DEFAULT_LIBRARY_ID,
collection='', location='',
description=_("This book was suggested for acquisition"), loan_period='',
status=CFG_BIBCIRCULATION_ITEM_STATUS_UNDER_REVIEW, expected_arrival_date='')
db.delete_brief_format_cache(recid)
return perform_new_request_send_message(uid, recid, period_from, period_to, tmp_barcode,
status=CFG_BIBCIRCULATION_REQUEST_STATUS_PROPOSED,
mail_subject='Acquisition Suggestion',
mail_template='proposal_notification',
mail_remarks=remarks, ln=CFG_SITE_LANG)
return _("This item already has copies.")
else:
if CFG_CERN_SITE:
message = bc_templates.tmpl_message_request_send_fail_cern("Borrower ID not found.")
else:
message = bc_templates.tmpl_message_request_send_fail_other("Borrower ID not found.")
body = bc_templates.tmpl_new_request_send(message=message, ln=ln)
return body
def perform_new_request_send(uid, recid, period_from, period_to,
barcode, ln=CFG_SITE_LANG):
"""
@param recid: recID - Invenio record identifier
@param ln: language of the page
"""
nb_requests = 0
all_copies_on_loan = True
description = db.get_item_description(barcode)
copies = db.get_barcodes(recid, description)
for bc in copies:
nb_requests += db.get_number_requests_per_copy(bc)
if db.is_item_on_loan(bc) is None:
all_copies_on_loan = False
if nb_requests == 0:
if all_copies_on_loan:
status = CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING
else:
status = CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING
else:
status = CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING
return perform_new_request_send_message(uid, recid, period_from, period_to, barcode,
status, mail_subject='New request',
mail_template='notification',
mail_remarks='', ln=ln)
def perform_new_request_send_message(uid, recid, period_from, period_to, barcode,
status, mail_subject, mail_template,
mail_remarks='', ln=CFG_SITE_LANG):
user = collect_user_info(uid)
if CFG_CERN_SITE:
try:
borrower = search_user('ccid', user['external_personid'])
except:
borrower = ()
else:
borrower = search_user('email', user['email'])
if borrower != ():
borrower_id = borrower[0][0]
if db.is_doc_already_requested(recid, barcode, borrower_id):
message = bc_templates.tmpl_message_send_already_requested()
return bc_templates.tmpl_new_request_send(message=message, ln=ln)
borrower_details = db.get_borrower_details(borrower_id)
(_id, ccid, name, email, _phone, address, mailbox) = borrower_details
(title, year, author,
isbn, publisher) = book_information_from_MARC(recid)
req_id = db.new_hold_request(borrower_id, recid, barcode,
period_from, period_to, status)
location = '-'
library = ''
request_date = ''
if status != CFG_BIBCIRCULATION_REQUEST_STATUS_PROPOSED:
details = db.get_loan_request_details(req_id)
if details:
library = details[3]
location = details[4]
request_date = details[7]
message_template = load_template(mail_template)
# A message to be sent to the user detailing his loan request
# or his new book proposal.
if status == CFG_BIBCIRCULATION_REQUEST_STATUS_PROPOSED:
message_for_user = message_template % (title)
else:
link_to_holdings_details = CFG_SITE_URL + \
'/record/%s/holdings' % str(recid)
message_for_user = message_template % (name, ccid, email, address,
mailbox, title, author, publisher,
year, isbn, location, library,
link_to_holdings_details, request_date)
send_email(fromaddr = CFG_BIBCIRCULATION_LOANS_EMAIL,
toaddr = email,
subject = mail_subject,
content = message_for_user,
header = '',
footer = '',
attempt_times=1,
attempt_sleeptime=10
)
if status == CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING:
# A message to be sent to the librarian about the pending status.
link_to_item_request_details = CFG_SITE_URL + \
"/admin2/bibcirculation/get_item_requests_details?ln=%s&recid=%s" \
% (ln, str(recid))
message_for_librarian = message_template % (name, ccid, email, address,
mailbox, title, author, publisher,
year, isbn, location, library,
link_to_item_request_details,
request_date)
send_email(fromaddr = CFG_BIBCIRCULATION_LIBRARIAN_EMAIL,
toaddr = CFG_BIBCIRCULATION_LOANS_EMAIL,
subject = mail_subject,
content = message_for_librarian,
header = '',
footer = '',
attempt_times=1,
attempt_sleeptime=10
)
if CFG_CERN_SITE:
if status == CFG_BIBCIRCULATION_REQUEST_STATUS_PROPOSED:
message = bc_templates.tmpl_message_proposal_send_ok_cern()
else:
message = bc_templates.tmpl_message_request_send_ok_cern()
else:
if status == CFG_BIBCIRCULATION_REQUEST_STATUS_PROPOSED:
message = bc_templates.tmpl_message_proposal_send_ok_other()
else:
message = bc_templates.tmpl_message_request_send_ok_other()
else:
if CFG_CERN_SITE:
message = bc_templates.tmpl_message_request_send_fail_cern("Borrower ID not found")
else:
message = bc_templates.tmpl_message_request_send_fail_other("Borrower ID not found")
body = bc_templates.tmpl_new_request_send(message=message, ln=ln)
return body
def display_ill_form(ln=CFG_SITE_LANG):
"""
Display ILL form
@param uid: user id
@type: int
"""
body = bc_templates.tmpl_display_ill_form(infos=[], ln=ln)
return body
def ill_request_with_recid(recid, ln=CFG_SITE_LANG):
"""
Display ILL form.
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@param uid: user id
@type: int
"""
body = bc_templates.tmpl_ill_request_with_recid(recid=recid,
infos=[],
ln=ln)
return body
def ill_register_request_with_recid(recid, uid, period_of_interest_from,
period_of_interest_to, additional_comments,
conditions, only_edition, barcode='',
ln=CFG_SITE_LANG):
"""
Register a new ILL request.
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@param uid: user id
@type: int
@param period_of_interest_from: period of interest - from(date)
@type period_of_interest_from: string
@param period_of_interest_to: period of interest - to(date)
@type period_of_interest_to: string
"""
_ = gettext_set_language(ln)
# Create a dictionary.
book_info = "{'recid': " + str(recid) + "}"
user = collect_user_info(uid)
borrower_id = db.get_borrower_id_by_email(user['email'])
if borrower_id is None:
if CFG_CERN_SITE == 1:
result = get_user_info_from_ldap(email=user['email'])
try:
name = result['cn'][0]
except KeyError:
name = None
try:
email = result['mail'][0]
except KeyError:
email = None
try:
phone = result['telephoneNumber'][0]
except KeyError:
phone = None
try:
address = result['physicalDeliveryOfficeName'][0]
except KeyError:
address = None
try:
mailbox = result['postOfficeBox'][0]
except KeyError:
mailbox = None
try:
ccid = result['employeeID'][0]
except KeyError:
ccid = ''
if address is not None:
db.new_borrower(ccid, name, email, phone, address, mailbox, '')
else:
message = bc_templates.tmpl_message_request_send_fail_cern("Office address not available.")
else:
message = bc_templates.tmpl_message_request_send_fail_other("Office address not available.")
return bc_templates.tmpl_ill_register_request_with_recid(
message=message,
ln=ln)
address = db.get_borrower_address(user['email'])
if not address:
if CFG_CERN_SITE == 1:
email = user['email']
result = get_user_info_from_ldap(email)
try:
address = result['physicalDeliveryOfficeName'][0]
except KeyError:
address = None
if address is not None:
db.add_borrower_address(address, email)
else:
message = bc_templates.tmpl_message_request_send_fail_cern("Office address not available.")
else:
message = bc_templates.tmpl_message_request_send_fail_other("Office address not available.")
return bc_templates.tmpl_ill_register_request_with_recid(
message=message,
ln=ln)
if not conditions:
infos = []
infos.append(_("You didn't accept the ILL conditions."))
return bc_templates.tmpl_ill_request_with_recid(recid,
infos=infos,
ln=ln)
elif conditions == 'register_acquisition_suggestion':
# This ILL request entry is a book proposal.
db.ill_register_request(book_info, borrower_id,
period_of_interest_from, period_of_interest_to,
CFG_BIBCIRCULATION_PROPOSAL_STATUS_NEW,
additional_comments,
only_edition or 'False','proposal-book', barcode=barcode)
else:
db.ill_register_request(book_info, borrower_id,
period_of_interest_from, period_of_interest_to,
CFG_BIBCIRCULATION_ILL_STATUS_NEW,
additional_comments,
only_edition or 'False','book', barcode=barcode)
if CFG_CERN_SITE == 1:
message = bc_templates.tmpl_message_request_send_ok_cern()
else:
message = bc_templates.tmpl_message_request_send_ok_other()
#Notify librarian about new ILL request.
send_email(fromaddr=CFG_BIBCIRCULATION_LIBRARIAN_EMAIL,
toaddr=CFG_BIBCIRCULATION_LOANS_EMAIL,
subject='ILL request for books confirmation',
content='',
#hold_request_mail(recid=recid, borrower_id=borrower_id),
attempt_times=1,
attempt_sleeptime=10)
return bc_templates.tmpl_ill_register_request_with_recid(
message=message,
ln=ln)
def ill_register_request(uid, title, authors, place, publisher, year, edition,
isbn, period_of_interest_from, period_of_interest_to,
additional_comments, conditions, only_edition, request_type,
barcode='', ln=CFG_SITE_LANG):
"""
Register new ILL request. Create new record (collection: ILL Books)
@param uid: user id
@type: int
@param authors: book's authors
@type authors: string
@param place: place of publication
@type place: string
@param publisher: book's publisher
@type publisher: string
@param year: year of publication
@type year: string
@param edition: book's edition
@type edition: string
@param isbn: book's isbn
@type isbn: string
@param period_of_interest_from: period of interest - from(date)
@type period_of_interest_from: string
@param period_of_interest_to: period of interest - to(date)
@type period_of_interest_to: string
@param additional_comments: comments given by the user
@type additional_comments: string
@param conditions: ILL conditions
@type conditions: boolean
@param only_edition: borrower wants only the given edition
@type only_edition: boolean
"""
_ = gettext_set_language(ln)
item_info = (title, authors, place, publisher, year, edition, isbn)
create_ill_record(item_info)
book_info = {'title': title, 'authors': authors, 'place': place,
'publisher': publisher, 'year': year, 'edition': edition,
'isbn': isbn}
user = collect_user_info(uid)
borrower_id = db.get_borrower_id_by_email(user['email'])
#Check if borrower is on DB.
if borrower_id != 0:
address = db.get_borrower_address(user['email'])
#Check if borrower has an address.
if address != 0:
#Check if borrower has accepted ILL conditions.
if conditions:
#Register ILL request on crcILLREQUEST.
db.ill_register_request(book_info, borrower_id,
period_of_interest_from,
period_of_interest_to,
CFG_BIBCIRCULATION_ILL_STATUS_NEW,
additional_comments,
only_edition or 'False', request_type,
budget_code='', barcode=barcode)
#Display confirmation message.
message = _("Your ILL request has been registered and the " \
"document will be sent to you via internal mail.")
#Notify librarian about new ILL request.
send_email(fromaddr=CFG_BIBCIRCULATION_LIBRARIAN_EMAIL,
toaddr=CFG_BIBCIRCULATION_LOANS_EMAIL,
subject=_('ILL request for books confirmation'),
content="",
attempt_times=1,
attempt_sleeptime=10
)
#Borrower did not accept ILL conditions.
else:
infos = []
infos.append(_("You didn't accept the ILL conditions."))
body = bc_templates.tmpl_display_ill_form(infos=infos, ln=ln)
#Borrower doesn't have an address.
else:
#If BibCirculation at CERN, use LDAP.
if CFG_CERN_SITE == 1:
email = user['email']
result = get_user_info_from_ldap(email)
try:
ldap_address = result['physicalDeliveryOfficeName'][0]
except KeyError:
ldap_address = None
# verify address
if ldap_address is not None:
db.add_borrower_address(ldap_address, email)
db.ill_register_request(book_info, borrower_id,
period_of_interest_from,
period_of_interest_to,
CFG_BIBCIRCULATION_ILL_STATUS_NEW,
additional_comments,
only_edition or 'False',
request_type, budget_code='', barcode=barcode)
message = _("Your ILL request has been registered and" \
" the document will be sent to you via" \
" internal mail.")
send_email(fromaddr=CFG_BIBCIRCULATION_LIBRARIAN_EMAIL,
toaddr=CFG_BIBCIRCULATION_LOANS_EMAIL,
subject=_('ILL request for books confirmation'),
content="",
attempt_times=1,
attempt_sleeptime=10
)
else:
message = _("It is not possible to validate your request.")
message += ' ' + _("Your office address is not available.")
message += ' ' + _("Please contact %(contact_email)s") % \
{'contact_email': CFG_BIBCIRCULATION_LIBRARIAN_EMAIL}
else:
# Get information from CERN LDAP
if CFG_CERN_SITE == 1:
result = get_user_info_from_ldap(email=user['email'])
try:
name = result['cn'][0]
except KeyError:
name = None
try:
email = result['mail'][0]
except KeyError:
email = None
try:
phone = result['telephoneNumber'][0]
except KeyError:
phone = None
try:
address = result['physicalDeliveryOfficeName'][0]
except KeyError:
address = None
try:
mailbox = result['postOfficeBox'][0]
except KeyError:
mailbox = None
try:
ccid = result['employeeID'][0]
except KeyError:
ccid = ''
# verify address
if address is not None:
db.new_borrower(ccid, name, email, phone, address, mailbox, '')
borrower_id = db.get_borrower_id_by_email(email)
db.ill_register_request(book_info, borrower_id,
period_of_interest_from,
period_of_interest_to,
CFG_BIBCIRCULATION_ILL_STATUS_NEW,
additional_comments,
only_edition or 'False',
request_type, budget_code='', barcode=barcode)
message = _("Your ILL request has been registered and" \
" the document will be sent to you via" \
" internal mail.")
send_email(fromaddr=CFG_BIBCIRCULATION_LIBRARIAN_EMAIL,
toaddr=CFG_BIBCIRCULATION_LOANS_EMAIL,
subject='ILL request for books confirmation',
content="",
attempt_times=1,
attempt_sleeptime=10
)
else:
message = _("It is not possible to validate your request.")
message += ' ' + _("Your office address is not available.")
message += ' ' + _("Please contact %(contact_email)s") % \
{'contact_email': CFG_BIBCIRCULATION_LIBRARIAN_EMAIL}
body = bc_templates.tmpl__with_recid(message=message,
ln=ln)
return body
diff --git a/invenio/legacy/bibcirculation/daemon.py b/invenio/legacy/bibcirculation/daemon.py
index 25e5cc138..327e8d10e 100644
--- a/invenio/legacy/bibcirculation/daemon.py
+++ b/invenio/legacy/bibcirculation/daemon.py
@@ -1,272 +1,272 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibCirculation daemon.
"""
__revision__ = "$Id$"
import sys
import time
from invenio.legacy.dbquery import run_sql
-from invenio.bibtask import task_init, \
+from invenio.legacy.bibsched.bibtask import task_init, \
task_sleep_now_if_required, \
task_update_progress, \
task_set_option, \
task_get_option, \
write_message
from invenio.ext.email import send_email
-import invenio.bibcirculation_dblayer as db
-from invenio.bibcirculation_config import CFG_BIBCIRCULATION_TEMPLATES, \
+import invenio.legacy.bibcirculation.db_layer as db
+from invenio.legacy.bibcirculation.config import CFG_BIBCIRCULATION_TEMPLATES, \
CFG_BIBCIRCULATION_LOANS_EMAIL, \
CFG_BIBCIRCULATION_ILLS_EMAIL, \
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING, \
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED
-from invenio.bibcirculation_utils import generate_email_body, \
+from invenio.legacy.bibcirculation.utils import generate_email_body, \
book_title_from_MARC, \
update_user_info_from_ldap, \
update_requests_statuses, \
looks_like_dictionary
import datetime
def task_submit_elaborate_specific_parameter(key, value, opts, args):
""" Given the string key, checks its meaning and returns True if
has elaborated the key.
Possible keys:
"""
write_message(key)
if key in ('-o', '--overdue-letters'):
task_set_option('overdue-letters', True)
elif key in ('-b', '--update-borrowers'):
task_set_option('update-borrowers', True)
elif key in ('-r', '--update-requests'):
task_set_option('update-requests', True)
else:
return False
return True
def update_expired_loan(loan_id, ill=0):
"""
Update status, number of overdue letters and the date of overdue letter
@param loan_id: identify the loan. Primary key of crcLOAN.
@type loan_id: int
"""
if ill:
run_sql("""update crcILLREQUEST
set overdue_letter_number=overdue_letter_number+1,
overdue_letter_date=NOW()
where id=%s
""", (loan_id,))
else:
run_sql("""update crcLOAN
set overdue_letter_number=overdue_letter_number+1,
status=%s,
overdue_letter_date=NOW()
where id=%s
""", (CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED,
loan_id))
def send_overdue_letter(borrower_id, from_address, subject, content):
"""
Send an overdue letter
@param borrower_id: identify the borrower. Primary key of crcBORROWER.
@type borrower_id: int
@param subject: subject of the overdue letter
@type subject: string
"""
to_borrower = db.get_borrower_email(borrower_id)
send_email(fromaddr=from_address,
toaddr=to_borrower,
subject=subject,
content=content,
header='',
footer='',
attempt_times=1,
attempt_sleeptime=10
)
return 1
def must_send_second_recall(date_letters):
"""
@param date_letters: date of the last letter.
@type date_letters: string
@return boolean
"""
today = datetime.date.today()
time_tuple = time.strptime(date_letters, "%Y-%m-%d")
#datetime.strptime(date_letters, "%Y-%m-%d") doesn't work (only on 2.5).
tmp_date = datetime.datetime(*time_tuple[0:3]) + datetime.timedelta(weeks=1)
try:
if tmp_date.strftime("%Y-%m-%d") <= today.strftime("%Y-%m-%d"):
return True
else:
return False
except ValueError:
return False
def must_send_third_recall(date_letters):
"""
@param date_letters: date of the last letter.
@type date_letters: string
@return boolean
"""
today = datetime.date.today()
time_tuple = time.strptime(date_letters, "%Y-%m-%d")
#datetime.strptime(date_letters, "%Y-%m-%d") doesn't work (only on Python 2.5)
tmp_date = datetime.datetime(*time_tuple[0:3]) + datetime.timedelta(days=3)
try:
if tmp_date.strftime("%Y-%m-%d") <= today.strftime("%Y-%m-%d"):
return True
else:
return False
except ValueError:
return False
def task_run_core():
"""
Run daemon
"""
write_message("Starting...")
if task_get_option("update-borrowers"):
write_message("Started update-borrowers")
list_of_borrowers = db.get_all_borrowers()
total_borrowers = len(list_of_borrowers)
for done, borrower in enumerate(list_of_borrowers):
user_id = borrower[0]
update_user_info_from_ldap(user_id)
if done % 10 == 0:
task_update_progress("Borrower: updated %d out of %d." % (done, total_borrowers))
task_sleep_now_if_required(can_stop_too=True)
task_update_progress("Borrower: updated %d out of %d." % (done+1, total_borrowers))
write_message("Updated %d out of %d total borrowers" % (done+1, total_borrowers))
if task_get_option("update-requests"):
write_message("Started update-requests")
list_of_reqs = db.get_loan_request_by_status(CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING)
for (_request_id, recid, bc, _name, borrower_id, _library, _location,
_date_from, _date_to, _request_date) in list_of_reqs:
description = db.get_item_description(bc)
list_of_barcodes = db.get_barcodes(recid, description)
for barcode in list_of_barcodes:
update_requests_statuses(barcode)
task_sleep_now_if_required(can_stop_too=True)
task_update_progress("Requests due updated from 'waiting' to 'pending'.")
write_message("Requests due updated from 'waiting' to 'pending'.")
if task_get_option("overdue-letters"):
write_message("Started overdue-letters")
expired_loans = db.get_all_expired_loans()
total_expired_loans = len(expired_loans)
for done, (borrower_id, _bor_name, recid, _barcode, _loaned_on,
_due_date, _number_of_renewals, number_of_letters,
date_letters, _notes, loan_id) in enumerate(expired_loans):
number_of_letters=int(number_of_letters)
content = ''
if number_of_letters == 0:
content = generate_email_body(CFG_BIBCIRCULATION_TEMPLATES['RECALL1'], loan_id)
elif number_of_letters == 1 and must_send_second_recall(date_letters):
content = generate_email_body(CFG_BIBCIRCULATION_TEMPLATES['RECALL2'], loan_id)
elif number_of_letters == 2 and must_send_third_recall(date_letters):
content = generate_email_body(CFG_BIBCIRCULATION_TEMPLATES['RECALL3'], loan_id)
elif number_of_letters >= 3 and must_send_third_recall(date_letters):
content = generate_email_body(CFG_BIBCIRCULATION_TEMPLATES['RECALL3'], loan_id)
if content != '':
title = book_title_from_MARC(recid)
subject = "LOAN RECALL: " + title
update_expired_loan(loan_id)
send_overdue_letter(borrower_id, CFG_BIBCIRCULATION_LOANS_EMAIL, subject, content)
if done % 10 == 0:
task_update_progress("Loan recall: sent %d out of %d." % (done, total_expired_loans))
task_sleep_now_if_required(can_stop_too=True)
task_update_progress("Loan recall: processed %d out of %d expires loans." % (done+1, total_expired_loans))
write_message("Processed %d out of %d expired loans." % (done+1, total_expired_loans))
# Recalls for expired ILLs
write_message("Started overdue-letters for Inter Library Loans")
expired_ills = db.get_all_expired_ills()
total_expired_ills = len(expired_ills)
for done, (ill_id, borrower_id, item_info, number_of_letters,
date_letters) in enumerate(expired_ills):
number_of_letters=int(number_of_letters)
content = ''
if number_of_letters == 0:
content = generate_email_body(CFG_BIBCIRCULATION_TEMPLATES['ILL_RECALL1'], ill_id, ill=1)
elif number_of_letters == 1 and must_send_second_recall(date_letters):
content = generate_email_body(CFG_BIBCIRCULATION_TEMPLATES['ILL_RECALL2'], ill_id, ill=1)
elif number_of_letters == 2 and must_send_third_recall(date_letters):
content = generate_email_body(CFG_BIBCIRCULATION_TEMPLATES['ILL_RECALL3'], ill_id, ill=1)
elif number_of_letters >= 3 and must_send_third_recall(date_letters):
content = generate_email_body(CFG_BIBCIRCULATION_TEMPLATES['ILL_RECALL3'], ill_id, ill=1)
if content != '' and looks_like_dictionary(item_info):
item_info = eval(item_info)
if item_info.has_key('title'):
book_title = item_info['title']
subject = "ILL RECALL: " + str(book_title)
update_expired_loan(loan_id=ill_id, ill=1)
send_overdue_letter(borrower_id, CFG_BIBCIRCULATION_ILLS_EMAIL, subject, content)
if done % 10 == 0:
task_update_progress("ILL recall: sent %d out of %d." % (done, total_expired_ills))
task_sleep_now_if_required(can_stop_too=True)
task_update_progress("ILL recall: processed %d out of %d expired ills." % (done+1, total_expired_ills))
write_message("Processed %d out of %d expired ills." % (done+1, total_expired_ills))
return 1
def main():
task_init(authorization_action='runbibcircd',
authorization_msg="BibCirculation Task Submission",
help_specific_usage="""-o, --overdue-letters\tCheck overdue loans and send recall emails if necessary.\n
-b, --update-borrowers\tUpdate borrowers information from ldap.\n
-r, --update-requests\tUpdate pending requests of users\n\n""",
description="""Example: %s -u admin \n\n""" % (sys.argv[0]),
specific_params=("obr", ["overdue-letters", "update-borrowers", "update-requests"]),
task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
version=__revision__,
task_run_fnc = task_run_core
)
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/bibcirculation/db_layer.py b/invenio/legacy/bibcirculation/db_layer.py
index dcef86e66..53c5b0f24 100644
--- a/invenio/legacy/bibcirculation/db_layer.py
+++ b/invenio/legacy/bibcirculation/db_layer.py
@@ -1,2913 +1,2913 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Every db-related function of the bibcirculation module.
The methods are positioned by grouping into logical
categories ('Loans', 'Returns', 'Loan requests', 'ILLs',
'Libraries', 'Vendors' ...)
This positioning should be maintained and when necessary,
improved for readability, as and when additional methods are
added. When applicable, methods should be renamed, refactored
and appropriate documentation added.
Currently, the same table 'crcILLREQUEST' is used for the ILLs,
purchases as well as proposals.
"""
__revision__ = "$Id$"
from invenio.legacy.dbquery import run_sql
-from invenio.bibcirculation_config import \
+from invenio.legacy.bibcirculation.config import \
CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, \
CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN, \
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED, \
CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED, \
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING, \
CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING, \
CFG_BIBCIRCULATION_REQUEST_STATUS_PROPOSED, \
CFG_BIBCIRCULATION_REQUEST_STATUS_DONE, \
CFG_BIBCIRCULATION_REQUEST_STATUS_CANCELLED, \
CFG_BIBCIRCULATION_ILL_STATUS_NEW, \
CFG_BIBCIRCULATION_ILL_STATUS_REQUESTED, \
CFG_BIBCIRCULATION_ILL_STATUS_ON_LOAN, \
CFG_BIBCIRCULATION_ILL_STATUS_RETURNED, \
CFG_BIBCIRCULATION_ILL_STATUS_RECEIVED, \
CFG_BIBCIRCULATION_ACQ_STATUS_NEW, \
CFG_BIBCIRCULATION_ACQ_STATUS_ON_ORDER, \
CFG_BIBCIRCULATION_ACQ_STATUS_PARTIAL_RECEIPT, \
CFG_BIBCIRCULATION_ACQ_STATUS_RECEIVED, \
CFG_BIBCIRCULATION_ACQ_STATUS_CANCELLED, \
CFG_BIBCIRCULATION_PROPOSAL_STATUS_NEW, \
CFG_BIBCIRCULATION_PROPOSAL_STATUS_ON_ORDER, \
CFG_BIBCIRCULATION_PROPOSAL_STATUS_RECEIVED, \
CFG_BIBCIRCULATION_PROPOSAL_STATUS_PUT_ASIDE, \
CFG_BIBCIRCULATION_LIBRARY_TYPE_INTERNAL, \
CFG_BIBCIRCULATION_LIBRARY_TYPE_EXTERNAL, \
CFG_BIBCIRCULATION_LIBRARY_TYPE_MAIN, \
CFG_BIBCIRCULATION_LIBRARY_TYPE_HIDDEN
###
### Loan Requests related functions ###
###
def new_hold_request(borrower_id, recid, barcode, date_from, date_to, status):
"""
Create a new hold request.
@param borrower_id: identify the borrower. Primary key of crcBORROWER.
@type borrower_id: int
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@param barcode: identify the item. Primary key of crcITEM.
@type barcode: string
@param date_from: begining of the period of interest.
@type date_from: string
@param date_to: end of the period of interest.
@type date_to: string
@param status: hold request status.
@type status: string
"""
res = run_sql("""INSERT INTO crcLOANREQUEST(id_crcBORROWER,
id_bibrec,
barcode,
period_of_interest_from,
period_of_interest_to,
status,
request_date)
VALUES (%s, %s, %s, %s, %s, %s, NOW())
""", (borrower_id, recid, barcode, date_from,
date_to, status))
return res
def has_loan_request(borrower_id, recid, ill=0):
- from invenio.bibcirculation_utils import looks_like_dictionary
+ from invenio.legacy.bibcirculation.utils import looks_like_dictionary
if ill == 0:
return run_sql("""
SELECT id
FROM crcLOANREQUEST
WHERE id_crcBORROWER=%s and
id_bibrec=%s and
status in (%s, %s, %s)""",
(borrower_id, recid,
CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING,
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING,
CFG_BIBCIRCULATION_REQUEST_STATUS_PROPOSED
)) != ()
else:
res = run_sql("""
SELECT item_info
FROM crcILLREQUEST
WHERE id_crcBORROWER=%s and
request_type=%s and
status in (%s, %s, %s)""",
(borrower_id, 'book',
CFG_BIBCIRCULATION_ILL_STATUS_NEW,
CFG_BIBCIRCULATION_ILL_STATUS_REQUESTED,
CFG_BIBCIRCULATION_ILL_STATUS_ON_LOAN
))
for record in res:
if looks_like_dictionary(record[0]):
item_info = eval(record[0])
try:
if str(recid) == str(item_info['recid']): return True
except KeyError:
continue
return False
def is_requested(barcode):
res = run_sql("""SELECT id
FROM crcLOANREQUEST
WHERE barcode=%s
AND (status = %s or status = %s)
""", (barcode,
CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING,
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING))
try:
return res
except IndexError:
return None
def is_doc_already_requested(recid, barcode, borrower_id):
"""
Check if the borrower already has a waiting/pending loan request or
a proposal, or a loan on some item of the record.
"""
multi_volume_book = False
if get_item_description(barcode).strip() not in ('', '-'):
multi_volume_book = True
reqs_on_rec = run_sql("""SELECT id, barcode
FROM crcLOANREQUEST
WHERE id_bibrec=%s
AND id_crcBORROWER = %s
AND status in (%s, %s, %s)
""", (recid, borrower_id,
CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING,
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING,
CFG_BIBCIRCULATION_REQUEST_STATUS_PROPOSED,
))
if reqs_on_rec != () and not multi_volume_book: return True
for req in reqs_on_rec:
if req[1] == barcode: return True
loans_on_rec = run_sql("""SELECT id, barcode
FROM crcLOAN
WHERE id_bibrec=%s
AND id_crcBORROWER = %s
AND status in (%s, %s)
""", (recid, borrower_id,
CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED
))
if loans_on_rec != () and not multi_volume_book: return True
for loan in loans_on_rec:
if loan[1] == barcode: return True
return False
def cancel_request(request_id, borrower_id=None, recid=None):
"""
Cancel a hold request identified with the request_id. If it is None,
cancel the hold request identified with (borrower_id, recid), if both
are not None.
"""
if request_id:
run_sql("""UPDATE crcLOANREQUEST
SET status=%s
WHERE id=%s
""", (CFG_BIBCIRCULATION_REQUEST_STATUS_CANCELLED, request_id))
elif borrower_id and recid:
run_sql("""UPDATE crcLOANREQUEST
SET status=%s
WHERE id_crcBORROWER=%s and
id_bibrec=%s and
status in (%s, %s, %s)""",
(CFG_BIBCIRCULATION_REQUEST_STATUS_CANCELLED,
borrower_id, recid,
CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING,
CFG_BIBCIRCULATION_REQUEST_STATUS_PROPOSED,
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING
))
def tag_requests_as_done(user_id, barcode=None, recid=None):
if barcode:
run_sql("""UPDATE crcLOANREQUEST
SET status=%s
WHERE barcode=%s
and id_crcBORROWER=%s
""", (CFG_BIBCIRCULATION_REQUEST_STATUS_DONE,
barcode, user_id))
elif recid:
run_sql("""UPDATE crcLOANREQUEST
SET status=%s
WHERE id_bibrec=%s
and id_crcBORROWER=%s
""", (CFG_BIBCIRCULATION_REQUEST_STATUS_DONE,
recid, user_id))
def get_requests(recid, description, status):
"""
Get the number of requests of a record.
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@param status: identify the status.
@type status: string
@return number of requests (int)
"""
# Get all the barcodes of the items belonging to the same record and with the same description.
barcodes = tuple(rec[0] for rec in run_sql("""SELECT barcode FROM crcITEM WHERE description=%s
AND id_bibrec=%s""", (description, recid)))
query = """SELECT id, DATE_FORMAT(period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(period_of_interest_to,'%%Y-%%m-%%d'),
DATE_FORMAT(request_date,'%%Y-%%m-%%d')
FROM crcLOANREQUEST
WHERE period_of_interest_from <= NOW()
AND period_of_interest_to >= NOW()
AND id_bibrec=%s
AND status='%s' """% (recid, status)
if len(barcodes) == 1:
query += """AND barcode='%s' ORDER BY request_date""" % barcodes[0]
elif len(barcodes) > 1:
query += """AND barcode in %s ORDER BY request_date""" % (barcodes,)
else:
query += """ORDER BY request_date"""
return run_sql(query)
def get_all_requests():
"""
Retrieve all requests.
"""
res = run_sql("""SELECT lr.id,
bor.id,
bor.name,
lr.id_bibrec,
lr.status,
DATE_FORMAT(lr.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(lr.period_of_interest_to,'%%Y-%%m-%%d'),
lr.request_date
FROM crcLOANREQUEST lr,
crcBORROWER bor
WHERE bor.id = lr.id_crcBORROWER
AND (lr.status=%s OR lr.status=%s)
AND lr.period_of_interest_to >= CURDATE()
ORDER BY lr.request_date
""", (CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING,
CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING))
return res
def get_loan_request_details(req_id):
res = run_sql("""SELECT lr.id_bibrec,
bor.name,
bor.id,
lib.name,
it.location,
DATE_FORMAT(lr.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(lr.period_of_interest_to,'%%Y-%%m-%%d'),
lr.request_date
FROM crcLOANREQUEST lr,
crcBORROWER bor,
crcITEM it,
crcLIBRARY lib
WHERE lr.id_crcBORROWER=bor.id AND it.barcode=lr.barcode
AND lib.id = it.id_crcLIBRARY
AND lr.id=%s
""", (req_id, ))
if res:
return res[0]
else:
return None
def get_loan_request_by_status(status):
query = """SELECT DISTINCT
lr.id,
lr.id_bibrec,
lr.barcode,
bor.name,
bor.id,
lib.name,
it.location,
DATE_FORMAT(lr.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(lr.period_of_interest_to,'%%Y-%%m-%%d'),
lr.request_date
FROM crcLOANREQUEST lr,
crcBORROWER bor,
crcITEM it,
crcLIBRARY lib
WHERE lr.id_crcBORROWER=bor.id AND it.barcode=lr.barcode AND
lib.id = it.id_crcLIBRARY AND lr.status=%s
AND lr.period_of_interest_from <= NOW()
AND lr.period_of_interest_to >= NOW()
ORDER BY lr.request_date"""
res = run_sql(query , (status, ))
return res
def get_requested_barcode(request_id):
"""
request_id: identify the hold request. It is also the primary key
of the table crcLOANREQUEST.
"""
res = run_sql("""SELECT barcode
FROM crcLOANREQUEST
WHERE id=%s""",
(request_id, ))
if res:
return res[0][0]
else:
return None
def update_loan_request_status(new_status, request_id=None,
barcode=None, borrower_id=None):
"""
Update the hold request(s) status(es) for an item with the request_id/barcode.
If the status of the hold request on an item with a particular barcode and
by a particular borrrower is to be modified, specify the borrower_id too.
"""
if request_id:
return int(run_sql("""UPDATE crcLOANREQUEST
SET status=%s
WHERE id=%s""",
(new_status, request_id)))
elif barcode and borrower_id:
return int(run_sql("""UPDATE crcLOANREQUEST
SET status=%s
WHERE barcode=%s
AND id_crcBORROWER=%s""",
(new_status, barcode, borrower_id)))
elif barcode:
return int(run_sql("""UPDATE crcLOANREQUEST
SET status=%s
WHERE barcode=%s""",
(new_status, barcode)))
def update_request_barcode(barcode, request_id):
"""
Update the barcode of a hold request.
barcode: new barcode (after update). It is also the
primary key of the crcITEM table.
request_id: identify the hold request who will be
cancelled. It is also the primary key of
the crcLOANREQUEST table.
"""
run_sql("""UPDATE crcLOANREQUEST
set barcode = %s
WHERE id = %s
""", (barcode, request_id))
def get_pending_loan_request(recid, description):
"""
Get the pending request for a given recid.
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@param description: Gives the details like volume(if any), etc... of a particular
item in the record.
@type description: string
@return list with request_id, borrower_name, recid, status,
period_of_interest (FROM and to) and request_date.
"""
# Get all the barcodes of the items belonging to the same record and with the same description.
barcodes = tuple(rec[0] for rec in run_sql("""SELECT barcode FROM crcITEM WHERE description=%s
AND id_bibrec=%s""", (description, recid)))
query = """SELECT lr.id,
bor.name,
lr.id_bibrec,
lr.status,
DATE_FORMAT(lr.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(lr.period_of_interest_to,'%%Y-%%m-%%d'),
lr.request_date
FROM crcLOANREQUEST lr,
crcBORROWER bor
WHERE lr.id_crcBORROWER=bor.id
AND lr.status='%s'
AND lr.id_bibrec=%s
AND lr.period_of_interest_from <= NOW()
AND lr.period_of_interest_to >= NOW() """% \
(CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING, recid)
if len(barcodes) == 1:
query += """AND lr.barcode='%s' ORDER BY lr.request_date""" % barcodes[0]
elif len(barcodes) > 1:
query += """AND lr.barcode in %s ORDER BY lr.request_date""" % (barcodes,)
else:
query += """ORDER BY lr.request_date"""
return run_sql(query)
def get_queue_request(recid, item_description):
"""
recid: identify the record. It is also the primary key of
the table bibrec.
item_description: Gives the details like volume(if any), etc... of a particular
item in the record.
"""
# Get all the barcodes of the items belonging to the same record and with the same description.
barcodes = tuple(rec[0] for rec in run_sql("""SELECT barcode FROM crcITEM WHERE description=%s
AND id_bibrec=%s""", (item_description, recid)))
query = """SELECT id_crcBORROWER,
status,
DATE_FORMAT(request_date,'%%Y-%%m-%%d')
FROM crcLOANREQUEST
WHERE id_bibrec=%s
AND (status='%s' or status='%s')
AND period_of_interest_from <= NOW()
AND period_of_interest_to >= NOW() """% \
(recid, CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING,\
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING)
if len(barcodes) == 1:
query += """AND barcode='%s' ORDER BY request_date""" % barcodes[0]
elif len(barcodes) > 1:
query += """AND barcode in %s ORDER BY request_date""" % (barcodes,)
else:
query += """ORDER BY request_date"""
return run_sql(query)
def get_request_recid(request_id):
"""
Get the recid of a given request_id
@param request_id: identify the (hold) request. Primary key of crcLOANREQUEST.
@type request_id: int
@return recid
"""
res = run_sql(""" SELECT id_bibrec
FROM crcLOANREQUEST
WHERE id=%s
""", (request_id, ))
if res:
return res[0][0]
else:
return None
def get_request_barcode(request_id):
"""
Get the barcode of a given request_id
@param request_id: identify the (hold) request. Primary key of crcLOANREQUEST.
@type request_id: int
@return barcode
"""
res = run_sql(""" SELECT barcode
FROM crcLOANREQUEST
WHERE id=%s
""", (request_id, ))
if res:
return res[0][0]
else:
return None
def get_request_borrower_id(request_id):
"""
Get the borrower_id of a given request_id
@param request_id: identify the (hold) request. Primary key of crcLOANREQUEST.
@type request_id: int
@return borrower_id
"""
res = run_sql(""" SELECT id_crcBORROWER
FROM crcLOANREQUEST
WHERE id=%s
""", (request_id, ))
if res:
return res[0][0]
else:
return None
def get_number_requests_per_copy(barcode):
"""
barcode: identify the item. It is the primary key of the table
crcITEM.
"""
res = run_sql("""SELECT count(barcode)
FROM crcLOANREQUEST
WHERE barcode=%s and
(status != %s and status != %s)""",
(barcode, CFG_BIBCIRCULATION_REQUEST_STATUS_DONE,
CFG_BIBCIRCULATION_REQUEST_STATUS_CANCELLED))
return res[0][0]
def get_pdf_request_data(status):
"""
status: request status.
"""
res = run_sql("""SELECT DISTINCT
lr.id_bibrec,
bor.name,
lib.name,
it.location,
DATE_FORMAT(lr.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(lr.period_of_interest_to,'%%Y-%%m-%%d'),
lr.request_date
FROM crcLOANREQUEST lr,
crcBORROWER bor,
crcITEM it,
crcLIBRARY lib
WHERE lr.id_crcBORROWER=bor.id AND
it.id_bibrec=lr.id_bibrec AND
lib.id = it.id_crcLIBRARY AND
lr.status=%s;
""" , (status))
return res
###
### Loans related functions ###
###
def loan_on_desk_confirm(barcode, borrower_id):
"""
barcode: identify the item. It is the primary key of the table
crcITEM.
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
res = run_sql("""SELECT it.id_bibrec, bor.name
FROM crcITEM it, crcBORROWER bor
WHERE it.barcode=%s and bor.id=%s
""", (barcode, borrower_id))
return res
def is_on_loan(barcode):
res = run_sql("""SELECT id
FROM crcLOAN
WHERE barcode=%s
AND (status=%s or status=%s)
""", (barcode,
CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED))
if res:
return True
else:
return False
def is_item_on_loan(barcode):
"""
Check if an item is on loan.
@param barcode: identify the item. It is the primary key of the table crcITEM.
"""
res = run_sql("""SELECT id
FROM crcLOAN
WHERE (status=%s or status=%s)
and barcode=%s""",
(CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED, barcode))
try:
return res[0][0]
except IndexError:
return None
def get_loan_infos(loan_id):
"""
loan_id: identify a loan. It is the primery key of the table
crcLOAN.
"""
res = run_sql("""SELECT l.id_bibrec,
l.barcode,
DATE_FORMAT(l.loaned_on, '%%Y-%%m-%%d'),
DATE_FORMAT(l.due_date, '%%Y-%%m-%%d'),
l.status,
it.loan_period,
it.status
FROM crcLOAN l, crcITEM it, crcLOANREQUEST lr
WHERE l.barcode=it.barcode and
l.id=%s""",
(loan_id, ))
if res:
return res[0]
else:
return None
def get_borrower_id(barcode):
"""
Get the borrower id who is associated to a loan.
@param barcode: identify the item. Primary key of crcITEM.
@type barcode: string
@return borrower_id or None
"""
res = run_sql(""" SELECT id_crcBORROWER
FROM crcLOAN
WHERE barcode=%s and
(status=%s or status=%s)""",
(barcode, CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED))
try:
return res[0][0]
except IndexError:
return None
def get_borrower_loans_barcodes(borrower_id):
"""
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
res = run_sql("""SELECT barcode
FROM crcLOAN
WHERE id_crcBORROWER=%s
AND (status=%s OR status=%s)
""",
(borrower_id, CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED))
list_of_barcodes = []
for bc in res:
list_of_barcodes.append(bc[0])
return list_of_barcodes
def new_loan(borrower_id, recid, barcode,
due_date, status, loan_type, notes):
"""
Create a new loan.
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
recid: identify the record. It is also the primary key of
the table bibrec.
barcode: identify the item. It is the primary key of the table
crcITEM.
loaned_on: loan date.
due_date: due date.
status: loan status.
loan_type: loan type(normal, ILL, etc...)
notes: loan notes.
"""
res = run_sql(""" insert into crcLOAN (id_crcBORROWER, id_bibrec,
barcode, loaned_on, due_date,
status, type, notes)
values(%s, %s, %s, NOW(), %s, %s ,%s, %s)
""", (borrower_id, recid, barcode, due_date,
status, loan_type, str(notes)))
res = run_sql(""" UPDATE crcITEM
SET status=%s
WHERE barcode=%s""", (status, barcode))
return res
def update_due_date(loan_id, new_due_date):
"""
loan_id: identify a loan. It is the primery key of the table
crcLOAN.
new_due_date: new due date.
"""
return int(run_sql("""UPDATE crcLOAN
SET due_date=%s,
number_of_renewals = number_of_renewals + 1
WHERE id=%s""",
(new_due_date, loan_id)))
def update_loan_status(status, loan_id):
"""
Update the status of a loan.
status: new status (after update)
loan_id: identify the loan who will be updated.
It is also the primary key of the table
crcLOAN.
"""
run_sql("""UPDATE crcLOAN
set status = %s
WHERE id = %s""",
(status, loan_id))
def get_loan_status(loan_id):
"""
Get loan's status
loan_id: identify a loan. It is the primery key of the table
crcLOAN.
"""
res = run_sql("""SELECT status
FROM crcLOAN
WHERE id=%s""",
(loan_id, ))
if res:
return res[0][0]
else:
return None
def get_all_loans(limit):
"""
Get all loans.
"""
res = run_sql("""
SELECT bor.id,
bor.name,
it.id_bibrec,
l.barcode,
DATE_FORMAT(l.loaned_on,'%%Y-%%m-%%d %%T'),
DATE_FORMAT(l.due_date,'%%Y-%%m-%%d'),
l.number_of_renewals,
l.overdue_letter_number,
DATE_FORMAT(l.overdue_letter_date,'%%Y-%%m-%%d'),
l.notes,
l.id
FROM crcLOAN l, crcBORROWER bor, crcITEM it
WHERE l.id_crcBORROWER = bor.id
AND l.barcode = it.barcode
AND l.status = %s
ORDER BY 5 DESC
LIMIT 0,%s
""", (CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN, limit))
return res
def get_all_expired_loans():
"""
Get all expired(overdue) loans.
"""
res = run_sql(
"""
SELECT bor.id,
bor.name,
it.id_bibrec,
l.barcode,
DATE_FORMAT(l.loaned_on,'%%Y-%%m-%%d'),
DATE_FORMAT(l.due_date,'%%Y-%%m-%%d'),
l.number_of_renewals,
l.overdue_letter_number,
DATE_FORMAT(l.overdue_letter_date,'%%Y-%%m-%%d'),
l.notes,
l.id
FROM crcLOAN l, crcBORROWER bor, crcITEM it
WHERE l.id_crcBORROWER = bor.id
and l.barcode = it.barcode
and ((l.status = %s and l.due_date < CURDATE())
or l.status = %s )
""", (CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED))
return res
def get_expired_loans_with_waiting_requests():
res = run_sql("""SELECT DISTINCT
lr.id,
lr.id_bibrec,
lr.id_crcBORROWER,
it.id_crcLIBRARY,
it.location,
DATE_FORMAT(lr.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(lr.period_of_interest_to,'%%Y-%%m-%%d'),
lr.request_date
FROM crcLOANREQUEST lr,
crcITEM it,
crcLOAN l
WHERE it.barcode=l.barcode
AND lr.id_bibrec=it.id_bibrec
AND (lr.status=%s or lr.status=%s)
AND (l.status=%s or (l.status=%s
AND l.due_date < CURDATE()))
AND lr.period_of_interest_from <= NOW()
AND lr.period_of_interest_to >= NOW()
ORDER BY lr.request_date;
""", ( CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING,
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING,
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED,
CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN))
return res
def get_current_loan_id(barcode):
res = run_sql(""" SELECT id
FROM crcLOAN
WHERE barcode=%s
AND (status=%s OR status=%s)
""", (barcode, CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED))
if res:
return res[0][0]
def get_last_loan():
"""
Get the recid, the borrower_id and the due date of
the last loan who was registered on the crcLOAN table.
"""
res = run_sql("""SELECT id_bibrec,
id_crcBORROWER,
DATE_FORMAT(due_date, '%Y-%m-%d')
FROM crcLOAN ORDER BY id DESC LIMIT 1""")
if res:
return res[0]
else:
return None
def get_loan_recid(loan_id):
res = run_sql("""SELECT id_bibrec
FROM crcLOAN
WHERE id=%s""",
(loan_id, ))
if res:
return res[0][0]
else:
return None
def get_loan_notes(loan_id):
res = run_sql("""SELECT notes
FROM crcLOAN
WHERE id=%s""",
(loan_id, ))
if res:
return res[0][0]
else:
return None
def update_loan_notes(loan_id, loan_notes):
"""
"""
run_sql("""UPDATE crcLOAN
SET notes=%s
WHERE id=%s """, (str(loan_notes), loan_id))
def add_new_loan_note(new_note, loan_id):
"""
Add a new loan's note.
new_note: note who will be added.
loan_id: identify the loan. A new note will
added to this loan. It is also the
primary key of the table crcLOAN.
"""
run_sql("""UPDATE crcLOAN
set notes=concat(notes,%s)
WHERE id=%s;
""", (new_note, loan_id))
def renew_loan(loan_id, new_due_date):
run_sql("""UPDATE crcLOAN
SET due_date=%s,
number_of_renewals=number_of_renewals+1,
overdue_letter_number=0,
status=%s
WHERE id=%s""", (new_due_date,
CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
loan_id))
###
### Loan Returns related functions ###
###
def return_loan(barcode):
"""
Update loan information when a copy is returned.
@param returned_on: return date.
@type returned_on: string
@param status: new loan status.
@type status: string
@param barcode: identify the item. Primary key of crcITEM.
@type barcode: string
"""
return int(run_sql("""UPDATE crcLOAN
SET returned_on=NOW(), status=%s, due_date=NULL
WHERE barcode=%s and (status=%s or status=%s)
""", (CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED,
barcode,
CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED)))
###
### 'Item' related functions ###
###
def get_id_bibrec(barcode):
"""
Get the id of the bibrec (recid).
@param barcode: identify the item. Primary key of crcITEM.
@type barcode: string
@return recid or None
"""
res = run_sql("""SELECT id_bibrec
FROM crcITEM
WHERE barcode=%s
""", (barcode, ))
if res:
return res[0][0]
else:
return None
def get_item_info(barcode):
"""
Get item's information.
barcode: identify the item. It is the primary key of the table
crcITEM.
"""
res = run_sql("""SELECT it.barcode,
it.id_crcLIBRARY,
lib.name,
it.collection,
it.location,
it.description,
it.loan_period,
it.status
FROM crcITEM it,
crcLIBRARY lib
WHERE it.barcode=%s and it.id_crcLIBRARY = lib.id""",
(barcode, ))
if res:
return res[0]
else:
return None
def get_loan_period(barcode):
"""
Retrieve the loan period of a book.
barcode: identify the item. It is the primary key of the table
crcITEM.
"""
res = run_sql("""SELECT loan_period
FROM crcITEM
WHERE barcode=%s""",
(barcode, ))
if res:
return res[0][0]
else:
return None
def update_item_info(barcode, library_id, collection, location, description,
loan_period, status, expected_arrival_date):
"""
Update item's information.
barcode: identify the item. It is the primary key of the table
crcITEM.
library_id: identify the library. It is also the primary key of
the table crcLIBRARY.
"""
int(run_sql("""UPDATE crcITEM
set barcode=%s,
id_crcLIBRARY=%s,
collection=%s,
location=%s,
description=%s,
loan_period=%s,
status=%s,
expected_arrival_date=%s,
modification_date=NOW()
WHERE barcode=%s""",
(barcode, library_id, collection, location, description,
loan_period, status, expected_arrival_date, barcode)))
def update_barcode(old_barcode, barcode):
res = run_sql("""UPDATE crcITEM
SET barcode=%s
WHERE barcode=%s
""", (barcode, old_barcode))
run_sql("""UPDATE crcLOAN
SET barcode=%s
WHERE barcode=%s
""", (barcode, old_barcode))
run_sql("""UPDATE crcLOANREQUEST
SET barcode=%s
WHERE barcode=%s
""", (barcode, old_barcode))
run_sql("""UPDATE crcILLREQUEST
SET barcode=%s
WHERE barcode=%s
""", (barcode, old_barcode))
return res > 0
def get_item_loans(recid):
"""
recid: identify the record. It is also the primary key of
the table bibrec.
"""
res = run_sql(
"""
SELECT bor.id,
bor.name,
l.barcode,
DATE_FORMAT(l.loaned_on,'%%Y-%%m-%%d'),
DATE_FORMAT(l.due_date,'%%Y-%%m-%%d'),
l.number_of_renewals,
l.overdue_letter_number,
DATE_FORMAT(l.overdue_letter_date,'%%Y-%%m-%%d'),
l.status,
l.notes,
l.id
FROM crcLOAN l, crcBORROWER bor, crcITEM it
WHERE l.id_crcBORROWER = bor.id
and l.barcode=it.barcode
and l.id_bibrec=%s
and l.status!=%s
""", (recid, CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED))
return res
def get_item_requests(recid):
"""
recid: identify the record. It is also the primary key of
the table bibrec.
"""
res = run_sql("""SELECT bor.id,
bor.name,
lr.id_bibrec,
lr.barcode,
lr.status,
lib.name,
it.location,
it.description,
DATE_FORMAT(lr.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(lr.period_of_interest_to,'%%Y-%%m-%%d'),
lr.id,
lr.request_date
FROM crcLOANREQUEST lr,
crcBORROWER bor,
crcITEM it,
crcLIBRARY lib
WHERE bor.id = lr.id_crcBORROWER and lr.id_bibrec=%s
and lr.status!=%s and lr.status!=%s and lr.status!=%s
and lr.barcode = it.barcode and lib.id = it.id_crcLIBRARY
""", (recid,
CFG_BIBCIRCULATION_REQUEST_STATUS_DONE,
CFG_BIBCIRCULATION_REQUEST_STATUS_CANCELLED,
CFG_BIBCIRCULATION_REQUEST_STATUS_PROPOSED))
return res
def get_item_purchases(status, recid):
"""
Purchases of a particular item to be displayed in the item info page.
"""
- from invenio.bibcirculation_utils import looks_like_dictionary
+ from invenio.legacy.bibcirculation.utils import looks_like_dictionary
status1 = ''
status2 = ''
if status == CFG_BIBCIRCULATION_ACQ_STATUS_NEW:
status1 = CFG_BIBCIRCULATION_ACQ_STATUS_ON_ORDER
status2 = CFG_BIBCIRCULATION_PROPOSAL_STATUS_ON_ORDER
elif status == CFG_BIBCIRCULATION_ACQ_STATUS_RECEIVED:
status1 = CFG_BIBCIRCULATION_ACQ_STATUS_PARTIAL_RECEIPT
status2 = CFG_BIBCIRCULATION_PROPOSAL_STATUS_RECEIVED
res = run_sql("""SELECT ill.id, ill.id_crcBORROWER, bor.name,
ill.id_crcLIBRARY, ill.status,
DATE_FORMAT(ill.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.period_of_interest_to,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.due_date,'%%Y-%%m-%%d'),
ill.item_info, ill.cost, ill.request_type, ''
FROM crcILLREQUEST ill, crcBORROWER bor
WHERE ill.id_crcBORROWER=bor.id
AND ill.request_type in (%s, %s, %s)
AND ill.status in (%s, %s, %s)
ORDER BY ill.id desc""", ('acq-book', 'acq-standard',
'proposal-book', status, status1, status2))
purchases = []
for record in res:
if looks_like_dictionary(record[8]):
item_info = eval(record[8])
try:
if str(recid) == str(item_info['recid']): purchases.append(record)
except KeyError:
continue
return tuple(purchases)
def get_item_loans_historical_overview(recid):
"""
@param recid: identify the record. Primary key of bibrec.
@type recid: int
"""
res = run_sql("""SELECT bor.name,
bor.id,
l.barcode,
lib.name,
it.location,
DATE_FORMAT(l.loaned_on,'%%Y-%%m-%%d'),
DATE_FORMAT(l.due_date,'%%Y-%%m-%%d'),
l.returned_on,
l.number_of_renewals,
l.overdue_letter_number
FROM crcLOAN l, crcBORROWER bor, crcITEM it, crcLIBRARY lib
WHERE l.id_crcBORROWER=bor.id and
lib.id = it.id_crcLIBRARY and
it.barcode = l.barcode and
l.id_bibrec = %s and
l.status = %s """
, (recid, CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED))
return res
def get_item_requests_historical_overview(recid):
"""
recid: identify the record. It is also the primary key of
the table bibrec.
"""
res = run_sql("""
SELECT bor.name,
bor.id,
lr.barcode,
lib.name,
it.location,
DATE_FORMAT(lr.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(lr.period_of_interest_to,'%%Y-%%m-%%d'),
lr.request_date
FROM crcLOANREQUEST lr, crcBORROWER bor, crcITEM it, crcLIBRARY lib
WHERE lr.id_crcBORROWER=bor.id and
lib.id = it.id_crcLIBRARY and
it.barcode = lr.barcode and
lr.id_bibrec = %s and
lr.status = %s
""", (recid, CFG_BIBCIRCULATION_REQUEST_STATUS_DONE))
return res
def get_nb_copies_on_loan(recid):
"""
Get the number of copies on loan for a recid.
recid: Invenio record identifier. The number of copies
of this record will be retrieved.
"""
res = run_sql("""SELECT count(barcode)
FROM crcITEM
WHERE id_bibrec=%s and status=%s;
""", (recid, CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN))
return res[0][0]
def get_item_copies_details(recid):
"""
Get copies details of a given recid.
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@return list with barcode, loan_period, library_name, library_id,
location, number_of_requests, status, collection,
description and due_date.
"""
res = run_sql("""SELECT it.barcode, it.loan_period, lib.name,
lib.id, it.location, it.number_of_requests,
it.status, it.collection, it.description,
DATE_FORMAT(ln.due_date,'%%Y-%%m-%%d')
FROM crcITEM it
left join crcLOAN ln
on it.barcode = ln.barcode and ln.status != %s
left join crcLIBRARY lib
on lib.id = it.id_crcLIBRARY
WHERE it.id_bibrec=%s
ORDER BY it.creation_date
""", (CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED, recid))
return res
def get_copy_details(barcode):
res = run_sql(""" SELECT *
FROM crcITEM it
WHERE barcode=%s""",
(barcode, ))
if res is not None:
return res[0]
else:
return None
def get_copies_status(recid, description='-'):
"""
@param description: Gives the details like volume(if any), etc... of a particular
item in the record.
"""
if description.strip() in ('', '-'):
res = run_sql("""SELECT status
FROM crcITEM
WHERE id_bibrec=%s""", (recid, ))
else:
res = run_sql("""SELECT status
FROM crcITEM
WHERE id_bibrec=%s
AND description=%s
""", (recid, description))
list_of_statuses = []
for status in res:
list_of_statuses.append(status[0])
if list_of_statuses == []:
return None
else:
return list_of_statuses
def update_item_status(status, barcode):
"""
Update the status of an item (using the barcode).
@param status: status of the item.
@type status: string
@param barcode: identify the item. Primary key of crcITEM.
@type barcode: string
@return
"""
if status == CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN:
return int(run_sql("""UPDATE crcITEM
SET status=%s,
number_of_requests = number_of_requests + 1
WHERE barcode=%s""", (status, barcode)))
else:
return int(run_sql("""UPDATE crcITEM
SET status=%s
WHERE barcode=%s""", (status, barcode)))
def get_item_description(barcode):
res = run_sql(""" SELECT description
FROM crcITEM
WHERE barcode=%s
""", (barcode, ))
#When no description:
#Don't return NULL, in order not to pose problems if checked for equality.
if res and res[0][0]:
return res[0][0]
else:
return ''
def set_item_description(barcode, description):
return int(run_sql("""UPDATE crcITEM
SET description=%s
WHERE barcode=%s""", (description or '-', barcode)))
def get_holdings_information(recid, include_hidden_libraries=True):
"""
Get information about holdings, using recid.
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@return holdings information
"""
if include_hidden_libraries:
res = run_sql("""SELECT it.barcode,
lib.name,
it.collection,
it.location,
it.description,
it.loan_period,
it.status,
DATE_FORMAT(ln.due_date, '%%Y-%%m-%%d')
FROM crcITEM it
left join crcLOAN ln
on it.barcode = ln.barcode and ln.status != %s
left join crcLIBRARY lib
on lib.id = it.id_crcLIBRARY
WHERE it.id_bibrec=%s
""", (CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED, recid))
else:
res = run_sql("""SELECT it.barcode,
lib.name,
it.collection,
it.location,
it.description,
it.loan_period,
it.status,
DATE_FORMAT(ln.due_date, '%%Y-%%m-%%d')
FROM crcITEM it
left join crcLOAN ln
on it.barcode = ln.barcode and ln.status != %s
left join crcLIBRARY lib
on lib.id = it.id_crcLIBRARY
WHERE it.id_bibrec=%s
AND lib.type<>%s
""", (CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED, recid,
CFG_BIBCIRCULATION_LIBRARY_TYPE_HIDDEN))
return res
def get_number_copies(recid):
"""
Get the number of copies of a given recid.
This function is used by the 'BibEdit' module to display the
number of copies for the record being edited.
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@return number_of_copies
"""
try:
recid = int(recid)
except ValueError:
return 0
res = run_sql("""SELECT count(barcode)
FROM crcITEM
WHERE id_bibrec=%s
""", (recid, ))
return res[0][0]
def has_copies(recid):
"""
Indicate if there are any physical copies of a document described
by the record
@param recid: The identifier of the record
@type recid: int
@return True or False according to the state
"""
return (get_number_copies(recid) != 0)
def add_new_copy(barcode, recid, library_id, collection, location, description,
loan_period, status, expected_arrival_date):
"""
Add a new copy
barcode: identify the item. It is the primary key of the table
crcITEM.
recid: identify the record. It is also the primary key of
the table bibrec.
library_id: identify the library. It is also the primary key of
the table crcLIBRARY.
"""
run_sql("""insert into crcITEM (barcode, id_bibrec, id_crcLIBRARY,
collection, location, description, loan_period,
status, expected_arrival_date, creation_date,
modification_date)
values (%s, %s, %s, %s, %s, %s, %s, %s, %s, NOW(), NOW())""",
(barcode, recid, library_id, collection, location, description or '-',
loan_period, status, expected_arrival_date))
def delete_copy(barcode):
res = run_sql("""delete FROM crcITEM WHERE barcode=%s""", (barcode, ))
return res
def get_expected_arrival_date(barcode):
res = run_sql("""SELECT expected_arrival_date
FROM crcITEM
WHERE barcode=%s """, (barcode,))
if res:
return res[0][0]
else:
return ''
def get_barcodes(recid, description='-'):
"""
@param description: Gives the details like volume(if any), etc... of a particular
item in the record.
"""
if description.strip() in ('', '-'):
res = run_sql("""SELECT barcode
FROM crcITEM
WHERE id_bibrec=%s""",
(recid, ))
else:
res = run_sql("""SELECT barcode
FROM crcITEM
WHERE id_bibrec=%s
AND description=%s""",
(recid, description))
barcodes = []
for i in range(len(res)):
barcodes.append(res[i][0])
return barcodes
def barcode_in_use(barcode):
res = run_sql("""SELECT id_bibrec
FROM crcITEM
WHERE barcode=%s""",
(barcode, ))
if len(res)>0:
return True
else:
return False
###
### "Borrower" related functions ###
###
def new_borrower(ccid, name, email, phone, address, mailbox, notes):
"""
Add/Register a new borrower on the crcBORROWER table.
name: borrower's name.
email: borrower's email.
phone: borrower's phone.
address: borrower's address.
"""
return run_sql("""insert into crcBORROWER ( ccid,
name,
email,
phone,
address,
mailbox,
borrower_since,
borrower_until,
notes)
values(%s, %s, %s, %s, %s, %s, NOW(), '0000-00-00 00:00:00', %s)""",
(ccid, name, email, phone, address, mailbox, str(notes)))
# IntegrityError: (1062, "Duplicate entry '665119' for key 2")
def get_borrower_details(borrower_id):
"""
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
res = run_sql("""SELECT id, ccid, name, email, phone, address, mailbox
FROM crcBORROWER
WHERE id=%s""", (borrower_id, ))
if res:
return res[0]
else:
return None
def update_borrower_info(borrower_id, name, email, phone, address, mailbox):
"""
Update borrower info.
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
return int(run_sql("""UPDATE crcBORROWER
set name=%s,
email=%s,
phone=%s,
address=%s,
mailbox=%s
WHERE id=%s""",
(name, email, phone, address, mailbox, borrower_id)))
def get_borrower_data(borrower_id):
"""
Get the borrower's information (name, address and email).
borrower_id: identify the borrower. The data associate
to this borrower will be retrieved. It is also
the primary key of the crcBORROWER table.
"""
res = run_sql("""SELECT name,
address,
mailbox,
email
FROM crcBORROWER
WHERE id=%s""",
(borrower_id, ))
if res:
return res[0]
else:
return None
def get_borrower_data_by_id(borrower_id):
"""
Retrieve borrower's data by borrower_id.
"""
res = run_sql("""SELECT id, ccid, name, email, phone,
address, mailbox
FROM crcBORROWER
WHERE id=%s""", (borrower_id, ))
if res:
return res[0]
else:
return None
def get_borrower_ccid(user_id):
res = run_sql("""SELECT ccid
FROM crcBORROWER
WHERE id=%s""", (user_id, ))
if res:
return res[0][0]
else:
return None
def get_all_borrowers():
res = run_sql("""SELECT id, ccid
FROM crcBORROWER""")
return res
def get_borrower_name(borrower_id):
"""
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
res = run_sql("""SELECT name
FROM crcBORROWER
WHERE id=%s
""", (borrower_id, ))
if res:
return res[0][0]
else:
return None
def get_borrower_email(borrower_id):
"""
Get the email of a borrower.
@param borrower_id: identify the borrower. Primary key of crcBORROWER.
@type borrower_id: int
@return borrower's email (string).
"""
res = run_sql("""SELECT email
FROM crcBORROWER
WHERE id=%s""", (borrower_id, ))
if res:
return res[0][0]
else:
return None
def get_borrower_id_by_email(email):
"""
Retrieve borrower's id by email.
"""
res = run_sql("""SELECT id
FROM crcBORROWER
WHERE email=%s""",
(email, ))
if res:
return res[0][0]
else:
return None
def get_borrower_address(email):
"""
Get the address of a borrower using the email.
email: borrower's email.
"""
res = run_sql("""SELECT address
FROM crcBORROWER
WHERE email=%s""", (email, ))
if len(res[0][0]) > 0:
return res[0][0]
else:
return 0
def add_borrower_address(address, email):
"""
Add the email and the address of a borrower.
address: borrower's address.
email: borrower's email.
"""
run_sql("""UPDATE crcBORROWER
set address=%s
WHERE email=%s""", (address, email))
def get_invenio_user_email(uid):
"""
Get the email of an invenio's user.
uid: identify an invenio's user.
"""
res = run_sql("""SELECT email
FROM user
WHERE id=%s""",
(uid, ))
if res:
return res[0][0]
else:
return None
def search_borrower_by_name(string):
"""
string: search pattern.
"""
string = string.replace("'", "\\'")
res = run_sql("""SELECT id, name
FROM crcBORROWER
WHERE upper(name) like upper('%%%s%%')
ORDER BY name
""" % (string))
return res
def search_borrower_by_email(string):
"""
string: search pattern.
"""
res = run_sql("""SELECT id, name
FROM crcBORROWER
WHERE email regexp %s
""", (string, ))
return res
def search_borrower_by_id(string):
"""
string: search pattern.
"""
res = run_sql("""SELECT id, name
FROM crcBORROWER
WHERE id=%s
""", (string, ))
return res
def search_borrower_by_ccid(string):
"""
string: search pattern.
"""
res = run_sql("""SELECT id, name
FROM crcBORROWER
WHERE ccid regexp %s
""", (string, ))
return res
def update_borrower(user_id, name, email, phone, address, mailbox):
return run_sql(""" UPDATE crcBORROWER
SET name=%s,
email=%s,
phone=%s,
address=%s,
mailbox=%s
WHERE id=%s
""", (name, email, phone, address, mailbox, user_id))
def get_borrower_loans(borrower_id):
"""
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
res = run_sql(""" SELECT id_bibrec,
barcode,
DATE_FORMAT(loaned_on,'%%Y-%%m-%%d'),
DATE_FORMAT(due_date,'%%Y-%%m-%%d'),
type
FROM crcLOAN
WHERE id_crcBORROWER=%s and status != %s
""", (borrower_id, CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED))
return res
def get_recid_borrower_loans(borrower_id):
"""
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
"""
res = run_sql(""" SELECT id, id_bibrec, barcode
FROM crcLOAN
WHERE id_crcBORROWER=%s
AND status != %s
AND type != 'ill'
""", (borrower_id, CFG_BIBCIRCULATION_ILL_STATUS_RETURNED))
return res
def get_borrower_loan_details(borrower_id):
"""
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
This function is also used by the Aleph Service for the display of loans
of the user for the termination sheet.
"""
res = run_sql("""
SELECT it.id_bibrec,
l.barcode,
DATE_FORMAT(l.loaned_on,'%%Y-%%m-%%d'),
DATE_FORMAT(l.due_date,'%%Y-%%m-%%d'),
l.number_of_renewals,
l.overdue_letter_number,
DATE_FORMAT(l.overdue_letter_date,'%%Y-%%m-%%d'),
l.type,
l.notes,
l.id,
l.status
FROM crcLOAN l, crcITEM it
WHERE l.barcode=it.barcode
AND id_crcBORROWER=%s
AND l.status!=%s
""", (borrower_id, CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED))
return res
def get_borrower_request_details(borrower_id):
"""
borrower_id: identify the borrower. It is also the primary key of
the table crcBORROWER.
This function is also used by the Aleph Service for the display of loan
requests of the user for the termination sheet.
"""
res = run_sql("""SELECT lr.id_bibrec,
lr.status,
lib.name,
it.location,
DATE_FORMAT(lr.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(lr.period_of_interest_to,'%%Y-%%m-%%d'),
lr.request_date,
lr.id
FROM crcLOANREQUEST lr,
crcITEM it,
crcLIBRARY lib
WHERE lr.id_crcBORROWER=%s
AND (lr.status=%s OR lr.status=%s)
and lib.id = it.id_crcLIBRARY and lr.barcode = it.barcode
""", (borrower_id,
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING,
CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING))
return res
def get_borrower_requests(borrower_id):
"""
Get the hold requests of a borrower.
borrower_id: identify the borrower. All the hold requests
associate to this borrower will be retrieved.
It is also the primary key of the crcBORROWER table.
"""
res = run_sql("""
SELECT id,
id_bibrec,
DATE_FORMAT(request_date,'%%Y-%%m-%%d'),
status
FROM crcLOANREQUEST
WHERE id_crcBORROWER=%s and
(status=%s or status=%s)""",
(borrower_id, CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING,
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING))
return res
def get_borrower_proposals(borrower_id):
"""
Get the proposals of a borrower.
borrower_id: identify the borrower. All the proposals
associated to this borrower will be retrieved.
It is also the primary key of the crcBORROWER table.
"""
res = run_sql("""
SELECT id,
id_bibrec,
DATE_FORMAT(request_date,'%%Y-%%m-%%d'),
status
FROM crcLOANREQUEST
WHERE id_crcBORROWER=%s and
status=%s""",
(borrower_id, CFG_BIBCIRCULATION_REQUEST_STATUS_PROPOSED))
return res
def bor_loans_historical_overview(borrower_id):
"""
Get loans historical overview of a given borrower_id.
@param borrower_id: identify the borrower. Primary key of crcBORROWER.
@type borrower_id: int
@return list with loans historical overview.
"""
res = run_sql("""SELECT l.id_bibrec,
l.barcode,
lib.name,
it.location,
DATE_FORMAT(l.loaned_on,'%%Y-%%m-%%d'),
DATE_FORMAT(l.due_date,'%%Y-%%m-%%d'),
l.returned_on,
l.number_of_renewals,
l.overdue_letter_number
FROM crcLOAN l, crcITEM it, crcLIBRARY lib
WHERE l.id_crcBORROWER=%s and
lib.id = it.id_crcLIBRARY and
it.barcode = l.barcode and
l.status = %s
""", (borrower_id, CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED))
return res
def bor_requests_historical_overview(borrower_id):
"""
Get requests historical overview of a given borrower_id.
@param borrower_id: identify the borrower. Primary key of crcBORROWER.
@type borrower_id: int
@return list with requests historical overview.
"""
res = run_sql("""SELECT lr.id_bibrec,
lr.barcode,
lib.name,
it.location,
DATE_FORMAT(lr.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(lr.period_of_interest_to,'%%Y-%%m-%%d'),
lr.request_date
FROM crcLOANREQUEST lr, crcITEM it, crcLIBRARY lib
WHERE lr.id_crcBORROWER=%s and
lib.id = it.id_crcLIBRARY and
it.barcode = lr.barcode and
lr.status =%s
""", (borrower_id, CFG_BIBCIRCULATION_REQUEST_STATUS_DONE))
return res
def get_historical_overview(borrower_id):
"""
Get historical information overview (recid, loan date, return date
and number of renewals).
borrower_id: identify the borrower. All the old (returned) loans
associated to this borrower will be retrieved.
It is also the primary key of the crcBORROWER table.
"""
res = run_sql("""SELECT id_bibrec,
DATE_FORMAT(loaned_on,'%%Y-%%m-%%d'),
returned_on,
number_of_renewals
FROM crcLOAN
WHERE id_crcBORROWER=%s and status=%s;
""", (borrower_id,
CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED))
return res
def get_borrower_notes(borrower_id):
"""The data associated to this borrower will be retrieved."""
res = run_sql("""SELECT notes
FROM crcBORROWER
WHERE id=%s""",
(borrower_id, ))
if res:
return res[0][0]
else:
return None
def update_borrower_notes(borrower_id, borrower_notes):
run_sql("""UPDATE crcBORROWER
SET notes=%s
WHERE id=%s """, (str(borrower_notes), borrower_id))
###
### "Library" related functions ###
###
def get_all_libraries():
res = run_sql("""SELECT id, name
FROM crcLIBRARY
ORDER BY name""")
return res
def get_main_libraries():
res = run_sql("""SELECT id, name
FROM crcLIBRARY
WHERE type=%s
""", (CFG_BIBCIRCULATION_LIBRARY_TYPE_MAIN, ))
if res:
return res
else:
return None
def get_internal_libraries():
res = run_sql("""SELECT id, name
FROM crcLIBRARY
WHERE (type=%s OR type=%s)
ORDER BY name
""", (CFG_BIBCIRCULATION_LIBRARY_TYPE_INTERNAL,
CFG_BIBCIRCULATION_LIBRARY_TYPE_MAIN))
return res
def get_external_libraries():
res = run_sql("""SELECT id, name
FROM crcLIBRARY
WHERE type=%s
""", (CFG_BIBCIRCULATION_LIBRARY_TYPE_EXTERNAL, ))
return res
def get_hidden_libraries():
res = run_sql("""SELECT id, name
FROM crcLIBRARY
WHERE type=%s
ORDER BY name
""", (CFG_BIBCIRCULATION_LIBRARY_TYPE_HIDDEN, ))
return res
def merge_libraries(library_from, library_to):
run_sql("""UPDATE crcITEM
SET id_crcLIBRARY=%s
WHERE id_crcLIBRARY=%s
""", (library_to, library_from))
run_sql("""UPDATE crcILLREQUEST
SET id_crcLIBRARY=%s
WHERE id_crcLIBRARY=%s
""", (library_to, library_from))
run_sql("""DELETE FROM crcLIBRARY
WHERE id=%s
""", (library_from,))
def get_library_items(library_id):
"""
Get all items which belong to a library.
library_id: identify the library. It is also the primary key of
the table crcLIBRARY.
"""
res = run_sql("""SELECT barcode, id_bibrec, collection,
location, description, loan_period, status, number_of_requests
FROM crcITEM
WHERE id_crcLIBRARY=%s""",
(library_id, ))
return res
def get_library_details(library_id):
"""
library_id: identify the library. It is also the primary key of
the table crcLIBRARY.
"""
res = run_sql("""SELECT id, name, address, email, phone, type, notes
FROM crcLIBRARY
WHERE id=%s;
""", (library_id, ))
if res:
return res[0]
else:
return None
def get_library_type(library_id):
"""
library_id: identify the library. It is also the primary key of
the table crcLIBRARY.
"""
res = run_sql("""SELECT type
FROM crcLIBRARY
WHERE id=%s""",
(library_id, ))
if res:
return res[0][0]
else:
return None
def get_library_name(library_id):
"""
library_id: identify the library. It is also the primary key of
the table crcLIBRARY.
"""
res = run_sql("""SELECT name
FROM crcLIBRARY
WHERE id=%s""",
(library_id, ))
if res:
return res[0][0]
else:
return None
def get_lib_location(barcode):
res = run_sql("""SELECT id_crcLIBRARY, location
FROM crcITEM
WHERE barcode=%s""",
(barcode, ))
if res:
return res[0]
else:
return None
def get_library_notes(library_id):
""" The data associated to this library will be retrieved."""
res = run_sql("""SELECT notes
FROM crcLIBRARY
WHERE id=%s""",
(library_id, ))
if res:
return res[0][0]
else:
return None
def update_library_notes(library_id, library_notes):
run_sql("""UPDATE crcLIBRARY
SET notes=%s
WHERE id=%s """, (str(library_notes), library_id))
def add_new_library(name, email, phone, address, lib_type, notes):
run_sql("""insert into crcLIBRARY (name, email, phone,
address, type, notes)
values (%s, %s, %s, %s, %s, %s)""",
(name, email, phone, address, lib_type, notes))
def update_library_info(library_id, name, email, phone, address, lib_type):
"""
library_id: identify the library. It is also the primary key of
the table crcLIBRARY.
"""
return int(run_sql("""UPDATE crcLIBRARY
set name=%s,
email=%s,
phone=%s,
address=%s,
type=%s
WHERE id=%s""",
(name, email, phone, address, lib_type, library_id)))
def search_library_by_name(string):
string = string.replace("'", "\\'")
res = run_sql("""SELECT id, name
FROM crcLIBRARY
WHERE upper(name) like upper('%%%s%%')
ORDER BY name
""" % (string))
return res
def search_library_by_email(string):
res = run_sql("""SELECT id, name
FROM crcLIBRARY
WHERE email regexp %s
ORDER BY name
""", (string, ))
return res
###
### "Vendor" related functions ###
###
def get_all_vendors():
res = run_sql("""SELECT id, name
FROM crcVENDOR""")
return res
def get_vendor_details(vendor_id):
"""
vendor_id: identify the vendor. It is also the primary key of
the table crcVENDOR.
"""
res = run_sql("""SELECT id, name, address, email, phone, notes
FROM crcVENDOR
WHERE id=%s;
""", (vendor_id, ))
if res:
return res[0]
else:
return None
def get_vendor_name(vendor_id):
"""
vendor_id: identify the vendor. It is also the primary key of
the table crcVENDOR.
"""
res = run_sql("""SELECT name
FROM crcVENDOR
WHERE id=%s""",
(vendor_id, ))
if res:
return res[0][0]
else:
return None
def get_vendor_notes(vendor_id):
""" The data associated to this vendor will be retrieved."""
res = run_sql("""SELECT notes
FROM crcVENDOR
WHERE id=%s""",
(vendor_id, ))
if res:
return res[0][0]
else:
return None
def add_new_vendor_note(new_note, vendor_id):
run_sql("""UPDATE crcVENDOR
SET notes=concat(notes,%s)
WHERE id=%s;
""", (new_note, vendor_id))
def add_new_vendor(name, email, phone, address, notes):
run_sql("""insert into crcVENDOR (name, email, phone,
address, notes)
values (%s, %s, %s, %s, %s)""",
(name, email, phone, address, notes))
def update_vendor_info(vendor_id, name, email, phone, address):
"""
vendor_id: identify the vendor. It is also the primary key of
the table crcVENDOR.
"""
return int(run_sql("""UPDATE crcVENDOR
SET name=%s,
email=%s,
phone=%s,
address=%s
WHERE id=%s""",
(name, email, phone, address, vendor_id)))
def search_vendor_by_name(string):
res = run_sql("""SELECT id, name
FROM crcVENDOR
WHERE name regexp %s
""", (string, ))
return res
def search_vendor_by_email(string):
res = run_sql("""SELECT id, name
FROM crcVENDOR
WHERE email regexp %s
""", (string, ))
return res
###
### ILL/Proposals/Purchases related functions ###
###
def get_ill_request_type(ill_request_id):
res = run_sql("""SELECT request_type
FROM crcILLREQUEST
WHERE id=%s""", (ill_request_id, ))
if res:
return res[0][0]
else:
return None
def ill_register_request(item_info, borrower_id, period_of_interest_from,
period_of_interest_to, status, additional_comments,
only_edition, request_type, budget_code='', barcode=''):
run_sql("""insert into crcILLREQUEST(id_crcBORROWER, barcode,
period_of_interest_from,
period_of_interest_to, status, item_info,
borrower_comments, only_this_edition,
request_type, budget_code)
values (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s)""",
(borrower_id, barcode, period_of_interest_from,
period_of_interest_to, status, str(item_info),
additional_comments, only_edition,
request_type, budget_code))
def ill_register_request_on_desk(borrower_id, item_info,
period_of_interest_from,
period_of_interest_to,
status, notes, only_edition, request_type,
budget_code=''):
run_sql("""insert into crcILLREQUEST(id_crcBORROWER,
period_of_interest_from, period_of_interest_to,
status, item_info, only_this_edition,
library_notes, request_type, budget_code)
values (%s, %s, %s, %s, %s, %s, %s, %s, %s)""",
(borrower_id, period_of_interest_from, period_of_interest_to,
status, str(item_info), only_edition, notes, request_type,
budget_code))
def get_ill_request_details(ill_request_id):
res = run_sql("""SELECT id_crcLIBRARY,
DATE_FORMAT(request_date,'%%Y-%%m-%%d'),
DATE_FORMAT(expected_date,'%%Y-%%m-%%d'),
DATE_FORMAT(arrival_date,'%%Y-%%m-%%d'),
DATE_FORMAT(due_date,'%%Y-%%m-%%d'),
DATE_FORMAT(return_date,'%%Y-%%m-%%d'),
cost,
barcode,
library_notes,
status
FROM crcILLREQUEST
WHERE id=%s""", (ill_request_id, ))
if res:
return res[0]
else:
return None
def register_ill_from_proposal(ill_request_id, bid=None, library_notes=''):
"""
Register an ILL request created from an existing proposal.
(Used in cases where proposals are 'put aside')
"""
if not bid:
bid = run_sql("""SELECT id_crcBORROWER
FROM crcILLREQUEST
WHERE id = %s
""", (ill_request_id))[0][0]
run_sql("""insert into crcILLREQUEST(id_crcBORROWER,
period_of_interest_from, period_of_interest_to,
status, item_info, only_this_edition,
request_type, budget_code, library_notes)
SELECT %s, period_of_interest_from, period_of_interest_to,
%s, item_info, only_this_edition,
%s, budget_code, %s
FROM crcILLREQUEST
WHERE id = %s
""",(bid, CFG_BIBCIRCULATION_ILL_STATUS_NEW, 'book',
str(library_notes), ill_request_id))
def get_ill_requests(status):
if status == None:
res = run_sql("""
SELECT ill.id, ill.id_crcBORROWER, bor.name,
ill.id_crcLIBRARY, ill.status,
DATE_FORMAT(ill.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.period_of_interest_to,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.due_date,'%%Y-%%m-%%d'),
ill.item_info, ill.request_type
FROM crcILLREQUEST ill, crcBORROWER bor
WHERE ill.id_crcBORROWER=bor.id
AND (ill.request_type=%s OR ill.request_type=%s)
ORDER BY ill.id desc
""", ('article', 'book'))
else:
res = run_sql("""
SELECT ill.id, ill.id_crcBORROWER, bor.name,
ill.id_crcLIBRARY, ill.status,
DATE_FORMAT(ill.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.period_of_interest_to,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.due_date,'%%Y-%%m-%%d'),
ill.item_info, ill.request_type
FROM crcILLREQUEST ill, crcBORROWER bor
WHERE ill.id_crcBORROWER=bor.id
AND (ill.request_type=%s OR ill.request_type=%s)
AND ill.status=%s
ORDER BY ill.id desc
""", ('article', 'book', status))
return res
def get_all_expired_ills():
"""
Get all expired(overdue) ills.
"""
res = run_sql(
"""
SELECT id,
id_crcBORROWER,
item_info,
overdue_letter_number,
DATE_FORMAT(overdue_letter_date,'%%Y-%%m-%%d')
FROM crcILLREQUEST
WHERE status = %s and due_date < CURDATE()
AND request_type in (%s, %s)
""", (CFG_BIBCIRCULATION_ILL_STATUS_ON_LOAN,
'article', 'book'))
return res
def get_proposals(proposal_status):
res = run_sql("""SELECT temp.*, count(req.barcode)
FROM (SELECT ill.id, ill.id_crcBORROWER, bor.name, ill.id_crcLIBRARY,
ill.status, ill.barcode,
ill.period_of_interest_from,
ill.period_of_interest_to,
ill.item_info, ill.cost, ill.request_type
FROM crcILLREQUEST as ill, crcBORROWER as bor
WHERE ill.request_type=%s
AND ill.status=%s
AND ill.barcode!=''
AND ill.id_crcBORROWER=bor.id) AS temp
LEFT JOIN (SELECT barcode
FROM crcLOANREQUEST
WHERE barcode!=''
AND status in (%s, %s, %s, %s)) AS req
ON temp.barcode=req.barcode
GROUP BY req.barcode
ORDER BY temp.id desc""", ('proposal-book', proposal_status,
CFG_BIBCIRCULATION_REQUEST_STATUS_PROPOSED,
CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING,
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING,
CFG_BIBCIRCULATION_REQUEST_STATUS_DONE))
return res
def get_requests_on_put_aside_proposals():
"""
@return Requests on proposed books that are 'put aside'.
"""
res = run_sql("""SELECT ill.id, req.id, bor.id, bor.name, req.period_of_interest_from,
req.period_of_interest_to, ill.item_info, ill.cost
FROM crcILLREQUEST as ill, crcLOANREQUEST as req, crcBORROWER as bor
WHERE ill.barcode!='' AND req.barcode!=''
AND ill.barcode=req.barcode
AND req.id_crcBORROWER = bor.id
AND ill.request_type=%s
AND ill.status=%s
AND req.status=%s
ORDER BY req.id desc""", ('proposal-book', CFG_BIBCIRCULATION_PROPOSAL_STATUS_PUT_ASIDE,
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING))
return res
def get_purchases(status):
if status in (CFG_BIBCIRCULATION_ACQ_STATUS_ON_ORDER,
CFG_BIBCIRCULATION_PROPOSAL_STATUS_ON_ORDER):
#Include proposals with status 'on order' since they are
#purchases too and thus, is helpful if both the categories are
#displayed in the same 'purchase-on order' list in the menu.
res = run_sql("""SELECT ill_data.*, ill_cnt.cnt FROM
(SELECT ill.id, ill.id_crcBORROWER, bor.name,
ill.id_crcLIBRARY, ill.status,
DATE_FORMAT(ill.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.period_of_interest_to,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.due_date,'%%Y-%%m-%%d'),
ill.item_info, ill.cost, ill.request_type
FROM crcILLREQUEST ill, crcBORROWER bor
WHERE ill.id_crcBORROWER=bor.id
AND ill.request_type in (%s, %s, %s)
AND ill.status in (%s, %s)) AS ill_data
LEFT JOIN (SELECT item_info, count(item_info) AS cnt
FROM crcILLREQUEST
WHERE request_type in (%s, %s, %s)
AND status not in (%s, %s, %s)
GROUP BY item_info) AS ill_cnt
ON ill_data.item_info = ill_cnt.item_info
ORDER BY ill_data.id desc""", ('acq-standard', 'acq-book',
'proposal-book',
CFG_BIBCIRCULATION_ACQ_STATUS_ON_ORDER,
CFG_BIBCIRCULATION_PROPOSAL_STATUS_ON_ORDER,
'acq-standard', 'acq-book',
'proposal-book',
CFG_BIBCIRCULATION_ACQ_STATUS_CANCELLED,
CFG_BIBCIRCULATION_PROPOSAL_STATUS_NEW,
CFG_BIBCIRCULATION_PROPOSAL_STATUS_PUT_ASIDE))
else:
res = run_sql("""SELECT ill_data.*, ill_cnt.cnt FROM
(SELECT ill.id, ill.id_crcBORROWER, bor.name,
ill.id_crcLIBRARY, ill.status,
DATE_FORMAT(ill.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.period_of_interest_to,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.due_date,'%%Y-%%m-%%d'),
ill.item_info, ill.cost, ill.request_type
FROM crcILLREQUEST ill, crcBORROWER bor
WHERE ill.id_crcBORROWER=bor.id
AND ill.request_type in (%s, %s)
AND ill.status=%s) AS ill_data
LEFT JOIN (SELECT item_info, count(item_info) AS cnt
FROM crcILLREQUEST
WHERE request_type in (%s, %s)
AND status!=%s
GROUP BY item_info) AS ill_cnt
ON ill_data.item_info = ill_cnt.item_info
ORDER BY ill_data.id desc""", ('acq-standard', 'acq-book', status,
'acq-standard', 'acq-book',
CFG_BIBCIRCULATION_ACQ_STATUS_CANCELLED))
return res
def search_ill_requests_title(title, date_from, date_to):
title = title.replace("'", "\\'")
date_from = date_from.replace("'", "\\'")
date_to = date_to.replace("'", "\\'")
tokens = title.split()
tokens_query = ""
for token in tokens:
tokens_query += " AND ill.item_info like '%%%s%%' " % token
query = """SELECT ill.id, ill.id_crcBORROWER, bor.name,
ill.id_crcLIBRARY, ill.status,
DATE_FORMAT(ill.period_of_interest_from,'%Y-%m-%d'),
DATE_FORMAT(ill.period_of_interest_to,'%Y-%m-%d'),
DATE_FORMAT(ill.due_date,'%Y-%m-%d'),
ill.item_info, ill.request_type
FROM crcILLREQUEST ill, crcBORROWER bor
WHERE ill.id_crcBORROWER=bor.id """
query += tokens_query
query += """ AND DATE_FORMAT(ill.request_date,'%%Y-%%m-%%d') >= '%s'
AND DATE_FORMAT(ill.request_date,'%%Y-%%m-%%d') <= '%s'
ORDER BY ill.id desc""" % (date_from, date_to)
return run_sql(query)
def search_ill_requests_id(reqid, date_from, date_to):
res = run_sql("""
SELECT ill.id, ill.id_crcBORROWER, bor.name,
ill.id_crcLIBRARY, ill.status,
DATE_FORMAT(ill.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.period_of_interest_to,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.due_date,'%%Y-%%m-%%d'),
ill.item_info, ill.request_type
FROM crcILLREQUEST ill, crcBORROWER bor
WHERE ill.id_crcBORROWER=bor.id
AND ill.id = %s
AND DATE_FORMAT(ill.request_date,'%%Y-%%m-%%d') >=%s
AND DATE_FORMAT(ill.request_date,'%%Y-%%m-%%d') <=%s
ORDER BY ill.id desc""", (reqid, date_from, date_to))
return res
def search_requests_cost(cost, date_from, date_to):
cost = cost.replace("'", "\\'")
res = run_sql("""
SELECT ill.id, ill.id_crcBORROWER, bor.name,
ill.id_crcLIBRARY, ill.status,
DATE_FORMAT(ill.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.period_of_interest_to,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.due_date,'%%Y-%%m-%%d'),
ill.item_info, ill.cost, ill.request_type, ''
FROM crcILLREQUEST ill, crcBORROWER bor
WHERE ill.id_crcBORROWER=bor.id
AND ill.cost like upper('%%%s%%')
AND DATE_FORMAT(ill.request_date,'%%Y-%%m-%%d') >= %s
AND DATE_FORMAT(ill.request_date,'%%Y-%%m-%%d') <= %s
ORDER BY ill.id desc
""" % (cost.upper(), date_from, date_to))
return res
def search_requests_notes(notes, date_from, date_to):
notes = notes.replace("'", "\\'")
date_from = date_from.replace("'", "\\'")
date_to = date_to.replace("'", "\\'")
tokens = notes.split()
tokens_query = ""
for token in tokens:
tokens_query += " AND library_notes like '%%%s%%' " % token
query = """
SELECT ill.id, ill.id_crcBORROWER, bor.name,
ill.id_crcLIBRARY, ill.status,
DATE_FORMAT(ill.period_of_interest_from,'%Y-%m-%d'),
DATE_FORMAT(ill.period_of_interest_to,'%Y-%m-%d'),
DATE_FORMAT(ill.due_date,'%Y-%m-%d'),
ill.item_info, ill.cost, ill.request_type, ''
FROM crcILLREQUEST ill, crcBORROWER bor
WHERE ill.id_crcBORROWER=bor.id """
query += tokens_query
query += """ AND DATE_FORMAT(ill.request_date,'%%Y-%%m-%%d') >= %s
AND DATE_FORMAT(ill.request_date,'%%Y-%%m-%%d') <= %s
ORDER BY ill.id desc
""" % (date_from, date_to)
return run_sql(query)
def get_ill_request_borrower_details(ill_request_id):
res = run_sql("""
SELECT ill.id_crcBORROWER, bor.name, bor.email, bor.mailbox,
DATE_FORMAT(ill.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.period_of_interest_to,'%%Y-%%m-%%d'),
ill.item_info, ill.borrower_comments,
ill.only_this_edition, ill.request_type
FROM crcILLREQUEST ill, crcBORROWER bor
WHERE ill.id_crcBORROWER=bor.id and ill.id=%s""", (ill_request_id, ))
if res:
return res[0]
else:
return None
def get_purchase_request_borrower_details(ill_request_id):
res = run_sql("""
SELECT ill.id_crcBORROWER, bor.name, bor.email, bor.mailbox,
DATE_FORMAT(ill.period_of_interest_from,'%%Y-%%m-%%d'),
DATE_FORMAT(ill.period_of_interest_to,'%%Y-%%m-%%d'),
ill.item_info, ill.borrower_comments,
ill.only_this_edition, ill.budget_code, ill.request_type
FROM crcILLREQUEST ill, crcBORROWER bor
WHERE ill.id_crcBORROWER=bor.id and ill.id=%s""", (ill_request_id, ))
if res:
return res[0]
else:
return None
def update_ill_request(ill_request_id, library_id, request_date,
expected_date, arrival_date, due_date, return_date,
status, cost, barcode, library_notes):
run_sql("""UPDATE crcILLREQUEST
SET id_crcLIBRARY=%s,
request_date=%s,
expected_date=%s,
arrival_date=%s,
due_date=%s,
return_date=%s,
status=%s,
cost=%s,
barcode=%s,
library_notes=%s
WHERE id=%s""",
(library_id, request_date, expected_date,
arrival_date, due_date, return_date, status, cost,
barcode, library_notes, ill_request_id))
def update_purchase_request(ill_request_id, library_id, request_date,
expected_date, arrival_date, due_date, return_date,
status, cost, budget_code, library_notes):
run_sql("""UPDATE crcILLREQUEST
SET id_crcLIBRARY=%s,
request_date=%s,
expected_date=%s,
arrival_date=%s,
due_date=%s,
return_date=%s,
status=%s,
cost=%s,
budget_code=%s,
library_notes=%s
WHERE id=%s""",
(library_id, request_date, expected_date,
arrival_date, due_date, return_date, status, cost,
budget_code, library_notes, ill_request_id))
def update_ill_request_status(ill_request_id, new_status):
run_sql("""UPDATE crcILLREQUEST
SET status=%s
WHERE id=%s""", (new_status, ill_request_id))
def get_ill_request_notes(ill_request_id):
res = run_sql("""SELECT library_notes
FROM crcILLREQUEST
WHERE id=%s""",
(ill_request_id, ))
if res:
return res[0][0]
else:
return None
def update_ill_request_notes(ill_request_id, library_notes):
run_sql("""UPDATE crcILLREQUEST
SET library_notes=%s
WHERE id=%s""", (str(library_notes), ill_request_id))
def update_ill_request_item_info(ill_request_id, item_info):
run_sql("""UPDATE crcILLREQUEST
SET item_info=%s
WHERE id=%s""", (str(item_info), ill_request_id))
def get_ill_borrower(ill_request_id):
res = run_sql("""SELECT id_crcBORROWER
FROM crcILLREQUEST
WHERE id=%s""", (ill_request_id, ))
if res:
return res[0][0]
else:
return None
def get_ill_barcode(ill_request_id):
res = run_sql("""SELECT barcode
FROM crcILLREQUEST
WHERE id=%s""", (ill_request_id, ))
if res:
return res[0][0]
else:
return None
def update_ill_loan_status(borrower_id, barcode, return_date, loan_type):
run_sql("""UPDATE crcLOAN
SET status = %s,
returned_on = %s
WHERE id_crcBORROWER = %s
AND barcode = %s
AND type = %s """,
(CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED,
return_date, borrower_id, barcode, loan_type))
def get_ill_requests_details(borrower_id):
"""
This function is also used by the Aleph Service for the display of ILLs
of the user for termination sheet.
"""
res = run_sql("""SELECT id, item_info, id_crcLIBRARY,
DATE_FORMAT(request_date,'%%Y-%%m-%%d'),
DATE_FORMAT(expected_date,'%%Y-%%m-%%d'),
DATE_FORMAT(arrival_date,'%%Y-%%m-%%d'),
DATE_FORMAT(due_date,'%%Y-%%m-%%d'),
status, library_notes, request_type
FROM crcILLREQUEST
WHERE id_crcBORROWER=%s
AND status in (%s, %s, %s)
AND request_type in (%s, %s)
ORDER BY FIELD(status, %s, %s, %s)
""", (borrower_id, CFG_BIBCIRCULATION_ILL_STATUS_NEW,
CFG_BIBCIRCULATION_ILL_STATUS_REQUESTED,
CFG_BIBCIRCULATION_ILL_STATUS_ON_LOAN,
'article', 'book',
CFG_BIBCIRCULATION_ILL_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_ILL_STATUS_NEW,
CFG_BIBCIRCULATION_ILL_STATUS_REQUESTED))
return res
def get_proposal_requests_details(borrower_id):
res = run_sql("""SELECT id, item_info, id_crcLIBRARY,
DATE_FORMAT(request_date,'%%Y-%%m-%%d'),
DATE_FORMAT(expected_date,'%%Y-%%m-%%d'),
DATE_FORMAT(arrival_date,'%%Y-%%m-%%d'),
DATE_FORMAT(due_date,'%%Y-%%m-%%d'),
status, library_notes, request_type
FROM crcILLREQUEST
WHERE id_crcBORROWER=%s
AND status in (%s, %s)
AND request_type = %s
""", (borrower_id, CFG_BIBCIRCULATION_PROPOSAL_STATUS_NEW,
CFG_BIBCIRCULATION_PROPOSAL_STATUS_PUT_ASIDE,
'proposal-book'))
return res
def bor_ill_historical_overview(borrower_id):
res = run_sql("""SELECT id, item_info, id_crcLIBRARY,
DATE_FORMAT(request_date,'%%Y-%%m-%%d'),
DATE_FORMAT(expected_date,'%%Y-%%m-%%d'),
DATE_FORMAT(arrival_date,'%%Y-%%m-%%d'),
DATE_FORMAT(due_date,'%%Y-%%m-%%d'),
status, library_notes, request_type
FROM crcILLREQUEST
WHERE id_crcBORROWER=%s
AND (status=%s OR status=%s)
AND request_type in (%s, %s)
""", (borrower_id, CFG_BIBCIRCULATION_ILL_STATUS_RETURNED,
CFG_BIBCIRCULATION_ILL_STATUS_RECEIVED,
'article', 'book'))
return res
def bor_proposal_historical_overview(borrower_id):
res = run_sql("""SELECT id, item_info, id_crcLIBRARY,
DATE_FORMAT(request_date,'%%Y-%%m-%%d'),
DATE_FORMAT(expected_date,'%%Y-%%m-%%d'),
DATE_FORMAT(arrival_date,'%%Y-%%m-%%d'),
DATE_FORMAT(due_date,'%%Y-%%m-%%d'),
status, library_notes, request_type
FROM crcILLREQUEST
WHERE id_crcBORROWER=%s
AND (status=%s OR status=%s)
AND request_type = %s
""", (borrower_id, CFG_BIBCIRCULATION_PROPOSAL_STATUS_ON_ORDER,
CFG_BIBCIRCULATION_PROPOSAL_STATUS_RECEIVED,
'proposal-book'))
return res
def get_ill_notes(ill_id):
res = run_sql("""SELECT library_notes
FROM crcILLREQUEST
WHERE id=%s""",
(ill_id, ))
if res:
return res[0][0]
else:
return None
def update_ill_notes(ill_id, ill_notes):
run_sql("""UPDATE crcILLREQUEST
SET library_notes=%s
WHERE id=%s """, (str(ill_notes), ill_id))
def get_ill_book_info(ill_request_id):
res = run_sql("""SELECT item_info
FROM crcILLREQUEST
WHERE id=%s""",
(ill_request_id, ))
if res:
return res[0][0]
else:
return None
def delete_brief_format_cache(recid):
run_sql("""DELETE FROM bibfmt
WHERE format='HB'
AND id_bibrec=%s""", (recid,))
diff --git a/invenio/legacy/bibcirculation/scripts/bibcircd.py b/invenio/legacy/bibcirculation/scripts/bibcircd.py
index 16e853a39..b40ae4c5f 100644
--- a/invenio/legacy/bibcirculation/scripts/bibcircd.py
+++ b/invenio/legacy/bibcirculation/scripts/bibcircd.py
@@ -1,32 +1,32 @@
#!@PYTHON@
## -*- mode: python; coding: utf-8; -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
try:
from invenio.flaskshell import *
- from invenio.bibcirculation_daemon import main
+ from invenio.legacy.bibcirculation.daemon import main
except ImportError, e:
print "Error: %s" % e
import sys
sys.exit(1)
main()
diff --git a/invenio/legacy/bibcirculation/templates.py b/invenio/legacy/bibcirculation/templates.py
index f189ccd26..ad22ea91f 100644
--- a/invenio/legacy/bibcirculation/templates.py
+++ b/invenio/legacy/bibcirculation/templates.py
@@ -1,16077 +1,16077 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
""" Templates for the bibcirculation module """
__revision__ = "$Id$"
import datetime
import cgi
from time import localtime
import invenio.utils.date as dateutils
from invenio.utils.url import create_html_link
from invenio.base.i18n import gettext_set_language
from invenio.legacy.search_engine import get_fieldvalues
from invenio.config import CFG_SITE_URL, CFG_SITE_LANG, \
CFG_CERN_SITE, CFG_SITE_SECURE_URL, CFG_SITE_RECORD, \
CFG_SITE_NAME
from invenio.base.i18n import gettext_set_language
-import invenio.bibcirculation_dblayer as db
-from invenio.bibcirculation_utils import get_book_cover, \
+import invenio.legacy.bibcirculation.db_layer as db
+from invenio.legacy.bibcirculation.utils import get_book_cover, \
book_information_from_MARC, \
book_title_from_MARC, \
renew_loan_for_X_days, \
get_item_info_for_search_result, \
all_copies_are_missing, \
is_periodical, \
looks_like_dictionary
-from invenio.bibcirculation_config import \
+from invenio.legacy.bibcirculation.config import \
CFG_BIBCIRCULATION_ITEM_LOAN_PERIOD, \
CFG_BIBCIRCULATION_COLLECTION, \
CFG_BIBCIRCULATION_LIBRARY_TYPE, \
CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, \
CFG_BIBCIRCULATION_LIBRARIAN_EMAIL, \
CFG_BIBCIRCULATION_LOANS_EMAIL, \
CFG_BIBCIRCULATION_ILLS_EMAIL, \
CFG_BIBCIRCULATION_ITEM_STATUS, \
CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, \
CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF, \
CFG_BIBCIRCULATION_ITEM_STATUS_ON_ORDER, \
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED, \
CFG_BIBCIRCULATION_ILL_STATUS_NEW, \
CFG_BIBCIRCULATION_ILL_STATUS_REQUESTED, \
CFG_BIBCIRCULATION_ILL_STATUS_ON_LOAN, \
CFG_BIBCIRCULATION_ILL_STATUS_RETURNED, \
CFG_BIBCIRCULATION_ILL_STATUS_CANCELLED, \
CFG_BIBCIRCULATION_ILL_STATUS_RECEIVED, \
CFG_BIBCIRCULATION_ITEM_LOAN_PERIOD, \
CFG_BIBCIRCULATION_ACQ_STATUS_NEW, \
CFG_BIBCIRCULATION_ACQ_STATUS_ON_ORDER, \
CFG_BIBCIRCULATION_ACQ_STATUS_PARTIAL_RECEIPT, \
CFG_BIBCIRCULATION_ACQ_STATUS_RECEIVED, \
CFG_BIBCIRCULATION_ACQ_STATUS_CANCELLED, \
CFG_BIBCIRCULATION_ACQ_TYPE, \
CFG_BIBCIRCULATION_ACQ_STATUS, \
CFG_BIBCIRCULATION_PROPOSAL_STATUS_NEW, \
CFG_BIBCIRCULATION_PROPOSAL_STATUS_ON_ORDER, \
CFG_BIBCIRCULATION_PROPOSAL_STATUS_PUT_ASIDE, \
CFG_BIBCIRCULATION_PROPOSAL_STATUS_RECEIVED, \
CFG_BIBCIRCULATION_PROPOSAL_TYPE, \
CFG_BIBCIRCULATION_PROPOSAL_STATUS
def load_menu(ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
_MENU_ = """
<div>
<map name="Navigation_Bar" id="cdlnav">
<div id="bibcircmenu" class="cdsweb">
<!-- <h2><a name="localNavLinks">%(links)s:</a></h2> -->
<ul>
<!-- <li>
<a href="%(url)s/admin2/bibcirculation">%(Home)s</a>
</li> -->
<li>
<a href="%(url)s/admin2/bibcirculation/loan_on_desk_step1?ln=%(ln)s">%(Loan)s</a>
</li>
<li>
<a href="%(url)s/admin2/bibcirculation/loan_return?ln=%(ln)s">%(Return)s</a>
</li>
<li>
<a href="%(url)s/admin2/bibcirculation/borrower_search?redirect_to_new_request=yes&ln=%(ln)s">%(Request)s</a>
</li>
<li>
<a href="%(url)s/admin2/bibcirculation/borrower_search?ln=%(ln)s">%(Borrowers)s</a>
</li>
<li>
<a href="%(url)s/admin2/bibcirculation/item_search?ln=%(ln)s">%(Items)s</a>
</li>
""" % {'url': CFG_SITE_URL, 'links': _("Main navigation links"),
'Home': _("Home"), 'Loan': _("Loan"), 'Return': _("Return"),
'Request': _("Request"), 'Borrowers': _("Borrowers"),
'Items': _("Items"), 'ln': ln}
_MENU_ += """
<li class="hassubmenu"> <a href="#">%(Lists)s</a>
<ul class="submenu">
<li><a href="%(url)s/admin2/bibcirculation/all_loans?ln=%(ln)s">%(Last_loans)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/all_expired_loans?ln=%(ln)s">%(Overdue_loans)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/get_pending_requests?ln=%(ln)s">%(Items_on_shelf_with_holds)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/get_waiting_requests?ln=%(ln)s">%(Items_on_loan_with_holds)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/get_expired_loans_with_waiting_requests?ln=%(ln)s">%(Overdue_loans_with_holds)s</a></li>
</ul>
</li>
<li class="hassubmenu"> <a href="#">%(Others)s</a>
<ul class="submenu">
<li> <a href="#">%(Libraries)s</a>
<ul class="subsubmenu">
<li><a href="%(url)s/admin2/bibcirculation/search_library_step1?ln=%(ln)s">%(Search)s...</a></li>
<li><a href="%(url)s/admin2/bibcirculation/add_new_library_step1?ln=%(ln)s">%(Add_new_library)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/update_library_info_step1?ln=%(ln)s">%(Update_info)s</a></li>
</ul>
</li>
<li> <a href="#">%(Vendors)s</a>
<ul class="subsubmenu">
<li><a href="%(url)s/admin2/bibcirculation/search_vendor_step1?ln=%(ln)s">%(Search)s...</a></li>
<li><a href="%(url)s/admin2/bibcirculation/add_new_vendor_step1?ln=%(ln)s">%(Add_new_vendor)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/update_vendor_info_step1?ln=%(ln)s">%(Update_info)s</a></li>
</ul>
</li>
</ul>
</li>
""" % {'url': CFG_SITE_URL, 'Lists': _("Loan Lists"),
'Last_loans': _("Last loans"),
'Overdue_loans': _("Overdue loans"),
'Items_on_shelf_with_holds': _("Items on shelf with holds"),
'Items_on_loan_with_holds': _("Items on loan with holds"),
'Overdue_loans_with_holds': _("Overdue loans with holds"),
'Others': _("Others"), 'Libraries': _("Libraries"),
'Search': _("Search"), 'Add_new_library': _("Add new library"),
'Update_info': _("Update info"), 'Vendors': _("Vendors"),
'Add_new_vendor': _("Add new vendor"), 'ln': ln}
_MENU_ += """
<li class="hassubmenu"> <a href="#">%(ILL)s<!--Inter Library Loan--></a>
<ul class="submenu">
<li><a href="%(url)s/admin2/bibcirculation/register_ill_book_request?ln=%(ln)s">%(Register_Book_request)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/register_ill_article_request_step1">%(Register_Article)s request</a></li>
<li><a href="%(url)s/admin2/bibcirculation/register_purchase_request_step1?ln=%(ln)s">%(Register_Purchase_request)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/ill_search?ln=%(ln)s">%(Search)s...</a></li>
</ul>
</li>
<li class="hassubmenu"> <a href="#">%(Lists)s</a>
<ul class="submenu">
<li><a href="%(url)s/admin2/bibcirculation/list_ill_request?status=new&ln=%(ln)s">%(ILL)s - %(ill-new)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/list_ill_request?status=requested&ln=%(ln)s">%(ILL)s - %(Requested)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/list_ill_request?status=on loan&ln=%(ln)s">%(ILL)s - %(On_loan)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/list_purchase?status=%(acq-new)s&ln=%(ln)s">%(Purchase)s - %(acq-new)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/list_purchase?status=%(on_order)s&ln=%(ln)s">%(Purchase)s - %(on_order)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/list_proposal?status=%(proposal-new)s&ln=%(ln)s">%(Proposal)s - %(proposal-new)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/list_proposal?status=%(proposal-put_aside)s&ln=%(ln)s">%(Proposal)s - %(proposal-put_aside)s</a></li>
<li><a href="%(url)s/admin2/bibcirculation/list_proposal?status=%(requests-putaside)s&ln=%(ln)s">%(Proposal)s - %(requests-putaside)s</a></li>
</ul>
</li>
<li class="hassubmenu">
<a href="#">%(Help)s</a>
<ul class="submenu">
<li><a href="%(url)s/help/admin/bibcirculation-admin-guide" target="_blank">%(Admin_guide)s</a></li>
<!-- <li><a href="%(url)s/admin2/bibcirculation/help_contactsupport">%(Contact_Support)s</a></li> -->
</ul>
</li>
</ul>
<div class="clear"></div>
</div>
</map>
</div>
""" % {'url': CFG_SITE_URL, 'ILL': _("ILL"),
'Register_Book_request': _("Register Book request"),
'Register_Article': _("Register Article"),
'Register_Purchase_request': _("Register purchase request"),
'Search': _("Search"),
'Lists': _("ILL Lists"),
'Purchase': _("Purchase"),
'Proposal': _("Proposal"),
'ill-new': _(CFG_BIBCIRCULATION_ILL_STATUS_NEW),
'acq-new': _(CFG_BIBCIRCULATION_ACQ_STATUS_NEW),
'on_order': _(CFG_BIBCIRCULATION_ACQ_STATUS_ON_ORDER),
'Requested': _(CFG_BIBCIRCULATION_ILL_STATUS_REQUESTED),
'On_loan': _(CFG_BIBCIRCULATION_ILL_STATUS_ON_LOAN),
'proposal-new': _(CFG_BIBCIRCULATION_PROPOSAL_STATUS_NEW),
'proposal-put_aside': _(CFG_BIBCIRCULATION_PROPOSAL_STATUS_PUT_ASIDE),
'requests-putaside': "requests-putaside",
'Help': _("Help"),
'Admin_guide': _("Admin guide"),
'Contact_Support': _("Contact Support"),
'ln': ln}
return _MENU_
class Template:
"""
Templates for the BibCirculation module.
The template methods are positioned by grouping into logical
categories('User Pages', 'Loans, Returns and Loan requests',
'ILLs', 'Libraries', 'Vendors' ...)
This is also true with the calling methods in bibcirculation
and adminlib.
These orders should be maintained and when necessary, improved
for readability, as and when additional methods are added.
When applicable, methods should be renamed, refactored and
appropriate documentation added.
"""
def tmpl_infobox(self, infos, ln=CFG_SITE_LANG):
"""
Display len(infos) information fields
@param infos: list of strings
@param ln: language
@return html output
"""
_ = gettext_set_language(ln)
if not((type(infos) is list) or (type(infos) is tuple)):
infos = [infos]
infobox = ""
for info in infos:
infobox += "<div class=\"infobox\">"
lines = info.split("\n")
for line in lines[0:-1]:
infobox += line + "<br />\n"
infobox += lines[-1] + "</div><br />\n"
return infobox
def tmpl_display_infos(self, infos, ln=CFG_SITE_LANG):
"""
Returns a page where the only content is infoboxes.
@param infos: messages to be displayed
@type infos: list
@param ln: language
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """ <br /> """
if type(infos) is not list or len(infos) == 0:
out += """<div class="infobox">"""
out += _("No messages to be displayed")
out += """</div> <br /> """
else:
for info in infos:
out += """<div class="infobox">"""
out += info
out += """</div> <br /> """
return out
def tmpl_holdings_information(self, recid, req, holdings_info,
ln=CFG_SITE_LANG):
"""
This template is used in the user interface. In this template
it is possible to see all details (loan period, number of copies, location, etc)
about a book.
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@param holdings_info: book's information (all copies)
@type holdings_info: list
"""
- from invenio.bibcirculationadminlib import is_adminuser
+ from invenio.legacy.bibcirculation.adminlib import is_adminuser
(auth_code, _auth_message) = is_adminuser(req)
_ = gettext_set_language(ln)
if not book_title_from_MARC(recid):
out = """<div align="center"
<div class="bibcircinfoboxmsg">%s</div>
""" % (_("This record does not exist."))
return out
elif not db.has_copies(recid):
message = _("This record has no copies.")
if auth_code == 0:
new_copy_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/add_new_copy_step3',
{'recid': recid, 'ln': ln},
_("Add a new copy"))
message += ' ' + new_copy_link
out = """<div align="center"
<div class="bibcircinfoboxmsg">%s</div>
""" % (message)
return out
# verify if all copies are missing
elif all_copies_are_missing(recid):
ill_link = """<a href='%(url)s/ill/book_request_step1?%(ln)s'>%(ILL_services)s</a>
""" % {'url': CFG_SITE_URL, 'ln': ln,
'ILL_services': _("ILL services")}
out = """<div align="center"
<div class="bibcircinfoboxmsg">%(message)s.</div>
""" % {'message': _('All the copies of %(strong_tag_open)s%(title)s%(strong_tag_close)s are missing. You can request a copy using %(strong_tag_open)s%(ill_link)s%(strong_tag_close)s') % {'strong_tag_open': '<strong>', 'strong_tag_close': '</strong>', 'title': book_title_from_MARC(recid), 'ill_link': ill_link}}
return out
# verify if there are no copies
elif not holdings_info:
out = """<div align="center"
<div class="bibcircinfoboxmsg">%s</div>
""" % (_("This item has no holdings."))
return out
out = """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
"""
if is_periodical(recid):
out += """
$(document).ready(function(){
$("#table_holdings")
.tablesorter({sortList: [[4,1]],
widthFixed: true,
widgets: ['zebra']})
.bind("sortStart",function(){$("#overlay").show();})
.bind("sortEnd",function(){$("#overlay").hide()})
.tablesorterPager({container: $("#pager"), positionFixed: false, size: 40});
});
</script>
"""
else:
out += """
$(document).ready(function(){
$("#table_holdings")
.tablesorter({sortList: [[6,1],[1,0]],
widthFixed: true,
widgets: ['zebra']})
.bind("sortStart",function(){$("#overlay").show();})
.bind("sortEnd",function(){$("#overlay").hide()})
.tablesorterPager({container: $("#pager"), positionFixed: false});
});
</script>
"""
out += """
<table id="table_holdings" class="tablesorter" border="0"
cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
""" % (_("Options"), _("Library"), _("Collection"),
_("Location"), _("Description"), _("Loan period"),
_("Status"), _("Due date"), _("Barcode"))
for (barcode, library, collection, location, description,
loan_period, status, due_date) in holdings_info:
if loan_period == 'Reference':
request_button = '-'
else:
request_button = """
<input type=button
onClick="location.href='%s/%s/%s/holdings/request?barcode=%s&ln=%s'"
value='%s' class="bibcircbutton" onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'">
""" % (CFG_SITE_URL, CFG_SITE_RECORD, recid, barcode, ln, _("Request"))
if status in (CFG_BIBCIRCULATION_ITEM_STATUS_ON_ORDER,
'claimed'):
expected_arrival_date = db.get_expected_arrival_date(barcode)
if expected_arrival_date != '':
status = status + ' - ' + expected_arrival_date
if status != 'missing':
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td align='center'>%s</td>
</tr>
""" % (request_button, library, collection or '-', location,
description, loan_period, status, due_date or '-', barcode)
if auth_code != 0:
bibcirc_link = ''
else:
bibcirc_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
_("See this book on BibCirculation"))
out += """
</tbody>
</table>
<div id="pager" class="pager">
<form>
<br />
<img src="/img/sb.gif" class="first" />
<img src="/img/sp.gif" class="prev" />
<input type="text" class="pagedisplay" />
<img src="/img/sn.gif" class="next" />
<img src="/img/se.gif" class="last" />
"""
if is_periodical(recid):
out += """
<select class="pagesize">
<option value="10">10</option>
<option value="20">20</option>
<option value="30">30</option>
<option value="40" selected="selected">40</option>
</select>
"""
else:
out += """
<select class="pagesize">
<option value="10" selected="selected">10</option>
<option value="20">20</option>
<option value="30">30</option>
<option value="40">40</option>
</select>
"""
out += """
</form>
</div>
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
""" % (bibcirc_link)
return out
def tmpl_book_proposal_information(self, recid, msg, ln=CFG_SITE_LANG):
"""
This template is used in the user interface. It is used to display the
message regarding a 'proposal' of a book when the corresponding metadata
record has been extracted from Amazon and noone has yet suggested the
book for acquisition(No copies of the book exist yet).
@param msg: information about book proposal mechanism
@type msg: string
"""
_ = gettext_set_language(ln)
out = """<div align="center"
<div class="bibcircinfoboxmsg">%s</div>
<br \>
<br \>
""" % (msg)
out += """<form>
<input type="button" value='%s' onClick="history.go(-1)" class="formbutton">
<input type="button" value='%s' onClick="location.href=
'%s/%s/%s/holdings/request?ln=%s&act=%s'" class="formbutton">
</form>""" % (_("Back"), _("Suggest"), CFG_SITE_URL, CFG_SITE_RECORD,
recid, ln, "pr")
return out
def tmpl_book_not_for_loan(self, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
message = """<div align="center"
<div class="bibcircinfoboxmsg">%s</div>
""" % (_("This item is not for loan."))
return message
def tmpl_message_send_already_requested(self, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
message = _("You already have a request on, or are in possession of this document.")
return message
def tmpl_message_request_send_ok_cern(self, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
message = _("Your request has been registered and the document will be sent to you via internal mail.")
return message
def tmpl_message_request_send_ok_other(self, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
message = _("Your request has been registered.")
return message
def tmpl_message_request_send_fail_cern(self, custom_msg='', ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
message = _("It is not possible to validate your request. ")
message += custom_msg
message += _("Please contact %(librarian_email)s") \
% {'librarian_email': CFG_BIBCIRCULATION_LIBRARIAN_EMAIL}
return message
def tmpl_message_request_send_fail_other(self, custom_msg='', ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
message = _("It is not possible to validate your request. ")
message += custom_msg
message += _("Please contact %(librarian_email)s") \
% {'librarian_email': CFG_BIBCIRCULATION_LIBRARIAN_EMAIL}
return message
def tmpl_message_proposal_send_ok_cern(self, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
message = _("Thank you for your suggestion. We will get back to you shortly.")
return message
def tmpl_message_proposal_send_ok_other(self, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
message = _("Thank you for your suggestion. We will get back to you shortly.")
return message
def tmpl_message_purchase_request_send_ok_other(self, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
message = _("Your purchase request has been registered.")
return message
def tmpl_message_sever_busy(self, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
message = _('Server busy. Please, try again in a few seconds.')
return message
###
### The two template methods below correspond to the user pages of bibcirculation.
###
def tmpl_yourloans(self, loans, requests, proposals, borrower_id,
infos, ln=CFG_SITE_LANG):
"""
When a user is logged in, in the section 'yourloans', it is
possible to check his loans, loan requests and book proposals.
It is also possible to renew a single loan or all loans.
@param infos: additional information in the infobox
@param ln: language
"""
_ = gettext_set_language(ln)
renew_all_link = create_html_link(CFG_SITE_SECURE_URL +
'/yourloans/display',
{'borrower_id': borrower_id, 'action': 'renew_all'},
(_("Renew all loans")))
loanshistoricaloverview_link = create_html_link(CFG_SITE_SECURE_URL +
'/yourloans/loanshistoricaloverview',
{'ln': ln},
(_("Loans - historical overview")))
out = self.tmpl_infobox(infos, ln)
if len(loans) == 0:
out += """
<div class="bibcirctop_bottom">
<br />
<table class="bibcirctable_contents">
<td align="center" class="bibcirccontent">%s</td>
</table>
<br />
""" % (_("You don't have any book on loan."))
else:
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_loans').tablesorter()
});
</script>
<table class="tablesortermedium" id="table_loans"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
""" % (_("Item"),
_("Loaned on"),
_("Due date"),
_("Action(s)"))
for(recid, barcode, loaned_on, due_date, loan_type) in loans:
record_link = "<a href=" + CFG_SITE_URL + "/%s/%s>" % (CFG_SITE_RECORD, recid) + \
(book_title_from_MARC(recid)) + "</a>"
if loan_type == 'ill':
renew_link = '-'
else:
renew_link = create_html_link(CFG_SITE_SECURE_URL +
'/yourloans/display',
{'barcode': barcode, 'action': 'renew'},
(_("Renew")))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
</tr>
""" % (record_link,
loaned_on,
due_date,
renew_link)
out += """ </tbody>
</table>
<br />
<table class="bibcirctable">
<tr>
<td width="430"></td>
<td class="bibcirccontent" width="700">%s</td>
</tr>
</table>
<br />
""" % (renew_all_link)
if len(requests) == 0:
out += """
<h1 class="headline">%s</h1>
<br />
<table class="bibcirctable_contents">
<td align="center" class="bibcirccontent">%s</td>
</table>
<br />
""" % (_("Your Requests"),
_("You don't have any request (waiting or pending)."))
else:
out += """
<h1 class="headline">%s</h1>
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_requests').tablesorter()
});
</script>
<table class="tablesortermedium" id="table_requests"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
""" % (_("Your Requests"),
_("Item"),
_("Request date"),
_("Status"),
_("Action(s)"))
for(request_id, recid, request_date, status) in requests:
record_link = "<a href=" + CFG_SITE_URL + "/%s/%s?ln=%s>" % (CFG_SITE_RECORD, recid, ln) + \
(book_title_from_MARC(recid)) + "</a>"
cancel_request_link = create_html_link(CFG_SITE_URL +
'/yourloans/display',
{'request_id': request_id,
'action': 'cancel',
'ln': ln},
(_("Cancel")))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
</tr>
""" % (record_link, request_date,
status, cancel_request_link)
out += """</tbody>
</table>
<br />"""
if len(proposals) == 0:
out += """
<h1 class="headline">%s</h1>
<br />
<table class="bibcirctable_contents">
<td align="center" class="bibcirccontent">%s</td>
</table>
<br /> <br />
<hr>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent" width="70">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button onClick="location.href='%s'" value='%s' class='formbutton'>
</td>
</tr>
</table>
<br />
""" % (_("Your Proposals"),
_("You did not propose any acquisitions."),
loanshistoricaloverview_link,
CFG_SITE_URL, _("Back to home"))
else:
out += """
<h1 class="headline">%s</h1>
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_requests').tablesorter()
});
</script>
<table class="tablesortermedium" id="table_requests"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
""" % (_("Your Proposals under review"),
_("Item"),
_("Proposal date"))
for(request_id, recid, request_date, status) in proposals:
record_link = "<a href=" + CFG_SITE_URL + "/%s/%s?ln=%s>" % (CFG_SITE_RECORD, recid, ln) + \
(book_title_from_MARC(recid)) + "</a>"
out += """
<tr>
<td>%s</td>
<td>%s</td>
</tr>
""" % (record_link, request_date)
out += """ </tbody>
</table>
<br />
<hr>
<table class="bibcirctable">
<tr>
<td class="bibcirccontent" width="70">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button
onClick="location.href='%s'"
value='%s'
class='formbutton'>
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (loanshistoricaloverview_link,
CFG_SITE_URL,
_("Back to home"))
return out
def tmpl_loanshistoricaloverview(self, result, ln=CFG_SITE_LANG):
"""
In the section 'yourloans' it is possible to see the historical overview of the loans
of the user who is logged in.
@param result: All loans whose status = CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED
@param ln: language
"""
_ = gettext_set_language(ln)
out = """<div class="bibcirctop_bottom">
<br /> <br />
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function(){
$('#table_hist').tablesorter()
});
</script>
<table class="tablesortermedium" id="table_hist"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
""" % (_("Item"),
_("Loaned"),
_("Returned"),
_("Renewals"))
for(recid, loaned_on, returned_on, nb_renewals) in result:
record_link = "<a href=" + CFG_SITE_URL + "/%s/%s>" % (CFG_SITE_RECORD, recid) + \
(book_title_from_MARC(recid)) + "</a>"
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
</tr>
""" % (record_link, loaned_on,
returned_on, nb_renewals)
out += """</tbody>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("Back"))
return out
###
### Loans, Loan Requests, Loan Returns related templates.
###
def tmpl_new_request(self, recid, barcode, action="borrowal", ln=CFG_SITE_LANG):
"""
This template is used when a user wants to request a copy of a book.
If a copy is avaliable (action is 'borrowal'), the 'period of interest' is solicited.
If not AND the book record is put up for proposal (action is 'proposal'),
user's comments are solicited.
@param recid: recID - Invenio record identifier
@param barcode: book copy's barcode
@param action: 'borrowal'/'proposal'
@param ln: language
"""
_ = gettext_set_language(ln)
today = datetime.date.today()
gap = datetime.timedelta(days=180)
gap_1yr = datetime.timedelta(days=360)
more_6_months = (today + gap).strftime('%Y-%m-%d')
more_1_year = (today + gap_1yr).strftime('%Y-%m-%d')
out = """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<link rel=\"stylesheet\" href=\"%s/img/jquery-ui.css\" type=\"text/css\" />
<script type="text/javascript" language='JavaScript' src="%s/js/ui.datepicker.min.js"></script>
<form name="request_form" action="%s/%s/%s/holdings/send" method="post" >
<br />
<div align=center>
""" % (CFG_SITE_URL, CFG_SITE_URL, CFG_SITE_URL,
CFG_SITE_RECORD, recid)
if action == "proposal":
out += """ <table class="bibcirctable_contents" align=center>
<tr>
<td class="bibcirctableheader" align=center>"""
out += _("Why do you suggest this book for the library?")
out += """ </td>
</tr>
</table>
<br />
<table align=center class="tablesorterborrower" width="100" border="0" cellspacing="1" align="center">
<tr align=center>
<td>
<textarea align=center rows="5" cols="43" name="remarks" id="remarks"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
<input type=hidden size="12" name="period_from" value="%s">
<input type=hidden size="12" name="period_to" value="%s">
""" % (today, more_1_year)
out += """<input type=hidden name="act" value="%s">""" % ("pr")
else:
out += """<table class="bibcirctable_contents" align=center>
<tr>
<td class="bibcirctableheader" align=center>%s</td>
</tr>
</table>
<br/>
<table align=center class="tablesorterborrower" width="100" border="0" cellspacing="1" align="center">
<tr align=center>
<th align=center>%s</th>
<td>
<script type="text/javascript">
$(function() {
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker1" name="period_from" value="%s"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr align=center>
<th align=center>%s</th>
<td>
<script type="text/javascript">
$(function() {
$("#date_picker2").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker2" name="period_to" value="%s"
style='border: 1px solid #cfcfcf'>
</td>
</tr>""" % (_("Enter your period of interest"),
_("From"), CFG_SITE_URL, today, _("To"),
CFG_SITE_URL, more_6_months,)
out += """</table>
</div>
<br />
<br />
<table class="bibcirctable_contents">
<input type=hidden name=barcode value='%s'>
<tr>
<td align="center">
<input type=button value='%s' onClick="history.go(-1)" class="formbutton">
<input type="submit" name="submit_button" value='%s' class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</form>
""" % (barcode, _("Back"), _("Confirm"))
return out
def tmpl_new_request_send(self, message, ln=CFG_SITE_LANG):
"""
This template is used in the user interface to display a confirmation message
when a copy of a book is requested.
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = """
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent" width="30">%s</td>
</tr>
<tr>
<td class="bibcirccontent" width="30">%s</td>
</tr>
</table>
<br />
<br />
<table class="bibcirctable">
<td>
<input type=button onClick="location.href='%s'" value='%s' class='formbutton'>
</td>
</table>
<br />
<br />
""" % (message,
_("You can see your library account %(x_url_open)shere%(x_url_close)s."
% {'x_url_open': '<a href="' + CFG_SITE_URL + \
'/yourloans/display' + '">', 'x_url_close': '</a>'}),
CFG_SITE_URL,
_("Back to home"))
return out
def tmpl_book_proposal_send(self, ln=CFG_SITE_LANG):
"""
This template is used in the user interface to display a confirmation message
when a book is proposed for acquisition.
"""
_ = gettext_set_language(ln)
message = "Thank you for your suggestion."
out = """
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent" width="30">%s</td>
</tr>
</table>
<br />
<br />
<table class="bibcirctable">
<td>
<input type=button onClick="location.href='%s'" value='%s' class='formbutton'>
</td>
</table>
<br />
<br />
""" % (message, CFG_SITE_URL, _("Back to home"))
return out
def tmpl_get_pending_requests(self, result, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<style type="text/css"> @import url("/js/tablesorter/themes/blue/style.css"); </style>
<style type="text/css"> @import url("/js/tablesorter/addons/pager/jquery.tablesorter.pager.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script src="/js/tablesorter/addons/pager/jquery.tablesorter.pager.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function(){
$("#table_all_loans")
.tablesorter({sortList: [[4,0], [0,0]],widthFixed: true, widgets: ['zebra']})
.bind("sortStart",function(){$("#overlay").show();})
.bind("sortEnd",function(){$("#overlay").hide()})
.tablesorterPager({container: $("#pager"), positionFixed: false});
});
</script>
<script type="text/javascript">
function confirmation(rqid) {
var answer = confirm("%s")
if (answer){
window.location = "%s/admin2/bibcirculation/get_pending_requests?request_id="+rqid;
}
else{
alert("%s")
}
}
</script>
<br />
<div class="bibcircbottom">
""" % (_("Delete this request?"), CFG_SITE_URL,
_("Request not deleted."))
if len(result) == 0:
out += """
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s' onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("No more requests are pending."),
_("Back"))
else:
out += """
<form name="borrower_form" action="%s/admin2/bibcirculation/all_loans" method="get" >
<br />
<table id="table_all_loans" class="tablesorter"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
""" % (CFG_SITE_URL,
_("Name"),
_("Item"),
_('Library'),
_("Location"),
_("Vol."),
_("Ed."),
_("From"),
_("To"),
_("Request date"),
_("Actions"))
for (request_id, recid, barcode, name, borrower_id, library, location,
date_from, date_to, request_date) in result:
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln}, (name))
volume = db.get_item_description(barcode)
edition = get_fieldvalues(recid, "250__a")
if edition == []:
edition = ''
else:
edition = edition[0]
out += """
<tr>
<td width='150'>%s</td>
<td width='250'>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td algin='center'>
<input type="button" value='%s' style="background: url(/img/dialog-cancel.png)
no-repeat #8DBDD8; width: 75px; text-align: right;"
onClick="confirmation(%s)"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'"
class="bibcircbutton">
<input type="button" value='%s' class="bibcircbutton"
style="background: url(/img/dialog-yes.png) no-repeat #8DBDD8;
width: 150px; text-align: right;"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'"
onClick="location.href='%s/admin2/bibcirculation/create_loan?ln=%s&request_id=%s&recid=%s&borrower_id=%s'">
</td>
</tr>
""" % (borrower_link,
title_link,
library,
location,
volume,
edition,
date_from,
date_to,
request_date,
_("Delete"),
request_id,
_("Create loan"),
CFG_SITE_URL, ln,
request_id,
recid,
borrower_id)
out += """
</tbody>
</table>
</form>
<div id="pager" class="pager">
<form>
<br />
<img src="/img/sb.gif" class="first" />
<img src="/img/sp.gif" class="prev" />
<input type="text" class="pagedisplay" />
<img src="/img/sn.gif" class="next" />
<img src="/img/se.gif" class="last" />
<select class="pagesize">
<option value="10" selected="selected">10</option>
<option value="20">20</option>
<option value="30">30</option>
<option value="40">40</option>
</select>
</form>
</div>
"""
out += """
<div class="back" style="position: relative; top: 5px;">
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button
value='%s'
onClick="history.go(-1)"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
</div>
""" % (_("Back"))
return out
def tmpl_get_waiting_requests(self, result, ln=CFG_SITE_LANG):
"""
@param ln: language
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<style type="text/css"> @import url("/js/tablesorter/themes/blue/style.css"); </style>
<style type="text/css"> @import url("/js/tablesorter/addons/pager/jquery.tablesorter.pager.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script src="/js/tablesorter/addons/pager/jquery.tablesorter.pager.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function(){
$("#table_all_loans")
.tablesorter({sortList: [[4,0], [0,0]],widthFixed: true, widgets: ['zebra']})
.bind("sortStart",function(){$("#overlay").show();})
.bind("sortEnd",function(){$("#overlay").hide()})
.tablesorterPager({container: $("#pager"), positionFixed: false});
});
</script>
<br />
<div class="bibcircbottom">
"""
if len(result) == 0:
out += """
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s' onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("No more requests are pending."),
_("Back"))
else:
out += """
<form name="borrower_form" action="%s/admin2/bibcirculation/all_loans" method="get" >
<br />
<table id="table_all_loans" class="tablesorter"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
"""% (CFG_SITE_URL,
_("Name"),
_("Item"),
_('Library'),
_("Location"),
_("From"),
_("To"),
_("Request date"),
_("Options"))
out += """
<script type="text/javascript">
function confirmation(rqid) {
var answer = confirm("Delete this request?")
if (answer){
window.location = "%s/admin2/bibcirculation/get_waiting_requests?request_id="+rqid;
}
else{
alert("%s")
}
}
</script>
""" % (CFG_SITE_URL, _("Request not deleted."))
for (request_id, recid, _barcode, name, borrower_id, library, location,
date_from, date_to, request_date) in result:
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln}, (name))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td align="center">
<input type="button" value='%s' style="background: url(/img/dialog-cancel.png)
no-repeat #8DBDD8; width: 75px; text-align: right;"
onClick="confirmation(%s)"
class="bibcircbutton">
<input type=button
style="background: url(/img/dialog-yes.png) no-repeat #8DBDD8;
width: 150px; text-align: right;"
onClick="location.href='%s/admin2/bibcirculation/create_loan?ln=%s&request_id=%s&recid=%s&borrower_id=%s'"
value='%s' class="bibcircbutton">
</td>
</tr>
""" % (borrower_link,
title_link,
library,
location,
date_from,
date_to,
request_date,
_("Cancel"),
request_id,
CFG_SITE_URL, ln,
request_id,
recid,
borrower_id,
_("Create Loan"))
out += """
</tbody>
</table>
</form>
<div id="pager" class="pager">
<form>
<br />
<img src="/img/sb.gif" class="first" />
<img src="/img/sp.gif" class="prev" />
<input type="text" class="pagedisplay" />
<img src="/img/sn.gif" class="next" />
<img src="/img/se.gif" class="last" />
<select class="pagesize">
<option value="10" selected="selected">10</option>
<option value="20">20</option>
<option value="30">30</option>
<option value="40">40</option>
</select>
</form>
</div>
"""
out += """
<div class="back" style="position: relative; top: 5px;">
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s' onClick="history.go(-1)"
class="formbutton"></td>
</tr>
</table>
<br />
<br />
</form>
</div>
</div>
""" % (_("Back"))
return out
def tmpl_loan_return(self, infos, ln=CFG_SITE_LANG):
"""
Template for the admin interface. Used when a book return.
@param ln: language
"""
out = self.tmpl_infobox(infos, ln)
_ = gettext_set_language(ln)
out += load_menu(ln)
out += """
<form name="return_form"
action="%s/admin2/bibcirculation/loan_return_confirm?ln=%s" method="post">
<div class="bibcircbottom">
<br />
<br />
<br />
<table class="bibcirctable_contents">
<tr align="center">
<td class="bibcirctableheader">
%s
<input type="text" size=45 id="barcode" name="barcode"
style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("barcode").focus();
</script>
</td>
</tr>
</table>
""" % (CFG_SITE_URL, ln,
_("Barcode"))
out += """
<br />
<table class="bibcirctable_contents">
<tr align="center">
<td>
<input type="reset" name="reset_button" value='%s' class="formbutton">
<input type="submit" name="ok_button" value='%s' class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</div>
</form>
""" % (_("Reset"),
_("OK"))
return out
def tmpl_loan_return_confirm(self, infos, borrower_name, borrower_id, recid,
barcode, return_date, result, ln=CFG_SITE_LANG):
"""
@param borrower_name: person who returned the book
@param id_bibrec: book's recid
@param barcode: book copy's barcode
@param ln: language
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln},
(borrower_name))
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
if len(result) == 0 and len(infos) == 0:
out += """
<script type="text/javascript">
$(window).keydown(function(event){
window.location.href="%(url)s/admin2/bibcirculation/loan_return?ln=%(ln)s";
});
</script>
""" % {'url': CFG_SITE_URL, 'ln': ln}
out += """
<form name="return_form"
action="%s/admin2/bibcirculation/loan_return?ln=%s" method="get" >
<br />
<div class="infoboxsuccess">""" % (CFG_SITE_URL, ln)
out += _("The item %(x_strong_tag_open)s%(x_title)s%(x_strong_tag_close)s, with barcode %(x_strong_tag_open)s%(x_barcode)s%(x_strong_tag_close)s, has been returned with success.") % {'x_title': book_title_from_MARC(recid), 'x_barcode': barcode, 'x_strong_tag_open': '<strong>', 'x_strong_tag_close': '</strong>'}
out += """</div>
<br />"""
for info in infos:
out += """<div class="infobox">"""
out += info
out += """</div> <br /> """
if len(result) > 0:
out += """
<br />
<div class="infoboxmsg">%s</div>
<br />
""" % (_("The next(pending) request on the returned book is shown below."))
(_book_title, book_year, book_author,
book_isbn, book_editor) = book_information_from_MARC(recid)
if book_isbn:
book_cover = get_book_cover(book_isbn)
else:
book_cover = "%s/img/book_cover_placeholder.gif" % (CFG_SITE_URL)
out += """
<table class="bibcirctable">
<tr valign='top'>
<td width="350">
<table class="bibcirctable">
<th class="bibcirctableheader" align='left'>%s</th>
</table>
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="80">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="80">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="80">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="80">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="80">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="80">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="80">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
</table>
</td>
<td class="bibcirccontent">
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
<input type=hidden name=recid value='%s'>
<input type=hidden name=barcode value='%s'>
</table>
""" % (_("Loan informations"),
_("Borrower"), borrower_link,
_("Item"), title_link,
_("Author"), book_author,
_("Year"), book_year,
_("Publisher"), book_editor,
_("ISBN"), book_isbn,
_("Return date"), return_date,
str(book_cover),
recid,
barcode)
if result:
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_requests').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table id="table_requests" class="tablesorter" border="0"
cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
</tbody>
"""% (_("Waiting requests"),
_("Name"),
_("Item"),
_("Request status"),
_("From"),
_("To"),
_("Request date"),
_("Request options"))
for (request_id, name, recid, status, date_from,
date_to, request_date) in result:
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>
<input type=button id="bor_select" onClick="location.href='%s/admin2/bibcirculation/make_new_loan_from_request?ln=%s&check_id=%s&barcode=%s'"
value='%s' class="bibcircbutton" style="background: url(/img/dialog-yes.png) no-repeat #8DBDD8; width: 125px; text-align: right;"></td>
</td>
</tr>
""" % (
name, book_title_from_MARC(recid),
status, date_from, date_to,
request_date, CFG_SITE_URL, ln, request_id, barcode,
_('Select request'))
out += """
</table>
<br />
<br />
</form>
"""
else:
out += """
</form>
<form name="return_form"
action="%s/admin2/bibcirculation/loan_return_confirm?ln=%s" method="post">
<div class="bibcircbottom">
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">
%s
</td>
</tr>
<tr>
<td class="bibcirctableheader">
%s
<input type="text" size=45 id="barcode" name="barcode"
style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("barcode").focus();
</script>
</td>
</tr>
</table>
""" % (CFG_SITE_URL, ln,
_("Return another book"), _("Barcode"))
out += """
<br />
<table class="bibcirctable">
<tr>
<td>
<input type="reset" name="reset_button" value='%s' class="formbutton">
<input type="submit" name="ok_button" value='%s' class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</div>
</form>
""" % (_("Reset"),
_("OK"))
return out
def tmpl_index(self, ln=CFG_SITE_LANG):
"""
Main page of the Admin interface.
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<div class="bibcircsubtitle">
%s
</div>
<br />
""" % (_("Welcome to Invenio BibCirculation Admin"))
out += """
<br /><br />
<br /><br />
<br /><br />
<br /><br />
<br /><br />
</div>
"""
return out
def tmpl_borrower_search(self, infos, redirect_to_new_request=False,
ln=CFG_SITE_LANG):
"""
Template for the admin interface. Search borrower.
@param ln: language
"""
_ = gettext_set_language(ln)
if CFG_CERN_SITE == 1:
id_string = 'ccid'
else:
id_string = _('id')
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
if redirect_to_new_request:
redirect_to_new_request = 'yes'
else:
redirect_to_new_request = 'no'
new_borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/add_new_borrower_step1',
{'ln': ln}, _("register new borrower"))
out += """
<div class="bibcircbottom">
<br />
<br />
<br />
<form name="borrower_search"
action="%s/admin2/bibcirculation/borrower_search_result"
method="get" >
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s
<input type="radio" name="column" value="id">%s
<input type="radio" name="column" value="name" checked>%s
<input type="radio" name="column" value="email">%s
<input type="hidden" name="redirect_to_new_request" value="%s">
<br>
<br>
</td>
</tr>
<tr align="center">
<td>
<input type="text" size="45" name="string" id="string"
style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("string").focus();
</script>
</td>
</tr>
""" % (CFG_SITE_URL,
_("Search borrower by"), id_string,
_("name"), _("email"),
redirect_to_new_request)
if not CFG_CERN_SITE:
out += """
<tr align="center">
<td class="bibcirctableheader">%s</td>
</tr>
""" % (new_borrower_link)
out += """
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<form>
<br />
<br />
<br />
<br />
</div>
""" % (_("Back"), _("Search"))
return out
def tmpl_borrower_search_result(self, result, redirect_to_new_request=False,
ln=CFG_SITE_LANG):
"""
When the admin's feature 'borrower_seach' is used, this template
shows the result.
@param result: search result
@type result:list
@param ln: language
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
if len(result) == 0:
if CFG_CERN_SITE:
message = _("0 borrowers found.") + ' ' +_("Search by CCID.")
else:
new_borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/add_new_borrower_step1',
{'ln': ln}, _("Register new borrower."))
message = _("0 borrowers found.") + ' ' + new_borrower_link
out += """
<div class="bibcircbottom">
<br />
<div class="bibcircinfoboxmsg">%s</div>
<br />
""" % (message)
else:
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom" align="center">
</form>
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirccontent">
<strong>%s borrower(s) found</strong>
</td>
</tr>
</table>
<br />
<table class="tablesortersmall" border="0" cellpadding="0" cellspacing="1">
<th align="center">%s</th>
""" % (len(result), _("Borrower(s)"))
for (borrower_id, name) in result:
if redirect_to_new_request:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/create_new_request_step1',
{'borrower_id': borrower_id, 'ln': ln}, (name))
else:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln}, (name))
out += """
<tr align="center">
<td width="70">%s
<input type=hidden name=uid value='%s'></td>
</tr>
""" % (borrower_link, borrower_id)
out += """
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value='%s'
onClick="history.go(-1)"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</form>
</div>
""" % (_("Back"))
return out
def tmpl_item_search(self, infos, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<form name="search_form"
action="%s/admin2/bibcirculation/item_search_result"
method="get" >
<br />
<br />
<br />
<input type=hidden value="0">
<input type=hidden value="10">
""" % (CFG_SITE_URL)
out += """
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s
<input type="radio" name="f" value="">%s
<input type="radio" name="f" value="barcode" checked>%s
<input type="radio" name="f" value="recid">%s
<br />
<br />
</td>
</tr>
""" % (_("Search item by"), _("Item details"), _("barcode"), _("recid"))
out += """
<tr align="center">
<td>
<input type="text" size="50" name="p" id="p" style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("p").focus();
</script>
</td>
</tr>
"""
out += """
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value='%s' class="formbutton" onClick="history.go(-1)">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
<br />
</div>
<form>
""" % (_("Back"), _("Search"))
return out
def tmpl_item_search_result(self, result, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
try:
number_of_results = len(result)
except:
number_of_results = 1
if result == None:
out += """
<div class="bibcircbottom">
<br />
<div class="bibcircinfoboxmsg">%s</div>
<br />
""" % (_("0 item(s) found."))
### por aqui voy ###
else:
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<strong>%s</strong>
</td>
</tr>
</table>
<br />
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
""" % (_("%i items found.") % number_of_results, _("Title"),
_("Author"), _("Publisher"),
_("# copies"))
### FIXME: If one result -> go ahead ###
for recid in result:
(book_author, book_editor,
book_copies) = get_item_info_for_search_result(recid)
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
</tr>
""" % (title_link, book_author,
book_editor, book_copies)
out += """
</tbody>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</div>
""" % (_("Back"))
return out
def tmpl_loan_on_desk_step1(self, result, key, string, infos,
ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<form name="step1_form1" action="%s/admin2/bibcirculation/loan_on_desk_step1"
method="get" >
<br />
<br />
<br />
<table class="bibcirctable" align="center">
""" % (CFG_SITE_URL)
if CFG_CERN_SITE == 1:
out += """
<tr>
<td class="bibcirctableheader" align="center">%s
""" % (_("Search user by"))
if key == 'email':
out += """
<input type="radio" name="key" value="ccid">%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email" checked>%s
""" % ('ccid', _('name'), _('email'))
elif key == 'name':
out += """
<input type="radio" name="key" value="ccid">%s
<input type="radio" name="key" value="name" checked>%s
<input type="radio" name="key" value="email">%s
""" % ('ccid', _('name'), _('email'))
else:
out += """
<input type="radio" name="key" value="ccid" checked>%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email">%s
""" % ('ccid', _('name'), _('email'))
else:
out += """
<tr>
<td align="center" class="bibcirctableheader">%s
""" % (_("Search borrower by"))
if key == 'email':
out += """
<input type="radio" name="key" value="id">%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email" checked>%s
""" % (_('id'), _('name'), _('email'))
elif key == 'id':
out += """
<input type="radio" name="key" value="id" checked>%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email">%s
""" % (_('id'), _('name'), _('email'))
else:
out += """
<input type="radio" name="key" value="id">%s
<input type="radio" name="key" value="name" checked>%s
<input type="radio" name="key" value="email">%s
""" % (_('id'), _('name'), _('email'))
out += """
<br><br>
</td>
</tr>
<tr>
<td align="center">
<input type="text" size="40" id="string" name="string"
value='%s' style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("string").focus();
</script>
</td>
</tr>
<tr>
<td align="center">
<br>
<input type="submit" id="bor_search" value="%s" class="formbutton">
</td>
</tr>
</table>
</form>
""" % (string or '', _("Search"))
if result:
out += """
<br />
<form name="step1_form2"
action="/admin2/bibcirculation/loan_on_desk_step2"
method="get">
<table class="bibcirctable">
<tr width="200">
<td align="center">
<select name="user_id" size="8" style='border: 1px solid #cfcfcf; width:200'>
"""
for user_info in result:
name = user_info[0]
user_id = user_info[2]
out += """
<option value='%s'>%s
""" % (name, user_id)
out += """
</select>
</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td align="center">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
</form>
""" % (_("Select user"))
out += """
<br />
<br />
<br />
</div>
"""
return out
def tmpl_loan_on_desk_step2(self, user_id, infos, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
user_info = db.get_borrower_details(user_id)
(borrower_id, ccid, name, email, phone, address, mailbox) = user_info
_ = gettext_set_language(ln)
display_id = borrower_id
id_string = _("ID")
if CFG_CERN_SITE == 1:
display_id = ccid
id_string = _("CCID")
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<form name="step2_form" action="%s/admin2/bibcirculation/loan_on_desk_step3"
method="get" >
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
""" % (CFG_SITE_URL, _("User information"))
out += """
</table>
<table class="tablesortersmall" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr> """ % (id_string, display_id)
out += """
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="77">%s</td>
<td>
<input type="text" size=45 id="barcode" name="barcode"
style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("barcode").focus();
</script>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit" id="submit_barcode"
value="%s" class="formbutton">
<br /><br />""" % (_("Name"), name,
_("Address"), address,
_("Mailbox"), mailbox,
_("Email"), email,
_("Phone"), phone,
_("Enter the barcode"), _("Back"),
_("Continue"))
out += """<input type=button value="%s"
onClick="location.href='%s/admin2/bibcirculation/all_loans?ln=%s'"
class="formbutton">""" % (_("See all loans"), CFG_SITE_SECURE_URL, ln)
out += """<input type=hidden name="user_id" value="%s">
</td>
</tr>
</table>
</form>
<br />
<br />
</div>
""" % (user_id)
return out
def tmpl_loan_on_desk_step3(self, user_id, list_of_books, infos,
ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
out = self.tmpl_infobox(infos, ln)
_ = gettext_set_language(ln)
user_info = db.get_borrower_details(user_id)
(borrower_id, ccid, name, email, phone, address, mailbox) = user_info
list_of_barcodes = []
for book in list_of_books:
list_of_barcodes.append(book[1])
display_id = borrower_id
id_string = _("ID")
if CFG_CERN_SITE == 1:
display_id = ccid
id_string = _("CCID")
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script type="text/javascript" language='JavaScript'>
function groupDatePicker(){
var index = 0;
var datepicker = null;
var datepickerhidden = this.document.getElementById("datepickerhidden")
do{
datepicker = this.document.getElementById("date_picker"+index)
if(datepicker != null){
if (index != 0){
datepickerhidden.value += ",";
}
datepickerhidden.value += datepicker.value ;
}
index = index + 1;
}while(datepicker != null);
}
</script>
<form name="step3_form" action="%s/admin2/bibcirculation/loan_on_desk_step4"
method="post" >
<br />
<br />
<input type=hidden name="list_of_barcodes" value="%s">
<input type=hidden name="datepickerhidden" id="datepickerhidden" value="">
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesortersmall" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<script type="text/javascript" language='JavaScript' src="%s/js/ui.datepicker.min.js"></script>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th width="65">%s</th>
<th width="100">%s</th>
<th width="80">%s</th>
<th width="130">%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
""" % (CFG_SITE_URL, str(list_of_barcodes),
_("User information"),
id_string, display_id,
_("Name"), name,
_("Address"), address,
_("Mailbox"), mailbox,
_("Email"), email,
_("Phone"), phone,
_("List of borrowed books"),
CFG_SITE_URL,
_("Item"), _("Barcode"),
_("Library"), _("Location"),
_("Due date"), _("Write note(s)"))
iterator = 0
for (recid, barcode, library_id, location) in list_of_books:
due_date = renew_loan_for_X_days(barcode)
library_name = db.get_library_name(library_id)
out += """
<tr>
<td>%s</td>
<td width="65">%s</td>
<td width="100">%s</td>
<td width="80">%s</td>
<td width="130" class="bibcirccontent">
<script type="text/javascript">
$(function() {
$("%s").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="%s" name="%s"
value="%s" style='border: 1px solid #cfcfcf'>
</td>
<td>
<textarea name='note' rows="1" cols="40"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
""" % (book_title_from_MARC(recid), barcode,
library_name, location, "#date_picker"+str(iterator),
CFG_SITE_URL, "date_picker"+str(iterator),
'due_date'+str(iterator), due_date)
iterator += 1
out += """
</tbody>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit" id="submit_barcode"
value="%s" class="formbutton" onmousedown="groupDatePicker();">
<input type=hidden name="user_id" value="%s">
</td>
</tr>
</table>
</form>
<br />
<br />
</div>
""" % (_("Back"), _("Continue"), user_id)
return out
def tmpl_loan_on_desk_confirm(self, barcode,
borrower, infos, ln=CFG_SITE_LANG):
"""
@param ln: language of the page0
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
borrower_email = borrower.split(' [')[0]
borrower_id = borrower.split(' [')[1]
borrower_id = int(borrower_id[:-1])
out += """
<form name="return_form"
action="%s/admin2/bibcirculation/register_new_loan"
method="post" >
<div class="bibcircbottom">
<input type=hidden name=borrower_id value="%s">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="70">%s</td>
<td class="bibcirccontent" width="600">%s</td>
</tr>
""" % (CFG_SITE_URL,
borrower_id,
_("Borrower"),
borrower_email)
for (bar) in barcode:
recid = db.get_id_bibrec(bar)
out += """
<tr>
<td class="bibcirctableheader" width="70">%s</td>
<td class="bibcirccontent" width="600">%s</td>
</tr>
<input type=hidden name=barcode value='%s'>
""" % (_("Item"),
book_title_from_MARC(recid),
bar)
out += """
</table>
<br />
<table class="bibcirctable_contents">
<tr>
<td>
<input type=button value='%s' onClick="history.go(-1)" class="formbutton">
<input type="submit" value="%s" class="formbutton">
</td>
</tr>
</table>
<br />
</div>
</form>
""" % (_("Back"),
_("Confirm"))
return out
def tmpl_register_new_loan(self, borrower_info, infos,
recid, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
(borrower_id, ccid, name,
email, phone, address, mailbox) = borrower_info
(book_title, book_year, book_author,
book_isbn, book_editor) = book_information_from_MARC(recid)
_ = gettext_set_language(ln)
display_id = borrower_id
id_string = _("ID")
if CFG_CERN_SITE == 1:
display_id = ccid
id_string = _("CCID")
out = load_menu(ln)
out += "<br />"
out += self.tmpl_infobox(infos, ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
<br />
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<td>
<input type="button"
value="%s" class="formbutton"
onClick="location.href='%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s'"
>
<input type="button"
value="%s" class="formbutton"
onClick="location.href='%s/admin2/bibcirculation/register_new_loan?ln=%s&print_data=true'"
>
</td>
</table>
<br />
<br />
</div>
""" % (id_string, display_id,
_("Name"), name,
_("Address"), address,
_("Mailbox"), mailbox,
_("Email"), email,
_("Phone"), phone,
_("Title"), book_title,
_("Author(s)"), book_author,
_("Year"), book_year,
_("Publisher"), book_editor,
_("ISBN"), book_isbn,
_("Back to home"),
CFG_SITE_URL, ln,
_("Print loan information"),
CFG_SITE_URL, ln)
return out
def tmpl_create_new_loan_step1(self, borrower, infos, ln=CFG_SITE_LANG):
"""
Display the borrower's information and a form where it is
possible to search for an item.
@param borrower: borrower's information
@type borrower: tuple
@param infos: information to be displayed in the infobox
@type infos: list
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
(borrower_id, ccid, name, email, phone, address, mailbox) = borrower
display_id = borrower_id
id_string = _("ID")
if CFG_CERN_SITE == 1:
display_id = ccid
id_string = _("CCID")
out += """
<form name="create_new_loan_form1"
action="%s/admin2/bibcirculation/create_new_loan_step2"
method="post" >
<div class="bibcircbottom">
<input type=hidden name=borrower_id value="%s">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
</form>
<table class="bibcirctable">
<tr>
<td width="100">%s</td>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="100">%s</td>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="100">%s</td>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="100">%s</td>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="100">%s</td>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="100">%s</td>
<td class="bibcirccontent">%s</td>
</tr>
</table>
"""% (CFG_SITE_URL,
borrower_id,
_("Personal details"),
id_string, display_id,
_("Name"), name,
_("Address"), address,
_("Mailbox"), mailbox,
_("Email"), email,
_("Phone"), phone)
out += """
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
<tr>
<td><input type="text" size="50" name="barcode" style='border: 1px solid #cfcfcf'></td>
</tr>
</table>
""" % (_("Barcode"))
out += """
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
<tr>
<td><textarea name='new_note' rows="4" cols="43" style='border: 1px solid #cfcfcf'></textarea></td>
</tr>
</table>
<br />
""" % (_("Write notes"))
out += """
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s" onClick="history.go(-1)" class="formbutton">
<input type="submit" value="%s" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</div>
</form>
""" % (_("Back"),
_("Confirm"))
return out
def tmpl_create_new_request_step1(self, borrower, infos, result, p, f,
ln=CFG_SITE_LANG):
"""
Display the borrower's information and the form where it is
possible to search for an item.
@param borrower: borrower's information.
@type borrower: tuple
@param infos: information to be displayed in the infobox.
@type infos: list
@param result: result of searching for an item, using p and f.
@type result: list
@param p: pattern which will be used in the search process.
@type p: string
@param f: field which will be used in the search process.
@type f: string
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
(borrower_id, ccid, name, email, phone, address, mailbox) = borrower
display_id = borrower_id
id_string = _("ID")
if CFG_CERN_SITE == 1:
display_id = ccid
id_string = _("CCID")
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<br />
<table class="bibcirctable">
<tbody>
<tr>
<td width="500" valign="top">
<form name="create_new_loan_form1"
action="%s/admin2/bibcirculation/create_new_request_step1"
method="get" >
<input type=hidden name=borrower_id value="%s">
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s
""" % (CFG_SITE_URL, borrower_id, _("Search item by"))
if f == 'barcode':
out += """
<input type="radio" name="f" value="">%s
<input type="radio" name="f" value="barcode" checked>%s
<input type="radio" name="f" value="author">%s
<input type="radio" name="f" value="title">%s
""" % (_("Any field"), _("barcode"), _("author"), _("title"))
elif f == 'author':
out += """
<input type="radio" name="f" value="">%s
<input type="radio" name="f" value="barcode">%s
<input type="radio" name="f" value="author" checked>%s
<input type="radio" name="f" value="title">%s
""" % (_("Any field"), _("barcode"), _("author"), _("title"))
elif f == 'title':
out += """
<input type="radio" name="f" value="">%s
<input type="radio" name="f" value="barcode">%s
<input type="radio" name="f" value="author">%s
<input type="radio" name="f" value="title" checked>%s
""" % (_("Any field"), _("barcode"), _("author"), _("title"))
else:
out += """
<input type="radio" name="f" value="" checked>%s
<input type="radio" name="f" value="barcode">%s
<input type="radio" name="f" value="author">%s
<input type="radio" name="f" value="title">%s
""" % (_("Any field"), _("barcode"), _("author"), _("title"))
out += """
<br />
<br />
</td>
</tr>
<tr align="center">
<td>
<input type="text" size="50" name="p" value='%s'
style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value='%s' onClick="history.go(-1)"
class="formbutton">
<input type="submit" value='%s' name='search'
class="formbutton">
</td>
</tr>
</table>
</form>
""" % (p or '', _("Back"), _("Search"))
if result:
out += """
<br />
<form name="form2"
action="%s/admin2/bibcirculation/create_new_request_step2"
method="get" >
<table class="bibcirctable">
<tr width="200">
<td align="center">
<select name="recid" size="12" style='border: 1px
solid #cfcfcf; width:77%%'>
""" % (CFG_SITE_URL)
for recid in result:
out += """
<option value ='%s'>%s
""" % (recid, book_title_from_MARC(recid))
out += """
</select>
</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td ALIGN="center">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<input type=hidden name=borrower_id value="%s">
</form>
""" % (_("Select item"), borrower_id)
out += """
</td>
<td width="200" align="center" valign="top">
<td align="center" valign="top">
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
</form>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
</tr>
<br />
"""% (_("Borrower details"),
id_string, display_id,
_("Name"), name,
_("Address"), address,
_("Mailbox"), mailbox,
_("Email"), email,
_("Phone"), phone)
out += """
</table>
<br />
<br />
<br />
<br />
<br />
</div>
"""
return out
def tmpl_create_new_request_step2(self, user_info, holdings_information,
recid, ln=CFG_SITE_LANG):
"""
@param borrower_id: identify the borrower. Primary key of crcBORROWER.
@type borrower_id: int
@param holdings_information: information about the holdings.
@type holdings_information: list
@param recid: identify the record. Primary key of bibrec.
@type recid: int
"""
_ = gettext_set_language(ln)
if not holdings_information:
return _("This item has no holdings.")
out = load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
<td class="bibcirctableheader">%s</td>
<td class="bibcirctableheader" align="center">%s</td>
<td class="bibcirctableheader" align="center">%s</td>
<td class="bibcirctableheader" align="center">%s</td>
<td class="bibcirctableheader" align="center">%s</td>
<td class="bibcirctableheader "align="center">%s</td>
<td class="bibcirctableheader "align="center">%s</td>
<td class="bibcirctableheader"></td>
</tr>
""" % (_("Barcode"), _("Library"), _("Collection"),
_("Location"), _("Description"), _("Loan period"),
_("Status"), _("Due date"))
for (barcode, library, collection, location, description, loan_period,
status, due_date) in holdings_information:
out += """
<tr onMouseOver="this.className='highlight'" onmouseout="this.className='normal'">
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="right">
<input type=button onClick="location.href='%s/admin2/bibcirculation/place_new_request_step2?ln=%s&barcode=%s&recid=%s&user_info=%s,%s,%s,%s,%s,%s,%s'"
value='%s' class="formbutton"></td>
</tr>
""" % (barcode, library, collection, location,
description, loan_period, status, due_date,
CFG_SITE_URL, ln, barcode, recid, user_info[0],
user_info[1], user_info[2], user_info[3],
user_info[4], user_info[5], user_info[6],
_("Request"))
out += """
</table>
<br />
<br />
<br />
</div>
"""
return out
def tmpl_create_new_request_step3(self, borrower_id, barcode, recid,
ln=CFG_SITE_LANG):
"""
@param borrower_id: identify the borrower. Primary key of crcBORROWER.
@type borrower_id: int
@param barcode: identify the item. Primary key of crcITEM.
@type barcode: string
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@param ln: language
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<script type="text/javascript" language='JavaScript' src="%s/js/ui.datepicker.min.js"></script>
<form name="request_form" action="%s/admin2/bibcirculation/create_new_request_step4"
method="post" >
<div class="bibcircbottom">
<br />
<br />
<br />
<table class="bibcirctable_contents">
<tr class="bibcirctableheader" align='center'>
<td>%s</td>
</tr>
</table>
<br />
<table class="bibcirctable_contents">
<tr>
<td width="90" class="bibcirctableheader" align='right'>%s</td>
<td align='left'>
<script type="text/javascript">
$(function(){
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker1" name="period_from" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
<table class="bibcirctable_contents">
<tr>
<td width="90" class="bibcirctableheader" align='right'>%s</td>
<td align='left'>
<script type="text/javascript">
$(function(){
$("#date_picker2").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker2" name="period_to" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
<br />
<br />
""" % (CFG_SITE_URL, CFG_SITE_URL,
_("Enter the period of interest"),
_("From: "), CFG_SITE_URL,
datetime.date.today().strftime('%Y-%m-%d'),
_("To: "), CFG_SITE_URL,
(datetime.date.today() + datetime.timedelta(days=365)).strftime('%Y-%m-%d'))
out += """
<table class="bibcirctable_contents">
<tr>
<td align="center">
<input type=hidden name=barcode value='%s'>
<input type=hidden name=borrower_id value='%s'>
<input type=hidden name=recid value='%s'>
<input type=button value="%s" onClick="history.go(-1)" class="formbutton">
<input type="submit" name="submit_button" value="%s" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (barcode, borrower_id, recid, _("Back"), _('Confirm'))
return out
def tmpl_create_new_request_step4(self, ln=CFG_SITE_LANG):
"""
Last step of the request procedure.
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent" width="30">%s</td>
</tr>
</table>
<br />
<br />
<table class="bibcirctable">
<td>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s'"
value='%s' class='formbutton'>
</td>
</table>
<br />
<br />
</div>
""" % (_("A new request has been registered with success."),
CFG_SITE_URL, ln, _("Back to home"))
return out
def tmpl_place_new_request_step1(self, result, key, string, barcode,
recid, infos, ln=CFG_SITE_LANG):
"""
@param result: borrower's information
@type result: list
@param key: field (name, email, etc...)
@param key: string
@param string: pattern
@type string: string
@param barcode: identify the item. Primary key of crcITEM.
@type barcode: string
@param recid: identify the record. Primary key of bibrec
@type recid: int
@param infos: information
@type infos: list
@param ln: language of the page
"""
_ = gettext_set_language(ln)
(book_title, book_year, book_author,
book_isbn, book_editor) = book_information_from_MARC(recid)
if book_isbn:
book_cover = get_book_cover(book_isbn)
else:
book_cover = "%s/img/book_cover_placeholder.gif" % (CFG_SITE_URL)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<br />
<br />
<table class="bibcirctable">
<tr>
<td width="500" valign='top'>
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
<td width="200" align='center' valign='top'>
<table>
<tr>
<td>
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
</td>
""" % (_("Item details"),
_("Name"), book_title,
_("Author(s)"), book_author,
_("Year"), book_year,
_("Publisher"), book_editor,
_("ISBN"), book_isbn,
_("Barcode"), barcode,
str(book_cover))
out += """
<td valign='top' align='center'>
<form name="step1_form1"
action="%s/admin2/bibcirculation/place_new_request_step1"
method="get" >
<input type=hidden name=barcode value='%s'>
<input type=hidden name=recid value='%s'>
<table>
""" % (CFG_SITE_URL, barcode, recid)
if CFG_CERN_SITE == 1:
out += """
<tr>
<td class="bibcirctableheader" align="center">%s
""" % (_("Search user by"))
if key == 'email':
out += """
<input type="radio" name="key" value="ccid">%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email" checked>%s
""" % (_("ccid"), _("name"), _("email"))
elif key == 'name':
out += """
<input type="radio" name="key" value="ccid">%s
<input type="radio" name="key" value="name" checked>%s
<input type="radio" name="key" value="email">%s
""" % (_("ccid"), _("name"), _("email"))
else:
out += """
<input type="radio" name="key" value="ccid" checked>%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email">%s
""" % (_("ccid"), _("name"), _("email"))
else:
out += """
<tr>
<td class="bibcirctableheader" align="center">%s
""" % (_("Search borrower by"))
if key == 'email':
out += """
<input type="radio" name="key" value="id">%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email" checked>%s
""" % (_("ccid"), _("name"), _("email"))
elif key == 'id':
out += """
<input type="radio" name="key" value="id" checked>%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email">%s
""" % (_("ccid"), _("name"), _("email"))
else:
out += """
<input type="radio" name="key" value="id">%s
<input type="radio" name="key" value="name" checked>%s
<input type="radio" name="key" value="email">%s
""" % (_("ccid"), _("name"), _("email"))
out += """
<br><br>
</td>
</tr>
<tr>
<td align="center">
<input type="text" size="40" id="string" name="string"
value='%s' style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<td align="center">
<br>
<input type="submit" value="%s" class="formbutton">
</td>
</tr>
</table>
</form>
""" % (string or '', _("Search"))
if result:
out += """
<br />
<form name="step1_form2"
action="%s/admin2/bibcirculation/place_new_request_step2"
method="get" >
<input type=hidden name=barcode value='%s'>
<input type=hidden name=recid value='%s'>
<table class="bibcirctable">
<tr width="200">
<td align="center">
<select name="user_info"
size="8"
style='border: 1px solid #cfcfcf; width:40%%'>
""" % (CFG_SITE_URL, barcode, recid)
for (borrower_id, ccid, name, email,
phone, address, mailbox) in result:
out += """
<option value ='%s,%s,%s,%s,%s,%s,%s'>%s
""" % (borrower_id, ccid, name, email, phone,
address, mailbox, name)
out += """
</select>
</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td ALIGN="center">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
</form>
""" % (_("Select user"))
out += """
</td>
</tr>
</table>
<br />
<br />
<br />
<br />
</div>
"""
return out
def tmpl_place_new_request_step2(self, barcode, recid, user_info, infos,
ln=CFG_SITE_LANG):
"""
@param barcode: identify the item. Primary key of crcITEM.
@type barcode: string
@param recid: identify the record. Primary key of bibrec
@type recid: int
@param user_info: user's information
@type user_info: tuple
@param infos: information
@type infos: list
@param ln: language of the page
"""
(book_title, book_year, book_author,
book_isbn, book_editor) = book_information_from_MARC(recid)
if book_isbn:
book_cover = get_book_cover(book_isbn)
else:
book_cover = "%s/img/book_cover_placeholder.gif" % (CFG_SITE_URL)
(borrower_id, ccid, name, email, phone, address, mailbox) = user_info
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<form name="step2_form" action="%s/admin2/bibcirculation/place_new_request_step3"
method="post" >
<input type=hidden name=barcode value='%s'>
<input type=hidden name=recid value='%s'>
<input type=hidden name=user_info value="%s,%s,%s,%s,%s,%s,%s">
<br />
<table class="bibcirctable">
<tr>
<td width="500" valign="top">
<table class="bibcirctable">
<tr class="bibcirctableheader">
<td>%s</td>
</tr>
</table>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
<td width="200" align='center' valign='top'>
<table>
<tr>
<td>
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
</td>
<br />
""" % (CFG_SITE_URL, barcode, recid,
borrower_id, ccid, name, email, phone, address, mailbox,
_("Item details"),
_("Name"), book_title,
_("Author(s)"), book_author,
_("Year"), book_year,
_("Publisher"), book_editor,
_("ISBN"), book_isbn,
_("Barcode"), barcode,
str(book_cover))
out += """
<td align='center' valign="top">
<table class="bibcirctable">
<tr class="bibcirctableheader">
<td>%s</td>
</tr>
</table>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
</table>
""" % (_("Borrower details"),
_("ID"), ccid,
_("Name"), name,
_("Address"), address,
_("Mailbox"), mailbox,
_("Email"), email,
_("Phone"), phone)
out += """
<script type="text/javascript" language='JavaScript' src="%s/js/ui.datepicker.min.js"></script>
<table class="bibcirctable">
<tr class="bibcirctableheader">
<td>%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>
<script type="text/javascript">
$(function(){
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker1" name="period_from" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<script type="text/javascript">
$(function(){
$("#date_picker2").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker2" name="period_to" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
<br />
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit"
value="%s" class="formbutton">
</td>
</tr>
</table>
</form>
<br />
<br />
</div>
""" % (CFG_SITE_URL,
_("Enter the period of interest"),
_("From: "), CFG_SITE_URL, datetime.date.today().strftime('%Y-%m-%d'),
_("To: "), CFG_SITE_URL,
(datetime.date.today() + datetime.timedelta(days=365)).strftime('%Y-%m-%d'),
_("Back"), _("Continue"))
return out
def tmpl_place_new_request_step3(self, ln=CFG_SITE_LANG):
"""
Last step of the request procedure.
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<div class="bibcircinfoboxsuccess">%s</div>
<br />
<br />
<table class="bibcirctable">
<td>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s'"
value='%s'
class='formbutton'>
</td>
</table>
<br />
<br />
</div>
""" % (_("A new request has been registered with success."),
CFG_SITE_URL, ln, _("Back to home"))
return out
def tmpl_place_new_loan_step1(self, result, key, string, barcode,
recid, infos, ln=CFG_SITE_LANG):
"""
@param result: borrower's information
@type result: list
@param key: field (name, email, etc...)
@param key: string
@param string: pattern
@type string: string
@param barcode: identify the item. Primary key of crcITEM.
@type barcode: string
@param recid: identify the record. Primary key of bibrec
@type recid: int
@param infos: information
@type infos: list
@param ln: language of the page
"""
_ = gettext_set_language(ln)
(book_title, book_year, book_author,
book_isbn, book_editor) = book_information_from_MARC(recid)
if book_isbn:
book_cover = get_book_cover(book_isbn)
else:
book_cover = "%s/img/book_cover_placeholder.gif" % (CFG_SITE_URL)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td width="500" valign='top'>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
<td width="200" align='center' valign='top'>
<table>
<tr>
<td>
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
</td>
""" % (_("Item details"),
_("Name"), book_title,
_("Author(s)"), book_author,
_("Year"), book_year,
_("Publisher"), book_editor,
_("ISBN"), book_isbn,
_("Barcode"), barcode,
str(book_cover))
out += """
<td valign='top' align='center'>
<form name="step1_form1"
action="%s/admin2/bibcirculation/place_new_loan_step1"
method="get" >
<input type=hidden name=barcode value='%s'>
<input type=hidden name=recid value='%s'>
<table>
""" % (CFG_SITE_URL, barcode, recid)
if CFG_CERN_SITE == 1:
out += """
<tr>
<td class="bibcirctableheader" align="center">%s
""" % (_("Search user by"))
if key == 'email':
out += """
<input type="radio" name="key" value="ccid">%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email" checked>%s
""" % (_("ccid"), _("name"), _("email"))
elif key == 'name':
out += """
<input type="radio" name="key" value="ccid">%s
<input type="radio" name="key" value="name" checked>%s
<input type="radio" name="key" value="email">%s
""" % (_("ccid"), _("name"), _("email"))
else:
out += """
<input type="radio" name="key" value="ccid" checked>%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email">%s
""" % (_("ccid"), _("name"), _("email"))
else:
out += """
<tr>
<td class="bibcirctableheader" align="center">%s
""" % (_("Search user by"))
if key == 'email':
out += """
<input type="radio" name="key" value="id">%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email" checked>%s
""" % (_("id"), _("name"), _("email"))
elif key == 'id':
out += """
<input type="radio" name="key" value="id" checked>%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email">%s
""" % (_("id"), _("name"), _("email"))
else:
out += """
<input type="radio" name="key" value="id">%s
<input type="radio" name="key" value="name" checked>%s
<input type="radio" name="key" value="email">%s
""" % (_("id"), _("name"), _("email"))
out += """
<br><br>
</td>
</tr>
<tr>
<td align="center">
<input type="text" size="40" id="string" name="string"
value='%s' style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<td align="center">
<br>
<input type="submit" value="%s" class="formbutton">
</td>
</tr>
</table>
</form>
""" % (string or '', _("Search"))
if result:
out += """
<script type="text/javascript">
function checkform(form){
if (form.user_id.value == ""){
alert("%s");
return false;
}
else{
return true;
}
}
</script>
""" % (_("Please select one borrower to continue."))
out += """
<br />
<form name="step1_form2" action="%s/admin2/bibcirculation/loan_on_desk_step3"
method="get" onsubmit="return checkform(this);">
<input type=hidden name=barcode value='%s'>
<table class="bibcirctable">
<tr width="200">
<td align="center">
<select name="user_id" size="8" style='border: 1px solid #cfcfcf;'>
""" % (CFG_SITE_URL, barcode)
for brw in result:
borrower_id = brw[0]
name = brw[2]
out += """
<option value ="%s">%s
""" % (borrower_id, name)
out += """
</select>
</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td ALIGN="center">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
</form>
""" % (_("Select user"))
out += """
</td>
</tr>
</table>
<br />
<br />
<br />
<br />
</div>
"""
return out
def tmpl_place_new_loan_step2(self, barcode, recid, user_info,
ln=CFG_SITE_LANG):
"""
@param barcode: identify the item. Primary key of crcITEM.
@type barcode: string
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@param user_info: user's information
@type user_info: tuple
@param ln: language of the page
"""
(book_title, book_year, book_author, book_isbn,
book_editor) = book_information_from_MARC(recid)
if book_isbn:
book_cover = get_book_cover(book_isbn)
else:
book_cover = "%s/img/book_cover_placeholder.gif" % (CFG_SITE_URL)
(_borrower_id, ccid, name, email, phone,
address, mailbox) = user_info.split(',')
_ = gettext_set_language(ln)
out = """
"""
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<form name="step2_form" action="%s/admin2/bibcirculation/place_new_loan_step3"
method="post" >
<input type=hidden name=barcode value='%s'>
<input type=hidden name=recid value='%s'>
<input type=hidden name=email value='%s'>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr valign='top'>
<td width="400">
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
<td>
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
<br />
""" % (CFG_SITE_URL, barcode,
recid, email,
_("Item details"),
_("Name"), book_title,
_("Author(s)"), book_author,
_("Year"), book_year,
_("Publisher"), book_editor,
_("ISBN"), book_isbn,
_("Barcode"), barcode,
str(book_cover))
out += """
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
</table>
<br />
""" % (_("Borrower details"),
_("ID"), ccid,
_("Name"), name,
_("Address"), address,
_("Mailbox"), mailbox,
_("Email"), email,
_("Phone"), phone)
out += """
<script type="text/javascript" language='JavaScript' src="%s/js/ui.datepicker.min.js"></script>
<table class="bibcirctable">
<tr class="bibcirctableheader">
<td>%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="70">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="70">%s</th>
<td align='left'>
<script type="text/javascript">
$(function(){
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker1"
name="due_date" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
<br />
<br />
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th>%s</th>
</tr>
<tr>
<td>
<textarea name='notes' rows="5" cols="57"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
<tr>
<td>%s</td>
</tr>
</table>
<br />
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="history.go(-1)"
class="formbutton">
<input type="submit"
value="%s"
class="formbutton">
</td>
</tr>
</table>
</form>
<br />
<br />
</div>
""" % (CFG_SITE_URL,
_("Loan information"),
_("Loan date"), datetime.date.today().strftime('%Y-%m-%d'),
_("Due date"), CFG_SITE_URL, renew_loan_for_X_days(barcode),
_("Write notes"),
_("This note will be associated to this new loan, not to the borrower."),
_("Back"), _("Continue"))
return out
def tmpl_change_due_date_step1(self, loan_details, loan_id, borrower_id,
ln=CFG_SITE_LANG):
"""
Return the form where the due date can be changed.
@param loan_details: the information related to the loan.
@type loan_details: tuple
@param loan_id: identify the loan. Primary key of crcLOAN.
@type loan_id: int
@param borrower_id: identify the borrower. Primary key of crcBORROWER.
@type borrower_id: int
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
(recid, barcode, loaned_on, due_date, loan_status,
loan_period, _item_status) = loan_details
number_of_requests = db.get_number_requests_per_copy(barcode)
if number_of_requests > 0:
request_status = 'Yes'
else:
request_status = 'No'
out += """
<div class="bibcircbottom">
<form name="borrower_notes" action="%s/admin2/bibcirculation/change_due_date_step2" method="get" >
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="100">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td width="80">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="80">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="80">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="80">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="80">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="80">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="80">%s</td> <td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
""" % (CFG_SITE_URL, _("Loan information"),
_("Title"), book_title_from_MARC(recid),
_("Barcode"), barcode,
_("Loan date"), loaned_on,
_("Due date"), due_date,
_("Loan status"), loan_status,
_("Loan period"), loan_period,
_("Requested ?"), request_status)
out += """
<script type="text/javascript" language='JavaScript' src="%s/js/ui.datepicker.min.js"></script>
<table class="bibcirctable">
<tr align="left">
<td width="230" class="bibcirctableheader">%s
<script type="text/javascript">
$(function(){
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker1" name="new_due_date" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
<br />
""" % (CFG_SITE_URL, _("New due date: "), CFG_SITE_URL, due_date)
out += """
<table class="bibcirctable">
<tr>
<td>
<input type=hidden name=loan_id value="%s">
<input type=hidden name=borrower_id value="%s">
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit"
value="%s" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (loan_id, borrower_id,
_("Back"), _("Submit new due date"))
return out
def tmpl_change_due_date_step2(self, new_due_date, borrower_id,
ln=CFG_SITE_LANG):
"""
Return a page with the new due date.
@param due_date: new due date
@type due_date: string
@param borrower_id: identify the borrower. Primary key of crcBORROWER.
@type borrower_id: int
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/get_borrower_loans_details?ln=%s&borrower_id=%s'"
value='%s' class='formbutton'>
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("The due date has been updated. New due date: %s" % (new_due_date)),
CFG_SITE_URL, ln, borrower_id, cgi.escape(_("Back to borrower's loans"), True))
return out
def tmpl_send_notification(self, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = """
"""
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<table class="bibcirctable">
<td class="bibcirccontent" width="30">%s</td>
</table>
<br /> <br />
<table class="bibcirctable">
<td>
<input
type=button
onClick="location.href='%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s'"
value='%s'
class='formbutton'>
</td>
</table>
<br />
<br />
</div>
""" % (_("Notification has been sent!"),
CFG_SITE_URL, ln, _("Back to home"))
return out
def tmpl_get_loans_notes(self, loans_notes, loan_id,
referer, back="", ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
if back == "":
back = referer
if not loans_notes:
loans_notes = {}
else:
if looks_like_dictionary(loans_notes):
loans_notes = eval(loans_notes)
else:
loans_notes = {}
out = """ """
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<form name="loans_notes"
action="%s/admin2/bibcirculation/get_loans_notes"
method="get" >
<input type="hidden" name="loan_id" value="%s">
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td>
<table class="bibcircnotes">
""" % (CFG_SITE_URL, loan_id,
_("Notes about loan"))
key_array = loans_notes.keys()
key_array.sort()
for key in key_array:
delete_note = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_loans_notes',
{'delete_key': key, 'loan_id': loan_id, 'ln': ln,
'back': cgi.escape(back, True)}, (_("[delete]")))
out += """<tr class="bibcirccontent">
<td class="bibcircnotes" width="160" valign="top"
align="center"><b>%s</b></td>
<td width="400"><i>%s</i></td>
<td width="65" align="center">%s</td>
</tr>
""" % (key, loans_notes[key], delete_note)
out += """
</table>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">
<textarea name="library_notes" rows="5" cols="90"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button
value="%s"
onClick="window.location='%s'"
class="formbutton">
<input type="submit" value="%s" class="formbutton">
<input type="hidden" name="back" value="%s">
</td>
</tr>
</table>
<br />
<br />
<br />
</form>
</div>
""" % (_("Write new note"),
_("Back"),
cgi.escape(back, True),
_("Confirm"),
cgi.escape(back, True))
return out
def tmpl_all_requests(self, result, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = """
"""
out += load_menu(ln)
out += """
<form name="all_requests_form"
action="%s/admin2/bibcirculation/all_requests"
method="get" >
<div class="bibcircbottom">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
<td class="bibcirctableheader">%s</td>
<td class="bibcirctableheader" align="center">%s</td>
<td class="bibcirctableheader" align="center">%s</td>
<td class="bibcirctableheader" align="center">%s</td>
<td class="bibcirctableheader" align="center">%s</td>
<td class="bibcirctableheader" align="center">%s</td>
</tr>
""" % (CFG_SITE_URL,
_("Borrower"),
_("Item"),
_("Status"),
_("From"),
_("To"),
_("Request date"),
_("Option(s)"))
for (id_lr, borid, name, recid, status, date_from,
date_to, request_date) in result:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borid, 'ln': ln},(name))
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
out += """
<tr onMouseOver="this.className='highlight'" onmouseout="this.className='normal'">
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">
<input type=button
onClick="location.href='%s/admin2/bibcirculation/all_requests?ln=%s&request_id=%s'"
value='%s' class="formbutton">
</td>
</tr>
""" % (borrower_link,
title_link,
status,
date_from,
date_to,
request_date,
CFG_SITE_URL, ln,
id_lr,
_("Cancel hold request"))
out += """
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s" onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
</form>
""" % (_("Back"))
return out
def tmpl_all_loans(self, result, infos, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/js/tablesorter/themes/blue/style.css"); </style>
<style type="text/css"> @import url("/js/tablesorter/addons/pager/jquery.tablesorter.pager.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script src="/js/tablesorter/addons/pager/jquery.tablesorter.pager.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function(){
$("#table_all_loans")
.tablesorter({sortList: [[3,1], [0,0]],widthFixed: true, widgets: ['zebra']})
.bind("sortStart",function(){$("#overlay").show();})
.bind("sortEnd",function(){$("#overlay").hide()})
.tablesorterPager({container: $("#pager"), positionFixed: false});
});
</script>
<br />
<div class="bibcircbottom">
"""
if len(result) == 0:
out += """
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s' onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("No result for your search."),
_("Back"))
else:
out += """
<form name="borrower_form" action="%s/admin2/bibcirculation/all_loans" method="get" >
<br />
<table id="table_all_loans" class="tablesorter"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th></th>
</tr>
</thead>
<tbody>
"""% (CFG_SITE_URL,
_("Borrower"),
_("Item"),
_("Barcode"),
_("Loaned on"),
_("Due date"),
_("Renewals"),
_("Overdue letters"),
_("Loan Notes"))
for (borrower_id, borrower_name, recid, barcode,
loaned_on, due_date, nb_renewal, nb_overdue,
date_overdue, notes, loan_id) in result:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln},
(borrower_name))
see_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_loans_notes',
{'loan_id': loan_id, 'ln': ln}, (_("see notes")))
no_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_loans_notes',
{'loan_id': loan_id, 'ln': ln}, (_("no notes")))
if notes == "" or str(notes) == '{}':
check_notes = no_notes_link
else:
check_notes = see_notes_link
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s - %s</td>
<td>%s</td>
<td align="center">
<input type=button onClick="location.href='%s/admin2/bibcirculation/claim_book_return?borrower_id=%s&recid=%s&loan_id=%s&template=claim_return'" onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value='%s' class='bibcircbutton'></td>
</tr>
""" % (borrower_link, title_link, barcode,
loaned_on, due_date,
nb_renewal, nb_overdue, date_overdue,
check_notes, CFG_SITE_URL,
borrower_id, recid, loan_id, _("Send recall"))
out += """
</tbody>
</table>
</form>
<div id="pager" class="pager">
<form>
<br />
<img src="/img/sb.gif" class="first" />
<img src="/img/sp.gif" class="prev" />
<input type="text" class="pagedisplay" />
<img src="/img/sn.gif" class="next" />
<img src="/img/se.gif" class="last" />
<select class="pagesize">
<option value="10" selected="selected">10</option>
<option value="20">20</option>
<option value="30">30</option>
<option value="40">40</option>
</select>
</form>
</div>
"""
out += """
<div class="back" style="position: relative; top: 5px;">
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s'
onClick="history.go(-1)"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
</div>
""" % (_("Back"))
return out
def tmpl_all_expired_loans(self, result, infos, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/js/tablesorter/themes/blue/style.css"); </style>
<style type="text/css"> @import url("/js/tablesorter/addons/pager/jquery.tablesorter.pager.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script src="/js/tablesorter/addons/pager/jquery.tablesorter.pager.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function(){
$("#table_all_loans")
.tablesorter({sortList: [[3,1], [0,0]],widthFixed: true, widgets: ['zebra']})
.bind("sortStart",function(){$("#overlay").show();})
.bind("sortEnd",function(){$("#overlay").hide()})
.tablesorterPager({container: $("#pager"), positionFixed: false});
});
</script>
<br />
<div class="bibcircbottom">
"""
if len(result) == 0:
out += """
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("No result for your search."),
_("Back"))
else:
out += """
<form name="borrower_form"
action="%s/admin2/bibcirculation/all_loans"
method="get" >
<br />
<table id="table_all_loans" class="tablesorter"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th></th>
</tr>
</thead>
<tbody>
"""% (CFG_SITE_URL,
_("Borrower"),
_("Item"),
_("Barcode"),
_("Loaned on"),
_("Due date"),
_("Renewals"),
_("Overdue letters"),
_("Loan Notes"))
for (borrower_id, borrower_name, recid, barcode,
loaned_on, due_date, nb_renewal, nb_overdue,
date_overdue, notes, loan_id) in result:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln},
(borrower_name))
see_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_loans_notes',
{'loan_id': loan_id, 'ln': ln},
(_("see notes")))
no_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_loans_notes',
{'loan_id': loan_id, 'ln': ln},
(_("no notes")))
if notes == "" or str(notes) == '{}':
check_notes = no_notes_link
else:
check_notes = see_notes_link
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s - %s</td>
<td>%s</td>
<td align="center">
<input type=button onClick="location.href='%s/admin2/bibcirculation/claim_book_return?borrower_id=%s&recid=%s&loan_id=%s&template=claim_return'" onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value='%s' class='bibcircbutton'></td>
</tr>
""" % (borrower_link, title_link, barcode,
loaned_on, due_date,
nb_renewal, nb_overdue, date_overdue,
check_notes, CFG_SITE_URL,
borrower_id, recid, loan_id, _("Send recall"))
out += """
</tbody>
</table>
</form>
<div id="pager" class="pager">
<form>
<br />
<img src="/img/sb.gif" class="first" />
<img src="/img/sp.gif" class="prev" />
<input type="text" class="pagedisplay" />
<img src="/img/sn.gif" class="next" />
<img src="/img/se.gif" class="last" />
<select class="pagesize">
<option value="10" selected="selected">10</option>
<option value="20">20</option>
<option value="30">30</option>
<option value="40">40</option>
</select>
</form>
</div>
"""
out += """
<div class="back" style="position: relative; top: 5px;">
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
</div>
""" % (_("Back"))
return out
def tmpl_get_expired_loans_with_waiting_requests(self, result, ln=CFG_SITE_LANG):
"""
@param result: loans' information
@param result: list
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
if len(result) == 0:
out += """
<div class="bibcircbottom">
<br /> <br /> <br /> <br />
<table class="bibcirctable_contents">
<td class="bibcirccontent" align="center">%s</td>
</table>
<br /> <br /> <br />
<table class="bibcirctable_contents">
<td align="center">
<input type=button
onClick="location.href='%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s'"
value='%s'
class='formbutton'>
</td>
</table>
<br />
</div>
""" % (_("No more requests are pending or waiting."),
CFG_SITE_URL, ln,
_("Back to home"))
else:
out += """
<style type="text/css"> @import url("/js/tablesorter/themes/blue/style.css"); </style>
<style type="text/css"> @import url("/js/tablesorter/addons/pager/jquery.tablesorter.pager.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script src="/js/tablesorter/addons/pager/jquery.tablesorter.pager.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function(){
$("#table_requests")
.tablesorter({sortList: [[4,0], [0,0]],widthFixed: true, widgets: ['zebra']})
.bind("sortStart",function(){$("#overlay").show();})
.bind("sortEnd",function(){$("#overlay").hide()})
.tablesorterPager({container: $("#pager"), positionFixed: false});
});
</script>
<div class="bibcircbottom">
<br />
<br />
<table id="table_requests" class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
"""% (_("Name"),
_("Item"),
_('Library'),
_("Location"),
_("From"),
_("To"),
_("Request date"),
_("Actions"))
for (request_id, recid, borrower_id, library_id, location,
date_from, date_to, request_date) in result:
borrower_name = db.get_borrower_name(borrower_id)
library_name = db.get_library_name(library_id)
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
if borrower_name:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln},
(borrower_name))
else:
borrower_link = str(borrower_id)
out += """
<script type="text/javascript">
function confirmation(id){
var answer = confirm("Delete this request?")
if (answer){
window.location = "%s/admin2/bibcirculation/get_expired_loans_with_waiting_requests?request_id="+id;
}
else{
alert("Request not deleted.")
}
}
</script>
<tr>
<td width='150'>%s</td>
<td width='250'>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td algin='center'>
<input type="button" value='%s' style="background: url(/img/dialog-cancel.png)
no-repeat #8DBDD8; width: 75px; text-align: right;"
onClick="confirmation(%s)"
onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
class="bibcircbutton">
<input type=button style="background: url(/img/dialog-yes.png) no-repeat #8DBDD8; width: 150px; text-align: right;"
onClick="location.href='%s/admin2/bibcirculation/create_loan?request_id=%s&recid=%s&borrower_id=%s&ln=%s'"
onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value='%s' class="bibcircbutton">
</td>
</tr>
""" % (CFG_SITE_URL,
borrower_link,
title_link,
library_name,
location,
date_from,
date_to,
request_date,
_("Delete"),
request_id,
CFG_SITE_URL,
request_id,
recid,
borrower_id,
ln,
_("Create Loan"))
out += """
</tbody>
</table>
<div id="pager" class="pager">
<form>
<br />
<img src="/img/sb.gif" class="first" />
<img src="/img/sp.gif" class="prev" />
<input type="text" class="pagedisplay" />
<img src="/img/sn.gif" class="next" />
<img src="/img/se.gif" class="last" />
<select class="pagesize">
<option value="10" selected="selected">10</option>
<option value="20">20</option>
<option value="30">30</option>
<option value="40">40</option>
</select>
</form>
</div>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button style="background: url(/img/document-print.png) no-repeat #8DBDD8; width: 135px; text-align: right;"
onClick="location.href='%s/admin2/bibcirculation/get_pending_requests?print_data=true&ln=%s'"
onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value='%s' class="bibcircbutton">
</td>
</tr>
</table>
<br />
</div>
""" % (CFG_SITE_URL, ln,
_("Printable format"))
return out
###
### Items and their copies' related templates.
###
def tmpl_get_item_details(self, recid, copies, requests, loans, purchases, req_hist_overview,
loans_hist_overview, purchases_hist_overview, infos, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
record_is_periodical = is_periodical(recid)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
(book_title, book_year, book_author,
book_isbn, book_editor) = book_information_from_MARC(recid)
if book_isbn:
try:
book_cover = get_book_cover(book_isbn)
except KeyError:
book_cover = """%s/img/book_cover_placeholder.gif
""" % (CFG_SITE_URL)
else:
book_cover = "%s/img/book_cover_placeholder.gif" % (CFG_SITE_URL)
link_to_detailed_record = "<a href='%s/%s/%s' target='_blank'>%s</a>" % (CFG_SITE_URL, CFG_SITE_RECORD, recid, book_title)
out += """
<div class="bibcircbottom">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
<tr valign='top'>
<td width="400">
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
<input type=button onClick="window.open('%s/%s/%s/edit')"
value='%s' class="formbutton">
</td>
<td>
<img style='border: 1px solid #cfcfcf' src="%s" alt="%s"/>
</td>""" % (_("Item details"), _("Name"), link_to_detailed_record,
_("Author(s)"), book_author, _("Year"), book_year,
_("Publisher"), book_editor, _("ISBN"), book_isbn,
CFG_SITE_URL, CFG_SITE_RECORD, recid, _("Edit this record"),
str(book_cover), _("Book Cover"))
# Search another item directly from the item details page.
out += """<td>
<form name="search_form"
action="%s/admin2/bibcirculation/item_search_result"
method="get" >
<input type=hidden value="0">
<input type=hidden value="10">
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s
<input type="radio" name="f" value="">%s
<input type="radio" name="f" value="barcode" checked>%s
<input type="radio" name="f" value="recid">%s
<br /><br />
</td>
</tr>
""" % (CFG_SITE_URL, _("Search another item by"), _("Item details"),
_("barcode"), _("recid"))
out += """
<tr>
<td>
<input type="text" size="50" name="p" id="p" style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("p").focus();
</script>
</td>
</tr>
"""
out += """
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s' class="formbutton" onClick="history.go(-1)">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<form>
</td>
""" % (_("Back"), _("Search"))
out += """ </tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
""" % (_("Additional details"))
out += """
<style type="text/css">
@import url("/css/tablesorter.css");
</style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<style type="text/css">
@import url("/js/tablesorter/themes/blue/style.css");
</style>
<style type="text/css">
@import url("/js/tablesorter/addons/pager/jquery.tablesorter.pager.css");
</style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script src="/js/tablesorter/addons/pager/jquery.tablesorter.pager.js"
type="text/javascript"></script>
"""
if record_is_periodical:
out += """
<script type="text/javascript">
$(document).ready(function(){
$("#table_copies")
.tablesorter({sortList: [[7,1],[4,0]],widthFixed: true,widgets: ['zebra']})
.bind("sortStart",function(){$("#overlay").show();})
.bind("sortEnd",function(){$("#overlay").hide()})
.tablesorterPager({container: $("#pager"),
positionFixed: false,
size: 40
});
});
</script>
"""
else:
out += """
<script type="text/javascript">
$(document).ready(function(){
$('#table_copies').tablesorter({sortList: [[1,1],[4,0]],
widthFixed: true,
widgets: ['zebra']})
});
</script>
"""
out += """
<table class="tablesorter" id="table_copies" border="0"
cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
""" % (_("Barcode"),
_("Status"),
_(CFG_BIBCIRCULATION_ILL_STATUS_REQUESTED), #_("Requested"),
_("Due date"),
_("Library"))
if not record_is_periodical:
out += """
<th>%s</th>
""" % (_("Location"))
out += """
<th>%s</th>
<th>%s</th>
""" % (_("Loan period"),
_("No of loans"))
if not record_is_periodical:
out += """
<th>%s</th>
""" % (_("Collection"))
out += """
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tboby>
""" % (_("Description"),
_("Actions"))
for (barcode, loan_period, library_name, library_id,
location, nb_requests, status, collection,
description, due_date) in copies:
number_of_requests = db.get_number_requests_per_copy(barcode)
if number_of_requests > 0:
requested = 'Yes'
else:
requested = 'No'
if status in ('on order', 'claimed'):
expected_arrival_date = db.get_expected_arrival_date(barcode)
if expected_arrival_date != '':
status = status + ' - ' + expected_arrival_date
library_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_details',
{'library_id': library_id, 'ln': ln},
(library_name))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
""" % (barcode, status, requested,
due_date or '-', library_link)
if not record_is_periodical:
out += """
<td>%s</td>
""" % (location)
out += """
<td>%s</td>
<td>%s</td>
""" % (loan_period, nb_requests)
if not record_is_periodical:
out += """
<td>%s</td>
""" % (collection or '-')
out += """
<td>%s</td>
""" % (description or '-')
if status == CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN:
out += """
<td align="center">
<SELECT style='border: 1px solid #cfcfcf'
ONCHANGE="location = this.options[this.selectedIndex].value;">
<OPTION VALUE="">%s
<OPTION VALUE="update_item_info_step4?barcode=%s">%s
<OPTION VALUE="add_new_copy_step3?recid=%s&barcode=%s">%s
<OPTION VALUE="place_new_request_step1?barcode=%s">%s
<OPTION VALUE="" DISABLED>%s
<OPTION VALUE="" DISABLED>%s
</SELECT>
</td>
</tr>
""" % (_("Select an action"),
barcode, _("Update"),
recid, barcode, _("Add similar copy"),
barcode, _("New request"),
_("New loan"),
_("Delete copy"))
elif status == 'missing':
out += """
<td align="center">
<SELECT style='border: 1px solid #cfcfcf'
ONCHANGE="location = this.options[this.selectedIndex].value;">
<OPTION VALUE="">%s
<OPTION VALUE="update_item_info_step4?barcode=%s">%s
<OPTION VALUE="add_new_copy_step3?recid=%s&barcode=%s">%s
<OPTION VALUE="" DISABLED>%s
<OPTION VALUE="" DISABLED>%s
<OPTION VALUE="delete_copy_step1?barcode=%s">%s
</SELECT>
</td>
</tr>
""" % (_("Select an action"),
barcode, _("Update"),
recid, barcode, _("Add similar copy"),
_("New request"),
_("New loan"),
barcode, _("Delete copy"))
elif status == 'Reference':
out += """
<td align="center">
<SELECT style='border: 1px solid #cfcfcf'
ONCHANGE="location = this.options[this.selectedIndex].value;">
<OPTION VALUE="">%s
<OPTION VALUE="update_item_info_step4?barcode=%s">%s
<OPTION VALUE="add_new_copy_step3?recid=%s&barcode=%s">%s
<OPTION VALUE="place_new_request_step1?barcode=%s">%s
<OPTION VALUE="place_new_loan_step1?barcode=%s">%s
<OPTION VALUE="delete_copy_step1?barcode=%s">%s
</SELECT>
</td>
</tr>
""" % (_("Select an action"),
barcode, _("Update"),
recid, barcode, _("Add similar copy"),
barcode, _("New request"),
barcode, _("New loan"),
barcode, _("Delete copy"))
else:
out += """
<td align="center">
<SELECT style='border: 1px solid #cfcfcf'
ONCHANGE="location = this.options[this.selectedIndex].value;">
<OPTION VALUE="">%s
<OPTION VALUE="update_item_info_step4?barcode=%s">%s
<OPTION VALUE="add_new_copy_step3?recid=%s&barcode=%s">%s
<OPTION VALUE="place_new_request_step1?barcode=%s">%s
<OPTION VALUE="place_new_loan_step1?barcode=%s">%s
<OPTION VALUE="delete_copy_step1?barcode=%s">%s
</SELECT>
</td>
</tr>
""" % (_("Select an action"),
barcode, _("Update"),
recid, barcode, _("Add similar copy"),
barcode, _("New request"),
barcode, _("New loan"),
barcode, _("Delete copy"))
out += """
</tbody>
</table>
"""
if record_is_periodical:
out += """
<div id="pager" class="pager">
<form>
<img src="/img/sb.gif" class="first" />
<img src="/img/sp.gif" class="prev" />
<input type="text" class="pagedisplay" />
<img src="/img/sn.gif" class="next" />
<img src="/img/se.gif" class="last" />
<select class="pagesize">
<option value="10">10</option>
<option value="20">20</option>
<option value="30">30</option>
<option value="40" selected="selected">40</option>
</select>
</form>
</div>
"""
out += """
</br>
<table class="bibcirctable">
<tr>
<td>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/add_new_copy_step3?ln=%s&recid=%s'"
value='%s' class="formbutton">
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesortersmall" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button" value='%s'
onClick="location.href='%s/admin2/bibcirculation/get_item_requests_details?ln=%s&recid=%s'"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'"
class="bibcircbutton">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button" value='%s'
onClick="location.href='%s/admin2/bibcirculation/get_item_loans_details?ln=%s&recid=%s'"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'"
class="bibcircbutton">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button" value='%s'
onClick="location.href='%s/admin2/bibcirculation/list_purchase?ln=%s&status=%s&recid=%s'"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'"
class="bibcircbutton">
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesortersmall" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button" value='%s'
onClick="location.href='%s/admin2/bibcirculation/get_item_req_historical_overview?ln=%s&recid=%s'"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'"
class="bibcircbutton">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button" value='%s'
onClick="location.href='%s/admin2/bibcirculation/get_item_loans_historical_overview?ln=%s&recid=%s'"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'"
class="bibcircbutton">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button" value='%s'
onClick="location.href='%s/admin2/bibcirculation/list_purchase?ln=%s&status=%s&recid=%s'"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'"
class="bibcircbutton">
</td>
</tr>
</table>
<br />
""" % (CFG_SITE_URL, ln, recid, _("Add new copy"),
_("Hold requests and loans overview on %(date)s")
% {'date': dateutils.convert_datestruct_to_datetext(localtime())},
_("Hold requests"), len(requests), _("More details"), CFG_SITE_URL, ln, recid,
_("Loans"), len(loans), _("More details"), CFG_SITE_URL, ln, recid,
_("Purchases"), len(purchases), _("More details"), CFG_SITE_URL, ln,
CFG_BIBCIRCULATION_ACQ_STATUS_NEW, recid,
_("Historical overview"), _("Hold requests"), len(req_hist_overview),
_("More details"), CFG_SITE_URL, ln, recid, _("Loans"), len(loans_hist_overview),
_("More details"), CFG_SITE_URL, ln, recid, _("Purchases"), len(purchases_hist_overview),
_("More details"), CFG_SITE_URL, ln, CFG_BIBCIRCULATION_ACQ_STATUS_RECEIVED, recid)
out += """
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</div>
""" % (_("Back"))
return out
def tmpl_get_item_requests_details(self, result, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
if len(result) == 0:
out += """
<div class="bibcircbottom">
<br />
<div class="bibcircinfoboxmsg">%s</div>
<br />
""" % (_("There are no requests."))
else:
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_requests').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<form name="all_loans_form"
action="%s/admin2/bibcirculation/update_loan_request_status" method="get" >
<div class="bibcircbottom">
<br />
<table id="table_requests" class="tablesorter"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
"""% (CFG_SITE_URL,
_("Borrower"),
_("Status"),
_("Library"),
_("Location"),
_("Barcode"),
_("Item Desc"),
_("From"),
_("To"),
_("Request date"),
_("Option(s)"))
for (borrower_id, name, id_bibrec, barcode, status, library,
location, description, date_from, date_to, request_id,
request_date) in result:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln}, (name))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td align="center">
<input type=button
onClick="location.href='%s/admin2/bibcirculation/create_loan?recid=%s&request_id=%s&borrower_id=%s'" value='%s' class='formbutton'>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/get_item_requests_details?recid=%s&request_id=%s'" value='%s' class='formbutton'>
</td>
</tr>
""" % (borrower_link, status, library, location, barcode, description,
date_from, date_to, request_date, CFG_SITE_URL,
id_bibrec, request_id, borrower_id, _("Create loan"),
CFG_SITE_URL, id_bibrec, request_id, _("Cancel hold request"))
out += """
</tbody>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button
onClick="history.go(-1)" value='%s' class='formbutton'>
</td>
</tr>
</table>
<br />
<br />
<br />
</div>
</form>
""" % (_("Back"))
return out
def tmpl_get_item_loans_details(self, result, recid, infos,
ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
if len(result) == 0:
out += """
<div class="bibcircbottom">
<br />
<div class="bibcircinfoboxmsg">%s</div>
<br />
""" % (_("There are no loans."))
else:
out += """
<div class="bibcircbottom">
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_loans').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<br />
<form name="borrower_form"
action="%s/admin2/bibcirculation/get_item_loans_details" method="get" >
<input type=hidden name=recid value="%s">
""" % (CFG_SITE_URL,
recid)
out += """
<br />
<table id="table_loans" class="tablesorter" border="0"
cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
"""% (_("Borrower"),
_("Barcode"),
_("Loaned on"),
_("Due date"),
_("Renewals"),
_("Overdue letter"),
_("Loan status"),
_("Loan notes"),
_("Loan options"))
for (borrower_id, borrower_name, barcode, loaned_on,
due_date, nb_renewal, nb_overdue, date_overdue,
status, notes, loan_id) in result:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln},
(borrower_name))
no_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_loans_notes',
{'loan_id': loan_id, 'ln': ln}, (_("No notes")))
see_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_loans_notes',
{'loan_id': loan_id, 'ln': ln}, (_("See notes")))
if notes == "" or str(notes) == '{}':
check_notes = no_notes_link
else:
check_notes = see_notes_link
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s - %s</td>
<td>%s</td>
<td>%s</td>
""" % (borrower_link, barcode, loaned_on, due_date,
nb_renewal, nb_overdue, date_overdue,
status, check_notes)
out += """
<td align="center">
<SELECT style='border: 1px solid #cfcfcf'
ONCHANGE="location = this.options[this.selectedIndex].value;">
<OPTION VALUE="">%s
<OPTION VALUE="get_item_loans_details?barcode=%s&loan_id=%s&recid=%s">%s
<OPTION VALUE="loan_return_confirm?barcode=%s">%s
""" % (_("Select an action"),
barcode, loan_id, recid, _("Renew"),
barcode, _("Return"))
if status == CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED:
out += """
<OPTION VALUE="change_due_date_step1?barcode=%s" DISABLED>%s
""" % (barcode, _("Change due date"))
else:
out += """
<OPTION VALUE="change_due_date_step1?barcode=%s">%s
""" % (barcode, _("Change due date"))
out += """
<OPTION VALUE="claim_book_return?borrower_id=%s&recid=%s&loan_id=%s&template=claim_return">%s
</SELECT>
</td>
</tr>
<input type=hidden name=loan_id value="%s">
""" % (borrower_id, recid, loan_id, _("Send recall"),
loan_id)
out += """
<tbody>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/get_item_details?ln=%s&recid=%s'"
value='%s'
class='formbutton'>
</td>
</tr>
</table>
<br />
<br />
<br />
</div>
</form>
""" % (CFG_SITE_URL, ln,
recid,
_("Back"))
return out
def tmpl_get_item_req_historical_overview(self, req_hist_overview,
ln=CFG_SITE_LANG):
"""
Return the historical requests overview of a item.
@param req_hist_overview: list of old borrowers.
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
if len(req_hist_overview) == 0:
out += """
<div class="bibcircbottom">
<br />
<div class="bibcircinfoboxmsg">%s</div>
<br />
""" % (_("There are no requests."))
else:
out += """
<div class="bibcircbottom">
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_holdings').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<br />
<br />
<table id="table_holdings" class="tablesorter"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
</tbody>
""" % (_("Borrower"),
_("Barcode"),
_("Library"),
_("Location"),
_("From"),
_("To"),
_("Request date"))
for (name, borrower_id, barcode, library_name,
location, req_from, req_to, req_date) in req_hist_overview:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln}, (name))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
</tr>
""" % (borrower_link, barcode, library_name,
location, req_from, req_to, req_date)
out += """
</tbody>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s'
onClick="history.go(-1)"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</div>
""" % (_("Back"))
return out
def tmpl_get_item_loans_historical_overview(self, loans_hist_overview,
ln=CFG_SITE_LANG):
"""
Return the historical loans overview of an item.
@param loans_hist_overview: list of old borrowers.
"""
_ = gettext_set_language(ln)
out = """
"""
out += load_menu(ln)
out += """<div class="bibcircbottom">
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_loans').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<br />
<br />
<table id="table_loans" class="tablesorter"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
""" % (_("Borrower"),
_("Barcode"),
_("Library"),
_("Location"),
_("Loaned on"),
_("Due date"),
_("Returned on"),
_("Renewals"),
_("Overdue letters"))
for (name, borrower_id, barcode, library_name, location, loaned_on,
due_date, returned_on, nb_renew,
nb_overdueletters) in loans_hist_overview:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln}, (name))
out += """ <tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
</tr>
""" % (borrower_link, barcode, library_name,
location, loaned_on,
due_date, returned_on, nb_renew,
nb_overdueletters)
out += """
</tbody>
</table>
<br />
<table class="bibcirctable">
<tr>
<td><input type=button value='%s'
onClick="history.go(-1)" class="formbutton"></td>
</tr>
</table>
<br />
<br />
<br />
</div>
""" % (_("Back"))
return out
def tmpl_update_item_info_step1(self, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<div class="bibcircbottom">
<form name="update_item_info_step1_form"
action="%s/admin2/bibcirculation/update_item_info_step2" method="get" >
""" % (CFG_SITE_URL)
out += """
<br />
<br />
<br />
<input type=hidden name=start value="0">
<input type=hidden name=end value="10">
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s
<input type="radio" name="f" value="" checked>%s
<input type="radio" name="f" value="name">%s
<input type="radio" name="f" value="email">%s
<input type="radio" name="f" value="email">%s
<br /><br />
</td>
""" % (_("Search item by"), _("RecId/Item details"), _("year"),
_("author"), _("title"))
out += """
<tr align="center">
<td>
<input type="text" size="50" name="p" style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button
value="%s"
onClick="history.go(-1)"
class="formbutton">
<input type="submit"
value="%s"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
<br />
</div>
<form>
""" % (_("Back"), _("Search"))
return out
def tmpl_update_item_info_step2(self, result, ln=CFG_SITE_LANG):
"""
@param result: list with recids
@type result: list
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirccontent"><strong>%s</strong></td>
</tr>
</table>
<table class="bibcirctable">
</tr>
""" % (_("%(nb_items_found)i items found")
% {'nb_items_found': len(result)})
for recid in result:
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/update_item_info_step3',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
out += """
<tr align="center">
<td class="contents">%s</td>
</tr>
""" % (title_link)
out += """
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s" onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</div>
""" % (_("Back"))
return out
def tmpl_update_item_info_step3(self, recid, result, ln=CFG_SITE_LANG):
"""
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@param result: book's information
@type result: tuple
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = """
"""
out += load_menu(ln)
(book_title, book_year, book_author,
book_isbn, book_editor) = book_information_from_MARC(recid)
if book_isbn:
book_cover = get_book_cover(book_isbn)
else:
book_cover = "%s/img/book_cover_placeholder.gif" % (CFG_SITE_URL)
out += """
<form name="update_item_info_step3_form"
action="%s/admin2/bibcirculation/update_item_info_step4" method="get" >
<div class="bibcircbottom">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr valign='top'>
<td width="400">
<table>
<tr>
<td width="100">%s</td>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="100">%s</td>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="100">%s</td>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="100">%s</td>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="100">%s</td>
<td class="bibcirccontent">%s</td>
</tr>
</table>
</td>
<td class="bibcirccontent">
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
<br />
""" % (CFG_SITE_URL,
_("Item details"),
_("Name"),
book_title,
_("Author(s)"),
book_author,
_("Year"),
book_year,
_("Publisher"),
book_editor,
_("ISBN"),
book_isbn,
str(book_cover))
out += """<table class="bibcirctable">
<tr>
<td>%s</td>
<td align="center">%s</td>
<td align="center">%s</td>
<td align="center">%s</td>
<td align="center">%s</td>
<td align="center">%s</td>
<td align="center">%s</td>
<td align="center">%s</td>
<td align="center"></td>
<td width="350"></td>
</tr>""" % (_("Barcode"),
_("Status"),
_("Library"),
_("Location"),
_("Loan period"),
_("No of loans"),
_("Collection"),
_("Description"))
for (barcode, loan_period, lib_name, libid, location, nb_requests,
status, collection, description) in result:
library_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_details',
{'library_id': libid, 'ln': ln}, (lib_name))
out += """
<tr>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">%s</td>
<td class="bibcirccontent" align="center">
<input type=button
onClick="location.href='%s/admin2/bibcirculation/update_item_info_step4?ln=%s&barcode=%s'"
value="%s" class="formbutton">
</td>
<td class="bibcirccontent" width="350"></td>
</tr>
""" % (barcode, status, library_link, location, loan_period,
nb_requests, collection, description, CFG_SITE_URL, ln,
barcode, _("Update"))
out += """
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type=hidden name=recid value="%s"></td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("Back"), recid)
return out
def tmpl_update_item_info_step4(self, recid, result, libraries,
ln=CFG_SITE_LANG):
"""
@param recid: identify the record. Primary key of bibrec
@type recid: int
@param result: book's information
@type result: tuple
@param libraries: list of libraries
@type libraries: list
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
(title, year, author, isbn, editor) = book_information_from_MARC(recid)
barcode = result[0]
expected_arrival_date = db.get_expected_arrival_date(barcode)
if isbn:
book_cover = get_book_cover(isbn)
else:
book_cover = "%s/img/book_cover_placeholder.gif" % (CFG_SITE_URL)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<form name="update_item_info_step4_form"
action="%s/admin2/bibcirculation/update_item_info_step5" method="get" >
<div class="bibcircbottom">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr valign='top'>
<td width="400">
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
<td class="bibcirccontent">
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
<br />
""" % (CFG_SITE_URL,
_("Item details"),
_("Name"),
title,
_("Author(s)"),
author,
_("Year"),
year,
_("Publisher"),
editor,
_("ISBN"),
isbn,
str(book_cover))
out += """
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf'
size=35 name="barcode" value="%s">
<input type=hidden name=old_barcode value="%s">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<select name="library_id" style='border: 1px solid #cfcfcf'>
""" % (_("Update copy information"),
_("Barcode"), result[0], result[0],
_("Library"))
for(library_id, name) in libraries:
if library_id == result[1]:
out += """<option value ="%s" selected>%s</option>
""" % (library_id, name)
else:
out += """<option value ="%s">%s</option>
""" % (library_id, name)
out += """
</select>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf' size=35
name="location" value="%s">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<select name="collection" style='border: 1px solid #cfcfcf'>
""" % (_("Location"), result[4],
_("Collection"))
for collection in CFG_BIBCIRCULATION_COLLECTION:
if collection == result[3]:
out += """
<option value="%s" selected="selected">%s</option>
""" % (collection, collection)
else:
out += """
<option value="%s">%s</option>
""" % (collection, collection)
out += """
</select>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf' size=35
name="description" value="%s">
</td>
</tr>
""" % (_("Description"), result[5] or '-')
out += """
<tr>
<th width="100">%s</th>
<td>
<select name="loan_period" style='border: 1px solid #cfcfcf'>
""" % (_("Loan period"))
for loan_period in CFG_BIBCIRCULATION_ITEM_LOAN_PERIOD:
if loan_period == result[6]:
out += """
<option value="%s" selected="selected">%s</option>
""" % (loan_period, loan_period)
else:
out += """
<option value="%s">%s</option>
""" % (loan_period, loan_period)
out += """
</select>
</td>
</tr>
"""
out += """
<tr>
<th width="100">%s</th>
<td>
<select name="status" style='border: 1px solid #cfcfcf'>
""" % (_("Status"))
for st in CFG_BIBCIRCULATION_ITEM_STATUS:
if st == CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN and result[7] != CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN:
pass # to avoid creting a fake loan,
# 'on loan' is only shown if the item was already on loan
elif st == result[7]:
out += """
<option value="%s" selected>%s</option>
""" % (st, st)
else:
out += """
<option value="%s">%s</option>
""" % (st, st)
out += """ </select>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf' size=35
name="expected_arrival_date" value="%s">
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button onClick="history.go(-1)"
value='%s' class='formbutton'>
<input type="submit" value='%s' class="formbutton">
<input type=hidden name=recid value="%s">
</td>
</tr>
</table>
<br />
<br />
</div>
</form>
""" % (_("Expected arrival date"), expected_arrival_date,
_("Back"), _("Continue"), recid)
return out
def tmpl_update_item_info_step5(self, tup_infos, ln=CFG_SITE_LANG):
"""
@param tup_info: item's information
@type tup_info: tuple
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<form name="update_item_info_step5_form"
action="%s/admin2/bibcirculation/update_item_info_step6" method="get" >
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesortersmall" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th> <td>%s</td>
</tr>
<tr>
<th width="100">%s</th> <td>%s</td>
</tr>
<tr>
<th width="100">%s</th> <td>%s</td>
</tr>
<tr>
<th width="100">%s</th> <td>%s</td>
</tr>
<tr>
<th width="100">%s</th> <td>%s</td>
</tr>
<tr>
<th width="100">%s</th> <td>%s</td>
</tr>
<tr>
<th width="100">%s</th> <td>%s</td>
</tr>
<tr>
<th width="100">%s</th> <td>%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit"
value="%s" class="formbutton">
<input type=hidden name=barcode value="%s">
<input type=hidden name=old_barcode value="%s">
<input type=hidden name=library_id value="%s">
<input type=hidden name=location value="%s">
<input type=hidden name=collection value="%s">
<input type=hidden name=description value="%s">
<input type=hidden name=loan_period value="%s">
<input type=hidden name=status value="%s">
<input type=hidden name=expected_arrival_date value="%s">
<input type=hidden name=recid value="%s">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (CFG_SITE_URL, _("New copy information"),
_("Barcode"), cgi.escape(tup_infos[0], True),
_("Library"), cgi.escape(tup_infos[3], True),
_("Location"), cgi.escape(tup_infos[4], True),
_("Collection"), cgi.escape(tup_infos[5], True),
_("Description"), cgi.escape(tup_infos[6], True),
_("Loan period"), cgi.escape(tup_infos[7], True),
_("Status"), cgi.escape(tup_infos[8], True),
_("Expected arrival date"), cgi.escape(tup_infos[9], True),
_("Back"), _("Confirm"),
cgi.escape(tup_infos[0], True),
cgi.escape(tup_infos[1], True),
tup_infos[2], cgi.escape(tup_infos[4], True),
cgi.escape(tup_infos[5], True),
cgi.escape(tup_infos[6], True),
cgi.escape(tup_infos[7], True),
cgi.escape(tup_infos[8], True),
cgi.escape(tup_infos[9], True), tup_infos[10])
return out
def tmpl_add_new_copy_step1(self, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<div class="bibcircbottom">
<form name="add_new_copy_step1_form"
action="%s/admin2/bibcirculation/add_new_copy_step2"
method="get" >
<br />
<br />
<br />
<input type=hidden name=start value="0">
<input type=hidden name=end value="10">
<table class="bibcirctable">
<tr align="center">
""" % (CFG_SITE_URL)
out += """
<td class="bibcirctableheader">%s
<input type="radio" name="f" value="" checked>%s
<input type="radio" name="f" value="name">%s
<input type="radio" name="f" value="author">%s
<input type="radio" name="f" value="title">%s
""" % (_("Search item by"), _("RecId/Item details"), _("year"),
_("author"), _("title"))
out += """
<br />
<br />
</td>
<tr align="center">
<td>
<input type="text" size="50" name="p" style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button
value="%s"
onClick="history.go(-1)"
class="formbutton">
<input type="submit"
value="%s"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
<br />
</div>
<form>
""" % (_("Back"), _("Search"))
return out
def tmpl_add_new_copy_step2(self, result, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirccontent">
<strong>%s items found</strong>
</td>
</tr>
</table>
<table class="bibcirctable">
</tr>
""" % (len(result))
for recid in result:
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/add_new_copy_step3',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
out += """
<tr align="center">
<td class="contents">%s</td>
</tr>
""" % (title_link)
out += """
</table>
<br />
"""
out += """
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s" onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</div>
""" % (_("Back"))
return out
def tmpl_add_new_copy_step3(self, recid, result, libraries,
original_copy_barcode, tmp_barcode,
infos, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
record_is_periodical = is_periodical(recid)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
(book_title, book_year, book_author,
book_isbn, book_editor) = book_information_from_MARC(recid)
if book_isbn:
book_cover = get_book_cover(book_isbn)
else:
book_cover = "%s/img/book_cover_placeholder.gif" % (CFG_SITE_URL)
out += """
<style type="text/css">
@import url("/css/tablesorter.css");
</style>
<style type="text/css">
@import url("/js/tablesorter/themes/blue/style.css");
</style>
<style type="text/css">
@import url("/js/tablesorter/addons/pager/jquery.tablesorter.pager.css");
</style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script src="/js/tablesorter/addons/pager/jquery.tablesorter.pager.js"
type="text/javascript"></script>
"""
if record_is_periodical:
out += """
<script type="text/javascript">
$(document).ready(function(){
$("#table_copies")
.tablesorter({sortList: [[6,1]],widthFixed: true, widgets: ['zebra']})
.bind("sortStart",function(){$("#overlay").show();})
.bind("sortEnd",function(){$("#overlay").hide()})
.tablesorterPager({container: $("#pager"), positionFixed: false});
});
</script>
"""
else:
out += """
<script type="text/javascript">
$(document).ready(function() {
$('#table_copies').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
"""
out += """
<form name="add_new_copy_step3_form"
action="%s/admin2/bibcirculation/add_new_copy_step4" method="get" >
<div class="bibcircbottom">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr valign='top'>
<td width="400">
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
<td class="bibcirccontent">
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
""" % (CFG_SITE_URL,
_("Item details"),
_("Name"),
book_title,
_("Author(s)"),
book_author,
_("Year"),
book_year,
_("Publisher"),
book_editor,
_("ISBN"),
book_isbn,
str(book_cover),
_("Copies of %s" % book_title))
out += """
<table class="tablesorter" id="table_copies" border="0"
cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
""" % (_("Barcode"),
_("Status"),
_("Due date"),
_("Library"))
if not record_is_periodical:
out += """
<th>%s</th>
""" % (_("Location"))
out += """
<th>%s</th>
<th>%s</th>
""" % (_("Loan period"),
_("No of loans"))
if not record_is_periodical:
out += """
<th>%s</th>
""" % (_("Collection"))
out += """
<th>%s</th>
</tr>
</thead>
<tboby>
""" % (_("Description"))
for (barcode, loan_period, lib_name, libid, location, nb_requests,
status, collection, description, due_date) in result:
library_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_details',
{'library_id': libid, 'ln': ln}, (lib_name))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
""" % (barcode, status, due_date or '-', library_link)
if not record_is_periodical:
out += """
<td>%s</td>
""" % (location)
out += """
<td>%s</td>
<td>%s</td>
""" % (loan_period, nb_requests)
if not record_is_periodical:
out += """
<td>%s</td>
""" % (collection or '-')
out += """
<td>%s</td>
""" % (description or '-')
out += """
</tbody>
</table>
"""
if record_is_periodical:
out += """
<div id="pager" class="pager">
<form>
<img src="/img/sb.gif" class="first" />
<img src="/img/sp.gif" class="prev" />
<input type="text" class="pagedisplay" />
<img src="/img/sn.gif" class="next" />
<img src="/img/se.gif" class="last" />
<select class="pagesize">
<option value="10" selected="selected">10</option>
<option value="20">20</option>
<option value="30">30</option>
<option value="40">40</option>
</select>
</form>
</div>
</br>
"""
if record_is_periodical:
colspan = 'colspan="5"'
else:
colspan = ''
if original_copy_barcode is not None:
default_details = db.get_item_info(original_copy_barcode)
if default_details is not None:
default_library_id = default_details[1]
default_collection = default_details[3]
default_location = default_details[4]
default_description = default_details[5]
default_loan_period = default_details[6]
out += """
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th>%s</th>
<td %s>
<input type="text" style='border: 1px solid #cfcfcf' size=35
name="barcode" value='%s'>
</td>
</tr>
<tr>
<th>%s</th>
<td %s>
<select name="library" style='border: 1px solid #cfcfcf'>
""" % (_("New copy details"), _("Barcode"),
colspan, tmp_barcode, _("Library"), colspan)
main_library = db.get_main_libraries()
if main_library is not None:
main_library = main_library[0][0] #id of the first main library
for(library_id, name) in libraries:
if original_copy_barcode is not None and \
default_details is not None and \
library_id == default_library_id:
out += """<option value="%s" selected="selected">%s</option>
""" % (library_id, name)
elif library_id == main_library:
out += """<option value="%s" selected="selected">%s</option>
""" % (library_id, name)
else:
out += """<option value="%s">%s</option>""" % (library_id, name)
if original_copy_barcode is not None \
and default_location is not None:
loc = default_location
else:
loc = ''
out += """
</select>
</td>
</tr>
"""
if record_is_periodical:
out += """ <input type=hidden name=collection value="%s">
""" % ("Periodical")
else:
out += """
<tr>
<th width="100">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf' size=35
name="location" value="%s">
</td>
</tr>
""" % (_("Location"), loc)
out += """
<tr>
<th width="100">%s</th>
<td>
<select name="collection" style='border: 1px solid #cfcfcf'>
""" % (_("Collection"))
for collection in CFG_BIBCIRCULATION_COLLECTION:
if original_copy_barcode is not None and \
default_collection is not None and \
collection == default_collection:
out += """
<option value="%s" selected="selected">%s</option>
""" % (collection, collection)
else:
out += """
<option value="%s">%s</option>
""" % (collection, collection)
out += """
</select>
</td>
</tr>
"""
if original_copy_barcode is not None \
and default_description is not None:
desc = default_description
else:
desc = ''
out += """
<tr>
<th width="100">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf' size=35
name="description" value="%s">
</td>
</tr>
""" % (_("Description"), desc)
out += """
<tr>
<th width="100">%s</th>
<td %s>
<select name="loan_period" style='border: 1px solid #cfcfcf'>
""" % (_("Loan period"), colspan)
for loan_period in CFG_BIBCIRCULATION_ITEM_LOAN_PERIOD:
if original_copy_barcode is not None and \
default_loan_period is not None and \
loan_period == default_loan_period:
out += """
<option value="%s" selected="selected">%s</option>
""" % (loan_period, loan_period)
else:
out += """
<option value="%s">%s</option>
""" % (loan_period, loan_period)
out += """
</select>
</td>
</tr>
"""
out += """
<tr>
<th width="100">%s</th>
<td %s>
<select name="status" style='border: 1px solid #cfcfcf'>
""" % (_("Status"), colspan)
for st in CFG_BIBCIRCULATION_ITEM_STATUS:
if st == CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF:
out += """
<option value ="%s" selected="selected">%s</option>
""" % (st, st)
else:
out += """
<option value ="%s">%s</option>
""" % (st, st)
out += """
</select>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td %s>
<input type="text" style='border: 1px solid #cfcfcf' size=35
name="expected_arrival_date" value="">
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit" value="%s" class="formbutton">
<input type=hidden name=recid value="%s">
</td>
</tr>
</table>
<br />
<br />
</div>
</form>
""" % (_("Expected arrival date"), colspan, _("Back"),
_("Continue"), recid)
return out
def tmpl_add_new_copy_step4(self, tup_infos, ln=CFG_SITE_LANG):
"""
@param tup_info: item's information
@type tup_info: tuple
@param ln: language of the page
"""
_ = gettext_set_language(ln)
(barcode, library, _library_name, location, collection, description,
loan_period, status, expected_arrival_date, recid) = tup_infos
out = """ """
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<form name="add_new_copy_step4_form"
action="%s/admin2/bibcirculation/add_new_copy_step5"
method="get" >
<br />
<br />
<table class="tablesorterborrower">
<tr>
<th width="90">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="90">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="90">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="90">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="90">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="90">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="90">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="90">%s</th> <td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit"
value="%s" class="formbutton">
<input type=hidden name=barcode value="%s">
<input type=hidden name=library value="%s">
<input type=hidden name=location value="%s">
<input type=hidden name=collection value="%s">
<input type=hidden name=description value="%s">
<input type=hidden name=loan_period value="%s">
<input type=hidden name=status value="%s">
<input type=hidden name=expected_arrival_date value="%s">
<input type=hidden name=recid value="%s">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (CFG_SITE_URL,
_("Barcode"), tup_infos[0],
_("Library"), tup_infos[2],
_("Location"), tup_infos[3],
_("Collection"), tup_infos[4],
_("Description"), tup_infos[5],
_("Loan period"), tup_infos[6],
_("Status"), tup_infos[7],
_("Expected arrival date"), expected_arrival_date,
_("Back"), _("Continue"),
barcode, library, location, collection, description,
loan_period, status, expected_arrival_date, recid)
return out
def tmpl_add_new_copy_step5(self, infos, recid, ln=CFG_SITE_LANG):
"""
@param recid: identify the record. Primary key of bibrec.
@type recid: int
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
if infos == []:
out += """
<div class="bibcircbottom">
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s'
onClick="location.href='%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s'"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("A %(x_url_open)snew copy%(x_url_close)s has been added.") % {'x_url_open': '<a href="' + CFG_SITE_URL + '/admin2/bibcirculation/get_item_details?ln=%s&amp;recid=%s' %(ln, recid) + '">', 'x_url_close': '</a>'},
_("Back to home"),
CFG_SITE_URL, ln)
else:
out += """<br /> """
out += self.tmpl_infobox(infos, ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s'
onClick="location.href='%s/admin2/bibcirculation/get_item_details?ln=%s&recid=%s'"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("Back to the record"),
CFG_SITE_URL, ln, recid)
return out
def tmpl_delete_copy_step1(self, barcode_to_delete, recid, result,
infos, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
record_is_periodical = is_periodical(recid)
out = load_menu(ln)
(book_title, book_year, book_author,
book_isbn, book_editor) = book_information_from_MARC(recid)
if book_isbn:
book_cover = get_book_cover(book_isbn)
else:
book_cover = "%s/img/book_cover_placeholder.gif" % (CFG_SITE_URL)
out += """
<style type="text/css">
@import url("/css/tablesorter.css");
</style>
<style type="text/css">
@import url("/js/tablesorter/themes/blue/style.css");
</style>
<style type="text/css">
@import url("/js/tablesorter/addons/pager/jquery.tablesorter.pager.css");
</style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script src="/js/tablesorter/addons/pager/jquery.tablesorter.pager.js"
type="text/javascript"></script>
"""
if record_is_periodical:
out += """
<script type="text/javascript">
$(document).ready(function(){
$("#table_copies")
.tablesorter({sortList: [[6,1]],widthFixed: true, widgets: ['zebra']})
.bind("sortStart",function(){$("#overlay").show();})
.bind("sortEnd",function(){$("#overlay").hide()})
.tablesorterPager({container: $("#pager"), positionFixed: false});
});
</script>
"""
else:
out += """
<script type="text/javascript">
$(document).ready(function() {
$('#table_copies').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
"""
out += """
<form name="delete_copy_step2_form"
action="%s/admin2/bibcirculation/delete_copy_step2"
method="get">
<div class="bibcircbottom">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr valign='top'>
<td width="400">
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
<td class="bibcirccontent">
<img style='border: 1px solid #cfcfcf'
src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
""" % (CFG_SITE_URL,
_("Item details"),
_("Name"),
book_title,
_("Author(s)"),
book_author,
_("Year"),
book_year,
_("Publisher"),
book_editor,
_("ISBN"),
book_isbn,
str(book_cover),
_("Copies of %s" % book_title))
out += """
<table class="tablesorter" id="table_copies" border="0"
cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
""" % (_("Barcode"),
_("Status"),
_("Due date"),
_("Library"))
if not record_is_periodical:
out += """
<th>%s</th>
""" % (_("Location"))
out += """
<th>%s</th>
<th>%s</th>
""" % (_("Loan period"),
_("No of loans"))
if not record_is_periodical:
out += """
<th>%s</th>
""" % (_("Collection"))
out += """
<th>%s</th>
</tr>
</thead>
<tboby>
""" % (_("Description"))
for (barcode, loan_period, lib_name, libid, location, nb_requests,
status, collection, description, due_date) in result:
library_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_details',
{'library_id': libid, 'ln': ln}, (lib_name))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
""" % (barcode, status, due_date or '-', library_link)
if not record_is_periodical:
out += """
<td>%s</td>
""" % (location)
out += """
<td>%s</td>
<td>%s</td>
""" % (loan_period, nb_requests)
if not record_is_periodical:
out += """
<td>%s</td>
""" % (collection or '-')
out += """
<td>%s</td>
""" % (description or '-')
out += """ </tbody>
</table>
"""
if record_is_periodical:
out += """
<div id="pager" class="pager">
<form>
<img src="/img/sb.gif" class="first" />
<img src="/img/sp.gif" class="prev" />
<input type="text" class="pagedisplay" />
<img src="/img/sn.gif" class="next" />
<img src="/img/se.gif" class="last" />
<select class="pagesize">
<option value="10" selected="selected">10</option>
<option value="20">20</option>
<option value="30">30</option>
<option value="40">40</option>
</select>
</form>
</div>
</br>
"""
out += self.tmpl_infobox(infos, ln)
out += """<table id="table_copies" class="tablesorter"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>""" % (_("Barcode"),
_("Status"),
_("Due date"),
_("Library"),
_("Location"),
_("Loan period"),
_("No of loans"),
_("Collection"),
_("Description"))
for (barcode, loan_period, lib_name, libid, location, nb_requests,
status, collection, description, due_date) in result:
if barcode == barcode_to_delete:
library_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_details',
{'library_id': libid, 'ln': ln}, (lib_name))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
</tr>
""" % (barcode, status, due_date, library_link, location,
loan_period, nb_requests, collection or '-',
description or '-')
out += """ </tbody>
</table>
"""
out += """<input type=hidden name=barcode value="%s">
""" % (str(barcode_to_delete))
out += """<input type=button value="%s" onClick="history.go(-1)"
class="formbutton">
""" % (_("Back"))
out += """<input type="submit" value='%s' class="formbutton">
""" % (_("Delete"))
out += """</div></form>"""
return out
def tmpl_create_loan(self, request_id, recid, borrower,
infos, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
(book_title, _book_year, _book_author,
book_isbn, _book_editor) = book_information_from_MARC(recid)
if book_isbn:
book_cover = get_book_cover(book_isbn)
else:
book_cover = "%s/img/book_cover_placeholder.gif" % (CFG_SITE_URL)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
(borrower_id, ccid, name, email, phone, address, mailbox) = borrower
display_id = borrower_id
id_string = _("ID")
if CFG_CERN_SITE == 1:
display_id = ccid
id_string = _("CCID")
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<form name="return_form" action="%s/admin2/bibcirculation/register_new_loan"
method="post" >
<div class="bibcircbottom">
<input type=hidden name=borrower_id value="%s">
<input type=hidden name=request_id value="%s">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
</form>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
"""% (CFG_SITE_URL,
borrower_id,
request_id,
_("Personal details"),
id_string, display_id,
_("Name"), name,
_("Address"), address,
_("Mailbox"), mailbox,
_("Email"), email,
_("Phone"), phone)
out += """
<br />
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th>%s</th>
</tr>
<tr>
<td>%s</td>
</tr>
<tr algin='center'>
<td>
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
<tr>
<th>%s</th>
</tr>
<tr>
<td>
<input type="text" size="66" name="barcode"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
""" % (_("Item"),
book_title,
str(book_cover),
_("Barcode"))
out += """
<br />
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th>%s</th>
</tr>
<tr>
<td>
<textarea name='new_note' rows="4" cols="57"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
<br />
""" % (_("Write notes"))
out += """
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s" onClick="history.go(-1)" class="bibcircbutton"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'">
<input type="submit" value="%s" class="bibcircbutton"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'">
</td>
</tr>
</table>
<br />
<br />
<br />
</div>
</form>
""" % (_("Back"),
_("Confirm"))
return out
###
### "Borrower" related templates
###
def tmpl_borrower_details(self, borrower, requests, loans, notes,
ill, proposals, req_hist, loans_hist, ill_hist,
proposal_hist, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
(borrower_id, ccid, name, email, phone, address, mailbox) = borrower
display_id = borrower_id
id_string = _("ID")
if CFG_CERN_SITE == 1:
display_id = ccid
id_string = _("CCID")
no_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_notes',
{'borrower_id': borrower_id},
(_("No notes")))
see_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_notes',
{'borrower_id': borrower_id},
(_("Notes about this borrower")))
if notes == "" or str(notes) == '{}':
check_notes = no_notes_link
else:
check_notes = see_notes_link
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
</form>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
""" % (_("Personal details"),
id_string, display_id,
_("Name"), name,
_("Address"), address,
_("Mailbox"), mailbox,
_("Email"), email,
_("Phone"), phone,
_("Notes"), check_notes)
nb_requests = len(requests)
nb_loans = len(loans)
nb_ill = len(ill)
nb_proposals = len(proposals)
nb_req_hist = len(req_hist)
nb_loans_hist = len(loans_hist)
nb_ill_hist = len(ill_hist)
nb_proposal_hist = len(proposal_hist)
out += """
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/loan_on_desk_step2?ln=%s&user_id=%s'"
value='%s' class='formbutton'>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/create_new_request_step1?ln=%s&borrower_id=%s'"
value='%s' class='formbutton'>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/register_ill_book_request?ln=%s&borrower_id=%s'"
value='%s' class='formbutton'>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/borrower_notification?ln=%s&borrower_id=%s&from_address=%s'"
value='%s' class='formbutton'>
""" % (CFG_SITE_URL, ln, borrower_id, _("New loan"),
CFG_SITE_URL, ln, borrower_id, _("New request"),
CFG_SITE_URL, ln, borrower_id, _("New ILL request"),
CFG_SITE_URL, ln, borrower_id, CFG_BIBCIRCULATION_LOANS_EMAIL, _("Notify this borrower"))
if CFG_CERN_SITE:
out += """
<input type=button onClick=
"location.href='%s/admin2/bibcirculation/get_borrower_details?ln=%s&borrower_id=%s&update=True'"
value="%s" class='formbutton'>
""" % (CFG_SITE_URL, ln, borrower_id, _("Update"))
else:
out += """
<input type=button
onClick=
"location.href='%s/admin2/bibcirculation/update_borrower_info_step1?ln=%s&borrower_id=%s'"
value="%s" class='formbutton'>
""" % (CFG_SITE_URL, ln, borrower_id, _("Update"))
out += """
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s %s</td>
</tr>
</table>
<table class="tablesortersmall" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button"
onClick="location.href='%s/admin2/bibcirculation/get_borrower_requests_details?ln=%s&borrower_id=%s'"
onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value='%s' class="bibcircbutton">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button"
onClick="location.href='%s/admin2/bibcirculation/get_borrower_loans_details?ln=%s&borrower_id=%s'"
onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value='%s' class="bibcircbutton">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button"
onClick="location.href='%s/admin2/bibcirculation/get_borrower_ill_details?ln=%s&borrower_id=%s'"
onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value='%s' class="bibcircbutton">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button"
onClick="location.href='%s/admin2/bibcirculation/get_borrower_ill_details?ln=%s&borrower_id=%s&request_type=%s'"
onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value='%s' class="bibcircbutton">
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesortersmall" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button"
onClick="location.href='%s/admin2/bibcirculation/bor_requests_historical_overview?ln=%s&borrower_id=%s'"
onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value='%s' class="bibcircbutton">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button"
onClick="location.href='%s/admin2/bibcirculation/bor_loans_historical_overview?ln=%s&borrower_id=%s'"
onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value='%s' class="bibcircbutton">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button"
onClick="location.href='%s/admin2/bibcirculation/bor_ill_historical_overview?ln=%s&borrower_id=%s'"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'"
value='%s' class="bibcircbutton">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td width="50">%s</td>
<td>
<input type="button"
onClick="location.href='%s/admin2/bibcirculation/bor_ill_historical_overview?ln=%s&borrower_id=%s&request_type=%s'"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'"
value='%s' class="bibcircbutton">
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td><input type=button value='%s'
onClick="history.go(-1)" class="formbutton"></td>
</tr>
</table>
<br />
</div>
""" % (_("Requests, Loans and ILL overview on"),
dateutils.convert_datestruct_to_datetext(localtime()),
_("Requests"), nb_requests, CFG_SITE_URL, ln, borrower_id,
_("More details"),
_("Loans"), nb_loans, CFG_SITE_URL, ln, borrower_id,
_("More details"),
_("ILL"), nb_ill, CFG_SITE_URL, ln, borrower_id,
_("More details"),
_("Proposals"), nb_proposals, CFG_SITE_URL, ln, borrower_id, 'proposal-book',
_("More details"),
_("Historical overview"),
_("Requests"), nb_req_hist, CFG_SITE_URL, ln, borrower_id,
_("More details"),
_("Loans"), nb_loans_hist, CFG_SITE_URL, ln, borrower_id,
_("More details"),
_("ILL"), nb_ill_hist, CFG_SITE_URL, ln, borrower_id,
_("More details"),
_("Proposals"), nb_proposal_hist, CFG_SITE_URL, ln, borrower_id, 'proposal-book',
_("More details"),
_("Back"))
return out
def tmpl_borrower_request_details(self, result, borrower_id,
ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = """
"""
out += load_menu(ln)
if len(result) == 0:
out += """
<div class="bibcircbottom">
<br />
<div class="bibcircinfoboxmsg">%s</div>
<br />
""" % (_("There are no requests."))
else:
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_requests').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<form name="borrower_form" action="%s/admin2/bibcirculation/get_borrower_requests_details" method="get" >
<div class="bibcircbottom">
<br />
<table id="table_requests" class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
</form>
"""% (CFG_SITE_URL,
_("Item"),
_("Request status"),
_("Library"),
_("Location"),
_("From"),
_("To"),
_("Request date"),
_("Request option(s)"))
for (recid, status, library, location, date_from,
date_to, request_date, request_id) in result:
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td align="center">
<input type="button" value='%s' style="background: url(/img/dialog-cancel.png)
no-repeat #8DBDD8; width: 75px; text-align: right;"
onClick="location.href='%s/admin2/bibcirculation/get_pending_requests?ln=%s&request_id=%s'"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'"
class="bibcircbutton">
</td>
</tr>
""" % (title_link, status, library, location, date_from,
date_to, request_date, _("Cancel"),
CFG_SITE_URL, ln, request_id)
out += """
</tbody>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button onClick="location.href='%s/admin2/bibcirculation/get_borrower_details?ln=%s&borrower_id=%s'"
value='%s' class='formbutton'>
</td>
</tr>
</table>
<br />
</div>
""" % (CFG_SITE_URL, ln,
borrower_id,
_("Back"))
return out
def tmpl_borrower_loans_details(self, borrower_loans, borrower_id, infos,
ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
if len(borrower_loans) == 0:
out += """
<div class="bibcircbottom">
<br />
<div class="bibcircinfoboxmsg">%s</div>
<br />
""" % (_("There are no loans."))
else:
out += """
<form name="borrower_form" action="%s/admin2/bibcirculation/get_borrower_loans_details?submit_changes=true" method="get" >
<input type=hidden name=borrower_id value="%s">
<div class="bibcircbottom">
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_loans').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<br />
<table id="table_loans" class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
"""% (CFG_SITE_URL,
borrower_id,
_("Item"),
_("Barcode"),
_("Loan date"),
_("Due date"),
_("Renewals"),
_("Overdue letters"),
_("Type"),
_("Loan notes"),
_("Loans status"),
_("Loan options"))
for (recid, barcode, loaned_on, due_date, nb_renewal,
nb_overdue, date_overdue, loan_type, notes,
loan_id, status) in borrower_loans:
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
no_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_loans_notes',
{'loan_id': loan_id, 'ln': ln},
(_("No notes")))
see_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_loans_notes',
{'loan_id': loan_id, 'ln': ln},
(_("See notes")))
if notes == "" or str(notes) == '{}':
check_notes = no_notes_link
else:
check_notes = see_notes_link
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s - %s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
""" % (title_link, barcode, loaned_on, due_date, nb_renewal,
nb_overdue, date_overdue, loan_type, check_notes, status)
out += """
<td align="center">
<SELECT style='border: 1px solid #cfcfcf'
ONCHANGE="location = this.options[this.selectedIndex].value;">
<OPTION VALUE="">%s
<OPTION
VALUE="get_borrower_loans_details?borrower_id=%s&barcode=%s&loan_id=%s&recid=%s">%s
<OPTION VALUE="loan_return_confirm?barcode=%s">%s
""" % (_("Select an action"),
borrower_id, barcode, loan_id, recid, _("Renew"),
barcode, _("Return"))
if status == CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED:
out += """
<OPTION VALUE="change_due_date_step1?barcode=%s&borrower_id=%s" DISABLED>%s
""" % (barcode, borrower_id, _("Change due date"))
else:
out += """
<OPTION VALUE="change_due_date_step1?barcode=%s&borrower_id=%s">%s
""" % (barcode, borrower_id, _("Change due date"))
out += """
<OPTION VALUE="claim_book_return?borrower_id=%s&recid=%s&loan_id=%s&template=claim_return">%s
</SELECT>
</td>
<input type=hidden name=barcode value="%s">
<input type=hidden name=loan_id value="%s">
</tr>
""" % (borrower_id, recid, loan_id, _("Send recall"),
barcode, loan_id)
out += """
</tbody>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent" align="right" width="100">
<input type=button onClick="location.href='%s/admin2/bibcirculation/get_borrower_loans_details?ln=%s&borrower_id=%s&renewal=true'"
value='%s' class='bibcircbutton'onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"></td>
</tr>
</table>
""" % (CFG_SITE_URL, ln,
borrower_id,
_("Renew all loans"))
out += """
<table class="bibcirctable">
<tr>
<td>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/get_borrower_details?ln=%s&borrower_id=%s'"
value='%s' class='formbutton'></td>
</tr>
</table>
<br />
</div>
</form>
""" % (CFG_SITE_URL, ln,
borrower_id,
_("Back"))
return out
def tmpl_bor_requests_historical_overview(self, req_hist_overview,
ln=CFG_SITE_LANG):
"""
Return the historical requests overview of a borrower.
@param req_hist_overview: list of old requests.
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
if len(req_hist_overview) == 0:
out += """
<div class="bibcircbottom">
<br />
<div class="bibcircinfoboxmsg">%s</div>
<br />
""" % (_("There are no requests."))
else:
out += """<div class="bibcircbottom">
<br /> <br />
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_requests').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<table id="table_requests" class="tablesorter"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
<thead>
<tbody>
""" % (_("Item"), _("Barcode"), _("Library"),
_("Location"), _("From"),
_("To"), _("Request date"))
for (recid, barcode, library_name,
location, req_from, req_to, req_date) in req_hist_overview:
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
out += """ <tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
</tr>
""" % (title_link, barcode, library_name, location, req_from,
req_to, req_date)
out += """
</tbody>
</table>
<br />
"""
out += """
<table class="bibcirctable">
<tr>
<td><input type=button value='%s'
onClick="history.go(-1)" class="formbutton"></td>
</tr>
</table>
<br />
<br />
<br />
</div>
""" % (_("Back"))
return out
def tmpl_bor_loans_historical_overview(self, loans_hist_overview,
ln=CFG_SITE_LANG):
"""
Return the historical loans overview of a borrower.
@param loans_hist_overview: list of old loans.
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
if len(loans_hist_overview) == 0:
out += """
<div class="bibcircbottom">
<br />
<div class="bibcircinfoboxmsg">%s</div>
<br />
""" % (_("There are no loans."))
else:
out += """<div class="bibcircbottom">
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_loans').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<br /> <br />
<table id="table_loans" class="tablesorter"
border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
""" % (_("Item"),
_("Barcode"),
_("Library"),
_("Location"),
_("Loaned on"),
_("Due date"),
_("Returned on"),
_("Renewals"),
_("Overdue letters"))
recid = '-'
barcode = '-'
library_name = '-'
location = '-'
loaned_on = '-'
due_date = '-'
returned_on = '-'
nb_renew = '-'
nb_overdueletters = '-'
for (recid, barcode, library_name, location, loaned_on, due_date,
returned_on, nb_renew,
nb_overdueletters) in loans_hist_overview:
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
out += """ <tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
</tr>
""" % (title_link, barcode,
library_name, location,
loaned_on, due_date,
returned_on, nb_renew,
nb_overdueletters)
out += """
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</div>
""" % (_("Back"))
return out
def tmpl_borrower_notification(self, email, subject, email_body, borrower_id,
from_address, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
if subject is None:
subject = ""
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<form name="borrower_notification"
action="%s/admin2/bibcirculation/borrower_notification"
method="get" >
<div class="bibcircbottom">
<input type=hidden name=borrower_id value="%s">
<input type=hidden name=from_address value="%s">
<br />
<table class="tablesortermedium" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="50">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="50">%s</th>
<td>%s</td>
</tr>
""" % (CFG_SITE_URL,
borrower_id,
from_address,
_("From"),
_("CERN Library"),
_("To"),
email)
out += """
<tr>
<th width="50">%s</th>
<td>
<input type="text" name="subject" size="60"
value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
<br />
<table class="tablesortermedium" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="500">%s</th>
<th>%s</th>
</tr>
<tr>
<td>
<textarea rows="10" cols="100" name="message"
style='border: 1px solid #cfcfcf'>%s</textarea>
</td>
""" % (_("Subject"),
subject,
_("Message"),
_("Choose a template"),
email_body)
out += """
<td>
<select name="template" style='border: 1px solid #cfcfcf'>
<option value ="">%s</option>
<option value ="overdue_letter">%s</option>
<option value ="reminder">%s</option>
<option value ="notification">%s</option>
<option value ="claim_return">%s</option>
<option value ="ill_recall1">%s</option>
<option value ="proposal_acceptance">%s</option>
<option value ="proposal_refusal">%s</option>
<option value ="purchase_received_cash">%s</option>
<option value ="purchase_received_tid">%s</option>
</select>
<br />
<br />
<button type="submit" name="load_msg_template" value="True" class="formbutton">%s</button>
</td>
</tr>
</table>
""" % (_("Templates"),
_("Overdue letter"),
_("Reminder"),
_("Notification"),
_("Loan recall"),
_("ILL recall"),
_("Proposal-accept"),
_("Proposal-refuse"),
_("Purchase-received-cash"),
_("Purchase-received-TID"),
_("Load"))
out += """
<br /> <br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s" onClick="history.go(-1)" class="formbutton">
<input type="reset" name="reset_button" value="%s" class="formbutton">
<input type="submit" name="send_message" value="%s" class="formbutton">
</td>
</tr>
</table>
<br /> <br />
</div>
</form>
""" % (_("Back"),
_("Reset"),
_("Send"))
return out
def tmpl_borrower_notes(self, borrower_notes, borrower_id,
ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
if not borrower_notes:
borrower_notes = {}
else:
if looks_like_dictionary(borrower_notes):
borrower_notes = eval(borrower_notes)
else:
borrower_notes = {}
out = """ """
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<form name="borrower_notes"
action="%s/admin2/bibcirculation/get_borrower_notes"
method="post" >
<input type=hidden name=borrower_id value='%s'>
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td>
<table class="bibcircnotes">
""" % (CFG_SITE_URL, borrower_id,
_("Notes about borrower"))
key_array = borrower_notes.keys()
key_array.sort()
for key in key_array:
delete_note = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_notes',
{'delete_key': key, 'borrower_id': borrower_id,
'ln': ln}, (_("[delete]")))
out += """<tr class="bibcirccontent">
<td class="bibcircnotes" width="160"
valign="top" align="center"><b>%s</b></td>
<td width="400"><i>%s</i></td>
<td width="65" align="center">%s</td>
</tr>
""" % (key, borrower_notes[key], delete_note)
out += """
</table>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">
<textarea name="library_notes" rows="5" cols="90"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/get_borrower_details?ln=%s&borrower_id=%s'"
value="%s" class='formbutton'>
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</form>
</div>
""" % (_("Write new note"),
CFG_SITE_URL, ln,
borrower_id,
_("Back"),
_("Confirm"))
return out
def tmpl_add_new_borrower_step1(self, tup_infos=None, infos=None, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
if tup_infos:
(name, email, phone, address, mailbox, notes) = tup_infos
else:
(name, email, phone, address, mailbox, notes) = ('', '', '', '', '', '')
out = ''
if infos:
out += self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<form name="add_new_borrower_step1_form"
action="%s/admin2/bibcirculation/add_new_borrower_step2"
method="get">
<br />
<br />
<table class="bibcirctable">
<tr>
<td width="70">%s</td>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf'
size=45 name="name" value="%s">
</td>
</tr>
<tr>
<td width="70">%s</td>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf'
size=45 name="email" value="%s">
</td>
</tr>
<tr>
<td width="70">%s</td>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf'
size=45 name="phone" value="%s">
</td>
</tr>
<tr>
<td width="70">%s</td>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf'
size=45 name="address" value="%s">
</td>
</tr>
<tr>
<td width="70">%s</td>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf'
size=45 name="mailbox" value="%s">
</td>
</tr>
<tr>
<td width="70" valign="top">%s</td>
<td class="bibcirccontent">
<textarea name="notes" rows="5" cols="39"
style='border: 1px solid #cfcfcf'>%s</textarea>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s" class="formbutton"
onClick="history.go(-1)">
<input type="submit" value="%s" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (CFG_SITE_URL,
_("Name"), name,
_("Email"), email,
_("Phone"), phone,
_("Address"), address,
_("Mailbox"), mailbox,
_("Notes"), notes,
_("Back"), _("Continue"))
return out
def tmpl_add_new_borrower_step2(self, tup_infos, infos, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
(name, email, phone, address, mailbox, notes) = tup_infos
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<form name="add_new_borrower_step2_form"
action="%s/admin2/bibcirculation/add_new_borrower_step3"
method="post" >
<br />
<br />
<table class="bibcirctable">
<tr>
<td width="70">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="70">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="70">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="70">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="70">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="70">%s</td> <td class="bibcirccontent">%s</td>
</tr>
</table>
""" % (CFG_SITE_URL,
_("Name"), name,
_("Email"), email,
_("Phone"), phone,
_("Address"), address,
_("Mailbox"), mailbox,
_("Notes"), notes)
if infos:
out += """
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="history.go(-1)"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (_("Back"))
else:
out += """
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="history.go(-1)"
class="formbutton">
<input type="submit" value="%s" class="formbutton">
<input type=hidden name=tup_infos value="%s">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (_("Back"), _("Continue"), tup_infos)
return out
def tmpl_add_new_borrower_step3(self, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="location.href='%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s'"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("A new borrower has been registered."),
_("Back to home"),
CFG_SITE_URL, ln)
return out
def tmpl_update_borrower_info_step1(self, tup_infos, infos=None,
ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
(borrower_id, name, email, phone, address, mailbox) = tup_infos
display_id = borrower_id
id_string = _("ID")
out = ''
if infos:
out += self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<form name="update_borrower_info_step1_form"
action="%s/admin2/bibcirculation/update_borrower_info_step2"
method="get" >
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td width="70">%s</td>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="70">%s</td>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf'
size=45 name="name" value="%s">
</td>
</tr>
<tr>
<td width="70">%s</td>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf'
size=45 name="address" value="%s">
</td>
</tr>
<tr>
<td width="70">%s</td>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf'
size=45 name="mailbox" value="%s">
</td>
</tr>
<tr>
<td width="70">%s</td>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf'
size=45 name="email" value="%s">
</td>
</tr>
<tr>
<td width="70">%s</td>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf'
size=45 name="phone" value="%s">
<input type=hidden name=borrower_id value="%s">
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s" onClick="history.go(-1)"
class="formbutton">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (CFG_SITE_URL, _("Borrower information"),
id_string, display_id,
_("Name"), name,
_("Address"), address,
_("Mailbox"), mailbox,
_("Email"), email,
_("Phone"), phone,
borrower_id,
_("Back"), _("Continue"))
return out
###
### ILL/Purchase/Acquisition related templates.
### Naming of the methods is not intuitive. Should be improved
### and appropriate documentation added, when required.
### Also, methods could be refactored.
###
def tmpl_borrower_ill_details(self, result, borrower_id,
ln=CFG_SITE_LANG):
"""
@param result: ILL request's information
@type result: list
@param borrower_id: identify the borrower. Primary key of crcBORROWER.
@type borrower_id: int
@param ill_id: identify the ILL request. Primray key of crcILLREQUEST
@type ill_id: int
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_ill').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<div class="bibcircbottom">
<br />
<table id="table_ill" class="tablesorter" border="0"
cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
""" % (_("ILL ID"),
_("Item"),
_("Supplier"),
_("Request date"),
_("Expected date"),
_("Arrival date"),
_("Due date"),
_("Status"),
_("Library notes"))
for (ill_id, book_info, supplier_id, request_date,
expected_date, arrival_date, due_date, status,
library_notes, request_type) in result:
#get supplier name
if supplier_id:
if request_type in CFG_BIBCIRCULATION_ACQ_TYPE or \
request_type in CFG_BIBCIRCULATION_PROPOSAL_TYPE:
library_name = db.get_vendor_name(supplier_id)
library_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_vendor_details',
{'vendor_id': supplier_id, 'ln': ln},
(library_name))
else:
library_name = db.get_library_name(supplier_id)
library_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_details',
{'library_id': supplier_id, 'ln': ln},
(library_name))
else:
library_link = '-'
#get book title
if looks_like_dictionary(book_info):
book_info = eval(book_info)
else:
book_info = {}
try:
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': book_info['recid'], 'ln': ln},
(book_title_from_MARC(int(book_info['recid']))))
except KeyError:
title_link = book_info['title']
if request_type in CFG_BIBCIRCULATION_ACQ_TYPE or \
request_type in CFG_BIBCIRCULATION_PROPOSAL_TYPE:
ill_id_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/purchase_details_step1',
{'ill_request_id': str(ill_id), 'ln': ln},
str(ill_id))
else:
ill_id_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/ill_request_details_step1',
{'ill_request_id': str(ill_id), 'ln': ln},
str(ill_id))
# links to notes pages
lib_no_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_ill_library_notes',
{'ill_id': ill_id}, (_("No notes")))
lib_see_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_ill_library_notes',
{'ill_id': ill_id}, (_("Notes about this ILL")))
if library_notes == "":
notes_link = lib_no_notes_link
else:
notes_link = lib_see_notes_link
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
</tr>
""" % (ill_id_link, title_link, library_link, request_date,
expected_date, arrival_date, due_date, status,
notes_link)
out += """
</tbody>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/get_borrower_details?borrower_id=%s&ln=%s'"
value='%s' class='formbutton'>
</td>
</tr>
</table>
<br />
</div>
""" % (CFG_SITE_URL,
borrower_id, ln,
_("Back"))
return out
def tmpl_ill_request_with_recid(self, recid, infos, ln=CFG_SITE_LANG):
"""
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@param infos: information
@type infos: list
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
(book_title, book_year, book_author,
book_isbn, book_editor) = book_information_from_MARC(recid)
today = datetime.date.today()
within_six_months = (datetime.date.today() + \
datetime.timedelta(days=182)).strftime('%Y-%m-%d')
out += """
<div align="center">
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<form name="update_item_info_step4_form" action="%s/record/%s/holdings/ill_register_request_with_recid" method="post" >
<table class="bibcirctable">
<tr align="center">
<td><h1 class="headline">%s</h1></td>
</tr>
</table>
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<input type=hidden name=recid value='%s'>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
<br />
<br />
""" % (CFG_SITE_URL, recid,
_('Interlibrary loan request for books'),
_("Item details"),
recid,
_("Name"),
book_title,
_("Author(s)"),
book_author,
_("Year"),
book_year,
_("Publisher"),
book_editor,
_("ISBN"),
book_isbn)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script type="text/javascript" language='JavaScript' src="%s/js/ui.datepicker.min.js"></script>
"""% CFG_SITE_URL
out += """
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="150">%s</th>
<td>
<script type="text/javascript">
$(function() {
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker1" name="period_of_interest_from" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="150">%s</th>
<td>
<script type="text/javascript">
$(function() {
$("#date_picker2").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker2" name="period_of_interest_to" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th valign="top" width="150">%s</th>
<td><textarea name='additional_comments' rows="6" cols="30"
style='border: 1px solid #cfcfcf'></textarea></td>
</tr>
</table>
<table class="bibcirctable">
<tr align="center">
<td>
<input name="conditions" type="checkbox" value="accepted" />%s</td>
</tr>
<tr align="center">
<td>
<input name="only_edition" type="checkbox" value="Yes" />%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit"
value="%s" class="formbutton">
</td>
</tr>
</table>
</form>
<br />
<br />
</div>
""" % (_("ILL request details"),
_("Period of interest - From"),
CFG_SITE_URL, today,
_("Period of interest - To"),
CFG_SITE_URL, within_six_months,
_("Additional comments"),
_("I accept the %(x_url_open)sconditions%(x_url_close)s of the service in particular the return of books in due time.") % {'x_url_open': '<a href="http://library.web.cern.ch/library/Library/ill_faq.html" target="_blank">', 'x_url_close': '</a>'},
_("I want this edition only."),
_("Back"), _("Continue"))
return out
def tmpl_ill_register_request_with_recid(self, message, ln=CFG_SITE_LANG):
"""
@param message: information to the borrower
@type message: string
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = """
<br /> <br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent" width="30">%s</td>
</tr>
<tr>
<td class="bibcirccontent" width="30">%s</td>
</tr>
</table>
<br /> <br />
<table class="bibcirctable">
<td><input type=button onClick="location.href='%s'" value='%s' class='formbutton'></td>
</table>
<br /> <br />
""" % (message,
_("Check your library account %(here_link)s.") % {'here_link':
create_html_link(CFG_SITE_URL + '/yourloans/display',
{'ln': ln}, _("here"))},
CFG_SITE_URL,
_("Back to home"))
return out
def tmpl_register_ill_request_with_no_recid_step1(self, infos, borrower_id,
admin=True,
ln=CFG_SITE_LANG):
"""
@param infos: information
@type infos: list
"""
_ = gettext_set_language(ln)
if admin:
form_url = CFG_SITE_URL + '/admin2/bibcirculation/register_ill_request_with_no_recid_step2'
else:
form_url = CFG_SITE_URL+'/ill/book_request_step2'
out = self.tmpl_infobox(infos, ln)
if admin:
out += load_menu(ln)
out += """
<br />
<br />
<div class="bibcircbottom" align="center">
<div class="bibcircinfoboxmsg"><strong>%s<br />%s</strong></div>
<br />
<br />
""" % (_("Book does not exist in %(CFG_SITE_NAME)s") % \
{'CFG_SITE_NAME': CFG_SITE_NAME},
_("Please fill the following form."))
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<form name="display_ill_form" action="%s" method="get">
""" % (form_url)
out += """
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s</td>
</tr>
</table>
""" % (_("Item details"))
if borrower_id not in (None, ''):
out += """
<input type=hidden name=borrower_id value="%s">
""" % (borrower_id)
out += """
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="45" name="title" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="45" name="authors" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="place" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="publisher" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="year" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="edition" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="isbn" style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
<br />
""" % (_("Book title"),
_("Author(s)"),
_("Place"),
_("Publisher"),
_("Year"),
_("Edition"),
_("ISBN"))
out += """
<script type="text/javascript" language='JavaScript' src="%s/js/ui.datepicker.min.js"></script>
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<!--<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="budget_code" style='border: 1px solid #cfcfcf'>
</td>
</tr>-->
<tr>
<th valign="center" width="100">%s</th>
<td valign="center" width="250">
<script type="text/javascript">
$(function() {
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker1" name="period_of_interest_from" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
"""% (CFG_SITE_URL,
_("ILL request details"), _("Budget code"),
_("Period of interest (From)"),
CFG_SITE_URL,
datetime.date.today().strftime('%Y-%m-%d'))
out += """
<tr>
<th valign="top" width="100">%s</th>
<td width="250">
<script type="text/javascript">
$(function() {
$("#date_picker2").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker2" name="period_of_interest_to" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th valign="top" width="100">%s</th>
<td width="250"><textarea name='additional_comments' rows="6" cols="34"
style='border: 1px solid #cfcfcf'></textarea></td>
</tr>
</table>
<table class="bibcirctable">
<!--<tr>
<td>
<input name="conditions" type="checkbox" value="accepted" />%s</td>
</tr> -->
<tr align="center">
<td>
<input name="only_edition" type="checkbox" value="Yes" />%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit"
value="%s" class="formbutton">
</td>
</tr>
</table>
</form>
<br />
<br />
</div>
""" % (_("Period of interest (To)"), CFG_SITE_URL,
(datetime.date.today() + datetime.timedelta(days=365)).strftime('%Y-%m-%d'),
_("Additional comments"),
_("Borrower accepts the %(x_url_open)sconditions%(x_url_close)s of the service in particular the return of books in due time.") % {'x_url_open': '<a href="http://library.web.cern.ch/library/Library/ill_faq.html" target="_blank">', 'x_url_close': '</a>'},
_("Borrower wants this edition only."), _("Back"), _("Continue"))
return out
def tmpl_register_ill_request_with_no_recid_step2(self, book_info,
request_details, result,
key, string, infos, ln):
"""
@param book_info: book's information
@type book_info: tuple
@param request_details: details about a given request
@type request_details: tuple
@param result: borrower's information
@type result: list
@param key: field (name, email, etc...)
@param key: string
@param string: pattern
@type string: string
@param infos: information to be displayed in the infobox
@type infos: list
"""
(title, authors, place, publisher, year, edition, isbn) = book_info
(budget_code, period_of_interest_from, period_of_interest_to,
additional_comments, only_edition)= request_details
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<form name="step1_form1" action="%s/admin2/bibcirculation/register_ill_request_with_no_recid_step2" method="get" >
<br />
<table class="bibcirctable">
<tr>
<td width="500" valign='top'>
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=title value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=authors value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=place value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=year value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=publisher value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=edition value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=isbn value="%s">
</tr>
</table>
<table>
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<!--
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=budget_code value="%s">
</tr>
-->
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=period_of_interest_from value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=period_of_interest_to value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=additional_comments value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=only_edition value="%s">
</tr>
</table>
</td>
<td width="200" align='center' valign='top'>
<table>
<tr>
<td>
</td>
</tr>
</table>
</td>
""" % (CFG_SITE_URL,
_("Item details"),
_("Name"), title, title,
_("Author(s)"), authors, authors,
_("Place"), place, place,
_("Year"), year, year,
_("Publisher"), publisher, publisher,
_("Edition"), edition, edition,
_("ISBN"), isbn, isbn,
_("ILL request details"),
_("Budget code"), budget_code, budget_code,
_("Period of interest - From"),
period_of_interest_from, period_of_interest_from,
_("Period of interest - To"),
period_of_interest_to, period_of_interest_to,
_("Additional comments"),
additional_comments, additional_comments,
_("Only this edition."), only_edition, only_edition)
out += """
<td valign='top' align='center'>
<table>
"""
if CFG_CERN_SITE == 1:
out += """
<tr>
<td class="bibcirctableheader" align="center">%s
""" % (_("Search user by"))
if key == 'email':
out += """
<input type="radio" name="key" value="ccid">%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email" checked>%s
""" % ('ccid', _('name'), _('email'))
elif key == 'name':
out += """
<input type="radio" name="key" value="ccid">%s
<input type="radio" name="key" value="name" checked>%s
<input type="radio" name="key" value="email">%s
""" % ('ccid', _('name'), _('email'))
else:
out += """
<input type="radio" name="key" value="ccid" checked>%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email">%s
""" % ('ccid', _('name'), _('email'))
else:
out += """
<tr>
<td align="center" class="bibcirctableheader">%s
""" % (_("Search borrower by"))
if key == 'email':
out += """
<input type="radio" name="key" value="id">%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email" checked>%s
""" % (_('id'), _('name'), _('email'))
elif key == 'id':
out += """
<input type="radio" name="key" value="id" checked>%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email">%s
""" % (_('id'), _('name'), _('email'))
else:
out += """
<input type="radio" name="key" value="id">%s
<input type="radio" name="key" value="name" checked>%s
<input type="radio" name="key" value="email">%s
""" % (_('id'), _('name'), _('email'))
out += """
<br><br>
</td>
</tr>
<tr>
<td align="center">
<input type="text" size="40" id="string" name="string"
value='%s' style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<td align="center">
<br>
<input type="submit" value="%s" class="formbutton">
</td>
</tr>
</table>
</form>
""" % (string or '', _("Search"))
if result:
out += """
<br />
<form name="step1_form2"
action="%s/admin2/bibcirculation/register_ill_request_with_no_recid_step3"
method="get" >
<input type=hidden name=title value="%s">
<input type=hidden name=authors value="%s">
<input type=hidden name=place value="%s">
<input type=hidden name=publisher value="%s">
<input type=hidden name=year value="%s">
<input type=hidden name=edition value="%s">
<input type=hidden name=isbn value="%s">
<table class="bibcirctable">
<tr width="200">
<td align="center">
<select name="user_info" size="8"
style='border: 1px solid #cfcfcf; width:80%%'>
""" % (CFG_SITE_URL, title, authors, place,
publisher, year, edition, isbn)
for (borrower_id, ccid, name, email,
phone, address, mailbox) in result:
out += """
<option value ='%s,%s,%s,%s,%s,%s,%s'>%s
""" % (borrower_id, ccid, name, email, phone,
address, mailbox, name)
out += """
</select>
</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td align="center">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<!-- <input type=hidden name=budget_code value="%s"> -->
<input type=hidden name=period_of_interest_from value="%s">
<input type=hidden name=period_of_interest_to value="%s">
<input type=hidden name=additional_comments value="%s">
<input type=hidden name=only_edition value="%s">
</form>
""" % (_("Select user"), budget_code,
period_of_interest_from,
period_of_interest_to,
additional_comments,
only_edition)
out += """
</td>
</tr>
</table>
<br />
<br />
<br />
<br />
</div>
"""
return out
def tmpl_register_ill_request_with_no_recid_step3(self, book_info,
user_info, request_details,
admin=True,
ln=CFG_SITE_LANG):
"""
@param book_info: book's information
@type book_info: tuple
@param user_info: user's information
@type user_info: tuple
@param request_details: details about a given request
@type request_details: tuple
"""
_ = gettext_set_language(ln)
if admin:
form_url = CFG_SITE_URL+'/admin2/bibcirculation/register_ill_request_with_no_recid_step4'
else:
form_url = CFG_SITE_URL+'/ill/book_request_step3'
(title, authors, place, publisher, year, edition, isbn) = book_info
(borrower_id, ccid, name, email, phone, address, mailbox) = user_info
display_id = borrower_id
id_string = _("ID")
if CFG_CERN_SITE == 1:
display_id = ccid
id_string = _("CCID")
(budget_code, period_of_interest_from, period_of_interest_to,
additional_comments, only_edition)= request_details
out = ""
if admin:
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<form name="step3_form1" action="%s" method="post" >
<br />
<table class="bibcirctable">
<tr>
<td width="200" valign='top'>
""" % (form_url)
out += """
<input type=hidden name=title value='%s'>
<input type=hidden name=authors value='%s'>
<input type=hidden name=place value='%s'>
<input type=hidden name=publisher value='%s'>
<input type=hidden name=year value='%s'>
<input type=hidden name=edition value='%s'>
<input type=hidden name=isbn value='%s'>
""" % (title, authors, place, publisher, year, edition, isbn)
out += """
<!-- <input type=hidden name=budget_code value='%s'> -->
<input type=hidden name=period_of_interest_from value='%s'>
<input type=hidden name=period_of_interest_to value='%s'>
<input type=hidden name=additional_comments value='%s'>
<input type=hidden name=only_edition value='%s'>
""" % (budget_code, period_of_interest_from,
period_of_interest_to, additional_comments, only_edition)
out += """ <table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
""" % (_("Item details"),
_("Title"), title,
_("Author(s)"), authors,
_("Place"), place,
_("Year"), year,
_("Publisher"), publisher,
_("Edition"), edition,
_("ISBN"), isbn)
out += """
<table>
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorter" border="0"
cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
<td width="50" valign='top'>
<table>
<tr>
<td>
</td>
</tr>
</table>
</td>
""" % (_("ILL request details"),
_("Budget code"), budget_code,
_("Period of interest (From)"), period_of_interest_from,
_("Period of interest (To)"), period_of_interest_to,
_("Additional comments"), additional_comments,
_("Only this edition"), only_edition)
out += """
<td width="200" valign='top'>
<table>
<tr align="center">
<td class="bibcirctableheader">%s</td>
<input type=hidden name=borrower_id value="%s">
</tr>
</table>
<table class="tablesorter" width="200" border="0"
cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
</tr>
</table>
""" % (_("Borrower details"), borrower_id,
id_string, display_id,
_("Name"), name,
_("Address"), address,
_("Mailbox"), mailbox,
_("Email"), email,
_("Phone"), phone)
out += """<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit"
value="%s" class="formbutton">
</td>
</tr>
</table>""" % (_("Back"), _("Continue"))
return out
def tmpl_register_ill_book_request(self, infos, borrower_id,
ln=CFG_SITE_LANG):
"""
@param infos: information
@type infos: list
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<div class=bibcircbottom align="center">
<form name="search_form"
action="%s/admin2/bibcirculation/register_ill_book_request_result"
method="get" >
<br />
<br />
<div class="bibcircinfoboxmsg"><strong>%s</strong></div>
<br />
<input type=hidden name=start value="0">
<input type=hidden name=end value="10">
""" % (CFG_SITE_URL,
_("Check if the book already exists on %(CFG_SITE_NAME)s, before sending your ILL request.") % {'CFG_SITE_NAME': CFG_SITE_NAME})
if borrower_id is not None:
out += """
<input type=hidden name=borrower_id value="%s">
""" % (borrower_id)
out += """
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s
<input type="radio" name="f" value="" checked>%s
<input type="radio" name="f" value="barcode">%s
<input type="radio" name="f" value="author">%s
<input type="radio" name="f" value="title">%s
<br />
<br />
</td>
""" % (_("Search item by"), _("RecId/Item details"), _("barcode"),
_("author"), _("title"))
out += """
</tr>
<tr align="center">
<td>
<input type="text" size="50" name="p" id='p' style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("p").focus();
</script>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
<br />
</div>
</form>
""" % (_("Back"), _("Search"))
return out
def tmpl_register_ill_book_request_result(self, result, borrower_id,
ln=CFG_SITE_LANG):
"""
@param result: book's information
@type result: list
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
if len(result) == 0:
out += """
<div class="bibcircbottom" align="center">
<br />
<div class="bibcircinfoboxmsg">%s</div>
<br />
""" % (_("0 items found."))
if borrower_id is not None:
out += """
<input type=hidden name=borrower_id value="%s">
""" % (borrower_id)
else:
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<form name="search_form"
action="%s/admin2/bibcirculation/register_ill_request_with_no_recid_step1"
method="get" >
<br />
""" % (CFG_SITE_URL)
if borrower_id is not None and borrower_id is not '':
out += """
<input type=hidden name=borrower_id value="%s">
""" % (borrower_id)
out += """
<table class="bibcirctable">
<tr align="center">
<td>
<strong>%s item(s) found</strong>
</td>
</tr>
</table>
<br />
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
""" % (len(result), _("Title"),
_("Author"), _("Publisher"),
_("# copies"))
for recid in result:
(book_author, book_editor,
book_copies) = get_item_info_for_search_result(recid)
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': recid, 'ln': ln},
(book_title_from_MARC(recid)))
out += """
<tr>
<td>%s</td>
<td>%s</td>
<td>%s</td>
<td>%s</td>
</tr>
""" % (title_link, book_author,
book_editor, book_copies)
out += """
</tbody>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</form>
</div>
""" % (_("Back"), _("Proceed anyway"))
return out
def tmpl_register_ill_article_request_step1(self, infos, admin=True,
ln=CFG_SITE_LANG):
"""
@param infos: information
@type infos: list
"""
_ = gettext_set_language(ln)
if admin:
form_url = CFG_SITE_URL + \
'/admin2/bibcirculation/register_ill_article_request_step2'
method = 'get'
else:
form_url = CFG_SITE_URL+'/ill/article_request_step2'
method = 'post'
out = self.tmpl_infobox(infos, ln)
if admin:
out += load_menu(ln)
out += """
<br />
<br />
<div class="bibcircbottom" align="center">
<br />
<br />
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
"""
out += """
<form name="display_ill_form" action="%s" method="%s">
""" % (form_url, method)
out += """
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0"
cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="45" name="periodical_title"
id='periodical_title'
style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("periodical_title").focus();
</script>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="45" name="article_title"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="45" name="author"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="report_number"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="volume"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="issue"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="page"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="year"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<!-- <tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="budget_code"
style='border: 1px solid #cfcfcf'>
</td>
</tr> -->
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="issn"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
<br />
""" % (_("Article details"),
_("Periodical title"),
_("Article title"),
_("Author(s)"),
_("Report number"),
_("Volume"),
_("Issue"),
_("Page"),
_("Year"),
_("Budget code"),
_("ISSN"))
out += """
<script type="text/javascript" language='JavaScript' src="%s/js/ui.datepicker.min.js"></script>
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="150">%s</th>
<td>
<script type="text/javascript">
$(function(){
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker1"
name="period_of_interest_from" value="%s"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="150">%s</th>
<td>
<script type="text/javascript">
$(function(){
$("#date_picker2").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker2"
name="period_of_interest_to" value="%s"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th valign="top" width="150">%s</th>
<td>
<textarea name='additional_comments' rows="6" cols="30"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit"
value="%s" class="formbutton">
</td>
</tr>
</table>
</form>
<br />
<br />
</div>
""" % (CFG_SITE_URL, _("ILL request details"),
_("Period of interest - From"), CFG_SITE_URL,
datetime.date.today().strftime('%Y-%m-%d'),
_("Period of interest - To"), CFG_SITE_URL,
(datetime.date.today() + datetime.timedelta(days=365)).strftime('%Y-%m-%d'),
_("Additional comments"),
_("Back"), _("Continue"))
return out
def tmpl_register_ill_article_request_step2(self, article_info,
request_details, result, key, string,
infos, ln=CFG_SITE_LANG):
"""
@param article_info: information about the article
@type article_info: tuple
@param request_details: details about a given ILL request
@type request_details: tuple
@param result: result with borrower's information
@param result: list
@param key: field (name, email, etc...)
@param key: string
@param string: pattern
@type string: string
@param infos: information
@type infos: list
"""
(periodical_title, article_title, author, report_number,
volume, issue, page, year, issn) = article_info
(period_of_interest_from, period_of_interest_to, budget_code,
additional_comments)= request_details
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<form name="step1_form1"
action="%s/admin2/bibcirculation/register_ill_article_request_step2"
method="get" >
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td width="500" valign='top'>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=periodical_title value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=article_title value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=author value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=report_number value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=volume value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=issue value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=page value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=year value="%s">
</tr>
<!-- <tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=budget_code value="%s">
</tr> -->
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=issn value="%s">
</tr>
</table>
<table>
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=period_of_interest_from value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=period_of_interest_to value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=additional_comments value="%s">
</tr>
</table>
</td>
<td width="200" align='center' valign='top'>
</td>
""" % (CFG_SITE_URL,
_("Item details"),
_("Periodical title"), periodical_title, periodical_title,
_("Article title"), article_title, article_title,
_("Author(s)"), author, author,
_("Report number"), report_number, report_number,
_("Volume"), volume, volume,
_("Issue"), issue, issue,
_("Page"), page, page,
_("Year"), year, year,
_("Budget code"), budget_code, budget_code,
_("ISSN"), issn, issn,
_("ILL request details"),
_("Period of interest - From"),
period_of_interest_from, period_of_interest_from,
_("Period of interest - To"),
period_of_interest_to, period_of_interest_to,
_("Additional comments"),
additional_comments, additional_comments)
out += """
<td valign='top' align='center'>
<table>
"""
if CFG_CERN_SITE == 1:
out += """
<tr>
<td class="bibcirctableheader" align="center">%s
""" % (_("Search user by"))
if key == 'email':
out += """
<input type="radio" name="key" value="ccid">%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email" checked>%s
""" % ('ccid', _('name'), _('email'))
elif key == 'name':
out += """
<input type="radio" name="key" value="ccid">%s
<input type="radio" name="key" value="name" checked>%s
<input type="radio" name="key" value="email">%s
""" % ('ccid', _('name'), _('email'))
else:
out += """
<input type="radio" name="key" value="ccid" checked>%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email">%s
""" % ('ccid', _('name'), _('email'))
else:
out += """
<tr>
<td align="center" class="bibcirctableheader">%s
""" % (_("Search borrower by"))
if key == 'email':
out += """
<input type="radio" name="key" value="id">%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email" checked>%s
""" % (_('id'), _('name'), _('email'))
elif key == 'id':
out += """
<input type="radio" name="key" value="id" checked>%s
<input type="radio" name="key" value="name">%s
<input type="radio" name="key" value="email">%s
""" % (_('id'), _('name'), _('email'))
else:
out += """
<input type="radio" name="key" value="id">%s
<input type="radio" name="key" value="name" checked>%s
<input type="radio" name="key" value="email">%s
""" % (_('id'), _('name'), _('email'))
out += """
<br><br>
</td>
</tr>
<tr>
<td align="center">
<input type="text" size="40" id="string" name="string"
value='%s' style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<td align="center">
<br>
<input type="submit" value="%s" class="formbutton">
</td>
</tr>
</table>
</form>
""" % (string or '', _("Search"))
if result:
out += """
<br />
<form name="step1_form2"
action="%s/admin2/bibcirculation/register_ill_article_request_step3"
method="post" >
<input type=hidden name=periodical_title value="%s">
<input type=hidden name=article_title value="%s">
<input type=hidden name=author value="%s">
<input type=hidden name=report_number value="%s">
<input type=hidden name=volume value="%s">
<input type=hidden name=issue value="%s">
<input type=hidden name=page value="%s">
<input type=hidden name=year value="%s">
<input type=hidden name=issn value="%s">
<table class="bibcirctable">
<tr width="200">
<td align="center">
<select name="user_info" size="8"
style='border: 1px solid #cfcfcf; width:40%%'>
""" % (CFG_SITE_URL, periodical_title, article_title,
author, report_number, volume, issue, page, year, issn)
for (borrower_id, ccid, name, email,
phone, address, mailbox) in result:
out += """
<option value ='%s,%s,%s,%s,%s,%s,%s'>%s
""" % (borrower_id, ccid, name, email,
phone, address, mailbox, name)
out += """
</select>
</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td align="center">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<input type=hidden name=period_of_interest_from value="%s">
<input type=hidden name=period_of_interest_to value="%s">
<input type=hidden name=budget_code value="%s">
<input type=hidden name=additional_comments value="%s">
</form>
""" % (_("Select user"),
period_of_interest_from, period_of_interest_to,
budget_code, additional_comments)
out += """
</td>
</tr>
</table>
<br />
<br />
<br />
<br />
</div>
"""
return out
def tmpl_register_purchase_request_step1(self, infos, fields, admin=False,
ln=CFG_SITE_LANG):
"""
@param infos: information
@type infos: list
"""
_ = gettext_set_language(ln)
recid = ''
#If admin, redirect to the second step(where the user is selected)
if admin:
form_url = CFG_SITE_URL + \
'/admin2/bibcirculation/register_purchase_request_step2'
else:
form_url = CFG_SITE_URL+'/ill/purchase_request_step2'
if len(fields) == 7:
(request_type, recid, _budget_code, cash,
_period_of_interest_from, _period_of_interest_to,
_additional_comments) = fields
(book_title, book_year, book_author,
book_isbn, book_editor) = book_information_from_MARC(int(recid))
else:
(request_type, title, authors, place, publisher, year, edition,
this_edition_only, isbn, standard_number, _budget_code, cash,
_period_of_interest_from, _period_of_interest_to,
_additional_comments) = fields
if this_edition_only == 'Yes':
checked_edition = 'checked'
else:
checked_edition = ''
if cash:
checked_cash = 'checked'
else:
checked_cash = ''
out = ''
if admin:
out += load_menu(ln)
out += """<br />""" + self.tmpl_infobox(infos, ln)
if not admin:
out += """%s<br /><br />""" % _("We will process your order immediately and contact you \
as soon as the document is received.")
out += _("According to a decision from the Scientific Information Policy Board, \
books purchased with budget codes other than Team accounts will be added to the Library catalogue, \
with the indication of the purchaser.")
out += """
<div class="bibcircbottom" align="center">
<br />
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
"""
out += """
<form name="display_ill_form" action="%s" method="post">
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0"
cellpadding="0" cellspacing="1">
""" % (form_url, _("Document details"))
if recid:
out += """<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
<input type=hidden name=recid value="%s">
<br />
""" % ( _("Title"), book_title,
_("Author(s)"), book_author,
_("Year"), book_year,
_("Publisher"), book_editor,
_("ISBN"), book_isbn,
recid)
else:
out += """<tr>
<th width="100">%s</th>
<td><SELECT name="type" style='border: 1px solid #cfcfcf'>
""" % _("Document type")
for purchase_type in CFG_BIBCIRCULATION_ACQ_TYPE:
if request_type == purchase_type or request_type == '':
out += """
<OPTION VALUE="%s" selected="selected">%s
""" % (purchase_type, purchase_type)
else:
out += """
<OPTION VALUE="%s">%s
""" % (purchase_type, purchase_type)
out += """ </SELECT></td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="45" name="title"
id='title' value='%s'
style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("title").focus();
</script>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="45" name="authors"
value='%s'
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="place"
value='%s'
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="publisher"
value='%s'
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="year"
value='%s'
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="edition"
value='%s'
style='border: 1px solid #cfcfcf'>
<br />
<input name="this_edition_only"
type="checkbox" value="Yes" %s/>%s</td>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="isbn"
value='%s'
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="standard_number"
value='%s'
style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
<br />
""" % (_("Title"), title,
_("Author(s)"), authors,
_("Place"), place,
_("Publisher"), publisher,
_("Year"), year,
_("Edition"), edition,
checked_edition, _("This edition only"),
_("ISBN"), isbn,
_("Standard number"), standard_number)
out += """
<script type="text/javascript" language='JavaScript'
src="%s/js/ui.datepicker.min.js"></script>
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>
<input type="text" size="30" name="budget_code"
style='border: 1px solid #cfcfcf'>
<input name="cash" type="checkbox" value="Yes" %s/>%s</td>
</tr>
<tr>
<th width="150">%s</th>
<td>
<script type="text/javascript">
$(function(){
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd',
showOn: 'button', buttonImage: "%s/img/calendar.gif",
buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker1"
name="period_of_interest_from" value="%s"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="150">%s</th>
<td>
<script type="text/javascript">
$(function(){
$("#date_picker2").datepicker({dateFormat: 'yy-mm-dd',
showOn: 'button', buttonImage: "%s/img/calendar.gif",
buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker2"
name="period_of_interest_to" value="%s"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th valign="top" width="150">%s</th>
<td>
<textarea name='additional_comments' rows="6" cols="30"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit" id="submit_request"
value="%s" class="formbutton">
</td>
</tr>
</table>
</form>
<br />
</div>
""" % (CFG_SITE_URL, _("Request details"),
_("Budget code"), checked_cash, _("Cash"),
_("Period of interest - From"), CFG_SITE_URL,
datetime.date.today().strftime('%Y-%m-%d'),
_("Period of interest - To"), CFG_SITE_URL,
(datetime.date.today() + datetime.timedelta(days=365)).strftime('%Y-%m-%d'),
_("Additional comments"), _("Back"), _("Continue"))
return out
def tmpl_register_purchase_request_step2(self, infos, fields, result,
p, f, ln=CFG_SITE_LANG):
recid = ''
if len(fields) == 7:
(request_type, recid, budget_code, cash,
period_of_interest_from, period_of_interest_to,
additional_comments) = fields
(title, year, authors, isbn, publisher) = book_information_from_MARC(int(recid))
else:
(request_type, title, authors, place, publisher, year, edition,
this_edition_only, isbn, standard_number, budget_code, cash,
period_of_interest_from, period_of_interest_to,
additional_comments) = fields
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<form name="step2_form1"
action="%s/admin2/bibcirculation/register_purchase_request_step2"
method="get">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td width="500" valign='top'>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=type value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=title value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=authors value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=isbn value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=publisher value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=year value="%s">
</tr>""" % ( CFG_SITE_URL,
_("Item details"),
_("Type"), request_type, request_type,
_("Title"), title, title,
_("Author(s)"), authors, authors,
_("ISBN"), isbn, isbn,
_("Publisher"), publisher, publisher,
_("Year"), year, year)
if not recid:
out += """<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=place value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=edition value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=standard_number value="%s">
</tr>""" % ( _("Place"), place, place,
_("Edition"), edition, edition,
_("Standard number"), standard_number, standard_number)
out += """</table>
<table>
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=budget_code value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=period_of_interest_from value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=period_of_interest_to value="%s">
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td><input type=hidden name=additional_comments value="%s">
</tr>
</table>
</td>
<td width="200" align='center' valign='top'>
</td>
""" % (_("Request details"),
_("Budget code"), budget_code, budget_code,
_("Period of interest - From"),
period_of_interest_from, period_of_interest_from,
_("Period of interest - To"),
period_of_interest_to, period_of_interest_to,
_("Additional comments"),
additional_comments, additional_comments)
if recid:
out += """<input type=hidden name=recid value="%s">
""" % recid
out += """
<td valign='top' align='center'>
<table>
"""
if CFG_CERN_SITE == 1:
out += """
<tr>
<td class="bibcirctableheader" align="center">%s
""" % (_("Search user by"))
if f == 'email':
out += """
<input type="radio" name="f" value="ccid">%s
<input type="radio" name="f" value="name">%s
<input type="radio" name="f" value="email" checked>%s
""" % ('ccid', _('name'), _('email'))
elif f == 'name':
out += """
<input type="radio" name="f" value="ccid">%s
<input type="radio" name="f" value="name" checked>%s
<input type="radio" name="f" value="email">%s
""" % ('ccid', _('name'), _('email'))
else:
out += """
<input type="radio" name="f" value="ccid" checked>%s
<input type="radio" name="f" value="name">%s
<input type="radio" name="f" value="email">%s
""" % ('ccid', _('name'), _('email'))
else:
out += """
<tr>
<td align="center" class="bibcirctableheader">%s
""" % (_("Search borrower by"))
if f == 'email':
out += """
<input type="radio" name="f" value="id">%s
<input type="radio" name="f" value="name">%s
<input type="radio" name="f" value="email" checked>%s
""" % (_('id'), _('name'), _('email'))
elif f == 'id':
out += """
<input type="radio" name="f" value="id" checked>%s
<input type="radio" name="f" value="name">%s
<input type="radio" name="f" value="email">%s
""" % (_('id'), _('name'), _('email'))
else:
out += """
<input type="radio" name="f" value="id">%s
<input type="radio" name="f" value="name" checked>%s
<input type="radio" name="f" value="email">%s
""" % (_('id'), _('name'), _('email'))
out += """
<br><br>
</td>
</tr>
<tr>
<td align="center">
<input type="text" size="40" id="string" name="p"
value='%s' style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<td align="center">
<br>
<input type="submit" id="search_user" value="%s" class="formbutton">
</td>
</tr>
</table>
</form>
""" % (p or '', _("Search"))
if result:
out += """
<br />
<form name="step1_form2"
action="%s/admin2/bibcirculation/register_purchase_request_step3"
method="post" >
<input type=hidden name=type value="%s">
<input type=hidden name=title value="%s">
<input type=hidden name=authors value="%s">
<input type=hidden name=publisher value="%s">
<input type=hidden name=year value="%s">
""" % (CFG_SITE_URL, request_type, title, authors, publisher, year)
if recid:
out += """<input type=hidden name=recid value="%s">
""" % recid
else:
out += """<input type=hidden name=place value="%s">
<input type=hidden name=edition value="%s">
<input type=hidden name=this_edition_only value="%s">
<input type=hidden name=standard_number value="%s">
""" % (place, edition, this_edition_only, standard_number)
out += """<input type=hidden name=isbn value="%s">
<input type=hidden name=budget_code value="%s">
<input type=hidden name=cash value="%s">
<input type=hidden name=period_of_interest_from value="%s">
<input type=hidden name=period_of_interest_to value="%s">
<input type=hidden name=additional_comments value="%s">
<table class="bibcirctable">
<tr width="200">
<td align="center">
<select name="borrower_id" size="8"
style='border: 1px solid #cfcfcf'>
""" % (isbn, budget_code, cash, period_of_interest_from,
period_of_interest_to, additional_comments)
for borrower_info in result:
borrower_id = borrower_info[0]
name = borrower_info[2]
out += """
<option value =%s>%s</option>
""" % (borrower_id, name)
out += """
</select>
</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td align="center">
<input type="submit" id="select_user" value='%s' class="formbutton">
</td>
</tr>
</table>
</form>
""" % (_("Select user"))
out += """
</td>
</tr>
</table>
<br />
<br />
<br />
<br />
</div>
"""
return out
def tmpl_ill_search(self, infos, ln=CFG_SITE_LANG):
"""
Display form for ILL search
@param infos: information
@type infos: list
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<link rel=\"stylesheet\" href=\"%(site_url)s/img/jquery-ui.css\" type=\"text/css\" />
<script type="text/javascript" language='JavaScript'
src="%(site_url)s/js/ui.datepicker.min.js"></script>
<form name="search_form"
action="%(site_url)s/admin2/bibcirculation/ill_search_result"
method="get" >
<br />
<br />
<br />
""" % {'site_url': CFG_SITE_URL}
out += """
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s
<input type="radio" name="f" value="title" checked>%s
<input type="radio" name="f" value="ILL_request_ID">%s
<input type="radio" name="f" value="cost">%s
<input type="radio" name="f" value="notes">%s
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center" width=10>
<td width=10>
<input type="text" size="50" name="p" id='p'
style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("p").focus();
</script>
</td>
</tr>
</table>
""" % (_("Search ILL request by"), _("ILL RecId/Item details"),
_("ILL request id"), _("cost"), _("notes"))
out += """
<br />
<table align="center">
<tr align="center">
<td class="bibcirctableheader" align="right">%s: </td>
<td class="bibcirctableheader" align="right">%s</td>
<td align="left">
<script type="text/javascript">
$(function(){
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd',
showOn: 'button', buttonImage: "%s/img/calendar.gif",
buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker1" name="date_from"
value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr align="center">
<td class="bibcirctableheader" align="right"></td>
<td class="bibcirctableheader" align="right">%s</td>
<td align="left">
<script type="text/javascript">
$(function(){
$("#date_picker2").datepicker({dateFormat: 'yy-mm-dd',
showOn: 'button', buttonImage: "%s/img/calendar.gif",
buttonImageOnly: true});
});
</script>
<input type="text" size="12" id="date_picker2" name="date_to"
value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
""" % (_("date restriction"),
_("From"), CFG_SITE_URL, "the beginning",
_("To"), CFG_SITE_URL, "now")
out += """
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
<br />
</div>
<form>
""" % (_("Back"), _("Search"))
return out
def tmpl_ill_request_details_step1(self, ill_request_id,
ill_request_details, libraries,
ill_request_borrower_details,
ln=CFG_SITE_LANG):
"""
@param ill_request_id: identify the ILL request. Primary key of crcILLREQUEST
@type ill_request_id: int
@param ill_req_details: information about a given ILL request
@type ill_req_details: tuple
@param libraries: list of libraries
@type libraries: list
@param ill_status: status of an ILL request
@type ill_status: string
@param ill_request_borrower_details: borrower's information
@type ill_request_borrower_details: tuple
"""
book_statuses = [CFG_BIBCIRCULATION_ILL_STATUS_NEW,
CFG_BIBCIRCULATION_ILL_STATUS_REQUESTED,
CFG_BIBCIRCULATION_ILL_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_ILL_STATUS_RETURNED,
CFG_BIBCIRCULATION_ILL_STATUS_CANCELLED]
article_statuses = [CFG_BIBCIRCULATION_ILL_STATUS_NEW,
CFG_BIBCIRCULATION_ILL_STATUS_REQUESTED,
CFG_BIBCIRCULATION_ILL_STATUS_RECEIVED,
CFG_BIBCIRCULATION_ILL_STATUS_CANCELLED]
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script type="text/javascript" language='JavaScript'
src="%s/js/ui.datepicker.min.js"></script>
"""% CFG_SITE_URL
(_borrower_id, borrower_name, borrower_email, borrower_mailbox,
period_from, period_to, item_info, borrower_comments,
only_this_edition, request_type) = ill_request_borrower_details
(library_id, request_date, expected_date, arrival_date, due_date,
return_date, cost, barcode, library_notes,
ill_status) = ill_request_details
if library_notes == '' or library_notes == None:
previous_library_notes = {}
else:
if looks_like_dictionary(library_notes):
previous_library_notes = eval(library_notes)
else:
previous_library_notes = {}
key_array = previous_library_notes.keys()
key_array.sort()
if looks_like_dictionary(item_info):
item_info = eval(item_info)
else:
item_info = {}
today = datetime.date.today()
within_a_week = (datetime.date.today()
+ datetime.timedelta(days=7)).strftime('%Y-%m-%d')
within_a_month = (datetime.date.today()
+ datetime.timedelta(days=30)).strftime('%Y-%m-%d')
notes = ''
for key in key_array:
delete_note = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/ill_request_details_step1',
{'delete_key': key, 'ill_request_id': ill_request_id,
'ln': ln}, (_("[delete]")))
notes += """<tr class="bibcirccontent">
<td class="bibcircnotes" width="160" valign="top"
align="center"><b>%s</b></td>
<td width="400"><i>%s</i></td>
<td width="65" align="center">%s</td>
</tr>
""" % (key, previous_library_notes[key], delete_note)
if library_id:
library_name = db.get_library_name(library_id)
else:
library_name = '-'
try:
(book_title, book_year, book_author, book_isbn,
book_editor) = book_information_from_MARC(int(item_info['recid']))
if book_isbn:
book_cover = get_book_cover(book_isbn)
else:
book_cover = """%s/img/book_cover_placeholder.gif
""" % (CFG_SITE_URL)
out += """
<form name="ill_req_form"
action="%s/admin2/bibcirculation/ill_request_details_step2" method="get" >
<div class="bibcircbottom">
<input type=hidden name=ill_request_id value="%s">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr valign='top'>
<td width="400">
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
<input type=hidden name=recid value="%s">
<td class="bibcirccontent">
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
<br />
""" % (CFG_SITE_URL,
ill_request_id,
_("Item details"),
_("Name"),
book_title,
_("Author(s)"),
book_author,
_("Year"),
book_year,
_("Publisher"),
book_editor,
_("ISBN"),
book_isbn,
item_info['recid'],
str(book_cover))
except KeyError:
try:
book_cover = get_book_cover(item_info['isbn'])
except KeyError:
book_cover = """%s/img/book_cover_placeholder.gif
""" % (CFG_SITE_URL)
if str(request_type) == 'book':
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<form name="ill_req_form"
action="%s/admin2/bibcirculation/ill_request_details_step2"
method="get">
<div class="bibcircbottom">
<input type=hidden name=ill_request_id value="%s">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr valign='top'>
<td width="800">
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>
<textarea name='title' rows="2"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='authors' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='place' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='publisher' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='year' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='edition' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='isbn' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
</table>
</td>
<td class="bibcirccontent">
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
<br />
""" % (CFG_SITE_URL, ill_request_id,
_("Item details"),
_("Title"), item_info['title'],
_("Author(s)"), item_info['authors'],
_("Place"), item_info['place'],
_("Publisher"), item_info['publisher'],
_("Year"), item_info['year'],
_("Edition"), item_info['edition'],
_("ISBN"), item_info['isbn'],
str(book_cover))
# for articles
elif str(request_type) == 'article':
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<form name="ill_req_form"
action="%s/admin2/bibcirculation/ill_request_details_step2" method="get" >
<div class="bibcircbottom">
<input type=hidden name=ill_request_id value="%s">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr valign='top'>
<td width="800">
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td colspan="5">
<textarea name='periodical_title' rows="2"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td colspan="5">
<textarea name='title' rows="2"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td colspan="5">
<textarea name='authors' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='volume' rows="1"
style='width:91%%;
border: 1px solid #cfcfcf;'>%s</textarea>
</td>
<th width="50" align='right'>%s</th>
<td>
<textarea name='issue' rows="1"
style='width:91%%;
border: 1px solid #cfcfcf;'>%s</textarea>
</td>
<th width="50" align='right'>%s</th>
<td>
<textarea name='page' rows="1"
style='width:91%%;
border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td colspan="3">
<textarea name='place' rows="1"
style='width:96%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
<th width="50" align='right'>%s</th>
<td>
<textarea name='issn' rows="1"
style='width:91%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td colspan="3">
<textarea name='publisher' rows="1"
style='width:96%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
<th width="50" align='right'>%s</th>
<td>
<textarea name='year' rows="1"
style='width:91%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
</table>
</td>
<td class="bibcirccontent">
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
<br />
""" % (CFG_SITE_URL,
ill_request_id,
_("Item details"),
_("Periodical Title"), item_info['periodical_title'],
_("Article Title"), item_info['title'],
_("Author(s)"), item_info['authors'],
_("Volume"), item_info['volume'],
_("Issue"), item_info['issue'],
_("Page"), item_info['page'],
_("Place"), item_info['place'],
_("ISSN"), item_info['issn'],
_("Publisher"), item_info['publisher'],
_("Year"), item_info['year'],
str(book_cover))
elif str(request_type) == 'acq-book':
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<form name="ill_req_form"
action="%s/admin2/bibcirculation/ill_request_details_step2"
method="get">
<div class="bibcircbottom">
<input type=hidden name=ill_request_id value="%s">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr valign='top'>
<td width="800">
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>
<textarea name='title' rows="2"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='authors' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='place' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='publisher' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='year' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='edition' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='isbn' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
</table>
</td>
<td class="bibcirccontent">
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
<br />
""" % (CFG_SITE_URL, ill_request_id,
_("Item details"),
_("Title"), item_info['title'],
_("Author(s)"), item_info['authors'],
_("Place"), item_info['place'],
_("Publisher"), item_info['publisher'],
_("Year"), item_info['year'],
_("Edition"), item_info['edition'],
_("ISBN"), item_info['isbn'],
str(book_cover))
else:
out += """Wrong type."""
out += """
<table class="bibcirctable">
<tr valign='top'>
<td width="550">
<table>
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="150">%s</th>
<td width="350"><i>%s</i></td>
</tr>
<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>
</table>
</td>
<td>
<table>
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
""" % (_("Borrower request"), _("Name"), borrower_name,
_("Email"), borrower_email,
_("Mailbox"), borrower_mailbox,
_("Period of interest (From)"), period_from,
_("Period of interest (To)"), period_to,
_("Borrower comments"), borrower_comments or '-',
_("Only this edition?"), only_this_edition or 'No',
_("ILL request details"))
out += """
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<input type=hidden name=new_status value="%s">
<th width="100">%s</th>
<td>
<select style='border: 1px solid #cfcfcf'
onchange="location = this.options[this.selectedIndex].value;">
""" % (ill_status, _("Status"))
statuses = []
if request_type == 'book':
statuses = book_statuses
elif request_type in CFG_BIBCIRCULATION_ACQ_TYPE:
statuses = CFG_BIBCIRCULATION_ACQ_STATUS
elif request_type in CFG_BIBCIRCULATION_PROPOSAL_TYPE:
statuses = CFG_BIBCIRCULATION_PROPOSAL_STATUS
elif request_type == 'article':
statuses = article_statuses
for status in statuses:
if status == ill_status:
out += """
<option value ="ill_request_details_step1?ill_request_id=%s&new_status=%s" selected>
%s
</option>
""" % (ill_request_id, status, status)
else:
out += """
<option value ="ill_request_details_step1?ill_request_id=%s&new_status=%s">
%s
</option>
""" % (ill_request_id, status, status)
out += """
</select>
</td>
</tr>
"""
#### NEW ####
if ill_status == CFG_BIBCIRCULATION_ILL_STATUS_NEW \
or ill_status == None \
or ill_status == '':
out += """
<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100" valign="top">%s</th>
<td>
<table class="bibcircnotes">
""" % (_("ILL request ID"), ill_request_id,
_("Previous notes"))
out += notes
out += """
</table>
</td>
</tr>
<tr>
<th valign="top" width="100">%s</th>
<td>
<textarea name='library_notes' rows="6" cols="74"
style='border: 1px solid #cfcfcf'>
</textarea>
</td>
</tr>
</table>
</td>
</tr>
</table>
""" % (_("Library notes"))
############# REQUESTED ##############
elif ill_status == CFG_BIBCIRCULATION_ILL_STATUS_REQUESTED:
out += """
<tr>
<th width="150">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
""" % (_("ILL request ID"), ill_request_id)
out += """
<tr>
<th width="150">%s</th>
<td class="bibcirccontent">
<select name="library_id" style='border: 1px solid #cfcfcf'>
""" % (_("Library/Supplier"))
for(lib_id, name) in libraries:
if lib_id == library_id:
out += """ <option value="%s" selected>%s</option>
""" % (lib_id, name)
else:
out += """ <option value="%s">%s</option>
""" % (lib_id, name)
out += """
</select>
</td>
</tr>
<tr>
<th width="150">%s</th>
<td class="bibcirccontent">
<script type="text/javascript">
$(function() {
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker1"
name="request_date" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="150">%s</th>
<td class="bibcirccontent">
<script type="text/javascript">
$(function() {
$("#date_picker2").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker2"
name="expected_date" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">
<input type="text" size="12" name="cost"
value="%s" style='border: 1px solid #cfcfcf'>
""" % (_("Request date"),
CFG_SITE_URL, today,
_("Expected date"),
CFG_SITE_URL, within_a_week,
_("Cost"), cost)
out += """
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent"><input type="text" size="12" name="barcode"
value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100" valign="top">%s</th>
<td>
<table class="bibcircnotes">
""" % (_("Barcode"), barcode or 'No barcode asociated',
_("Previous notes"))
out += notes
out += """
</table>
</td>
</tr>
<tr>
<th valign="top" width="150">%s</th>
<td>
<textarea name='library_notes' rows="6" cols="74"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
</td>
</tr>
</table>
""" % (_("Library notes"))
##### ON LOAN ##############
elif ill_status == CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN:
out += """
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
""" % (_("ILL request ID"), ill_request_id, _("Library"),
library_name, _("Request date"), request_date,
_("Expected date"), expected_date)
if str(arrival_date) == '0000-00-00':
date1 = today
else:
date1 = arrival_date
if str(due_date) == '0000-00-00':
date2 = within_a_month
else:
date2 = due_date
out += """
<tr>
<th width="150">%s</th>
<td class="bibcirccontent">
<script type="text/javascript">
$(function() {
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker1" name="arrival_date" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="150">%s</th>
<td class="bibcirccontent">
<script type="text/javascript">
$(function() {
$("#date_picker2").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker2" name="due_date" value="%s" style='border: 1px solid #cfcfcf'>
<input type="hidden" name="request_date" value="%s">
<input type="hidden" name="expected_date" value="%s">
<input type="hidden" name="library_id" value="%s">
</td>
</tr>
""" % (_("Arrival date"), CFG_SITE_URL, date1, _("Due date"),
CFG_SITE_URL, date2, request_date, expected_date, library_id)
out += """
<tr>
<th width="100">%s</th>
<td class="bibcirccontent"><input type="text" size="12" name="cost" value="%s" style='border: 1px solid #cfcfcf'>
""" % (_("Cost"), cost)
out += """
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent"><input type="text" size="12" name="barcode" value="%s" style='border: 1px solid #cfcfcf'>
</tr>
<tr>
<th width="100" valign="top">%s</th>
<td>
<table class="bibcircnotes">
""" % (_("Barcode"), barcode, _("Previous notes"))
out += notes
out += """
</table>
</td>
</tr>
<tr>
<th valign="top" width="100">%s</th>
<td><textarea name='library_notes' rows="6" cols="74" style='border: 1px solid #cfcfcf'></textarea></td>
</tr>
</table>
</td>
</tr>
</table>
""" % (_("Library notes"))
##### RETURNED ##############
elif ill_status == CFG_BIBCIRCULATION_ILL_STATUS_RETURNED or \
ill_status == CFG_BIBCIRCULATION_ILL_STATUS_CANCELLED:
date1 = return_date
if ill_status == CFG_BIBCIRCULATION_ILL_STATUS_RETURNED and \
str(return_date)=='0000-00-00':
date1 = today
out += """
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">
<script type="text/javascript">
$(function() {
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd',
showOn: 'button', buttonImage: "%s/img/calendar.gif",
buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker1" name="return_date"
value="%s" style='border: 1px solid #cfcfcf'>
<input type="hidden" name="request_date" value="%s">
<input type="hidden" name="expected_date" value="%s">
<input type="hidden" name="arrival_date" value="%s">
<input type="hidden" name="due_date" value="%s">
<input type="hidden" name="library_id" value="%s">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">
<input type="text" size="12" name="cost"
value="%s" style='border: 1px solid #cfcfcf'>
""" % (_("ILL request ID"), ill_request_id, _("Library"),
library_name, _("Request date"), request_date, _("Expected date"),
expected_date, _("Arrival date"), arrival_date, _("Due date"),
due_date, _("Return date"), CFG_SITE_URL, date1, request_date,
expected_date, arrival_date, due_date, library_id, _("Cost"), cost)
out += """
</select>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100" valign="top">%s</th>
<td>
<table class="bibcircnotes">
""" % (_("Barcode"), barcode, _("Previous notes"))
out += notes
out += """
</table>
</td>
</tr>
<tr>
<th valign="top" width="100">%s</th>
<td>
<textarea name='library_notes' rows="6" cols="74"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
</td>
</tr>
</table>
""" % (_("Library notes"))
##### RECEIVED ##############
elif ill_status == CFG_BIBCIRCULATION_ILL_STATUS_RECEIVED:
if str(arrival_date) == '0000-00-00':
date1 = today
else:
date1 = arrival_date
out += """
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">
<script type="text/javascript">
$(function() {
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd',
showOn: 'button', buttonImage: "%s/img/calendar.gif",
buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker1"
name="arrival_date" value="%s" style='border: 1px solid #cfcfcf'>
<input type="hidden" name="request_date" value="%s">
<input type="hidden" name="expected_date" value="%s">
<input type="hidden" name="library_id" value="%s">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent"><input type="text" size="12"
name="cost" value="%s" style='border: 1px solid #cfcfcf'>
""" % (_("ILL request ID"), ill_request_id, _("Library"), library_name,
_("Request date"), request_date, _("Expected date"), expected_date,
_("Arrival date"), CFG_SITE_URL, date1, request_date, expected_date,
library_id, _("Cost"), cost)
out += """
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100" valign="top">%s</th>
<td>
<table class="bibcircnotes">
""" % (_("Barcode"), barcode, _("Previous notes"))
out += notes
out += """
</table>
</td>
</tr>
<tr>
<th valign="top" width="100">%s</th>
<td>
<textarea name='library_notes' rows="6" cols="74"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
</td>
</tr>
</table>
""" % (_("Library notes"))
###### END STATUSES ######
out += """
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit"
value="%s" class="formbutton">
</td>
</tr>
</table>
</div>
</form>
<br />
<br />
""" % (_("Back"), _("Continue"))
return out
def tmpl_purchase_details_step1(self, ill_request_id,
ill_request_details, libraries,
ill_request_borrower_details,
ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script type="text/javascript" language='JavaScript'
src="%s/js/ui.datepicker.min.js"></script>
"""% CFG_SITE_URL
(_borrower_id, borrower_name, borrower_email,
borrower_mailbox, period_from, period_to,
item_info, borrower_comments, only_this_edition,
budget_code, request_type) = ill_request_borrower_details
(library_id, request_date, expected_date, arrival_date, due_date,
return_date, cost, _barcode, library_notes,
ill_status) = ill_request_details
if library_notes == '' or library_notes == None:
previous_library_notes = {}
else:
if looks_like_dictionary(library_notes):
previous_library_notes = eval(library_notes)
else:
previous_library_notes = {}
key_array = previous_library_notes.keys()
key_array.sort()
if looks_like_dictionary(item_info):
item_info = eval(item_info)
else:
item_info = {}
today = datetime.date.today()
within_a_week = (datetime.date.today()
+ datetime.timedelta(days=7)).strftime('%Y-%m-%d')
within_a_month = (datetime.date.today()
+ datetime.timedelta(days=30)).strftime('%Y-%m-%d')
notes = ''
for key in key_array:
delete_note = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/purchase_details_step1',
{'delete_key': key, 'ill_request_id': ill_request_id,
'ln': ln}, (_("[delete]")))
notes += """<tr class="bibcirccontent">
<td class="bibcircnotes" width="160" valign="top"
align="center"><b>%s</b></td>
<td width="400"><i>%s</i></td>
<td width="65" align="center">%s</td>
</tr>
""" % (key, previous_library_notes[key], delete_note)
if library_id:
library_name = db.get_vendor_name(library_id)
else:
library_name = '-'
try:
(book_title, book_year, book_author, book_isbn,
book_editor) = book_information_from_MARC(int(item_info['recid']))
if book_isbn:
book_cover = get_book_cover(book_isbn)
else:
book_cover = """%s/img/book_cover_placeholder.gif
""" % (CFG_SITE_URL)
out += """
<form name="ill_req_form"
action="%s/admin2/bibcirculation/purchase_details_step2" method="get" >
<div class="bibcircbottom">
<input type=hidden name=ill_request_id value="%s">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr valign='top'>
<td width="400">
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
<input type=hidden name=recid value="%s">
<td class="bibcirccontent">
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
<br />
""" % (CFG_SITE_URL,
ill_request_id,
_("Item details"),
_("Name"),
book_title,
_("Author(s)"),
book_author,
_("Year"),
book_year,
_("Publisher"),
book_editor,
_("ISBN"),
book_isbn,
item_info['recid'],
str(book_cover))
except KeyError:
try:
book_cover = get_book_cover(item_info['isbn'])
except KeyError:
book_cover = """%s/img/book_cover_placeholder.gif
""" % (CFG_SITE_URL)
if str(request_type) in CFG_BIBCIRCULATION_ACQ_TYPE or \
str(request_type) in CFG_BIBCIRCULATION_PROPOSAL_TYPE:
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<form name="ill_req_form"
action="%s/admin2/bibcirculation/purchase_details_step2"
method="get">
<div class="bibcircbottom">
<input type=hidden name=ill_request_id value="%s">
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader" width="10">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr valign='top'>
<td width="800">
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>
<textarea name='title' rows="2"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='authors' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='place' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='publisher' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='year' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='edition' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='isbn' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td>
<textarea name='standard_number' rows="1"
style='width:98%%; border: 1px solid #cfcfcf;'>%s</textarea>
</td>
</tr>
</table>
</td>
<td class="bibcirccontent">
<img style='border: 1px solid #cfcfcf' src="%s" alt="Book Cover"/>
</td>
</tr>
</table>
<br />
""" % (CFG_SITE_URL, ill_request_id,
_("Item details"),
_("Title"), item_info['title'],
_("Author(s)"), item_info['authors'],
_("Place"), item_info['place'],
_("Publisher"), item_info['publisher'],
_("Year"), item_info['year'],
_("Edition"), item_info['edition'],
_("ISBN"), item_info['isbn'],
_("Standard number"), item_info['standard_number'],
str(book_cover))
else:
out += """Wrong type."""
out += """
<table class="bibcirctable">
<tr valign='top'>
<td width="550">
<table>
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>""" % (_("Borrower request"), _("Name"), borrower_name,
_("Email"), borrower_email,
_("Mailbox"), borrower_mailbox)
if request_type not in CFG_BIBCIRCULATION_PROPOSAL_TYPE:
out += """<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>""" % (_("Period of interest (From)"), period_from,
_("Period of interest (To)"), period_to,
_("Only this edition?"), only_this_edition or 'No')
else:
out += """<tr>
<th width="150">%s</th>
<td>%s</td>
</tr>""" % (_("Date of request"), period_from)
out += """<tr>
<th width="150">%s</th>
<td width="350"><i>%s</i></td>
</tr>
</table>
</td>
<td>
<table>
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table> """ % (_("Borrower comments"), borrower_comments or '-',
_("Request details"))
out += """
<table class="tablesorter" border="0" cellpadding="0" cellspacing="1">
<tr>
<input type=hidden name=new_status value="%s">
<th width="100">%s</th>
<td colspan="3">
<select style='border: 1px solid #cfcfcf'
onchange="location = this.options[this.selectedIndex].value;">
""" % (ill_status, _("Status"))
statuses = []
if request_type in CFG_BIBCIRCULATION_ACQ_TYPE:
statuses = CFG_BIBCIRCULATION_ACQ_STATUS
elif request_type in CFG_BIBCIRCULATION_PROPOSAL_TYPE:
statuses = CFG_BIBCIRCULATION_PROPOSAL_STATUS
for status in statuses:
if status == ill_status:
out += """
<option value ="purchase_details_step1?ill_request_id=%s&new_status=%s" selected>
%s
</option>
""" % (ill_request_id, status, status)
else:
out += """
<option value ="purchase_details_step1?ill_request_id=%s&new_status=%s">
%s
</option>
""" % (ill_request_id, status, status)
out += """
</select>
</td>
</tr>
"""
######## NEW ########
if ill_status == CFG_BIBCIRCULATION_ACQ_STATUS_NEW \
or ill_status == CFG_BIBCIRCULATION_PROPOSAL_STATUS_NEW \
or ill_status == CFG_BIBCIRCULATION_PROPOSAL_STATUS_PUT_ASIDE \
or ill_status == None \
or ill_status == '':
out += """
<tr>
<th width="150">%s</th>
<td>%s</td>
<th width="150">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="150">%s</th>
<td colspan="3">
<input type="text" size="12"
name="budget_code"
value="%s"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100" valign="top">%s</th>
<td colspan="3">
<table class="bibcircnotes">
""" % (_("ILL request ID"), ill_request_id,
_("Type"), request_type,
_("Budget code"), budget_code,
_("Previous notes"))
out += notes
out += """
</table>
</td>
</tr>
<tr>
<th valign="top" width="100">%s</th>
<td colspan="3">
<textarea name='library_notes' rows="6" cols="74"
style='border: 1px solid #cfcfcf'>
</textarea>
</td>
</tr>
</table>
</td>
</tr>
</table>
""" % (_("Library notes"))
############# ON ORDER ##############
elif ill_status == CFG_BIBCIRCULATION_ACQ_STATUS_ON_ORDER \
or ill_status == CFG_BIBCIRCULATION_PROPOSAL_STATUS_ON_ORDER:
out += """
<tr>
<th width="150">%s</th>
<td class="bibcirccontent">%s</td>
<th width="150">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
""" % (_("ILL request ID"), ill_request_id,
_("Type"), request_type)
out += """
<tr>
<th width="150">%s</th>
<td class="bibcirccontent" colspan="3">
<select name="library_id" style='border: 1px solid #cfcfcf'>
""" % (_("Vendor"))
for(lib_id, name) in libraries:
if lib_id == library_id:
out += """ <option value="%s" selected>%s</option>
""" % (lib_id, name)
else:
out += """ <option value="%s">%s</option>
""" % (lib_id, name)
out += """
</select>
</td>
</tr>
<tr>
<th width="150">%s</th>
<td class="bibcirccontent" colspan="3">
<script type="text/javascript">
$(function(){
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd',
showOn: 'button', buttonImage: "%s/img/calendar.gif",
buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker1"
name="request_date" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="150">%s</th>
<td class="bibcirccontent" colspan="3">
<script type="text/javascript">
$(function() {
$("#date_picker2").datepicker({dateFormat: 'yy-mm-dd',
showOn: 'button', buttonImage: "%s/img/calendar.gif",
buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker2"
name="expected_date" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">
<input type="text" size="12" name="cost"
value="%s" style='border: 1px solid #cfcfcf'>
""" % (_("Request date"),
CFG_SITE_URL, today,
_("Expected date"),
CFG_SITE_URL, within_a_week,
_("Cost"), cost)
out += """
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">
<input type="text" size="12" name="budget_code"
value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="100" valign="top">%s</th>
<td colspan="3">
<table class="bibcircnotes">
""" % (_("Budget code"), budget_code,
_("Previous notes"))
out += notes
out += """
</table>
</td>
</tr>
<tr>
<th valign="top" width="150">%s</th>
<td colspan="3">
<textarea name='library_notes' rows="6" cols="74"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
</td>
</tr>
</table>
""" % (_("Library notes"))
##### PARTIAL RECEIPT ##############
elif ill_status == CFG_BIBCIRCULATION_ACQ_STATUS_PARTIAL_RECEIPT:
out += """
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">%s</td>
</tr>
""" % (_("ILL request ID"), ill_request_id,
_("Type"), request_type, _("Library"),
library_name, _("Request date"), request_date,
_("Expected date"), expected_date)
if str(arrival_date) == '0000-00-00':
date1 = today
else:
date1 = arrival_date
if str(due_date) == '0000-00-00':
date2 = within_a_month
else:
date2 = due_date
out += """
<tr>
<th width="150">%s</th>
<td class="bibcirccontent" colspan="3">
<script type="text/javascript">
$(function() {
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker1" name="arrival_date" value="%s" style='border: 1px solid #cfcfcf'>
</td>
</tr>
<tr>
<th width="150">%s</th>
<td class="bibcirccontent" colspan="3">
<script type="text/javascript">
$(function() {
$("#date_picker2").datepicker({dateFormat: 'yy-mm-dd', showOn: 'button', buttonImage: "%s/img/calendar.gif", buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker2" name="due_date" value="%s" style='border: 1px solid #cfcfcf'>
<input type="hidden" name="request_date" value="%s">
<input type="hidden" name="expected_date" value="%s">
<input type="hidden" name="library_id" value="%s">
</td>
</tr>
""" % (_("Arrival date"), CFG_SITE_URL, date1, _("Due date"),
CFG_SITE_URL, date2, request_date, expected_date, library_id)
out += """
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">
<input type="text" size="12" name="cost" value="%s"
style='border: 1px solid #cfcfcf'>
""" % (_("Cost"), cost)
out += """
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">
<input type="text" size="12" name="budget_code"
value="%s" style='border: 1px solid #cfcfcf'>
</tr>
<tr>
<th width="100" valign="top">%s</th>
<td colspan="3">
<table class="bibcircnotes">
""" % (_("Budget code"), budget_code, _("Previous notes"))
out += notes
out += """
</table>
</td>
</tr>
<tr>
<th valign="top" width="100">%s</th>
<td colspan="3">
<textarea name='library_notes' rows="6" cols="74"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
</td>
</tr>
</table>
""" % (_("Library notes"))
##### CANCELED ##############
elif ill_status == CFG_BIBCIRCULATION_ACQ_STATUS_CANCELLED:
date1 = return_date
if ill_status == CFG_BIBCIRCULATION_ILL_STATUS_RETURNED and \
str(return_date)=='0000-00-00':
date1 = today
out += """
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">
<script type="text/javascript">
$(function() {
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd',
showOn: 'button', buttonImage: "%s/img/calendar.gif",
buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker1" name="return_date"
value="%s" style='border: 1px solid #cfcfcf'>
<input type="hidden" name="request_date" value="%s">
<input type="hidden" name="expected_date" value="%s">
<input type="hidden" name="arrival_date" value="%s">
<input type="hidden" name="due_date" value="%s">
<input type="hidden" name="library_id" value="%s">
<input type="hidden" name="budget_code" value="%s">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">
<input type="text" size="12" name="cost"
value="%s" style='border: 1px solid #cfcfcf'>
""" % (_("ILL request ID"), ill_request_id, _("Type"), request_type,
_("Library"), library_name, _("Request date"), request_date,
_("Expected date"), expected_date, _("Arrival date"),
arrival_date, _("Due date"), due_date, _("Return date"),
CFG_SITE_URL, date1, request_date, expected_date, arrival_date,
due_date, library_id, budget_code, _("Cost"), cost)
out += """
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">%s</td>
</tr>
<tr>
<th width="100" valign="top">%s</th>
<td colspan="3">
<table class="bibcircnotes">
""" % (_("Budget code"), budget_code, _("Previous notes"))
out += notes
out += """
</table>
</td>
</tr>
<tr>
<th valign="top" width="100">%s</th>
<td colspan="3">
<textarea name='library_notes' rows="6" cols="74"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
</td>
</tr>
</table>
""" % (_("Library notes"))
##### RECEIVED ##############
elif ill_status == CFG_BIBCIRCULATION_ACQ_STATUS_RECEIVED \
or ill_status == CFG_BIBCIRCULATION_PROPOSAL_STATUS_RECEIVED:
if str(arrival_date) == '0000-00-00':
date1 = today
else:
date1 = arrival_date
out += """
<tr>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
<th width="100">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3">
<script type="text/javascript">
$(function() {
$("#date_picker1").datepicker({dateFormat: 'yy-mm-dd',
showOn: 'button', buttonImage: "%s/img/calendar.gif",
buttonImageOnly: true});
});
</script>
<input type="text" size="10" id="date_picker1"
name="arrival_date" value="%s" style='border: 1px solid #cfcfcf'>
<input type="hidden" name="request_date" value="%s">
<input type="hidden" name="expected_date" value="%s">
<input type="hidden" name="library_id" value="%s">
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3"><input type="text" size="12"
name="cost" value="%s" style='border: 1px solid #cfcfcf'>
""" % (_("ILL request ID"), ill_request_id, _("Type"), request_type,
_("Library"), library_name, _("Request date"), request_date,
_("Expected date"), expected_date, _("Arrival date"),
CFG_SITE_URL, date1, request_date, expected_date, library_id,
_("Cost"), cost)
out += """
</td>
</tr>
<tr>
<th width="100">%s</th>
<td class="bibcirccontent" colspan="3"><input type="text" size="12"
name="budget_code" value="%s" style='border: 1px solid #cfcfcf'></td>
</tr>
<tr>
<th width="100" valign="top">%s</th>
<td colspan="3">
<table class="bibcircnotes">
""" % (_("Budget code"), budget_code, _("Previous notes"))
out += notes
out += """
</table>
</td>
</tr>
<tr>
<th valign="top" width="100">%s</th>
<td colspan="3">
<textarea name='library_notes' rows="6" cols="74"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
</td>
</tr>
</table>
""" % (_("Library notes"))
###### END STATUSES ######
out += """
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit"
value="%s" class="formbutton">
</td>
</tr>
</table>
</div>
</form>
<br />
<br />
""" % (_("Back"), _("Continue"))
return out
def tmpl_ill_notes(self, ill_notes, ill_id, ln=CFG_SITE_LANG):
"""
@param ill_notes: notes about an ILL request
@type ill_notes: dictionnary
@param ill_id: identify the ILL request. Primray key of crcILLREQUEST
@type ill_id: int
"""
_ = gettext_set_language(ln)
if not ill_notes:
ill_notes = {}
else:
if looks_like_dictionary(ill_notes):
ill_notes = eval(ill_notes)
else:
ill_notes = {}
out = """ """
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<form name="borrower_notes"
action="%s/admin2/bibcirculation/get_ill_library_notes"
method="get" >
<input type=hidden name=ill_id value='%s'>
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td>
<table class="bibcircnotes">
""" % (CFG_SITE_URL, ill_id,
_("Notes about ILL"))
key_array = ill_notes.keys()
key_array.sort()
for key in key_array:
delete_note = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_ill_library_notes',
{'delete_key': key, 'ill_id': ill_id, 'ln': ln},
(_("[delete]")))
out += """<tr class="bibcirccontent">
<td class="bibcircnotes" width="160" valign="top"
align="center"><b>%s</b></td>
<td width="400"><i>%s</i></td>
<td width="65" align="center">%s</td>
</tr>
""" % (key, ill_notes[key], delete_note)
out += """
</table>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">
<textarea name="library_notes" rows="5" cols="90"
style='border: 1px solid #cfcfcf'>
</textarea>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</form>
</div>
""" % (_("Write new note"),
_("Back"),
_("Confirm"))
return out
####
#### Templates for the display of "Lists" ####
####
def tmpl_list_ill(self, ill_req, infos=[], ln=CFG_SITE_LANG):
"""
@param ill_req: information about ILL requests
@type ill_req: tuple
"""
_ = gettext_set_language(ln)
out = """ """
if infos: out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_ill').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<div class="bibcircbottom">
<br />
<table id="table_ill" class="tablesorter" border="0"
cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
"""% (_("Borrower"),
_("Item"),
_("Supplier"),
_("Status"),
_("ID"),
_("Interest from"),
_("Due date"),
_("Type"),
_("Option(s)"))
for (ill_request_id, borrower_id, borrower_name, library_id,
ill_status, period_from, _period_to, due_date, item_info,
request_type) in ill_req:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln},
(borrower_name))
if library_id:
if request_type in CFG_BIBCIRCULATION_ACQ_TYPE or \
request_type in CFG_BIBCIRCULATION_PROPOSAL_TYPE:
library_name = db.get_vendor_name(library_id)
else:
library_name = db.get_library_name(library_id)
else:
library_name = '-'
if looks_like_dictionary(item_info):
item_info = eval(item_info)
else:
item_info = {}
try:
title = book_title_from_MARC(int(item_info['recid']))
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': item_info['recid'], 'ln': ln}, title)
except KeyError:
if request_type in ['book'] + CFG_BIBCIRCULATION_ACQ_TYPE + CFG_BIBCIRCULATION_PROPOSAL_TYPE:
title = item_info['title']
else:
title = item_info['periodical_title']
title_link = title
out += """
<tr>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td align="center">
""" % (borrower_link, title_link, library_name, ill_status,
ill_request_id, period_from, due_date or '-',
request_type)
if request_type in CFG_BIBCIRCULATION_ACQ_TYPE or \
request_type in CFG_BIBCIRCULATION_PROPOSAL_TYPE:
out += """
<input type=button onClick="location.href='%s/admin2/bibcirculation/purchase_details_step1?ill_request_id=%s'" onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value="%s" class='bibcircbutton'>
</td>
</tr>
""" % (CFG_SITE_URL, ill_request_id, _('select'))
else:
out += """
<input type=button onClick="location.href='%s/admin2/bibcirculation/ill_request_details_step1?ill_request_id=%s'" onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value="%s" class='bibcircbutton'>
""" % (CFG_SITE_URL, ill_request_id, _('select'))
# Create a link for a manual recall.
if ill_status == CFG_BIBCIRCULATION_ILL_STATUS_ON_LOAN:
subject = _("Inter library loan recall: ") + str(title)
out += """
<input type=button onClick="location.href='%s/admin2/bibcirculation/borrower_notification?borrower_id=%s&subject=%s&load_msg_template=True&template=%s&from_address=%s'"
onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'" value="%s" class='bibcircbutton'>
""" % (CFG_SITE_URL, borrower_id, subject, 'ill_recall1', CFG_BIBCIRCULATION_ILLS_EMAIL, _('Send Recall'))
out += """</td>
</tr>
"""
out += """
</tbody>
</table>
</div>
"""
return out
def tmpl_list_purchase(self, purchase_reqs, ln=CFG_SITE_LANG):
"""
@param purchase_req: information about purchase requests
@type purchase_req: tuple
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_ill').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<div class="bibcircbottom">
<br />
<table id="table_ill" class="tablesorter" border="0"
cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
"""% (_("Borrower"),
_("Item"),
_("No. purchases"),
_("Supplier"),
_("Cost"),
_("Status"),
_("ID"),
_("Date requested"),
_("Type"),
_("Options"))
for (ill_request_id, borrower_id, borrower_name, vendor_id, ill_status,
period_from, _period_to, _due_date, item_info, cost, request_type,
no_purchases) in purchase_reqs:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln},
(borrower_name))
if vendor_id:
vendor_name = db.get_vendor_name(vendor_id)
else:
vendor_name = '-'
if looks_like_dictionary(item_info):
item_info = eval(item_info)
else:
item_info = {}
try:
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': item_info['recid'], 'ln': ln},
(book_title_from_MARC(int(item_info['recid']))))
except KeyError:
title_link = item_info['title']
out += """
<tr>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td align="center">
""" % (borrower_link, title_link, no_purchases,
vendor_name, cost, ill_status, ill_request_id,
period_from, request_type)
if request_type in CFG_BIBCIRCULATION_ACQ_TYPE or \
request_type in CFG_BIBCIRCULATION_PROPOSAL_TYPE:
out += """
<input type=button id=select_purchase onClick="location.href='%s/admin2/bibcirculation/purchase_details_step1?ill_request_id=%s'"
onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value="%s" class='bibcircbutton'>
</td>
</tr>
""" % (CFG_SITE_URL, ill_request_id, _('select'))
else:
out += """
<input type=button id=select_ill onClick="location.href='%s/admin2/bibcirculation/ill_request_details_step1?ill_request_id=%s'"
onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value="%s" class='bibcircbutton'>
</td>
</tr>
""" % (CFG_SITE_URL, ill_request_id, _('select'))
out += """
</tbody>
</table>
</div>
"""
return out
def tmpl_list_proposal(self, proposals, ln=CFG_SITE_LANG):
"""
@param proposals: Information about proposals
@type proposals: tuple
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_ill').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<div class="bibcircbottom">
<br />
<table id="table_ill" class="tablesorter" border="0"
cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
"""% (_("ID"),
_("Proposal date"),
_("Proposer"),
_("Requests"),
_("Title"),
_("Status"),
_("Supplier"),
_("Cost"),
_("Type"),
_("Options"))
for (ill_request_id, borrower_id, borrower_name, vendor_id,
ill_status, _barcode, period_from, _period_to,
item_info, cost, request_type, number_of_requests) in proposals:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': borrower_id, 'ln': ln},
(borrower_name))
if vendor_id:
vendor_name = db.get_vendor_name(vendor_id)
else:
vendor_name = '-'
if looks_like_dictionary(item_info):
item_info = eval(item_info)
else:
item_info = {}
try:
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': item_info['recid'], 'ln': ln},
(book_title_from_MARC(int(item_info['recid']))))
except KeyError:
title_link = item_info['title']
try:
hold_requests_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_requests_details',
{'recid': item_info['recid'], 'ln': ln},
str(number_of_requests))
except KeyError:
hold_requests_link = str(number_of_requests)
out += """
<tr>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td align="center">
""" % (ill_request_id, period_from, borrower_link,
hold_requests_link, title_link, ill_status,
vendor_name, cost, request_type)
out += """ <input type=button onClick="location.href='%s/admin2/bibcirculation/purchase_details_step1?ill_request_id=%s'" onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value="%s" class='bibcircbutton'>""" % (CFG_SITE_URL, ill_request_id, _('select'))
if ill_status == CFG_BIBCIRCULATION_PROPOSAL_STATUS_PUT_ASIDE:
out += """ <input type=button onClick="location.href='%s/admin2/bibcirculation/register_ill_from_proposal?ill_request_id=%s'" onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'" value="%s" class='bibcircbutton'>""" % (CFG_SITE_URL, ill_request_id, _('Create ILL req'))
out += """</td></tr>"""
out += """
</tbody>
</table>
</div>
"""
return out
def tmpl_list_requests_on_put_aside_proposals(self, requests, ln=CFG_SITE_LANG):
"""
Template for the display of additional requests on proposals which are
'put aside'.
@param requests: information about requests on 'put aside' proposals
@type requests: tuple
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<script src="/js/tablesorter/jquery.tablesorter.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function() {
$('#table_ill').tablesorter({widthFixed: true, widgets: ['zebra']})
});
</script>
<div class="bibcircbottom">
<br />
<table id="table_ill" class="tablesorter" border="0"
cellpadding="0" cellspacing="1">
<thead>
<tr>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
<th>%s</th>
</tr>
</thead>
<tbody>
"""% (_("Req.ID"),
_("Requester"),
_("Period of Interest: From"),
_("Period of Interest: To"),
_("Title"),
_("Cost"),
_("Options"))
for (ill_id, req_id, bor_id, bor_name, period_from, period_to,
item_info, cost) in requests:
borrower_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_borrower_details',
{'borrower_id': bor_id, 'ln': ln},
(bor_name))
if looks_like_dictionary(item_info):
item_info = eval(item_info)
else:
item_info = {}
try:
title_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_item_details',
{'recid': item_info['recid'], 'ln': ln},
(book_title_from_MARC(int(item_info['recid']))))
except KeyError:
title_link = item_info['title']
out += """
<tr>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td align="center">
""" % (req_id, borrower_link, period_from, period_to,
title_link, cost)
out += """ <input type=button onClick="location.href='%s/admin2/bibcirculation/purchase_details_step1?ill_request_id=%s'" onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value="%s" class='bibcircbutton'>""" % (CFG_SITE_URL, ill_id, _('Go to Proposal'))
out += """ <input type=button onClick="location.href='%s/admin2/bibcirculation/register_ill_from_proposal?ill_request_id=%s&bor_id=%s'" onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'" value="%s" class='bibcircbutton'>""" % (CFG_SITE_URL, ill_id, bor_id, _('Create ILL req'))
out += """</td></tr>"""
out += """
</tbody>
</table>
</div>
"""
return out
###
### "Library" related templates ###
###
def tmpl_merge_libraries_step1(self, library_details, library_items,
result, p, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
out = """
"""
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<br />
"""
(library_id, name, address, email, phone,
lib_type, notes) = library_details
no_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_notes',
{'library_id': library_id},
(_("No notes")))
see_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_notes',
{'library_id': library_id},
(_("Notes about this library")))
if notes == "" or str(notes) == '{}':
notes_link = no_notes_link
else:
notes_link = see_notes_link
out += """
<table class="bibcirctable">
<tbody>
<tr>
<td align="left" valign="top" width="300">
<table class="bibcirctable">
<tr>
<td width="200" class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorter" border="0"
cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
</td>
""" % (_("Library to be deleted"),
_("Name"), name,
_("Address"), address,
_("Email"), email,
_("Phone"), phone,
_("Type"), lib_type,
_("Notes"), notes_link,
_("No of items"), len(library_items))
out += """
<td width="200" align="center" valign="top">
<td valign="top" align='left'>
<form name="search_library_step1_form"
action="%s/admin2/bibcirculation/merge_libraries_step1"
method="get" >
<input type=hidden name=library_id value="%s">
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s
<input type="radio" name="f"
value="name" checked>%s
<input type="radio" name="f"
value="email">%s
<br \>
<br \>
</td>
</tr>
<tr align="center">
<td>
<input type="text" size="45" name="p"
style='border: 1px solid #cfcfcf'
value="%s">
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
</form>
""" % (CFG_SITE_URL, library_id, _("Search library"),
_("name"), _("email"), p or '', _("Search"))
if result:
out += """
<br />
<form name="form2"
action="%s/admin2/bibcirculation/merge_libraries_step2"
method="get">
<table class="bibcirctable">
<tr width="200">
<td align="center">
<select name="library_to" size="12"
style='border: 1px
solid #cfcfcf; width:77%%'>
""" % (CFG_SITE_URL)
for (library_to, library_name) in result:
if library_to != library_id:
out += """
<option value ='%s'>%s
""" % (library_to, library_name)
out += """
</select>
</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td ALIGN="center">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<input type=hidden name=library_from value="%s">
</form>
""" % (_("Select library"), library_id)
out += """
</td>
<tr>
</tbody>
</table>
<br />
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</div>
""" % (_("Back"))
return out
def tmpl_merge_libraries_step2(self, library_from_details,
library_from_items, library_to_details,
library_to_items, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<br />
"""
try:
(library_id_1, name_1, address_1, email_1,
phone_1, type_1, notes_1) = library_from_details
found_1 = True
except:
found_1 = False
try:
(library_id_2, name_2, address_2, email_2,
phone_2, type_2, notes_2) = library_to_details
found_2 = True
except:
found_2 = False
if found_1:
no_notes_link_1 = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_notes',
{'library_id': library_id_1},
(_("No notes")))
see_notes_link_1 = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_notes',
{'library_id': library_id_1},
(_("Notes about this library")))
if notes_1 == "" or str(notes_1) == '{}':
notes_link_1 = no_notes_link_1
else:
notes_link_1 = see_notes_link_1
if found_2:
no_notes_link_2 = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_notes',
{'library_id': library_id_2},
(_("No notes")))
see_notes_link_2 = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_notes',
{'library_id': library_id_2},
(_("Notes about this library")))
if notes_2 == "" or str(notes_2) == '{}':
notes_link_2 = no_notes_link_2
else:
notes_link_2 = see_notes_link_2
if found_1 and found_2:
out += """
<br />
<div class="infoboxmsg">
<strong>
%s
</strong>
</div>
<br />
""" % (_("Please, note that this action is NOT reversible"))
out += """
<table class="bibcirctable">
<tbody>
<tr>
"""
if found_1:
out += """
<td align="left" valign="top" width="300">
<table class="bibcirctable">
<tr>
<td width="200" class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorter" border="0"
cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
""" % (_("Library to be deleted"),
_("Name"), name_1,
_("Address"), address_1,
_("Email"), email_1,
_("Phone"), phone_1,
_("Type"), type_1,
_("Notes"), notes_link_1,
_("No of items"), len(library_from_items))
else:
out += """
<td align="left" valign="center" width="300">
<div class="infoboxmsg">%s</div>
""" % (_("Library not found"))
out += """
</td>
<td width="200" align="center" valign="center">
<strong>==></strong>
</td>
"""
if found_2:
out += """
<td align="left" valign="top" width="300">
<table class="bibcirctable">
<tr>
<td width="200" class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorter" border="0"
cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
""" % (_("Merged library"),
_("Name"), name_2,
_("Address"), address_2,
_("Email"), email_2,
_("Phone"), phone_2,
_("Type"), type_2,
_("Notes"), notes_link_2,
_("No of items"), len(library_to_items))
else:
out += """
<td align="left" valign="center" width="300">
<div class="infoboxmsg">%s</div>
""" % (_("Library not found"))
out += """
</td>
<tr>
</tbody>
</table>
<br />
<br />
<form name="form1" action="%s/admin2/bibcirculation/merge_libraries_step3"
method="get">
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
""" % (CFG_SITE_URL, _("Back"))
if found_1 and found_2:
out += """
<input type=hidden name=library_from value="%s">
<input type=hidden name=library_to value="%s">
<input type="submit" value='%s' class="formbutton">
""" % (library_id_1, library_id_2, _("Confirm"))
out += """
</td>
</tr>
</table>
</form>
<br />
<br />
<br />
</div>
"""
return out
def tmpl_add_new_library_step1(self, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom" align="center">
<form name="add_new_library_step1_form"
action="%s/admin2/bibcirculation/add_new_library_step2" method="get" >
<br />
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="70">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf'
size=50 name="name" id='name'>
<script language="javascript" type="text/javascript">
document.getElementById("name").focus();
</script>
</td>
</tr>
<tr>
<th width="70">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf' size=50 name="email">
</td>
</tr>
<tr>
<th width="70">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf' size=50 name="phone">
</td>
</tr>
<tr>
<th width="70">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf' size=50
name="address">
</td>
</tr>
<tr>
<th width="70">%s</th>
<td>
<select name="type" style='border: 1px solid #cfcfcf'>
""" % (CFG_SITE_URL, _("New library information"), _("Name"),
_("Email"), _("Phone"), _("Address"), _("Type"))
for lib in CFG_BIBCIRCULATION_LIBRARY_TYPE:
out += """
<option value="%s">%s</option>
""" % (lib, lib)
out += """
</select>
</td>
</tr>
<tr>
<th width="70" valign="top">%s</th>
<td>
<textarea name="notes" rows="5" cols="39"
style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s" onClick="history.go(-1)" class="formbutton">
<input type="submit" value="%s" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (_("Notes"), _("Back"), _("Continue"))
return out
def tmpl_add_new_library_step2(self, tup_infos, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
(name, email, phone, address, lib_type, notes) = tup_infos
out = """ """
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<form name="add_new_library_step2_form"
action="%s/admin2/bibcirculation/add_new_library_step3" method="get" >
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td width="70">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="70">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="70">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="70">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="70">%s</td> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<td width="70">%s</td> <td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button
value="%s"
onClick="history.go(-1)"
class="formbutton">
<input type="submit"
value="%s"
class="formbutton">
<input type=hidden name=name value="%s">
<input type=hidden name=email value="%s">
<input type=hidden name=phone value="%s">
<input type=hidden name=address value="%s">
<input type=hidden name=lib_type value="%s">
<input type=hidden name=notes value="%s">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (CFG_SITE_URL, _("New library information"),
_("Name"), name,
_("Email"), email,
_("Phone"), phone,
_("Address"), address,
_("Type"), lib_type,
_("Notes"), notes,
_("Back"), _("Confirm"),
name, email, phone, address, lib_type, notes)
return out
def tmpl_add_new_library_step3(self, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value='%s'
onClick="location.href='%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s'"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("A new library has been registered."),
_("Back to home"),
CFG_SITE_URL, ln)
return out
def tmpl_update_library_info_step1(self, infos, ln=CFG_SITE_LANG):
"""
Template for the admin interface. Search borrower.
@param ln: language
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<br />
<form name="update_library_info_step1_form"
action="%s/admin2/bibcirculation/update_library_info_step2"
method="get" >
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s
<input type="radio" name="column" value="name" checked>%s
<input type="radio" name="column" value="email">%s
<br>
<br>
</td>
</tr>
""" % (CFG_SITE_URL,
_("Search library by"),
_("name"),
_("email"))
out += """
<tr align="center">
<td>
<input type="text" size="45" name="string" id='string'
style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("string").focus();
</script>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
<input type="submit" value="%s" class="formbutton">
</td>
</tr>
</table>
<form>
<br /><br />
<br />
<br />
</div>
""" % (_("Back"), _("Search"))
return out
def tmpl_update_library_info_step2(self, result, ln=CFG_SITE_LANG):
"""
@param result: search result
@param ln: language
"""
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom" align="center">
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirccontent">
<strong>%s libraries found</strong>
</td>
</tr>
</table>
<br />
<table class="tablesortersmall" border="0"
cellpadding="0" cellspacing="1">
<th align="center">%s</th>
""" % (len(result), _("Libraries"))
for (library_id, name) in result:
library_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/update_library_info_step3',
{'library_id': library_id, 'ln': ln}, (name))
out += """
<tr align="center">
<td class="bibcirccontent" width="70">%s
<input type=hidden name=library_id value="%s"></td>
</tr>
""" % (library_link, library_id)
out += """
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td><input type=button value="%s"
onClick="history.go(-1)" class="formbutton"></td>
</tr>
</table>
<br />
<br />
<br />
</form>
</div>
""" % (_("Back"))
return out
def tmpl_update_library_info_step3(self, library_info, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
(library_id, name, address, email, phone,
lib_type, _notes) = library_info
out = """ """
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom" align="center">
<form name="update_library_info_step3_form"
action="%s/admin2/bibcirculation/update_library_info_step4" method="get" >
<input type=hidden name=library_id value="%s">
<br />
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="70">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf' size=50
name="name" value="%s">
</td>
</tr>
<tr>
<th width="70">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf' size=50
name="email" value="%s">
</td>
</tr>
<tr>
<th width="70">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf' size=50
name="phone" value="%s">
</td>
</tr>
<tr>
<th width="70">%s</th>
<td>
<input type="text" style='border: 1px solid #cfcfcf' size=50
name="address" value="%s">
</td>
</tr>
<tr>
<th width="70">%s</th>
<td>
<select name="lib_type" style='border: 1px solid #cfcfcf'>
""" % (CFG_SITE_URL, library_id, _("Library information"),
_("Name"), name,
_("Email"), email,
_("Phone"), phone,
_("Address"), address,
_("Type"))
for lib in CFG_BIBCIRCULATION_LIBRARY_TYPE:
if lib == lib_type:
out += """
<option value="%s" selected="selected">%s</option>
""" % (lib, lib)
else:
out += """
<option value="%s">%s</option>
""" % (lib, lib)
out += """
</select>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s" onClick="history.go(-1)" class="formbutton">
<input type="submit" value="%s" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (_("Back"), _("Continue"))
return out
def tmpl_update_library_info_step4(self, tup_infos, ln=CFG_SITE_LANG):
(library_id, name, email, phone, address, lib_type) = tup_infos
_ = gettext_set_language(ln)
out = load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom" align="center">
<form name="update_library_info_step4_form" action="%s/admin2/bibcirculation/update_library_info_step5" method="get" >
<br />
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="70">%s</th> <td>%s</td>
</tr>
<tr>
<th width="70">%s</th> <td>%s</td>
</tr>
<tr>
<th width="70">%s</th> <td>%s</td>
</tr>
<tr>
<th width="70">%s</th> <td>%s</td>
</tr>
<tr>
<th width="70">%s</th> <td>%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit"
value="%s" class="formbutton">
<input type=hidden name=library_id value="%s">
<input type=hidden name=name value="%s">
<input type=hidden name=email value="%s">
<input type=hidden name=phone value="%s">
<input type=hidden name=address value="%s">
<input type=hidden name=lib_type value="%s">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (CFG_SITE_URL, _("Library information"),
_("Name"), name,
_("Email"), email,
_("Phone"), phone,
_("Address"), address,
_("Type"), lib_type,
_("Back"), _("Continue"),
library_id, name, email, phone, address, lib_type)
return out
def tmpl_update_library_info_step5(self, ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="location.href='%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s'"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("The information has been updated."),
_("Back to home"),
CFG_SITE_URL, ln)
return out
def tmpl_search_library_step1(self, infos, ln=CFG_SITE_LANG):
"""
Template for the admin interface. Search borrower.
@param infos: informations
@type infos: list
@param ln: language
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<br />
<form name="search_library_step1_form"
action="%s/admin2/bibcirculation/search_library_step2"
method="get" >
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s
<input type="radio" name="column" value="name" checked>%s
<input type="radio" name="column" value="email">%s
<br>
<br>
</td>
</tr>
""" % (CFG_SITE_URL,
_("Search library by"),
_("name"), _("email"))
out += """
<tr align="center">
<td>
<input type="text" size="45" name="string" id="string"
style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("string").focus();
</script>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<form>
<br />
<br />
<br />
<br />
</div>
""" % (_("Back"),
_("Search"))
return out
def tmpl_search_library_step2(self, result, ln=CFG_SITE_LANG):
"""
@param result: search result about libraries
@type result: list
@param ln: language
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
if len(result) == 0:
out += """
<div class="bibcircbottom">
<br />
<div class="bibcircinfoboxmsg">%s</div>
<br />
""" % (_("0 libraries found."))
else:
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom" align="center">
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirccontent">
<strong>%s libraries found</strong>
</td>
</tr>
</table>
<br />
<table class="tablesortersmall" border="0" cellpadding="0" cellspacing="1">
<th align="center">%s</th>
""" % (len(result), _("Libraries"))
for (library_id, name) in result:
library_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_details',
{'library_id': library_id, 'ln': ln}, (name))
out += """
<tr align="center">
<td width="70">%s
<input type=hidden name=library_id value="%s">
</td>
</tr>
""" % (library_link, library_id)
out += """
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s" onClick="history.go(-1)" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</form>
</div>
""" % (_("Back"))
return out
def tmpl_library_details(self, library_details, library_items,
ln=CFG_SITE_LANG):
"""
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = """
"""
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom">
<br />
"""
(library_id, name, address, email, phone,
lib_type, notes) = library_details
no_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_notes',
{'library_id': library_id},
(_("No notes")))
see_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_notes',
{'library_id': library_id},
(_("Notes about this library")))
if notes == "" or str(notes) == '{}':
notes_link = no_notes_link
else:
notes_link = see_notes_link
out += """
<table class="bibcirctable">
<tr>
<td width="80" class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
<tr>
<th width="100">%s</th>
<td>%s</td>
</tr>
</table>
<table>
<tr>
<td>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/update_library_info_step3?ln=%s&library_id=%s'"
onmouseover="this.className='bibcircbuttonover'"
onmouseout="this.className='bibcircbutton'"
value="%s" class="bibcircbutton">
<a href="%s/admin2/bibcirculation/merge_libraries_step1?ln=%s&library_id=%s">%s</a>
</td>
</tr>
</table>
""" % (_("Library details"),
_("Name"), name,
_("Address"), address,
_("Email"), email,
_("Phone"), phone,
_("Type"), lib_type,
_("Notes"), notes_link,
_("No of items"), len(library_items),
CFG_SITE_URL, ln, library_id, _("Update"),
CFG_SITE_URL, ln, library_id, _("Duplicated library?"))
out += """
</table>
<br />
<br />
<table class="bibcirctable">
<tr>
<td><input type=button value='%s'
onClick="history.go(-1)" class="formbutton"></td>
</tr>
</table>
<br />
<br />
<br />
</div>
""" % (_("Back"))
return out
def tmpl_library_notes(self, library_notes, library_id,
ln=CFG_SITE_LANG):
"""
@param library_notes: notes about a library
@type library_notes: dictionnary
@param library_id: identify the library. Primary key of crcLIBRARY
@type library_id: int
@param ln: language of the page
"""
_ = gettext_set_language(ln)
if not library_notes:
library_notes = {}
else:
if looks_like_dictionary(library_notes):
library_notes = eval(library_notes)
else:
library_notes = {}
out = """ """
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<form name="library_notes" action="%s/admin2/bibcirculation/get_library_notes" method="get" >
<input type=hidden name=library_id value='%s'>
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td>
<table class="bibcircnotes">
""" % (CFG_SITE_URL, library_id,
_("Notes about library"))
key_array = library_notes.keys()
key_array.sort()
for key in key_array:
delete_note = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_library_notes',
{'delete_key': key, 'library_id': library_id,
'ln': ln}, (_("[delete]")))
out += """<tr class="bibcirccontent">
<td class="bibcircnotes" width="160" valign="top" align="center"><b>%s</b></td>
<td width="400"><i>%s</i></td>
<td width="65" align="center">%s</td>
</tr>
""" % (key, library_notes[key], delete_note)
out += """
</table>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">
<textarea name="library_notes" rows="5" cols="90" style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/get_library_details?ln=%s&library_id=%s'"
value="%s" class='formbutton'>
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<br />
<br />
<br />
</form>
</div>
""" % (_("Write new note"),
CFG_SITE_URL, ln,
library_id,
_("Back"),
_("Confirm"))
return out
###
### "Vendor" related templates ###
###
def tmpl_add_new_vendor_step1(self, ln=CFG_SITE_LANG):
"""
@param ln: language
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom" align="center">
<form name="add_new_vendor_step1_form" action="%s/admin2/bibcirculation/add_new_vendor_step2" method="get" >
<br />
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="70">%s</th>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf' size=45 name="name">
</td>
</tr>
<tr>
<th width="70">%s</th>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf' size=45 name="email">
</td>
</tr>
<tr>
<th width="70">%s</th>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf' size=45 name="phone">
</td>
</tr>
<tr>
<th width="70">%s</th>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf' size=45 name="address">
</td>
</tr>
<tr>
<th width="70" valign="top">%s</th>
<td class="bibcirccontent">
<textarea name="notes" rows="5" cols="39" style='border: 1px solid #cfcfcf'></textarea>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s" onClick="history.go(-1)" class="formbutton">
<input type="submit" value="%s" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (CFG_SITE_URL, _("New vendor information"), _("Name"),
_("Email"), _("Phone"), _("Address"), _("Notes"),
_("Back"), _("Continue"))
return out
def tmpl_add_new_vendor_step2(self, tup_infos, ln=CFG_SITE_LANG):
"""
@param tup_infos: borrower's information
@type tup_infos: tuple
@param ln: language
"""
_ = gettext_set_language(ln)
(name, email, phone, address, notes) = tup_infos
out = """ """
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom" align="center">
<form name="add_new_vendor_step2_form" action="%s/admin2/bibcirculation/add_new_vendor_step3" method="get" >
<br />
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="70">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="70">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="70">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="70">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="70">%s</th> <td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s" onClick="history.go(-1)" class="formbutton">
<input type="submit" value="%s" class="formbutton">
<input type=hidden name=name value="%s">
<input type=hidden name=email value="%s">
<input type=hidden name=phone value="%s">
<input type=hidden name=address value="%s">
<input type=hidden name=notes value="%s">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (CFG_SITE_URL, _("New vendor information"),
_("Name"), name,
_("Email"), email,
_("Phone"), phone,
_("Address"), address,
_("Notes"), notes,
_("Back"), _("Confirm"),
name, email, phone, address, notes)
return out
def tmpl_add_new_vendor_step3(self, ln=CFG_SITE_LANG):
"""
@param ln: language
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="location.href='%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s'"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("A new vendor has been registered."),
_("Back to home"),
CFG_SITE_URL, ln)
return out
def tmpl_update_vendor_info_step1(self, infos, ln=CFG_SITE_LANG):
"""
@param infos: information
@type infos: list
@param ln: language
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<br />
<form name="update_vendor_info_step1_form"
action="%s/admin2/bibcirculation/update_vendor_info_step2"
method="get" >
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s
<input type="radio" name="column" value="name" checked>%s
<input type="radio" name="column" value="email">%s
<br>
<br>
</td>
</tr>
""" % (CFG_SITE_URL,
_("Search vendor by"),
_("name"),
_("email"))
out += """
<tr align="center">
<td>
<input type="text" size="45" name="string"
style='border: 1px solid #cfcfcf'>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
<input type="submit" value="%s" class="formbutton">
</td>
</tr>
</table>
<form>
<br />
<br />
<br />
<br />
</div>
""" % (_("Back"), _("Search"))
return out
def tmpl_update_vendor_info_step2(self, result, ln=CFG_SITE_LANG):
"""
@param result: search result
@type result: list
@param ln: language
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom" align="center">
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirccontent">
<strong>%s vendor(s) found</strong>
</td>
</tr>
</table>
<br />
<table class="tablesortersmall" border="0" cellpadding="0" cellspacing="1">
<tr align="center">
<th>%s</th>
</tr>
""" % (len(result), _("Vendor(s)"))
for (vendor_id, name) in result:
vendor_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/update_vendor_info_step3',
{'vendor_id': vendor_id, 'ln': ln}, (name))
out += """
<tr align="center">
<td class="bibcirccontent" width="70">%s
<input type=hidden name=vendor_id value="%s"></td>
</tr>
""" % (vendor_link, vendor_id)
out += """
</table>
<br />
"""
out += """
<table class="bibcirctable">
<tr align="center">
<td><input type=button value="%s"
onClick="history.go(-1)" class="formbutton"></td>
</tr>
</table>
<br />
<br />
<br />
</form>
</div>
""" % (_("Back"))
return out
def tmpl_update_vendor_info_step3(self, vendor_info, ln=CFG_SITE_LANG):
"""
@param vendor_infos: information about a given vendor
@type vendor_infos: tuple
"""
_ = gettext_set_language(ln)
(vendor_id, name, address, email, phone, _notes) = vendor_info
out = """ """
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom" align="center">
<form name="update_vendor_info_step3_form" action="%s/admin2/bibcirculation/update_vendor_info_step4" method="get" >
<input type=hidden name=vendor_id value="%s">
<br />
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="70">%s</th>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf' size=45 name="name" value="%s">
</td>
</tr>
<tr>
<th width="70">%s</th>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf' size=45 name="email" value="%s">
</td>
</tr>
<tr>
<th width="70">%s</th>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf' size=45 name="phone" value="%s">
</td>
</tr>
<tr>
<th width="70">%s</th>
<td class="bibcirccontent">
<input type="text" style='border: 1px solid #cfcfcf' size=45 name="address" value="%s">
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit"
value="%s" class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (CFG_SITE_URL, vendor_id, _("Vendor information"),
_("Name"), name,
_("Email"), email,
_("Phone"), phone,
_("Address"), address,
_("Back"), _("Continue"))
return out
def tmpl_update_vendor_info_step4(self, tup_infos, ln=CFG_SITE_LANG):
"""
@param tup_infos: information about a given vendor
@type tup_infos: tuple
"""
(vendor_id, name, email, phone, address) = tup_infos
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom" align="center">
<form name="update_vendor_info_step4_form" action="%s/admin2/bibcirculation/update_vendor_info_step5" method="get" >
<br />
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="70">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="70">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="70">%s</th> <td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="70">%s</th> <td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton">
<input type="submit"
value="%s" class="formbutton">
<input type=hidden name=vendor_id value="%s">
<input type=hidden name=name value="%s">
<input type=hidden name=email value="%s">
<input type=hidden name=phone value="%s">
<input type=hidden name=address value="%s">
</td>
</tr>
</table>
<br />
<br />
</form>
</div>
""" % (CFG_SITE_URL, _("Vendor information"),
_("Name"), name,
_("Email"), email,
_("Phone"), phone,
_("Address"), address,
_("Back"), _("Continue"),
vendor_id, name, email, phone, address)
return out
def tmpl_update_vendor_info_step5(self, ln=CFG_SITE_LANG):
"""
@param ln: language
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirccontent">%s</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button value="%s"
onClick="location.href='%s/admin2/bibcirculation/loan_on_desk_step1?ln=%s'"
class="formbutton">
</td>
</tr>
</table>
<br />
<br />
</div>
""" % (_("The information has been updated."),
_("Back to home"),
CFG_SITE_URL, ln)
return out
def tmpl_search_vendor_step1(self, infos, ln=CFG_SITE_LANG):
"""
@param infos: information for the infobox.
@type infos: list
@param ln: language
"""
_ = gettext_set_language(ln)
out = self.tmpl_infobox(infos, ln)
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<br />
<br />
<br />
<form name="search_vendor_step1_form"
action="%s/admin2/bibcirculation/search_vendor_step2"
method="get" >
<table class="bibcirctable">
<tr align="center">
<td class="bibcirctableheader">%s
<input type="radio" name="column" value="name" checked>%s
<input type="radio" name="column" value="email">%s
<br>
<br>
</td>
</tr>
""" % (CFG_SITE_URL,
_("Search vendor by"),
_("name"),
_("email"))
out += """
<tr align="center">
<td>
<input type="text" size="45" name="string" id='string'
style='border: 1px solid #cfcfcf'>
<script language="javascript" type="text/javascript">
document.getElementById("string").focus();
</script>
</td>
</tr>
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value='%s'
onClick="history.go(-1)" class="formbutton">
<input type="submit" value='%s' class="formbutton">
</td>
</tr>
</table>
<form>
<br />
<br />
<br />
<br />
</div>
""" % (_("Back"), _("Search"))
return out
def tmpl_search_vendor_step2(self, result, ln=CFG_SITE_LANG):
"""
@param result: search result
@type result:list
@param ln: language
"""
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
<div class="bibcircbottom" align="center">
<br />
<table class="bibcirctable">
<tr align="center">
<td class="bibcirccontent">
<strong>%s vendor(s) found</strong>
</td>
</tr>
</table>
<br />
<table class="tablesortersmall" border="0" cellpadding="0" cellspacing="1">
<tr align="center">
<th>%s</th>
</tr>
""" % (len(result), _("Vendor(s)"))
for (vendor_id, name) in result:
vendor_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_vendor_details',
{'vendor_id': vendor_id, 'ln': ln}, (name))
out += """
<tr align="center">
<td class="bibcirccontent" width="70">%s
<input type=hidden name=library_id value="%s"></td>
</tr>
""" % (vendor_link, vendor_id)
out += """
</table>
<br />
<table class="bibcirctable">
<tr align="center">
<td>
<input type=button value="%s"
onClick="history.go(-1)" class="formbutton"></td>
</tr>
</table>
<br />
<br />
<br />
</form>
</div>
""" % (_("Back"))
return out
def tmpl_vendor_details(self, vendor_details, ln=CFG_SITE_LANG):
"""
@param vendor_details: details about a given vendor
@type vendor_details: tuple
@param ln: language of the page
"""
_ = gettext_set_language(ln)
out = """
"""
out += load_menu(ln)
out += """
<div class="bibcircbottom" align="center">
<br />
<style type="text/css"> @import url("/css/tablesorter.css"); </style>
"""
(vendor_id, name, address, email, phone, notes) = vendor_details
no_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_vendor_notes',
{'vendor_id': vendor_id},
(_("No notes")))
see_notes_link = create_html_link(CFG_SITE_URL +
'/admin2/bibcirculation/get_vendor_notes',
{'vendor_id': vendor_id},
(_("Notes about this vendor")))
if notes == "" or str(notes) == '{}':
notes_link = no_notes_link
else:
notes_link = see_notes_link
out += """
<table class="bibcirctable">
<tr align="center">
<td width="80" class="bibcirctableheader">%s</td>
</tr>
</table>
<table class="tablesorterborrower" border="0" cellpadding="0" cellspacing="1">
<tr>
<th width="80">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="80">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="80">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="80">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
<tr>
<th width="80">%s</th>
<td class="bibcirccontent">%s</td>
</tr>
</table>
<table>
<tr>
<td><input type=button onClick="location.href='%s/admin2/bibcirculation/update_vendor_info_step3?vendor_id=%s'" onmouseover="this.className='bibcircbuttonover'" onmouseout="this.className='bibcircbutton'"
value="%s" class="bibcircbutton">
</td>
</tr>
</table>
""" % (_("Vendor details"),
_("Name"), name,
_("Address"), address,
_("Email"), email,
_("Phone"), phone,
_("Notes"), notes_link,
CFG_SITE_URL, vendor_id, _("Update"))
out += """
</table>
<br />
<br />
<table class="bibcirctable">
<tr align="center">
<td><input type=button value='%s'
onClick="history.go(-1)" class="formbutton"></td>
</tr>
</table>
<br />
<br />
<br />
</form>
</div>
""" % (_("Back"))
return out
def tmpl_vendor_notes(self, vendor_notes, vendor_id, add_notes,
ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
out = """ """
out += load_menu(ln)
out += """
<div class="bibcircbottom">
<form name="vendor_notes" action="%s/admin2/bibcirculation/get_vendor_notes" method="get" >
<br />
<br />
<table class="bibcirctable">
<tr>
<td class="bibcirctableheader">%s</td>
</tr>
""" % (CFG_SITE_URL,
_("Notes about this vendor"))
notes = vendor_notes.split('\n')
for values in notes:
out += """ <tr>
<td class="bibcirccontent">%s</td>
</tr>
""" % (values)
if add_notes:
out += """
<tr>
<td><textarea name='new_note' rows="10" cols="60" style='border: 1px solid #cfcfcf'></textarea></td>
</tr>
<tr>
<td>
<input type='submit' name='confirm_note' value='%s' class='formbutton'>
<input type=hidden name=vendor_id value="%s">
</td>
</tr>
</table>
""" % (_("Confirm"), vendor_id)
else:
out += """
<tr><td></td></tr>
<tr><td></td></tr>
<tr><td></td></tr>
<tr>
<td>
<input type='submit' name='add_notes' value='%s' class='formbutton'>
<input type=hidden name=vendor_id value="%s">
</td>
</tr>
</table>
""" % (_("Add notes"), vendor_id)
out += """
<br />
<br />
<table class="bibcirctable">
<tr>
<td>
<input type=button
onClick="location.href='%s/admin2/bibcirculation/get_vendor_details?vendor_id=%s&ln=%s'"
value="%s" class='formbutton'>
</td>
</tr>
</table>
<br />
<br />
<br />
</form>
</div>
""" % (CFG_SITE_URL,
vendor_id, ln,
_("Back"))
return out
diff --git a/invenio/legacy/bibcirculation/utils.py b/invenio/legacy/bibcirculation/utils.py
index c11a39839..1c70450eb 100644
--- a/invenio/legacy/bibcirculation/utils.py
+++ b/invenio/legacy/bibcirculation/utils.py
@@ -1,953 +1,953 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibCirculation Utils: Auxiliary methods of BibCirculation """
__revision__ = "$Id$"
import datetime
import random
import re
import time
from invenio.legacy.bibrecord import get_fieldvalues
from invenio.utils.url import make_invenio_opener
from invenio.legacy.search_engine import get_field_tags
-from invenio.bibtask import task_low_level_submission
+from invenio.legacy.bibsched.bibtask import task_low_level_submission
from invenio.utils.text import encode_for_xml
from invenio.base.i18n import gettext_set_language
from invenio.config import CFG_SITE_URL, CFG_TMPDIR, CFG_SITE_LANG
-import invenio.bibcirculation_dblayer as db
-from invenio.bibcirculation_config import \
+import invenio.legacy.bibcirculation.db_layer as db
+from invenio.legacy.bibcirculation.config import \
CFG_BIBCIRCULATION_WORKING_DAYS, \
CFG_BIBCIRCULATION_HOLIDAYS, \
CFG_CERN_SITE, \
CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, \
CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF, \
CFG_BIBCIRCULATION_ITEM_STATUS_IN_PROCESS, \
CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING, \
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING, \
CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN, \
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED, \
CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED
DICC_REGEXP = re.compile("^\{('[^']*': ?('[^']*'|\"[^\"]+\"|[0-9]*|None)(, ?'[^']*': ?('[^']*'|\"[^\"]+\"|[0-9]*|None))*)?\}$")
BIBCIRCULATION_OPENER = make_invenio_opener('BibCirculation')
def search_user(column, string):
if string is not None:
string = string.strip()
if CFG_CERN_SITE == 1:
if column == 'name':
result = db.search_borrower_by_name(string)
else:
if column == 'email':
try:
result = db.search_borrower_by_email(string)
except:
result = ()
else:
try:
result = db.search_borrower_by_ccid(string)
except:
result = ()
if result == ():
- from invenio.bibcirculation_cern_ldap \
+ from invenio.legacy.bibcirculation.cern_ldap \
import get_user_info_from_ldap
ldap_info = 'busy'
while ldap_info == 'busy':
time.sleep(1)
if column == 'id' or column == 'ccid':
ldap_info = get_user_info_from_ldap(ccid=string)
elif column == 'email':
ldap_info = get_user_info_from_ldap(email=string)
else:
ldap_info = get_user_info_from_ldap(nickname=string)
if len(ldap_info) == 0:
result = ()
else:
try:
name = ldap_info['displayName'][0]
except KeyError:
name = ""
try:
email = ldap_info['mail'][0]
except KeyError:
email = ""
try:
phone = ldap_info['telephoneNumber'][0]
except KeyError:
phone = ""
try:
address = ldap_info['physicalDeliveryOfficeName'][0]
except KeyError:
address = ""
try:
mailbox = ldap_info['postOfficeBox'][0]
except KeyError:
mailbox = ""
try:
ccid = ldap_info['employeeID'][0]
except KeyError:
ccid = ""
try:
db.new_borrower(ccid, name, email, phone,
address, mailbox, '')
except:
pass
result = db.search_borrower_by_ccid(int(ccid))
else:
if column == 'name':
result = db.search_borrower_by_name(string)
elif column == 'email':
result = db.search_borrower_by_email(string)
else:
result = db.search_borrower_by_id(string)
return result
def update_user_info_from_ldap(user_id):
- from invenio.bibcirculation_cern_ldap import get_user_info_from_ldap
+ from invenio.legacy.bibcirculation.cern_ldap import get_user_info_from_ldap
ccid = db.get_borrower_ccid(user_id)
ldap_info = get_user_info_from_ldap(ccid=ccid)
if not ldap_info:
result = ()
else:
try:
name = ldap_info['displayName'][0]
except KeyError:
name = ""
try:
email = ldap_info['mail'][0]
except KeyError:
email = ""
try:
phone = ldap_info['telephoneNumber'][0]
except KeyError:
phone = ""
try:
address = ldap_info['physicalDeliveryOfficeName'][0]
except KeyError:
address = ""
try:
mailbox = ldap_info['postOfficeBox'][0]
except KeyError:
mailbox = ""
db.update_borrower(user_id, name, email, phone, address, mailbox)
result = db.search_borrower_by_ccid(int(ccid))
return result
def get_book_cover(isbn):
"""
Retrieve book cover using Amazon web services.
@param isbn: book's isbn
@type isbn: string
@return book cover
"""
from xml.dom import minidom
# connect to AWS
"""cover_xml = BIBCIRCULATION_OPENER.open('http://ecs.amazonaws.com/onca/xml' \
'?Service=AWSECommerceService&AWSAccessKeyId=' \
+ CFG_BIBCIRCULATION_AMAZON_ACCESS_KEY + \
'&Operation=ItemSearch&Condition=All&' \
'ResponseGroup=Images&SearchIndex=Books&' \
'Keywords=' + isbn)"""
cover_xml=""
# parse XML
try:
xml_img = minidom.parse(cover_xml)
retrieve_book_cover = xml_img.getElementsByTagName('MediumImage')
book_cover = retrieve_book_cover.item(0).firstChild.firstChild.data
except:
book_cover = "%s/img/book_cover_placeholder.gif" % (CFG_SITE_URL)
return book_cover
def book_information_from_MARC(recid):
"""
Retrieve book's information from MARC
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@return tuple with title, year, author, isbn and editor.
"""
# FIXME do the same that book_title_from_MARC
book_title = book_title_from_MARC(recid)
book_year = ''.join(get_fieldvalues(recid, "260__c"))
author_tags = ['100__a', '700__a', '721__a']
book_author = ''
for tag in author_tags:
l = get_fieldvalues(recid, tag)
for c in l:
book_author += c + '; '
book_author = book_author[:-2]
l = get_fieldvalues(recid, "020__a")
book_isbn = ''
for isbn in l:
book_isbn += isbn + ', '
book_isbn = book_isbn[:-2]
book_editor = ', '.join(get_fieldvalues(recid, "260__a") + \
get_fieldvalues(recid, "260__b"))
return (book_title, book_year, book_author, book_isbn, book_editor)
def book_title_from_MARC(recid):
"""
Retrieve book's title from MARC
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@return book's title
"""
title_tags = get_field_tags('title')
book_title = ''
i = 0
while book_title == '' and i < len(title_tags):
l = get_fieldvalues(recid, title_tags[i])
for candidate in l:
book_title = book_title + candidate + ': '
i += 1
book_title = book_title[:-2]
return book_title
def update_status_if_expired(loan_id):
"""
Update the loan's status if status is 'expired'.
@param loan_id: identify the loan. Primary key of crcLOAN.
@type loan_id: int
"""
loan_status = db.get_loan_status(loan_id)
if loan_status == CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED:
db.update_loan_status(CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, loan_id)
return
def get_next_day(date_string):
"""
Get the next day
@param date_string: date
@type date_string: string
return next day
"""
# add 1 day
more_1_day = datetime.timedelta(days=1)
# convert date_string to datetime format
tmp_date = time.strptime(date_string, '%Y-%m-%d')
# calculate the new date (next day)
next_day = datetime.datetime(tmp_date[0], tmp_date[1], tmp_date[2]) \
+ more_1_day
return next_day
def generate_new_due_date(days):
"""
Generate a new due date (today + X days = new due date).
@param days: number of days
@type days: string
@return new due date
"""
today = datetime.date.today()
more_X_days = datetime.timedelta(days=days)
tmp_date = today + more_X_days
week_day = tmp_date.strftime('%A')
due_date = tmp_date.strftime('%Y-%m-%d')
due_date_validated = False
while not due_date_validated:
if week_day in CFG_BIBCIRCULATION_WORKING_DAYS \
and due_date not in CFG_BIBCIRCULATION_HOLIDAYS:
due_date_validated = True
else:
next_day = get_next_day(due_date)
due_date = next_day.strftime('%Y-%m-%d')
week_day = next_day.strftime('%A')
return due_date
def renew_loan_for_X_days(barcode):
"""
Renew a loan based on its loan period
@param barcode: identify the item. Primary key of crcITEM.
@type barcode: string
@return new due date
"""
loan_period = db.get_loan_period(barcode)
if loan_period == '4 weeks':
due_date = generate_new_due_date(30)
else:
due_date = generate_new_due_date(7)
return due_date
def make_copy_available(request_id):
"""
Change the status of a copy for
CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF when
an hold request was cancelled.
@param request_id: identify the request: Primary key of crcLOANREQUEST
@type request_id: int
"""
barcode_requested = db.get_requested_barcode(request_id)
db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF, barcode_requested)
update_requests_statuses(barcode_requested)
def print_new_loan_information(req, ln=CFG_SITE_LANG):
"""
Create a printable format with the information of the last
loan who has been registered on the table crcLOAN.
"""
_ = gettext_set_language(ln)
# get the last loan from crcLOAN
(recid, borrower_id, due_date) = db.get_last_loan()
# get book's information
(book_title, book_year, book_author,
book_isbn, book_editor) = book_information_from_MARC(recid)
# get borrower's data/information (name, address, email)
(borrower_name, borrower_address,
borrower_mailbox, borrower_email) = db.get_borrower_data(borrower_id)
# Generate printable format
req.content_type = "text/html"
req.send_http_header()
out = """<table style='width:95%; margin:auto; max-width: 600px;'>"""
out += """
<tr>
<td><img src="%s/img/CERN_CDS_logo.png"></td>
</tr>
</table><br />""" % (CFG_SITE_URL)
out += """<table style='color: #79d; font-size: 82%; width:95%;
margin:auto; max-width: 400px;'>"""
out += """ <tr>
<td align="center">
<h2><strong>%s</strong></h2>
</td>
</tr>""" % (_("Loan information"))
out += """ <tr>
<td align="center"><strong>%s</strong></td>
</tr>""" % (_("This book has been sent to you:"))
out += """</table><br />"""
out += """<table style='color: #79d; font-size: 82%; width:95%;
margin:auto; max-width: 400px;'>"""
out += """ <tr>
<td width="70"><strong>%s</strong></td>
<td style='color: black;'>%s</td>
</tr>
<tr>
<td width="70"><strong>%s</strong></td>
<td style='color: black;'>%s</td>
</tr>
<tr>
<td width="70"><strong>%s</strong></td>
<td style='color: black;'>%s</td>
</tr>
<tr>
<td width="70"><strong>%s</strong></td>
<td style='color: black;'>%s</td>
</tr>
<tr>
<td width="70"><strong>%s</strong></td>
<td style='color: black;'>%s</td>
</tr>
""" % (_("Title"), book_title,
_("Author"), book_author,
_("Editor"), book_editor,
_("ISBN"), book_isbn,
_("Year"), book_year)
out += """</table><br />"""
out += """<table style='color: #79d; font-size: 82%; width:95%;
margin:auto; max-width: 400px;'>"""
out += """ <tr>
<td width="70"><strong>%s</strong></td>
<td style='color: black;'>%s</td>
</tr>
<tr>
<td width="70"><strong>%s</strong></td>
<td style='color: black;'>%s</td>
</tr>
<tr>
<td width="70"><strong>%s</strong></td>
<td style='color: black;'>%s</td>
</tr>
<tr>
<td width="70"><strong>%s</strong></td>
<td style='color: black;'>%s</td>
</tr>
""" % (_("Name"), borrower_name,
_("Mailbox"), borrower_mailbox,
_("Address"), borrower_address,
_("Email"), borrower_email)
out += """</table>
<br />"""
out += """<table style='color: #79d; font-size: 82%; width:95%;
margin:auto; max-width: 400px;'>"""
out += """ <tr>
<td align="center"><h2><strong>%s: %s</strong></h2></td>
</tr>""" % (_("Due date"), due_date)
out += """</table>"""
out += """<table style='color: #79d; font-size: 82%; width:95%;
margin:auto; max-width: 800px;'>
<tr>
<td>
<input type="button" onClick='window.print()'
value='Print' style='color: #fff;
background: #36c; font-weight: bold;'>
</td>
</tr>
</table>
"""
req.write("<html>")
req.write(out)
req.write("</html>")
return "\n"
def print_pending_hold_requests_information(req, ln):
"""
Create a printable format with all the information about all
pending hold requests.
"""
_ = gettext_set_language(ln)
requests = db.get_pdf_request_data(CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING)
req.content_type = "text/html"
req.send_http_header()
out = """<table style='width:100%; margin:auto; max-width: 1024px;'>"""
out += """
<tr>
<td><img src="%s/img/CERN_CDS_logo.png"></td>
</tr>
</table><br />""" % (CFG_SITE_URL)
out += """<table style='color: #79d; font-size: 82%;
width:95%; margin:auto; max-width: 1024px;'>"""
out += """ <tr>
<td align="center"><h2><strong>%s</strong></h2></td>
</tr>""" % (_("List of pending hold requests"))
out += """ <tr>
<td align="center"><strong>%s</strong></td>
</tr>""" % (time.ctime())
out += """</table><br/>"""
out += """<table style='color: #79d; font-size: 82%;
width:95%; margin:auto; max-width: 1024px;'>"""
out += """<tr>
<td><strong>%s</strong></td>
<td><strong>%s</strong></td>
<td><strong>%s</strong></td>
<td><strong>%s</strong></td>
<td><strong>%s</strong></td>
<td><strong>%s</strong></td>
<td><strong>%s</strong></td>
</tr>
""" % (_("Borrower"),
_("Item"),
_("Library"),
_("Location"),
_("From"),
_("To"),
_("Request date"))
for (recid, borrower_name, library_name, location,
date_from, date_to, request_date) in requests:
out += """<tr style='color: black;'>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
<td class="bibcirccontent">%s</td>
</tr>
""" % (borrower_name, book_title_from_MARC(recid),
library_name, location, date_from, date_to,
request_date)
out += """</table>
<br />
<br />
<table style='color: #79d; font-size: 82%;
width:95%; margin:auto; max-width: 1024px;'>
<tr>
<td>
<input type=button value='Back' onClick="history.go(-1)"
style='color: #fff; background: #36c;
font-weight: bold;'>
<input type="button" onClick='window.print()'
value='Print' style='color: #fff;
background: #36c; font-weight: bold;'>
</td>
</tr>
</table>"""
req.write("<html>")
req.write(out)
req.write("</html>")
return "\n"
def get_item_info_for_search_result(recid):
"""
Get the item's info from MARC in order to create a
search result with more details
@param recid: identify the record. Primary key of bibrec.
@type recid: int
@return book's informations (author, editor and number of copies)
"""
book_author = ' '.join(get_fieldvalues(recid, "100__a") + \
get_fieldvalues(recid, "100__u"))
book_editor = ' , '.join(get_fieldvalues(recid, "260__a") + \
get_fieldvalues(recid, "260__b") + \
get_fieldvalues(recid, "260__c"))
book_copies = ' '.join(get_fieldvalues(recid, "964__a"))
book_infos = (book_author, book_editor, book_copies)
return book_infos
def update_request_data(request_id):
"""
Update the status of a given request.
@param request_id: identify the request: Primary key of crcLOANREQUEST
@type request_id: int
"""
barcode = db.get_request_barcode(request_id)
is_on_loan = db.is_item_on_loan(barcode)
if is_on_loan is not None:
db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, barcode)
else:
db.update_item_status(CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF, barcode)
update_requests_statuses(barcode)
return True
def compare_dates(date):
"""
Compare given date with today
@param date: given date
@type date: string
@return boolean
"""
if date < time.strftime("%Y-%m-%d"):
return False
else:
return True
def validate_date_format(date):
"""
Verify the date format
@param date: given date
@type date: string
@return boolean
"""
try:
if time.strptime(date, "%Y-%m-%d"):
if compare_dates(date):
return True
else:
return False
except ValueError:
return False
def create_ill_record(book_info):
"""
Create a new ILL record
@param book_info: book's information
@type book_info: tuple
@return MARC record
"""
(title, author, place, publisher, year, edition, isbn) = book_info
ill_record = """
<record>
<datafield tag="020" ind1=" " ind2=" ">
<subfield code="a">%(isbn)s</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">%(author)s</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">%(title)s</subfield>
</datafield>
<datafield tag="250" ind1=" " ind2=" ">
<subfield code="a">%(edition)s</subfield>
</datafield>
<datafield tag="260" ind1=" " ind2=" ">
<subfield code="a">%(place)s</subfield>
<subfield code="b">%(publisher)s</subfield>
<subfield code="c">%(year)s</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">ILLBOOK</subfield>
</datafield>
</record>
""" % {'isbn': encode_for_xml(isbn),
'author': encode_for_xml(author),
'title': encode_for_xml(title),
'edition': encode_for_xml(edition),
'place': encode_for_xml(place),
'publisher': encode_for_xml(publisher),
'year': encode_for_xml(year)}
file_path = '%s/%s_%s.xml' % (CFG_TMPDIR, 'bibcirculation_ill_book',
time.strftime("%Y%m%d_%H%M%S"))
xml_file = open(file_path, 'w')
xml_file.write(ill_record)
xml_file.close()
# Pass XML file to BibUpload.
task_low_level_submission('bibupload', 'bibcirculation',
'-P', '5', '-i', file_path)
return ill_record
def wash_recid_from_ILL_request(ill_request_id):
"""
Get dictionnary and wash recid values.
@param ill_request_id: identify the ILL request. Primray key of crcILLREQUEST
@type ill_request_id: int
@return recid
"""
book_info = db.get_ill_book_info(ill_request_id)
if looks_like_dictionary(book_info):
book_info = eval(book_info)
else:
book_info = None
try:
recid = int(book_info['recid'])
except KeyError:
recid = None
return recid
def all_copies_are_missing(recid):
"""
Verify if all copies of an item are missing
@param recid: identify the record. Primary key of bibrec
@type recid: int
@return boolean
"""
copies_status = db.get_copies_status(recid)
number_of_missing = 0
if copies_status == None:
return True
else:
for (status) in copies_status:
if status == 'missing':
number_of_missing += 1
if number_of_missing == len(copies_status):
return True
else:
return False
#def has_copies(recid):
# """
# Verify if a recid is item (has copies)
#
# @param recid: identify the record. Primary key of bibrec
# @type recid: int
#
# @return boolean
# """
#
# copies_status = db.get_copies_status(recid)
#
# if copies_status is None:
# return False
# else:
# if len(copies_status) == 0:
# return False
# else:
# return True
def generate_email_body(template, loan_id, ill=0):
"""
Generate the body of an email for loan recalls.
@param template: email template
@type template: string
@param loan_id: identify the loan. Primary key of crcLOAN.
@type loan_id: int
@return email(body)
"""
if ill:
# Inter library loan.
out = template
else:
recid = db.get_loan_recid(loan_id)
(book_title, book_year, book_author,
book_isbn, book_editor) = book_information_from_MARC(int(recid))
out = template % (book_title, book_year, book_author,
book_isbn, book_editor)
return out
def create_item_details_url(recid, ln):
url = '/admin2/bibcirculation/get_item_details?ln=%s&recid=%s' % (ln,
str(recid))
return CFG_SITE_URL + url
def tag_all_requests_as_done(barcode, user_id):
recid = db.get_id_bibrec(barcode)
description = db.get_item_description(barcode)
list_of_barcodes = db.get_barcodes(recid, description)
for bc in list_of_barcodes:
db.tag_requests_as_done(user_id, bc)
def update_requests_statuses(barcode):
recid = db.get_id_bibrec(barcode)
description = db.get_item_description(barcode)
list_of_pending_requests = db.get_requests(recid, description,
CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING)
some_copy_available = False
copies_status = db.get_copies_status(recid, description)
if copies_status is not None:
for status in copies_status:
if status in (CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF,
CFG_BIBCIRCULATION_ITEM_STATUS_IN_PROCESS):
some_copy_available = True
if len(list_of_pending_requests) == 1:
if not some_copy_available:
db.update_loan_request_status(CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING,
list_of_pending_requests[0][0])
else:
return list_of_pending_requests[0][0]
elif len(list_of_pending_requests) == 0:
if some_copy_available:
list_of_waiting_requests = db.get_requests(recid, description,
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING)
if len(list_of_waiting_requests) > 0:
db.update_loan_request_status(CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING,
list_of_waiting_requests[0][0])
return list_of_waiting_requests[0][0]
elif len(list_of_pending_requests) > 1:
for request in list_of_pending_requests:
db.update_loan_request_status(CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING,
request[0])
list_of_waiting_requests = db.get_requests(recid, description,
CFG_BIBCIRCULATION_REQUEST_STATUS_WAITING)
if some_copy_available:
db.update_loan_request_status(CFG_BIBCIRCULATION_REQUEST_STATUS_PENDING,
list_of_waiting_requests[0][0])
return list_of_waiting_requests[0][0]
return None
def is_periodical(recid):
rec_type = get_fieldvalues(recid, "690C_a")
if len(rec_type) > 0:
for value in rec_type:
if value == 'PERI':
return True
return False
def has_date_format(date):
if type(date) is not str:
return False
date = date.strip()
if len(date) is not 10:
return False
elif date[4] is not '-' and date[7] is not '-':
return False
else:
year = date[:4]
month = date[5:7]
day = date[8:]
return year.isdigit() and month.isdigit() and day.isdigit()
def generate_tmp_barcode():
tmp_barcode = 'tmp-' + str(random.random())[-8:]
while(db.barcode_in_use(tmp_barcode)):
tmp_barcode = 'tmp-' + str(random.random())[-8:]
return tmp_barcode
def check_database():
from invenio.legacy.dbquery import run_sql
r1 = run_sql(""" SELECT it.barcode, it.status, ln.status
FROM crcITEM it, crcLOAN ln
WHERE ln.barcode=it.barcode
AND it.status=%s
AND ln.status!=%s
AND ln.status!=%s
AND ln.status!=%s
""", (CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED,
CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED))
r2 = run_sql(""" SELECT it.barcode
FROM crcITEM it, crcLOAN ln
WHERE ln.barcode=it.barcode
AND it.status=%s
AND (ln.status=%s or ln.status=%s)
""", (CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF,
CFG_BIBCIRCULATION_LOAN_STATUS_ON_LOAN,
CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED))
r3 = run_sql(""" SELECT l1.barcode, l1.id,
DATE_FORMAT(l1.loaned_on,'%%Y-%%m-%%d %%H:%%i:%%s'),
DATE_FORMAT(l2.loaned_on,'%%Y-%%m-%%d %%H:%%i:%%s')
FROM crcLOAN l1,
crcLOAN l2
WHERE l1.id!=l2.id
AND l1.status!=%s
AND l1.status=l2.status
AND l1.barcode=l2.barcode
ORDER BY l1.loaned_on
""", (CFG_BIBCIRCULATION_LOAN_STATUS_RETURNED, ))
r4 = run_sql(""" SELECT id, id_crcBORROWER, barcode,
due_date, number_of_renewals
FROM crcLOAN
WHERE status=%s
AND due_date>NOW()
""", (CFG_BIBCIRCULATION_LOAN_STATUS_EXPIRED, ))
return (len(r1), len(r2), len(r3), len(r4))
def looks_like_dictionary(candidate_string):
if re.match(DICC_REGEXP, candidate_string):
return True
else:
return False
diff --git a/invenio/legacy/bibcirculation/webinterface.py b/invenio/legacy/bibcirculation/webinterface.py
index 398699042..d09cba6df 100644
--- a/invenio/legacy/bibcirculation/webinterface.py
+++ b/invenio/legacy/bibcirculation/webinterface.py
@@ -1,1310 +1,1310 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Invenio Bibcirculation User (URLs) Interface.
When applicable, methods should be renamed, refactored and
appropriate documentation added.
"""
__revision__ = "$Id$"
__lastupdated__ = """$Date$"""
import time
import cgi
from invenio.config import CFG_SITE_LANG, \
CFG_SITE_URL, \
CFG_SITE_SECURE_URL, \
CFG_ACCESS_CONTROL_LEVEL_SITE, \
CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS, \
CFG_SITE_RECORD, \
CFG_CERN_SITE
from invenio.legacy.webuser import getUid, page_not_authorized, isGuestUser, \
collect_user_info
from invenio.legacy.webpage import page, pageheaderonly, pagefooteronly
from invenio.ext.email import send_email
from invenio.legacy.search_engine import create_navtrail_links, \
guess_primary_collection_of_a_record, \
get_colID, check_user_can_view_record, \
record_exists, get_fieldvalues
from invenio.utils.url import redirect_to_url, \
make_canonical_urlargd
from invenio.base.i18n import gettext_set_language
from invenio.ext.legacy.handler import wash_urlargd, WebInterfaceDirectory
-from invenio.websearchadminlib import get_detailed_page_tabs
+from invenio.legacy.websearch.adminlib import get_detailed_page_tabs
from invenio.modules.access.local_config import VIEWRESTRCOLL
from invenio.modules.access.mailcookie import mail_cookie_create_authorize_action
import invenio.legacy.template
-import invenio.bibcirculation_dblayer as db
-from invenio.bibcirculation_utils import book_title_from_MARC, search_user
-from invenio.bibcirculation import perform_new_request, \
+import invenio.legacy.bibcirculation.db_layer as db
+from invenio.legacy.bibcirculation.utils import book_title_from_MARC, search_user
+from invenio.legacy.bibcirculation.api import perform_new_request, \
perform_new_request_send, \
perform_book_proposal_send, \
perform_get_holdings_information, \
perform_borrower_loans, \
perform_loanshistoricaloverview, \
ill_register_request, \
ill_request_with_recid, \
ill_register_request_with_recid
-from invenio.bibcirculationadminlib import is_adminuser, \
+from invenio.legacy.bibcirculation.adminlib import is_adminuser, \
load_template
-from invenio.bibcirculation_config import CFG_BIBCIRCULATION_ILLS_EMAIL, \
+from invenio.legacy.bibcirculation.config import CFG_BIBCIRCULATION_ILLS_EMAIL, \
CFG_BIBCIRCULATION_ILL_STATUS_NEW, \
CFG_BIBCIRCULATION_ACQ_STATUS_NEW, \
AMZ_ACQUISITION_IDENTIFIER_TAG
webstyle_templates = invenio.legacy.template.load('webstyle')
websearch_templates = invenio.legacy.template.load('websearch')
bc_templates = invenio.legacy.template.load('bibcirculation')
class WebInterfaceYourLoansPages(WebInterfaceDirectory):
"""Defines the set of /yourloans pages."""
_exports = ['', 'display', 'loanshistoricaloverview']
def __init__(self, recid=-1):
self.recid = recid
def index(self, req, form):
"""
The function called by default
"""
redirect_to_url(req, "%s/yourloans/display?%s" % (CFG_SITE_SECURE_URL,
req.args))
def display(self, req, form):
"""
Displays all loans of a given user
@param ln: language
@return the page for inbox
"""
argd = wash_urlargd(form, {'barcode': (str, ""),
'borrower_id': (int, 0),
'request_id': (int, 0),
'action': (str, "")})
# Check if user is logged
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/yourloans/display" % \
(CFG_SITE_SECURE_URL,),
navmenuid="yourloans")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourloans/display%s" % (
CFG_SITE_SECURE_URL, make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})),
norobot=True)
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_useloans']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use loans."))
body = perform_borrower_loans(uid=uid,
barcode=argd['barcode'],
borrower_id=argd['borrower_id'],
request_id=argd['request_id'],
action=argd['action'],
ln=argd['ln'])
return page(title = _("Your Loans"),
body = body,
uid = uid,
lastupdated = __lastupdated__,
req = req,
language = argd['ln'],
navmenuid = "yourloans",
secure_page_p=1)
def loanshistoricaloverview(self, req, form):
"""
Show loans historical overview.
"""
argd = wash_urlargd(form, {})
# Check if user is logged
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/yourloans/loanshistoricaloverview" % \
(CFG_SITE_SECURE_URL,),
navmenuid="yourloans")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourloans/loanshistoricaloverview%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})), norobot=True)
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_useloans']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use loans."))
body = perform_loanshistoricaloverview(uid=uid,
ln=argd['ln'])
return page(title = _("Loans - historical overview"),
body = body,
uid = uid,
lastupdated = __lastupdated__,
req = req,
language = argd['ln'],
navmenuid = "yourloans",
secure_page_p=1)
class WebInterfaceILLPages(WebInterfaceDirectory):
"""Defines the set of /ill pages."""
_exports = ['', 'register_request', 'book_request_step1',
'book_request_step2','book_request_step3',
'article_request_step1', 'article_request_step2',
'article_request_step3', 'purchase_request_step1',
'purchase_request_step2']
def index(self, req, form):
""" The function called by default
"""
redirect_to_url(req, "%s/ill/book_request_step1?%s" % (CFG_SITE_SECURE_URL,
req.args))
def register_request(self, req, form):
"""
Displays all loans of a given user
@param ln: language
@return the page for inbox
"""
argd = wash_urlargd(form, {'ln': (str, ""),
'title': (str, ""),
'authors': (str, ""),
'place': (str, ""),
'publisher': (str, ""),
'year': (str, ""),
'edition': (str, ""),
'isbn': (str, ""),
'period_of_interest_from': (str, ""),
'period_of_interest_to': (str, ""),
'additional_comments': (str, ""),
'conditions': (str, ""),
'only_edition': (str, ""),
})
# Check if user is logged
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/ill/register_request" % \
(CFG_SITE_SECURE_URL,),
navmenuid="ill")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/ill/register_request%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})), norobot=True)
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_useloans']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use ill."))
body = ill_register_request(uid=uid,
title=argd['title'],
authors=argd['authors'],
place=argd['place'],
publisher=argd['publisher'],
year=argd['year'],
edition=argd['edition'],
isbn=argd['isbn'],
period_of_interest_from=argd['period_of_interest_from'],
period_of_interest_to=argd['period_of_interest_to'],
additional_comments=argd['additional_comments'],
conditions=argd['conditions'],
only_edition=argd['only_edition'],
request_type='book',
ln=argd['ln'])
return page(title = _("Interlibrary loan request for books"),
body = body,
uid = uid,
lastupdated = __lastupdated__,
req = req,
language = argd['ln'],
navmenuid = "ill")
def book_request_step1(self, req, form):
"""
Displays all loans of a given user
@param ln: language
@return the page for inbox
"""
argd = wash_urlargd(form, {})
# Check if user is logged
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/ill/book_request_step1" % \
(CFG_SITE_SECURE_URL,),
navmenuid="ill")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/ill/book_request_step1%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})), norobot=True)
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_useloans']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use ill."))
### get borrower_id ###
borrower_id = search_user('email', user_info['email'])
if borrower_id == ():
body = "wrong user id"
else:
body = bc_templates.tmpl_register_ill_request_with_no_recid_step1([], None, False, argd['ln'])
return page(title = _("Interlibrary loan request for books"),
body = body,
uid = uid,
lastupdated = __lastupdated__,
req = req,
language = argd['ln'],
navmenuid = "ill")
def book_request_step2(self, req, form):
"""
Displays all loans of a given user
@param ln: language
@return the page for inbox
"""
argd = wash_urlargd(form, {'title': (str, None), 'authors': (str, None),
'place': (str, None), 'publisher': (str, None), 'year': (str, None),
'edition': (str, None), 'isbn': (str, None), 'budget_code': (str, ''),
'period_of_interest_from': (str, None), 'period_of_interest_to': (str, None),
'additional_comments': (str, None), 'only_edition': (str, 'No'),'ln': (str, "en")})
title = argd['title']
authors = argd['authors']
place = argd['place']
publisher = argd['publisher']
year = argd['year']
edition = argd['edition']
isbn = argd['isbn']
budget_code = argd['budget_code']
period_of_interest_from = argd['period_of_interest_from']
period_of_interest_to = argd['period_of_interest_to']
additional_comments = argd['additional_comments']
only_edition = argd['only_edition']
ln = argd['ln']
if title is not None:
title = title.strip()
if authors is not None:
authors = authors.strip()
if place is not None:
place = place.strip()
if publisher is not None:
publisher = publisher.strip()
if year is not None:
year = year.strip()
if edition is not None:
edition = edition.strip()
if isbn is not None:
isbn = isbn.strip()
if budget_code is not None:
budget_code = budget_code.strip()
if period_of_interest_from is not None:
period_of_interest_from = period_of_interest_from.strip()
if period_of_interest_to is not None:
period_of_interest_to = period_of_interest_to.strip()
# Check if user is logged
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/ill/book_request_step2" % \
(CFG_SITE_URL,),
navmenuid="ill")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/ill/book_request_step2%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : ln}, {})), norobot=True)
_ = gettext_set_language(ln)
user_info = collect_user_info(req)
if not user_info['precached_useloans']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use ill."))
if CFG_CERN_SITE:
borrower = search_user('ccid', user_info['external_personid'])
else:
borrower = search_user('email', user_info['email'])
if borrower != ():
borrower_id = borrower[0][0]
book_info = (title, authors, place, publisher,
year, edition, isbn)
user_info = db.get_borrower_data_by_id(borrower_id)
request_details = (budget_code, period_of_interest_from,
period_of_interest_to, additional_comments,
only_edition)
body = bc_templates.tmpl_register_ill_request_with_no_recid_step3(
book_info, user_info, request_details, False, ln)
else:
body = "wrong user id"
return page(title = _("Interlibrary loan request for books"),
body = body,
uid = uid,
lastupdated = __lastupdated__,
req = req,
language = ln,
navmenuid = "ill")
def book_request_step3(self, req, form):
"""
Displays all loans of a given user
@param ln: language
@return the page for inbox
"""
argd = wash_urlargd(form, {'title': (str, None), 'authors': (str, None),
'place': (str, None), 'publisher': (str, None), 'year': (str, None),
'edition': (str, None), 'isbn': (str, None), 'borrower_id': (str, None),
'budget_code': (str, ''), 'period_of_interest_from': (str, None),
'period_of_interest_to': (str, None), 'additional_comments': (str, None),
'only_edition': (str, None), 'ln': (str, "en")})
title = argd['title']
authors = argd['authors']
place = argd['place']
publisher = argd['publisher']
year = argd['year']
edition = argd['edition']
isbn = argd['isbn']
borrower_id = argd['borrower_id']
budget_code = argd['budget_code']
period_of_interest_from = argd['period_of_interest_from']
period_of_interest_to = argd['period_of_interest_to']
library_notes = argd['additional_comments']
only_edition = argd['only_edition']
ln = argd['ln']
book_info = (title, authors, place, publisher, year, edition, isbn)
# Check if user is logged
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/ill/book_request_step2" % \
(CFG_SITE_URL,),
navmenuid="ill")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/ill/book_request_step2%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})), norobot=True)
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_useloans']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use ill."))
book_info = {'title': title, 'authors': authors, 'place': place,
'publisher': publisher, 'year' : year,
'edition': edition, 'isbn' : isbn}
ill_request_notes = {}
if library_notes:
ill_request_notes[time.strftime("%Y-%m-%d %H:%M:%S")] = str(library_notes)
### budget_code ###
db.ill_register_request_on_desk(borrower_id, book_info,
period_of_interest_from,
period_of_interest_to,
CFG_BIBCIRCULATION_ILL_STATUS_NEW,
str(ill_request_notes), only_edition,
'book', budget_code)
infos = []
infos.append('Interlibrary loan request done.')
body = bc_templates.tmpl_infobox(infos, ln)
return page(title = _("Interlibrary loan request for books"),
body = body,
uid = uid,
lastupdated = __lastupdated__,
req = req,
language = argd['ln'],
navmenuid = "ill")
def article_request_step1(self, req, form):
"""
Displays all loans of a given user
@param ln: language
@return the page for inbox
"""
argd = wash_urlargd(form, {})
# Check if user is logged
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/ill/article_request_step1" % \
(CFG_SITE_URL,),
navmenuid="ill")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/ill/article_request_step1%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})), norobot=True)
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_useloans']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use ill."))
### get borrower_id ###
borrower_id = search_user('email', user_info['email'])
if borrower_id == ():
body = "Wrong user id"
else:
body = bc_templates.tmpl_register_ill_article_request_step1([], False, argd['ln'])
return page(title = _("Interlibrary loan request for articles"),
body = body,
uid = uid,
lastupdated = __lastupdated__,
req = req,
language = argd['ln'],
navmenuid = "ill")
def article_request_step2(self, req, form):
"""
Displays all loans of a given user
@param ln: language
@return the page for inbox
"""
argd = wash_urlargd(form, {'periodical_title': (str, None), 'article_title': (str, None),
'author': (str, None), 'report_number': (str, None), 'volume': (str, None),
'issue': (str, None), 'page': (str, None), 'year': (str, None),
'budget_code': (str, ''), 'issn': (str, None),
'period_of_interest_from': (str, None), 'period_of_interest_to': (str, None),
'additional_comments': (str, None), 'key': (str, None), 'string': (str, None),
'ln': (str, "en")})
# Check if user is logged
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/ill/article_request_step2" % \
(CFG_SITE_URL,),
navmenuid="ill")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/ill/article_request_step2%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})), norobot=True)
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_useloans']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use ill."))
borrower_id = search_user('email', user_info['email'])
if borrower_id != ():
borrower_id = borrower_id[0][0]
notes = argd['additional_comments']
ill_request_notes = {}
if notes:
ill_request_notes[time.strftime("%Y-%m-%d %H:%M:%S")] = str(notes)
item_info = {'periodical_title': argd['periodical_title'],
'title': argd['article_title'], 'authors': argd['author'],
'place': "", 'publisher': "", 'year' : argd['year'],
'edition': "", 'issn' : argd['issn'], 'volume': argd['volume'],
'page': argd['page'], 'issue': argd['issue'] }
### budget_code ###
db.ill_register_request_on_desk(borrower_id, item_info,
argd['period_of_interest_from'],
argd['period_of_interest_to'],
CFG_BIBCIRCULATION_ILL_STATUS_NEW,
str(ill_request_notes), 'No', 'article',
argd['budget_code'])
infos = []
infos.append('Interlibrary loan request done.')
body = bc_templates.tmpl_infobox(infos, argd['ln'])
else:
body = _("Wrong user id")
return page(title = _("Interlibrary loan request for books"),
body = body,
uid = uid,
lastupdated = __lastupdated__,
req = req,
language = argd['ln'],
navmenuid = "ill")
def purchase_request_step1(self, req, form):
argd = wash_urlargd(form, {'type': (str, 'acq-book'), 'recid': (str, ''),
'title': (str, ''), 'authors': (str, ''), 'place': (str, ''),
'publisher': (str, ''), 'year': (str, ''), 'edition': (str, ''),
'this_edition_only': (str, 'No'),
'isbn': (str, ''), 'standard_number': (str, ''),
'budget_code': (str, ''), 'cash': (str, 'No'),
'period_of_interest_from': (str, ''),
'period_of_interest_to': (str, ''),
'additional_comments': (str, ''), 'ln': (str, 'en')})
request_type = argd['type'].strip()
recid = argd['recid'].strip()
title = argd['title'].strip()
authors = argd['authors'].strip()
place = argd['place'].strip()
publisher = argd['publisher'].strip()
year = argd['year'].strip()
edition = argd['edition'].strip()
this_edition_only = argd['this_edition_only'].strip()
isbn = argd['isbn'].strip()
standard_number = argd['standard_number'].strip()
budget_code = argd['budget_code'].strip()
cash = argd['cash'] == 'Yes'
period_of_interest_from = argd['period_of_interest_from'].strip()
period_of_interest_to = argd['period_of_interest_to'].strip()
additional_comments = argd['additional_comments'].strip()
ln = argd['ln']
if not recid:
fields = (request_type, title, authors, place, publisher, year, edition,
this_edition_only, isbn, standard_number, budget_code,
cash, period_of_interest_from, period_of_interest_to,
additional_comments)
else:
fields = (request_type, recid, budget_code, cash,
period_of_interest_from, period_of_interest_to,
additional_comments)
# Check if user is logged
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/ill/purchase_request_step1" % \
(CFG_SITE_URL,),
navmenuid="ill")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/ill/purchase_request_step1%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})), norobot=True)
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_useloans']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use ill."))
### get borrower_id ###
borrower_id = search_user('email', user_info['email'])
if borrower_id == ():
body = "Wrong user ID"
else:
(auth_code, _auth_message) = is_adminuser(req)
body = bc_templates.tmpl_register_purchase_request_step1([], fields,
auth_code == 0, ln)
return page(title = _("Purchase request"),
body = body,
uid = uid,
lastupdated = __lastupdated__,
req = req,
language = argd['ln'],
navmenuid = "ill")
def purchase_request_step2(self, req, form):
argd = wash_urlargd(form, {'type': (str, 'acq-book'), 'recid': (str, ''),
'title': (str, ''), 'authors': (str, ''), 'place': (str, ''),
'publisher': (str, ''), 'year': (str, ''), 'edition': (str, ''),
'this_edition_only': (str, 'No'),
'isbn': (str, ''), 'standard_number': (str, ''),
'budget_code': (str, ''), 'cash': (str, 'No'),
'period_of_interest_from': (str, ''),
'period_of_interest_to': (str, ''),
'additional_comments': (str, ''), 'ln': (str, "en")})
request_type = argd['type'].strip()
recid = argd['recid'].strip()
title = argd['title'].strip()
authors = argd['authors'].strip()
place = argd['place'].strip()
publisher = argd['publisher'].strip()
year = argd['year'].strip()
edition = argd['edition'].strip()
this_edition_only = argd['this_edition_only'].strip()
isbn = argd['isbn'].strip()
standard_number = argd['standard_number'].strip()
budget_code = argd['budget_code'].strip()
cash = argd['cash'] == 'Yes'
period_of_interest_from = argd['period_of_interest_from'].strip()
period_of_interest_to = argd['period_of_interest_to'].strip()
additional_comments = argd['additional_comments'].strip()
ln = argd['ln']
if recid:
fields = (request_type, recid, budget_code, cash,
period_of_interest_from, period_of_interest_to,
additional_comments)
else:
fields = (request_type, title, authors, place, publisher, year, edition,
this_edition_only, isbn, standard_number, budget_code,
cash, period_of_interest_from, period_of_interest_to,
additional_comments)
# Check if user is logged
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/ill/purchase_request_step1" % \
(CFG_SITE_URL,),
navmenuid="ill")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/ill/purchase_request_step2%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : ln}, {})), norobot=True)
_ = gettext_set_language(ln)
user_info = collect_user_info(req)
if not user_info['precached_useloans']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use ill."))
infos = []
if budget_code == '' and not cash:
infos.append(_("Payment method information is mandatory. \
Please, type your budget code or tick the 'cash' checkbox."))
(auth_code, _auth_message) = is_adminuser(req)
body = bc_templates.tmpl_register_purchase_request_step1(infos, fields,
auth_code == 0, ln)
else:
if recid:
item_info = "{'recid': " + str(recid) + "}"
title = book_title_from_MARC(recid)
else:
item_info = {'title': title, 'authors': authors, 'place': place,
'publisher': publisher, 'year' : year,
'edition': edition, 'isbn' : isbn,
'standard_number': standard_number}
ill_request_notes = {}
if additional_comments:
ill_request_notes[time.strftime("%Y-%m-%d %H:%M:%S")] \
= str(additional_comments)
if cash and budget_code == '':
budget_code = 'cash'
borrower_email = db.get_invenio_user_email(uid)
borrower_id = db.get_borrower_id_by_email(borrower_email)
db.ill_register_request_on_desk(borrower_id, item_info,
period_of_interest_from,
period_of_interest_to,
CFG_BIBCIRCULATION_ACQ_STATUS_NEW,
str(ill_request_notes),
this_edition_only, request_type,
budget_code)
msg_for_user = load_template('purchase_notification') % title
send_email(fromaddr = CFG_BIBCIRCULATION_ILLS_EMAIL,
toaddr = borrower_email,
subject = _("Your book purchase request"),
header = '', footer = '',
content = msg_for_user,
attempt_times=1,
attempt_sleeptime=10
)
body = bc_templates.tmpl_message_purchase_request_send_ok_other(ln=ln)
return page(title=_("Register purchase request"),
uid=uid,
req=req,
body=body,
language=ln,
metaheaderadd='<link rel="stylesheet" ' \
'href="%s/img/jquery-ui.css" ' \
'type="text/css" />' % CFG_SITE_URL,
lastupdated=__lastupdated__)
class WebInterfaceHoldingsPages(WebInterfaceDirectory):
"""Defines the set of /holdings pages."""
_exports = ['', 'display', 'request', 'send',
'ill_request_with_recid',
'ill_register_request_with_recid']
def __init__(self, recid=-1):
self.recid = recid
def index(self, req, form):
"""
Redirects to display function
"""
return self.display(req, form)
def display(self, req, form):
"""
Show the tab 'holdings'.
"""
argd = wash_urlargd(form, {'do': (str, "od"),
'ds': (str, "all"),
'nb': (int, 100),
'p' : (int, 1),
'voted': (int, -1),
'reported': (int, -1),
})
_ = gettext_set_language(argd['ln'])
record_exists_p = record_exists(self.recid)
if record_exists_p != 1:
if record_exists_p == -1:
msg = _("The record has been deleted.")
else:
msg = _("Requested record does not seem to exist.")
msg = '<span class="quicknote">' + msg + '</span>'
title, description, keywords = \
websearch_templates.tmpl_record_page_header_content(req,
self.recid,
argd['ln'])
return page(title = title,
show_title_p = False,
body = msg,
description = description,
keywords = keywords,
uid = getUid(req),
language = argd['ln'],
req = req,
navmenuid='search')
# Check if the record has been harvested from Amazon. If true, the control flow will be
# that of patron driven acquisition.
acquisition_src = get_fieldvalues(self.recid, AMZ_ACQUISITION_IDENTIFIER_TAG)
if acquisition_src and acquisition_src[0].startswith('AMZ') and db.has_copies(self.recid) == False:
body = perform_get_holdings_information(self.recid, req, action="proposal", ln=argd['ln'])
else:
body = perform_get_holdings_information(self.recid, req, action="borrowal", ln=argd['ln'])
uid = getUid(req)
user_info = collect_user_info(req)
(auth_code, auth_msg) = check_user_can_view_record(user_info, self.recid)
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL,
{'collection': guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln': argd['ln'],
'referer': CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
return page_not_authorized(req, "../", text=auth_msg)
unordered_tabs = get_detailed_page_tabs(get_colID(\
guess_primary_collection_of_a_record(self.recid)),
self.recid, ln=argd['ln'])
ordered_tabs_id = [(tab_id, values['order']) for (tab_id, values) in unordered_tabs.iteritems()]
ordered_tabs_id.sort(lambda x, y: cmp(x[1], y[1]))
link_ln = ''
if argd['ln'] != CFG_SITE_LANG:
link_ln = '?ln=%s' % argd['ln']
tabs = [(unordered_tabs[tab_id]['label'], \
'%s/%s/%s/%s%s' % (CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, tab_id, link_ln), \
tab_id in ['holdings'],
unordered_tabs[tab_id]['enabled']) \
for (tab_id, _order) in ordered_tabs_id
if unordered_tabs[tab_id]['visible'] == True]
top = webstyle_templates.detailed_record_container_top(self.recid,
tabs,
argd['ln'])
bottom = webstyle_templates.detailed_record_container_bottom(self.recid,
tabs,
argd['ln'])
title = websearch_templates.tmpl_record_page_header_content(req, self.recid, argd['ln'])[0]
navtrail = create_navtrail_links(cc=guess_primary_collection_of_a_record(self.recid), ln=argd['ln'])
navtrail += ' &gt; <a class="navtrail" href="%s/%s/%s?ln=%s">'% (CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, argd['ln'])
navtrail += cgi.escape(title)
navtrail += '</a>'
return pageheaderonly(title=title,
navtrail=navtrail,
uid=uid,
verbose=1,
req=req,
metaheaderadd = "<link rel=\"stylesheet\" href=\"%s/img/jquery-ui.css\" type=\"text/css\" />" % CFG_SITE_SECURE_URL,
language=argd['ln'],
navmenuid='search',
navtrail_append_title_p=0) + \
websearch_templates.tmpl_search_pagestart(argd['ln']) + \
top + body + bottom + \
websearch_templates.tmpl_search_pageend(argd['ln']) + \
pagefooteronly(lastupdated=__lastupdated__, language=argd['ln'], req=req)
# Return the same page wether we ask for /CFG_SITE_RECORD/123 or /CFG_SITE_RECORD/123/
__call__ = index
def request(self, req, form):
"""
Show new hold request form.
"""
argd = wash_urlargd(form, {'ln': (str, ""), 'barcode': (str, ""), 'act': (str, "")})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../holdings/request",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/%s/%s/holdings/request%s" % (
CFG_SITE_SECURE_URL,
CFG_SITE_RECORD,
self.recid,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})), norobot=True)
user_info = collect_user_info(req)
(auth_code, auth_msg) = check_user_can_view_record(user_info, self.recid)
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'],
'referer': CFG_SITE_SECURE_URL + user_info['uri']
}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
return page_not_authorized(req, "../", \
text = auth_msg)
act = "borrowal"
if argd['act'] == 'pr':
act = "proposal"
if argd['act'] == 'pu':
act = "purchase"
body = perform_new_request(recid=self.recid,
barcode=argd['barcode'],
action=act,
ln=argd['ln'])
unordered_tabs = get_detailed_page_tabs(get_colID(guess_primary_collection_of_a_record(self.recid)), self.recid, ln=argd['ln'])
ordered_tabs_id = [(tab_id, values['order']) for (tab_id, values) in unordered_tabs.iteritems()]
ordered_tabs_id.sort(lambda x, y: cmp(x[1], y[1]))
link_ln = ''
if argd['ln'] != CFG_SITE_LANG:
link_ln = '?ln=%s' % argd['ln']
tabs = [(unordered_tabs[tab_id]['label'], \
'%s/%s/%s/%s%s' % (CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, tab_id, link_ln), \
tab_id in ['holdings'],
unordered_tabs[tab_id]['enabled']) \
for (tab_id, _order) in ordered_tabs_id
if unordered_tabs[tab_id]['visible'] == True]
top = webstyle_templates.detailed_record_container_top(self.recid,
tabs,
argd['ln'])
bottom = webstyle_templates.detailed_record_container_bottom(self.recid,
tabs,
argd['ln'])
title = websearch_templates.tmpl_record_page_header_content(req, self.recid, argd['ln'])[0]
navtrail = create_navtrail_links(cc=guess_primary_collection_of_a_record(self.recid), ln=argd['ln'])
navtrail += ' &gt; <a class="navtrail" href="%s/%s/%s?ln=%s">'% (CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, argd['ln'])
navtrail += cgi.escape(title)
navtrail += '</a>'
return pageheaderonly(title=title,
navtrail=navtrail,
uid=uid,
verbose=1,
req=req,
metaheaderadd = "<link rel=\"stylesheet\" href=\"%s/img/jquery-ui.css\" type=\"text/css\" />" % CFG_SITE_SECURE_URL,
language=argd['ln'],
navmenuid='search',
navtrail_append_title_p=0) + \
websearch_templates.tmpl_search_pagestart(argd['ln']) + \
top + body + bottom + \
websearch_templates.tmpl_search_pageend(argd['ln']) + \
pagefooteronly(lastupdated=__lastupdated__, language=argd['ln'], req=req)
def send(self, req, form):
"""
Create a new hold request and if the 'act' parameter is "pr"(proposal),
also send a confirmation email with the proposal.
"""
argd = wash_urlargd(form, {'period_from': (str, ""),
'period_to': (str, ""),
'barcode': (str, ""),
'act': (str, ""),
'remarks': (str, "")
})
ln = CFG_SITE_LANG
_ = gettext_set_language(ln)
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../holdings/request",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/%s/%s/holdings/request%s" % (
CFG_SITE_SECURE_URL,
CFG_SITE_RECORD,
self.recid,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})), norobot=True)
user_info = collect_user_info(req)
(auth_code, auth_msg) = check_user_can_view_record(user_info, self.recid)
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target)
elif auth_code:
return page_not_authorized(req, "../", \
text = auth_msg)
period_from = argd['period_from']
period_to = argd['period_to']
period_from = period_from.strip()
period_to = period_to.strip()
barcode = argd['barcode']
if argd['act'] == 'pr':
body = perform_book_proposal_send(uid=uid,
recid=self.recid,
period_from=argd['period_from'],
period_to=argd['period_to'],
remarks=argd['remarks'].strip())
else:
body = perform_new_request_send(recid=self.recid,
uid=uid,
period_from=argd['period_from'],
period_to=argd['period_to'],
barcode=barcode)
unordered_tabs = get_detailed_page_tabs(get_colID(guess_primary_collection_of_a_record(self.recid)), self.recid, ln=ln)
ordered_tabs_id = [(tab_id, values['order']) for (tab_id, values) in unordered_tabs.iteritems()]
ordered_tabs_id.sort(lambda x, y: cmp(x[1], y[1]))
link_ln = ''
if argd['ln'] != CFG_SITE_LANG:
link_ln = '?ln=%s' % ln
tabs = [(unordered_tabs[tab_id]['label'], \
'%s/%s/%s/%s%s' % (CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, tab_id, link_ln), \
tab_id in ['holdings'],
unordered_tabs[tab_id]['enabled']) \
for (tab_id, _order) in ordered_tabs_id
if unordered_tabs[tab_id]['visible'] == True]
top = webstyle_templates.detailed_record_container_top(self.recid,
tabs,
argd['ln'])
bottom = webstyle_templates.detailed_record_container_bottom(self.recid,
tabs,
argd['ln'])
title = websearch_templates.tmpl_record_page_header_content(req, self.recid, argd['ln'])[0]
navtrail = create_navtrail_links(cc=guess_primary_collection_of_a_record(self.recid), ln=argd['ln'])
navtrail += ' &gt; <a class="navtrail" href="%s/%s/%s?ln=%s">'% (CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, argd['ln'])
navtrail += cgi.escape(title)
navtrail += '</a>'
return pageheaderonly(title=title,
navtrail=navtrail,
uid=uid,
verbose=1,
req=req,
language=argd['ln'],
navmenuid='search',
navtrail_append_title_p=0) + \
websearch_templates.tmpl_search_pagestart(argd['ln']) + \
top + body + bottom + \
websearch_templates.tmpl_search_pageend(argd['ln']) + \
pagefooteronly(lastupdated=__lastupdated__,
language=argd['ln'], req=req)
def ill_request_with_recid(self, req, form):
"""
Show ILL request form.
"""
argd = wash_urlargd(form, {'ln': (str, "")})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
body = ill_request_with_recid(recid=self.recid,
ln=argd['ln'])
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../holdings/ill_request_with_recid",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/%s/%s/holdings/ill_request_with_recid%s" % (
CFG_SITE_SECURE_URL,
CFG_SITE_RECORD,
self.recid,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
(auth_code, auth_msg) = check_user_can_view_record(user_info, self.recid)
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target)
elif auth_code:
return page_not_authorized(req, "../", \
text = auth_msg)
unordered_tabs = get_detailed_page_tabs(get_colID(guess_primary_collection_of_a_record(self.recid)),
self.recid,
ln=argd['ln'])
ordered_tabs_id = [(tab_id, values['order']) for (tab_id, values) in unordered_tabs.iteritems()]
ordered_tabs_id.sort(lambda x, y: cmp(x[1], y[1]))
link_ln = ''
if argd['ln'] != CFG_SITE_LANG:
link_ln = '?ln=%s' % argd['ln']
tabs = [(unordered_tabs[tab_id]['label'], \
'%s/%s/%s/%s%s' % (CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, tab_id, link_ln), \
tab_id in ['holdings'],
unordered_tabs[tab_id]['enabled']) \
for (tab_id, _order) in ordered_tabs_id
if unordered_tabs[tab_id]['visible'] == True]
top = webstyle_templates.detailed_record_container_top(self.recid,
tabs,
argd['ln'])
bottom = webstyle_templates.detailed_record_container_bottom(self.recid,
tabs,
argd['ln'])
title = websearch_templates.tmpl_record_page_header_content(req, self.recid, argd['ln'])[0]
navtrail = create_navtrail_links(cc=guess_primary_collection_of_a_record(self.recid), ln=argd['ln'])
navtrail += ' &gt; <a class="navtrail" href="%s/%s/%s?ln=%s">'% (CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, argd['ln'])
navtrail += cgi.escape(title)
navtrail += '</a>'
return pageheaderonly(title=title,
navtrail=navtrail,
uid=uid,
verbose=1,
req=req,
metaheaderadd = "<link rel=\"stylesheet\" href=\"%s/img/jquery-ui.css\" type=\"text/css\" />" % CFG_SITE_SECURE_URL,
language=argd['ln'],
navmenuid='search',
navtrail_append_title_p=0) + \
websearch_templates.tmpl_search_pagestart(argd['ln']) + \
top + body + bottom + \
websearch_templates.tmpl_search_pageend(argd['ln']) + \
pagefooteronly(lastupdated=__lastupdated__, language=argd['ln'], req=req)
def ill_register_request_with_recid(self, req, form):
"""
Register ILL request.
"""
argd = wash_urlargd(form, {'ln': (str, ""),
'period_of_interest_from': (str, ""),
'period_of_interest_to': (str, ""),
'additional_comments': (str, ""),
'conditions': (str, ""),
'only_edition': (str, ""),
})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
body = ill_register_request_with_recid(recid=self.recid,
uid=uid,
period_of_interest_from = argd['period_of_interest_from'],
period_of_interest_to = argd['period_of_interest_to'],
additional_comments = argd['additional_comments'],
conditions = argd['conditions'],
only_edition = argd['only_edition'],
ln=argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../holdings/ill_request_with_recid",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/%s/%s/holdings/ill_request_with_recid%s" % (
CFG_SITE_SECURE_URL,
CFG_SITE_RECORD,
self.recid,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
(auth_code, auth_msg) = check_user_can_view_record(user_info, self.recid)
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target)
elif auth_code:
return page_not_authorized(req, "../", \
text = auth_msg)
unordered_tabs = get_detailed_page_tabs(get_colID(guess_primary_collection_of_a_record(self.recid)),
self.recid,
ln=argd['ln'])
ordered_tabs_id = [(tab_id, values['order']) for (tab_id, values) in unordered_tabs.iteritems()]
ordered_tabs_id.sort(lambda x, y: cmp(x[1], y[1]))
link_ln = ''
if argd['ln'] != CFG_SITE_LANG:
link_ln = '?ln=%s' % argd['ln']
tabs = [(unordered_tabs[tab_id]['label'], \
'%s/%s/%s/%s%s' % (CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, tab_id, link_ln), \
tab_id in ['holdings'],
unordered_tabs[tab_id]['enabled']) \
for (tab_id, _order) in ordered_tabs_id
if unordered_tabs[tab_id]['visible'] == True]
top = webstyle_templates.detailed_record_container_top(self.recid,
tabs,
argd['ln'])
bottom = webstyle_templates.detailed_record_container_bottom(self.recid,
tabs,
argd['ln'])
title = websearch_templates.tmpl_record_page_header_content(req, self.recid, argd['ln'])[0]
navtrail = create_navtrail_links(cc=guess_primary_collection_of_a_record(self.recid), ln=argd['ln'])
navtrail += ' &gt; <a class="navtrail" href="%s/%s/%s?ln=%s">'% (CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, argd['ln'])
navtrail += cgi.escape(title)
navtrail += '</a>'
return pageheaderonly(title=title,
navtrail=navtrail,
uid=uid,
verbose=1,
req=req,
language=argd['ln'],
navmenuid='search',
navtrail_append_title_p=0) + \
websearch_templates.tmpl_search_pagestart(argd['ln']) + \
top + body + bottom + \
websearch_templates.tmpl_search_pageend(argd['ln']) + \
pagefooteronly(lastupdated=__lastupdated__, language=argd['ln'], req=req)
diff --git a/invenio/legacy/bibclassify/daemon.py b/invenio/legacy/bibclassify/daemon.py
index 92c8a6088..f031751ef 100644
--- a/invenio/legacy/bibclassify/daemon.py
+++ b/invenio/legacy/bibclassify/daemon.py
@@ -1,403 +1,403 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibClassify daemon.
FIXME: the code below requires collection table to be updated to add column:
clsMETHOD_fk mediumint(9) unsigned NOT NULL,
This is not clean and should be fixed.
This module IS NOT standalone safe - it should never be run so.
"""
import sys
import time
import os
from invenio import bibclassify_config as bconfig
from invenio import bibclassify_text_extractor
from invenio import bibclassify_engine
from invenio import bibclassify_webinterface
from invenio import bibtask
from invenio.legacy.dbquery import run_sql
from invenio.intbitset import intbitset
from invenio.legacy.search_engine import get_collection_reclist
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
# Global variables allowing to retain the progress of the task.
_INDEX = 0
_RECIDS_NUMBER = 0
## INTERFACE
def bibclassify_daemon():
"""Constructs the BibClassify bibtask."""
bibtask.task_init(authorization_action='runbibclassify',
authorization_msg="BibClassify Task Submission",
description="Extract keywords and create a BibUpload "
"task.\nExamples:\n"
" $ bibclassify\n"
" $ bibclassify -i 79 -k HEP\n"
" $ bibclassify -c 'Articles' -k HEP\n",
help_specific_usage=" -i, --recid\t\tkeywords are extracted from "
"this record\n"
" -c, --collection\t\tkeywords are extracted from this collection\n"
" -k, --taxonomy\t\tkeywords are based on that reference",
version="Invenio BibClassify v%s" % bconfig.VERSION,
specific_params=("i:c:k:f",
[
"recid=",
"collection=",
"taxonomy=",
"force"
]),
task_submit_elaborate_specific_parameter_fnc=
_task_submit_elaborate_specific_parameter,
task_submit_check_options_fnc=_task_submit_check_options,
task_run_fnc=_task_run_core)
## PRIVATE METHODS
def _ontology_exists(ontology_name):
"""Check if the ontology name is registered in the database."""
if run_sql("SELECT name FROM clsMETHOD WHERE name=%s",
(ontology_name,)):
return True
return False
def _collection_exists(collection_name):
"""Check if the collection name is registered in the database."""
if run_sql("SELECT name FROM collection WHERE name=%s",
(collection_name,)):
return True
return False
def _recid_exists(recid):
"""Check if the recid number is registered in the database."""
if run_sql("SELECT id FROM bibrec WHERE id=%s",
(recid,)):
return True
return False
def _get_recids_foreach_ontology(recids=None, collections=None, taxonomy=None):
"""Returns an array containing hash objects containing the
collection, its corresponding ontology and the records belonging to
the given collection."""
rec_onts = []
# User specified record IDs.
if recids:
rec_onts.append({
'ontology': taxonomy,
'collection': None,
'recIDs': recids,
})
return rec_onts
# User specified collections.
if collections:
for collection in collections:
records = get_collection_reclist(collection)
if records:
rec_onts.append({
'ontology': taxonomy,
'collection': collection,
'recIDs': records
})
return rec_onts
# Use rules found in collection_clsMETHOD.
result = run_sql("SELECT clsMETHOD.name, clsMETHOD.last_updated, "
"collection.name FROM clsMETHOD JOIN collection_clsMETHOD ON "
"clsMETHOD.id=id_clsMETHOD JOIN collection ON "
"id_collection=collection.id")
for ontology, date_last_run, collection in result:
records = get_collection_reclist(collection)
if records:
if not date_last_run:
bibtask.write_message("INFO: Collection %s has not been previously "
"analyzed." % collection, stream=sys.stderr, verbose=3)
modified_records = intbitset(run_sql("SELECT id FROM bibrec"))
elif bibtask.task_get_option('force'):
bibtask.write_message("INFO: Analysis is forced for collection %s." %
collection, stream=sys.stderr, verbose=3)
modified_records = intbitset(run_sql("SELECT id FROM bibrec"))
else:
modified_records = intbitset(run_sql("SELECT id FROM bibrec "
"WHERE modification_date >= %s", (date_last_run, )))
records &= modified_records
if records:
rec_onts.append({
'ontology': ontology,
'collection': collection,
'recIDs': records
})
else:
bibtask.write_message("WARNING: All records from collection '%s' have "
"already been analyzed for keywords with ontology '%s' "
"on %s." % (collection, ontology, date_last_run),
stream=sys.stderr, verbose=2)
else:
bibtask.write_message("ERROR: Collection '%s' doesn't contain any record. "
"Cannot analyse keywords." % (collection,),
stream=sys.stderr, verbose=0)
return rec_onts
def _update_date_of_last_run(runtime):
"""Update bibclassify daemon table information about last run time."""
run_sql("UPDATE clsMETHOD SET last_updated=%s", (runtime,))
def _task_submit_elaborate_specific_parameter(key, value, opts, args):
"""Given the string key it checks it's meaning, eventually using the
value. Usually it fills some key in the options dict.
It must return True if it has elaborated the key, False, if it doesn't
know that key.
eg:
if key in ('-n', '--number'):
bibtask.task_get_option(\1) = value
return True
return False
"""
# Recid option
if key in ("-i", "--recid"):
try:
value = int(value)
except ValueError:
bibtask.write_message("The value specified for --recid must be a "
"valid integer, not '%s'." % value, stream=sys.stderr,
verbose=0)
if not _recid_exists(value):
bibtask.write_message("ERROR: '%s' is not a valid record ID." % value,
stream=sys.stderr, verbose=0)
return False
recids = bibtask.task_get_option('recids')
if recids is None:
recids = []
recids.append(value)
bibtask.task_set_option('recids', recids)
# Collection option
elif key in ("-c", "--collection"):
if not _collection_exists(value):
bibtask.write_message("ERROR: '%s' is not a valid collection." % value,
stream=sys.stderr, verbose=0)
return False
collections = bibtask.task_get_option("collections")
collections = collections or []
collections.append(value)
bibtask.task_set_option("collections", collections)
# Taxonomy option
elif key in ("-k", "--taxonomy"):
if not _ontology_exists(value):
bibtask.write_message("ERROR: '%s' is not a valid taxonomy name." % value,
stream=sys.stderr, verbose=0)
return False
bibtask.task_set_option("taxonomy", value)
elif key in ("-f", "--force"):
bibtask.task_set_option("force", True)
else:
return False
return True
def _task_run_core():
"""Runs analyse_documents for each ontology, collection, record ids
set."""
automated_daemon_mode_p = True
recids = bibtask.task_get_option('recids')
collections = bibtask.task_get_option('collections')
taxonomy = bibtask.task_get_option('taxonomy')
if recids or collections:
# We want to run some records/collection only, so we are not
# in the automated daemon mode; this will be useful later.
automated_daemon_mode_p = False
# Check if the user specified which documents to extract keywords from.
if recids:
onto_recids = _get_recids_foreach_ontology(recids=recids,
taxonomy=taxonomy)
elif collections:
onto_recids = _get_recids_foreach_ontology(collections=collections,
taxonomy=taxonomy)
else:
onto_recids = _get_recids_foreach_ontology()
if not onto_recids:
# Nothing to do.
if automated_daemon_mode_p:
_update_date_of_last_run(bibtask.task_get_task_param('task_starting_time'))
return 1
# We will write to a temporary file as we go, because we might be processing
# big collections with many docs
_rid = time.strftime("%Y%m%d%H%M%S", time.localtime())
abs_path = bibclassify_engine.get_tmp_file(_rid)
fo = open(abs_path, 'w')
fo.write('<?xml version="1.0" encoding="UTF-8"?>\n')
fo.write('<collection xmlns="http://www.loc.gov/MARC21/slim">\n')
# Count the total number of records in order to update the progression.
global _RECIDS_NUMBER
for onto_rec in onto_recids:
_RECIDS_NUMBER += len(onto_rec['recIDs'])
rec_added = False
for onto_rec in onto_recids:
bibtask.task_sleep_now_if_required(can_stop_too=False)
if onto_rec['collection'] is not None:
bibtask.write_message('INFO: Applying taxonomy %s to collection %s (%s '
'records)' % (onto_rec['ontology'], onto_rec['collection'],
len(onto_rec['recIDs'])), stream=sys.stderr, verbose=3)
else:
bibtask.write_message('INFO: Applying taxonomy %s to recIDs %s. ' %
(onto_rec['ontology'],
', '.join([str(recid) for recid in onto_rec['recIDs']])),
stream=sys.stderr, verbose=3)
if onto_rec['recIDs']:
xml = _analyze_documents(onto_rec['recIDs'],
onto_rec['ontology'], onto_rec['collection'])
if len(xml) > 5:
fo.write(xml)
rec_added = True
fo.write('</collection>\n')
fo.close()
# Apply the changes.
if rec_added:
if bconfig.CFG_DB_SAVE_KW:
bibclassify_webinterface.upload_keywords(abs_path)
else:
bibtask.write_message("INFO: CFG_DB_SAVE_KW is false, we don't save results",
stream=sys.stderr, verbose=0)
else:
bibtask.write_message("WARNING: No keywords found, recids: %s" % onto_recids,
stream=sys.stderr, verbose=0)
os.remove(abs_path)
# Update the date of last run in the clsMETHOD table, but only if
# we were running in an automated mode.
if automated_daemon_mode_p:
_update_date_of_last_run(bibtask.task_get_task_param('task_starting_time'))
return 1
def _analyze_documents(records, taxonomy_name, collection,
output_limit=bconfig.CFG_BIBCLASSIFY_DEFAULT_OUTPUT_NUMBER):
"""For each collection, parse the documents attached to the records
in collection with the corresponding taxonomy_name.
@var records: list of recids to process
@var taxonomy_name: str, name of the taxonomy, e.g. HEP
@var collection: str, collection name
@keyword output_limit: int, max number of keywords to extract [3]
@return: str, marcxml output format of results
"""
global _INDEX
if not records:
# No records could be found.
bibtask.write_message("WARNING: No records were found in collection %s." %
collection, stream=sys.stderr, verbose=2)
return False
# Process records:
output = []
for record in records:
bibdocfiles = BibRecDocs(record).list_latest_files() # TODO: why this doesn't call list_all_files() ?
keywords = {}
akws = {}
acro = {}
single_keywords = composite_keywords = author_keywords = acronyms = None
for doc in bibdocfiles:
# Get the keywords for all PDF documents contained in the record.
if bibclassify_text_extractor.is_pdf(doc.get_full_path()):
bibtask.write_message('INFO: Generating keywords for record %d.' %
record, stream=sys.stderr, verbose=3)
fulltext = doc.get_path()
single_keywords, composite_keywords, author_keywords, acronyms = \
bibclassify_engine.get_keywords_from_local_file(fulltext,
taxonomy_name, with_author_keywords=True, output_mode="raw",
output_limit=output_limit, match_mode='partial')
else:
bibtask.write_message('WARNING: BibClassify does not know how to process \
doc: %s (type: %s) -- ignoring it.' %
(doc.fullpath, doc.doctype), stream=sys.stderr, verbose=3)
if single_keywords or composite_keywords:
cleaned_single = bibclassify_engine.clean_before_output(single_keywords)
cleaned_composite = bibclassify_engine.clean_before_output(composite_keywords)
# merge the groups into one
keywords.update(cleaned_single)
keywords.update(cleaned_composite)
acro.update(acronyms)
akws.update(author_keywords)
if len(keywords):
output.append('<record>')
output.append('<controlfield tag="001">%s</controlfield>' % record)
output.append(bibclassify_engine._output_marc(keywords.items(), (), akws, acro,
spires=bconfig.CFG_SPIRES_FORMAT))
output.append('</record>')
else:
bibtask.write_message('WARNING: No keywords found for record %d.' %
record, stream=sys.stderr, verbose=0)
_INDEX += 1
bibtask.task_update_progress('Done %d out of %d.' % (_INDEX, _RECIDS_NUMBER))
bibtask.task_sleep_now_if_required(can_stop_too=False)
return '\n'.join(output)
def _task_submit_check_options():
"""Required by bibtask. Checks the options."""
recids = bibtask.task_get_option('recids')
collections = bibtask.task_get_option('collections')
taxonomy = bibtask.task_get_option('taxonomy')
# If a recid or a collection is specified, check that the taxonomy
# is also specified.
if (recids is not None or collections is not None) and \
taxonomy is None:
bibtask.write_message("ERROR: When specifying a record ID or a collection, "
"you have to precise which\ntaxonomy to use.", stream=sys.stderr,
verbose=0)
return False
return True
# FIXME: outfiledesc can be multiple files, e.g. when processing
# 100000 records it is good to store results by 1000 records
# (see oaiharvest)
diff --git a/invenio/legacy/bibclassify/webinterface.py b/invenio/legacy/bibclassify/webinterface.py
index a2588fc57..09a18975b 100644
--- a/invenio/legacy/bibclassify/webinterface.py
+++ b/invenio/legacy/bibclassify/webinterface.py
@@ -1,362 +1,362 @@
# This file is part of Invenio.
# Copyright (C) 2008, 2009, 2010, 2011, 2013 CERN.
#
# Invenio is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License as
# published by the Free Software Foundation; either version 2 of the
# License, or (at your option) any later version.
#
# Invenio is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with Invenio; if not, write to the Free Software Foundation, Inc.,
# 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibClassify's web interface.
This module is NOT standalone safe - this component is never expected
to run in a standalone mode, but always inside invenio."""
import os
from cgi import escape
from urllib import quote
import time
from invenio import bibupload
from invenio.base.i18n import gettext_set_language
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
from invenio.ext.legacy.handler import WebInterfaceDirectory
from invenio.legacy.webpage import pageheaderonly, pagefooteronly
from invenio.legacy.search_engine import get_colID, \
guess_primary_collection_of_a_record, create_navtrail_links, \
perform_request_search, get_record, print_record
-from invenio.websearchadminlib import get_detailed_page_tabs
+from invenio.legacy.websearch.adminlib import get_detailed_page_tabs
from invenio.template import load
from invenio.ext.legacy.handler import wash_urlargd
from invenio.legacy.webuser import collect_user_info
import invenio.modules.access.engine as acce
from invenio import dbquery
from invenio import bibtask
from invenio.legacy import bibrecord
from invenio import bibclassify_config as bconfig
from invenio import bibclassify_text_extractor
from invenio import bibclassify_engine
from invenio import bibclassify_ontology_reader as bor
log = bconfig.get_logger("bibclassify.webinterface")
template = load('bibclassify')
def main_page(req, recid, tabs, ln, template):
"""Generates the main page for the keyword tab - http://url/record/[recid]/keywords
@var req: request object
@var recid: int docid
@var tabs: list of tab links
@var ln: language id
@var template: template object
@return: nothing, writes using req object
"""
form = req.form
argd = wash_urlargd(form, {
'generate': (str, 'no'),
'sorting': (str, 'occurences'),
'type': (str, 'tagcloud'),
'numbering': (str, 'off'),
'showall': (str, 'off'),
})
for k,v in argd.items():
argd[k] = escape(v)
req.write(template.detailed_record_container_top(recid, tabs, ln))
# Get the keywords from MARC (if any)
success, keywords, marcrec = record_get_keywords(recid)
if success:
# check for the cached file and delete it (we don't need it anymore, data are in the DB)
tmp_file = bibclassify_engine.get_tmp_file(recid)
if os.path.exists(tmp_file):
try:
os.remove(tmp_file)
except Exception, msg:
log.error('Error removing the cached file: %s' % tmp_file)
log.error(msg)
else:
# Give user possibility to generate them ONLY if not available already
# we may have some keywords, but they are the old ones and we want to generate new
new_found, new_keywords, marcrec = generate_keywords(req, recid, argd)
if keywords and new_keywords:
for key in keywords.keys():
if key in new_keywords:
log.warning('The old "DESY" keyword will be overwritten by the newly extracted one: %s' % key)
keywords.update(new_keywords)
if keywords:
# Output the keywords or the generate button or some message why kw not available
write_keywords_body(keywords, req, recid, argd, marcrec=marcrec)
req.write(template.detailed_record_container_bottom(recid,
tabs, ln))
def write_keywords_body(keywords, req, recid, argd, marcrec=None):
"""Writes the bibclassify keyword output into req object"""
if not keywords:
req.write(template.tmpl_page_no_keywords(req=req, **argd))
return
# test if more than half of the entries have weight (0,0) - ie. not weighted
#if argd['type'] == 'tagcloud' and len(filter(lambda x: (0,0) in x[0], keywords.values())) > (len(keywords) * .5):
# argd['type'] = 'list'
if argd['type'] == 'list':
# Display keywords as a list.
req.write(template.tmpl_page_list(keywords, req=req, **argd))
elif argd['type'] == 'tagcloud':
# Display keywords as a tag cloud.
req.write(template.tmpl_page_tagcloud(keywords=keywords, req=req, **argd))
elif argd['type'] == 'xml':
if marcrec:
marcxml = filter_marcrec(marcrec)
else:
marcxml = bibclassify_engine.build_marc(recid, keywords, {})
req.write(template.tmpl_page_xml_output(keywords,
marcxml,
req=req, **argd))
else:
_ = gettext_set_language(argd['ln'])
req.write(template.tmpl_page(top=_('Unknown type: %s') % argd['type'], **argd))
def record_get_keywords(record, main_field=bconfig.CFG_MAIN_FIELD,
others=bconfig.CFG_OTHER_FIELDS):
"""Returns a dictionary of keywordToken objects from the marc
record. Weight is set to (0,0) if no weight can be found.
This will load keywords from the field 653 and 695__a (which are the
old 'DESY' keywords)
@var record: int or marc record, if int - marc record is loaded
from the database. If you pass record instance, keywords are
extracted from it
@return: tuple (found, keywords, marcxml)
found - int indicating how many main_field keywords were found
the other fields are not counted
keywords - standard dictionary of keywordToken objects
marcrec - marc record object loaded with data
"""
keywords = {}
if isinstance(main_field, basestring):
main_field = [main_field]
if isinstance(others, basestring):
others = [others]
if isinstance(record, int):
rec = get_record(record)
else:
rec = record
found = 0
for m_field in main_field:
tag, ind1, ind2 = bibclassify_engine._parse_marc_code(m_field)
for field in rec.get(tag, []):
keyword = ''
weight = 0
type = ''
for subfield in field[0]:
if subfield[0] == 'a':
keyword = subfield[1]
elif subfield[0] == 'n':
weight = int(subfield[1])
elif subfield[0] == '9':
type = subfield[1]
if keyword:
found += 1
keywords[bor.KeywordToken(keyword, type=type)] = [[(0,0) for x in range(weight)]]
if others:
for field_no in others:
tag, ind1, ind2 = bibclassify_engine._parse_marc_code(field_no)
type = 'f%s' % field_no
for field in rec.get(tag, []):
keyword = ''
for subfield in field[0]:
if subfield[0] == 'a':
keyword = subfield[1]
keywords[bor.KeywordToken(keyword, type=type)] = [[(0,0)]]
break
return found, keywords, rec
def generate_keywords(req, recid, argd):
"""Extracts keywords from the fulltexts (if found) for the
given recid. It first checks whether the keywords are not already
stored in the temp file (maybe from the previous run).
@var req: req object
@var recid: record id
@var argd: arguments passed from web
@keyword store_keywords: boolean, whether to save records in the file
@return: standard dictionary of kw objects or {}
"""
ln = argd['ln']
_ = gettext_set_language(ln)
keywords = {}
# check the files were not already generated
abs_path = bibclassify_engine.get_tmp_file(recid)
if os.path.exists(abs_path):
try:
# Try to load the data from the tmp file
recs = bibupload.xml_marc_to_records(bibupload.open_marc_file(abs_path))
return record_get_keywords(recs[0])
except:
pass
# check it is allowed (for this user) to generate pages
(exit_stat, msg) = acce.acc_authorize_action(req, 'runbibclassify')
if exit_stat != 0:
log.info('Access denied: ' + msg)
msg = _("The site settings do not allow automatic keyword extraction")
req.write(template.tmpl_page_msg(msg=msg))
return 0, keywords, None
# register generation
bibdocfiles = BibRecDocs(recid).list_latest_files()
if bibdocfiles:
# User arrived at a page, but no keywords are available
inprogress, msg = _doc_already_submitted(recid)
if argd['generate'] != 'yes':
# Display a form and give them possibility to generate keywords
if inprogress:
req.write(template.tmpl_page_msg(msg='<div class="warningbox">%s</div>' % _(msg)))
else:
req.write(template.tmpl_page_generate_keywords(req=req, **argd))
return 0, keywords, None
else: # after user clicked on "generate" button
if inprogress:
req.write(template.tmpl_page_msg(msg='<div class="warningbox">%s</div>' % _(msg) ))
else:
schedule_extraction(recid, taxonomy=bconfig.CFG_EXTRACTION_TAXONOMY)
req.write(template.tmpl_page_msg(msg='<div class="warningbox">%s</div>' %
_('We have registered your request, the automated'
'keyword extraction will run after some time. Please return back in a while.')))
else:
req.write(template.tmpl_page_msg(msg='<div class="warningbox">%s</div>' %
_("Unfortunately, we don't have a PDF fulltext for this record in the storage, \
keywords cannot be generated using an automated process.")))
return 0, keywords, None
def upload_keywords(filename, mode='correct', recids=None):
"""Stores the extracted keywords in the database
@var filename: fullpath to the file with marc record
@keyword mode: correct|replace|add|delete
use correct to add fields if they are different
replace all fields with fields from the file
add - add (even duplicate) fields
delete - delete fields which are inside the file
@keyword recids: list of record ids, this arg comes from
the bibclassify daemon and it is used when the recids
contains one entry (recid) - ie. one individual document
was processed. We use it to mark the job title so that
it is possible to query database if the bibclassify
was run over that document (in case of collections with
many recids, we simply construct a general title)
"""
if mode == 'correct':
m = '-c'
elif mode == 'replace':
m = '-r'
elif mode == 'add':
m = '-a'
elif mode == 'delete':
m = '-d'
else:
raise Exception('Unknown mode')
# let's use the user column to store the information, cause no better alternative in sight...
user_title = 'bibclassify.upload'
if recids and len(recids) == 1:
user_title = 'extract:%d' % recids[0]
bibtask.task_low_level_submission('bibupload',
user_title, '-n', m, filename)
def schedule_extraction(recid, taxonomy):
bibtask.task_low_level_submission('bibclassify',
'extract:%s' % recid, '-k', taxonomy, '-i', '%s' % recid)
def _doc_already_submitted(recid):
# check extraction was already registered
sql = "SELECT COUNT(proc) FROM schTASK WHERE proc='bibclassify' AND user=%s\
AND (status='WAITING' OR status='RUNNING')"
if dbquery.run_sql(sql, ("extract:" + str(recid),))[0][0] > 0:
return (True, "The automated keyword extraction \
for this document has been already scheduled. Please return back in a while.")
# check the upload is inside the scheduled tasks
sql = "SELECT COUNT(proc) FROM schTASK WHERE proc='bibupload' AND user=%s\
AND (status='WAITING' OR status='RUNNING')"
if dbquery.run_sql(sql, ("extract:" + str(recid),))[0][0] > 0:
return (True, 'The document was already processed, '
'it will take a while for it to be ingested.')
# or the task was run and is already archived
sql = "SELECT COUNT(proc) FROM hstTASK WHERE proc='bibupload' AND user=%s"
if dbquery.run_sql(sql, ("extract:" + str(recid),))[0][0] > 0:
return (True, 'The document was already processed, '
'at this moment, the automated extraction is not available.')
# or the task was already ran
sql = "SELECT COUNT(proc) FROM schTASK WHERE proc='bibclassify' AND user=%s\
AND (status='DONE')"
if dbquery.run_sql(sql, ("extract:" + str(recid),))[0][0] > 0:
return (True, 'The document was already processed, '
'but automated extraction identified no suitable keywords.')
# or the extraction is in error stat
sql = "SELECT COUNT(proc) FROM schTASK WHERE proc='bibclassify' AND user=%s\
AND (status='ERROR')"
if dbquery.run_sql(sql, ("extract:" + str(recid),))[0][0] > 0:
return (True, 'The document was already scheduled, '
'but an error happened. This requires an'
'administrator\'s intervention. Unfortunately, '
'for the moment we cannot display any data.')
return (False, None)
def filter_marcrec(marcrec, main_field=bconfig.CFG_MAIN_FIELD,
others=bconfig.CFG_OTHER_FIELDS):
"""Removes the unwanted fields and returns xml"""
if isinstance(main_field, basestring):
main_field = [main_field]
if isinstance(others, basestring):
others = [others]
key_map = ['001']
for field in main_field + others:
tag, ind1, ind2 = bibclassify_engine._parse_marc_code(field)
key_map.append(tag)
return bibrecord.print_rec(marcrec, 1, tags=key_map)
diff --git a/invenio/legacy/bibdocfile/api.py b/invenio/legacy/bibdocfile/api.py
index c6b763057..950735f9d 100644
--- a/invenio/legacy/bibdocfile/api.py
+++ b/invenio/legacy/bibdocfile/api.py
@@ -1,4843 +1,4843 @@
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
This module implements the low-level API for dealing with fulltext files.
- All the files associated to a I{record} (identified by a I{recid}) can be
managed via an instance of the C{BibRecDocs} class.
- A C{BibRecDocs} is a wrapper of the list of I{documents} attached to the
record.
- Each document is represented by an instance of the C{BibDoc} class.
- A document is identified by a C{docid} and name (C{docname}). The docname
must be unique within the record. A document is the set of all the
formats and revisions of a piece of information.
- A document has a type called C{doctype} and can have a restriction.
- Each physical file, i.e. the concretization of a document into a
particular I{version} and I{format} is represented by an instance of the
C{BibDocFile} class.
- The format is infact the extension of the physical file.
- A comment and a description and other information can be associated to a
BibDocFile.
- A C{bibdoc} is a synonim for a document, while a C{bibdocfile} is a
synonim for a physical file.
@group Main classes: BibRecDocs,BibDoc,BibDocFile
@group Other classes: BibDocMoreInfo,Md5Folder,InvenioBibDocFileError
@group Main functions: decompose_file,stream_file,bibdocfile_*,download_url
@group Configuration Variables: CFG_*
"""
__revision__ = "$Id$"
import os
import re
import shutil
import filecmp
import time
import random
import socket
import urllib2
import urllib
import tempfile
import cPickle
import base64
import binascii
import cgi
import sys
try:
import magic
if hasattr(magic, "open"):
CFG_HAS_MAGIC = 1
if not hasattr(magic, "MAGIC_MIME_TYPE"):
## Patching RHEL6/CentOS6 version
magic.MAGIC_MIME_TYPE = 16
elif hasattr(magic, "Magic"):
CFG_HAS_MAGIC = 2
except ImportError:
CFG_HAS_MAGIC = 0
from datetime import datetime
from mimetypes import MimeTypes
from thread import get_ident
from invenio.utils import apache
## Let's set a reasonable timeout for URL request (e.g. FFT)
socket.setdefaulttimeout(40)
if sys.hexversion < 0x2040000:
# pylint: disable=W0622
from sets import Set as set
# pylint: enable=W0622
from invenio.utils.shell import escape_shell_arg
from invenio.legacy.dbquery import run_sql, DatabaseError
from invenio.ext.logging import register_exception
from invenio.legacy.bibrecord import record_get_field_instances, \
field_get_subfield_values, field_get_subfield_instances, \
encode_for_xml
from invenio.utils.url import create_url, make_user_agent_string
from invenio.utils.text import nice_size
from invenio.modules.access.engine import acc_authorize_action
from invenio.modules.access.control import acc_is_user_in_role, acc_get_role_id
from invenio.modules.access.firerole import compile_role_definition, acc_firerole_check_user
from invenio.modules.access.local_config import SUPERADMINROLE, CFG_WEBACCESS_WARNING_MSGS
from invenio.config import CFG_SITE_URL, \
CFG_WEBDIR, CFG_BIBDOCFILE_FILEDIR,\
CFG_BIBDOCFILE_ADDITIONAL_KNOWN_FILE_EXTENSIONS, \
CFG_BIBDOCFILE_FILESYSTEM_BIBDOC_GROUP_LIMIT, CFG_SITE_SECURE_URL, \
CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS, \
CFG_TMPDIR, CFG_TMPSHAREDDIR, CFG_PATH_MD5SUM, \
CFG_WEBSUBMIT_STORAGEDIR, \
CFG_BIBDOCFILE_USE_XSENDFILE, \
CFG_BIBDOCFILE_MD5_CHECK_PROBABILITY, \
CFG_SITE_RECORD, CFG_PYLIBDIR, \
CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS, \
CFG_BIBDOCFILE_ENABLE_BIBDOCFSINFO_CACHE, \
CFG_BIBINDEX_PERFORM_OCR_ON_DOCNAMES, \
CFG_BIBDOCFILE_ADDITIONAL_KNOWN_MIMETYPES
-from invenio.bibdocfile_config import CFG_BIBDOCFILE_ICON_SUBFORMAT_RE, \
+from invenio.legacy.bibdocfile.config import CFG_BIBDOCFILE_ICON_SUBFORMAT_RE, \
CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT
from invenio.base.utils import import_submodules_from_packages
from invenio.utils.hash import md5
import invenio.legacy.template
bibdocfile_templates = invenio.legacy.template.load('bibdocfile')
## The above flag controls whether HTTP range requests are supported or not
## when serving static files via Python. This is disabled by default as
## it currently breaks support for opening PDF files on Windows platforms
## using Acrobat reader brower plugin.
CFG_ENABLE_HTTP_RANGE_REQUESTS = False
#: block size when performing I/O.
CFG_BIBDOCFILE_BLOCK_SIZE = 1024 * 8
#: threshold used do decide when to use Python MD5 of CLI MD5 algorithm.
CFG_BIBDOCFILE_MD5_THRESHOLD = 256 * 1024
#: chunks loaded by the Python MD5 algorithm.
CFG_BIBDOCFILE_MD5_BUFFER = 1024 * 1024
#: whether to normalize e.g. ".JPEG" and ".jpg" into .jpeg.
CFG_BIBDOCFILE_STRONG_FORMAT_NORMALIZATION = False
#: flags that can be associated to files.
CFG_BIBDOCFILE_AVAILABLE_FLAGS = (
'PDF/A',
'STAMPED',
'PDFOPT',
'HIDDEN',
'CONVERTED',
'PERFORM_HIDE_PREVIOUS',
'OCRED'
)
DBG_LOG_QUERIES = False
#: constant used if FFT correct with the obvious meaning.
KEEP_OLD_VALUE = 'KEEP-OLD-VALUE'
_CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS = [(re.compile(_regex), _headers)
for _regex, _headers in CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS]
_mimes = MimeTypes(strict=False)
_mimes.suffix_map.update({'.tbz2' : '.tar.bz2'})
_mimes.encodings_map.update({'.bz2' : 'bzip2'})
if CFG_BIBDOCFILE_ADDITIONAL_KNOWN_MIMETYPES:
for key, value in CFG_BIBDOCFILE_ADDITIONAL_KNOWN_MIMETYPES.iteritems():
_mimes.add_type(key, value)
del key, value
_magic_cookies = {}
if CFG_HAS_MAGIC == 1:
def _get_magic_cookies():
"""
@return: a tuple of magic object.
@rtype: (MAGIC_NONE, MAGIC_COMPRESS, MAGIC_MIME, MAGIC_COMPRESS + MAGIC_MIME)
@note: ... not real magic. Just see: man file(1)
"""
thread_id = get_ident()
if thread_id not in _magic_cookies:
_magic_cookies[thread_id] = {
magic.MAGIC_NONE: magic.open(magic.MAGIC_NONE),
magic.MAGIC_COMPRESS: magic.open(magic.MAGIC_COMPRESS),
magic.MAGIC_MIME: magic.open(magic.MAGIC_MIME),
magic.MAGIC_COMPRESS + magic.MAGIC_MIME: magic.open(magic.MAGIC_COMPRESS + magic.MAGIC_MIME),
magic.MAGIC_MIME_TYPE: magic.open(magic.MAGIC_MIME_TYPE),
}
for key in _magic_cookies[thread_id].keys():
_magic_cookies[thread_id][key].load()
return _magic_cookies[thread_id]
elif CFG_HAS_MAGIC == 2:
def _magic_wrapper(local_path, mime=True, mime_encoding=False):
thread_id = get_ident()
if (thread_id, mime, mime_encoding) not in _magic_cookies:
magic_object = _magic_cookies[thread_id, mime, mime_encoding] = magic.Magic(mime=mime, mime_encoding=mime_encoding)
else:
magic_object = _magic_cookies[thread_id, mime, mime_encoding]
return magic_object.from_file(local_path) # pylint: disable=E1103
def _generate_extensions():
"""
Generate the regular expression to match all the known extensions.
@return: the regular expression.
@rtype: regular expression object
"""
_tmp_extensions = _mimes.encodings_map.keys() + \
_mimes.suffix_map.keys() + \
_mimes.types_map[1].keys() + \
CFG_BIBDOCFILE_ADDITIONAL_KNOWN_FILE_EXTENSIONS
extensions = []
for ext in _tmp_extensions:
if ext.startswith('.'):
extensions.append(ext)
else:
extensions.append('.' + ext)
extensions.sort()
extensions.reverse()
extensions = set([ext.lower() for ext in extensions])
extensions = '\\' + '$|\\'.join(extensions) + '$'
extensions = extensions.replace('+', '\\+')
return re.compile(extensions, re.I)
#: Regular expression to recognized extensions.
_extensions = _generate_extensions()
class InvenioBibDocFileError(Exception):
"""
Exception raised in case of errors related to fulltext files.
"""
pass
class InvenioBibdocfileUnauthorizedURL(InvenioBibDocFileError):
"""
Exception raised in case of errors related to fulltext files.
"""
## NOTE: this is a legacy Exception
pass
def _val_or_null(val, eq_name = None, q_str = None, q_args = None):
"""
Auxiliary function helpful while building WHERE clauses of SQL queries
that should contain field=val or field is val
If optional parameters q_str and q_args are provided, lists are updated
if val == None, a statement of the form "eq_name is Null" is returned
otherwise, otherwise the function returns a parametrised comparison
"eq_name=%s" with val as an argument added to the query args list.
Using parametrised queries diminishes the likelihood of having
SQL injection.
@param val Value to compare with
@type val
@param eq_name The name of the database column
@type eq_name string
@param q_str Query string builder - list of clauses
that should be connected by AND operator
@type q_str list
@param q_args Query arguments list. This list will be applied as
a second argument of run_sql command
@type q_args list
@result string of a single part of WHERE clause
@rtype string
"""
res = ""
if eq_name != None:
res += eq_name
if val == None:
if eq_name != None:
res += " is "
res += "NULL"
if q_str != None:
q_str.append(res)
return res
else:
if eq_name != None:
res += "="
res += "%s"
if q_str != None:
q_str.append(res)
if q_args != None:
q_args.append(str(val))
return res
def _sql_generate_conjunctive_where(to_process):
"""Generating WHERE clause of a SQL statement, consisting of conjunction
of declared terms. Terms are defined by the to_process argument.
the method creates appropriate entries different in the case, value
should be NULL (None in the list) and in the case of not-none arguments.
In the second case, parametrised query is generated decreasing the
chance of an SQL-injection.
@param to_process List of tuples (value, database_column)
@type to_process list"""
q_str = []
q_args = []
for entry in to_process:
q_str.append(_val_or_null(entry[0], eq_name = entry[1], q_args = q_args))
return (" AND ".join(q_str), q_args)
def file_strip_ext(afile, skip_version=False, only_known_extensions=False, allow_subformat=True):
"""
Strip in the best way the extension from a filename.
>>> file_strip_ext("foo.tar.gz")
'foo'
>>> file_strip_ext("foo.buz.gz")
'foo.buz'
>>> file_strip_ext("foo.buz")
'foo'
>>> file_strip_ext("foo.buz", only_known_extensions=True)
'foo.buz'
>>> file_strip_ext("foo.buz;1", skip_version=False,
... only_known_extensions=True)
'foo.buz;1'
>>> file_strip_ext("foo.gif;icon")
'foo'
>>> file_strip_ext("foo.gif:icon", allow_subformat=False)
'foo.gif:icon'
@param afile: the path/name of a file.
@type afile: string
@param skip_version: whether to skip a trailing ";version".
@type skip_version: bool
@param only_known_extensions: whether to strip out only known extensions or
to consider as extension anything that follows a dot.
@type only_known_extensions: bool
@param allow_subformat: whether to consider also subformats as part of
the extension.
@type allow_subformat: bool
@return: the name/path without the extension (and version).
@rtype: string
"""
if skip_version or allow_subformat:
afile = afile.split(';')[0]
nextfile = _extensions.sub('', afile)
if nextfile == afile and not only_known_extensions:
nextfile = os.path.splitext(afile)[0]
while nextfile != afile:
afile = nextfile
nextfile = _extensions.sub('', afile)
return nextfile
def normalize_format(docformat, allow_subformat=True):
"""
Normalize the format, e.g. by adding a dot in front.
@param format: the format/extension to be normalized.
@type format: string
@param allow_subformat: whether to consider also subformats as part of
the extension.
@type allow_subformat: bool
@return: the normalized format.
@rtype; string
"""
if allow_subformat:
subformat = docformat[docformat.rfind(';'):]
docformat = docformat[:docformat.rfind(';')]
else:
subformat = ''
if docformat and docformat[0] != '.':
docformat = '.' + docformat
if CFG_BIBDOCFILE_STRONG_FORMAT_NORMALIZATION:
if docformat not in ('.Z', '.H', '.C', '.CC'):
docformat = docformat.lower()
docformat = {
'.jpg' : '.jpeg',
'.htm' : '.html',
'.tif' : '.tiff'
}.get(docformat, docformat)
return docformat + subformat
def guess_format_from_url(url):
"""
Given a URL tries to guess it's extension.
Different method will be used, including HTTP HEAD query,
downloading the resource and using mime
@param url: the URL for which the extension should be guessed.
@type url: string
@return: the recognized extension or '.bin' if it's impossible to
recognize it.
@rtype: string
"""
def guess_via_magic(local_path):
try:
if CFG_HAS_MAGIC == 1:
magic_cookie = _get_magic_cookies()[magic.MAGIC_MIME_TYPE]
mimetype = magic_cookie.file(local_path)
elif CFG_HAS_MAGIC == 2:
mimetype = _magic_wrapper(local_path, mime=True, mime_encoding=False)
if CFG_HAS_MAGIC:
ext = _mimes.guess_extension(mimetype)
if ext:
## Normalize some common magic mis-interpreation
ext = {'.asc': '.txt', '.obj': '.bin'}.get(ext, ext)
return normalize_format(ext)
except Exception:
pass
## Let's try to guess the extension by considering the URL as a filename
ext = decompose_file(url, skip_version=True, only_known_extensions=True)[2]
if ext.startswith('.'):
return ext
if is_url_a_local_file(url):
## The URL corresponds to a local file, so we can safely consider
## traditional extensions after the dot.
ext = decompose_file(url, skip_version=True, only_known_extensions=False)[2]
if ext.startswith('.'):
return ext
## No extensions? Let's use Magic.
ext = guess_via_magic(url)
if ext:
return ext
else:
## Since the URL is remote, let's try to perform a HEAD request
## and see the corresponding headers
try:
response = open_url(url, head_request=True)
except (InvenioBibdocfileUnauthorizedURL, urllib2.URLError):
return ".bin"
ext = get_format_from_http_response(response)
if ext:
return ext
if CFG_HAS_MAGIC:
## Last solution: let's download the remote resource
## and use the Python magic library to guess the extension
filename = ""
try:
try:
filename = download_url(url, docformat='')
ext = guess_via_magic(filename)
if ext:
return ext
except Exception:
pass
finally:
if os.path.exists(filename):
## Let's free space
os.remove(filename)
return ".bin"
_docname_re = re.compile(r'[^-\w.]*')
def normalize_docname(docname):
"""
Normalize the docname.
At the moment the normalization is just returning the same string.
@param docname: the docname to be normalized.
@type docname: string
@return: the normalized docname.
@rtype: string
"""
#return _docname_re.sub('', docname)
return docname
def normalize_version(version):
"""
Normalize the version.
The version can be either an integer or the keyword 'all'. Any other
value will be transformed into the empty string.
@param version: the version (either a number or 'all').
@type version: integer or string
@return: the normalized version.
@rtype: string
"""
try:
int(version)
except ValueError:
if version.lower().strip() == 'all':
return 'all'
else:
return ''
return str(version)
def compose_file(dirname, extension, subformat=None, version=None, storagename=None):
"""
Construct back a fullpath given the separate components.
@param
@param storagename Name under which the file should be stored in the filesystem
@type storagename string
@return a fullpath to the file
@rtype string
"""
if version:
version = ";%i" % int(version)
else:
version = ""
if subformat:
if not subformat.startswith(";"):
subformat = ";%s" % subformat
else:
subformat = ""
if extension and not extension.startswith("."):
extension = ".%s" % extension
if not storagename:
storagename = "content"
return os.path.join(dirname, storagename + extension + subformat + version)
def compose_format(extension, subformat=None):
"""
Construct the format string
"""
if not extension.startswith("."):
extension = ".%s" % extension
if subformat:
if not subformat.startswith(";"):
subformat = ";%s" % subformat
else:
subformat = ""
return extension + subformat
def decompose_file(afile, skip_version=False, only_known_extensions=False,
allow_subformat=True):
"""
Decompose a file/path into its components dirname, basename and extension.
>>> decompose_file('/tmp/foo.tar.gz')
('/tmp', 'foo', '.tar.gz')
>>> decompose_file('/tmp/foo.tar.gz;1', skip_version=True)
('/tmp', 'foo', '.tar.gz')
>>> decompose_file('http://www.google.com/index.html')
('http://www.google.com', 'index', '.html')
@param afile: the path/name of a file.
@type afile: string
@param skip_version: whether to skip a trailing ";version".
@type skip_version: bool
@param only_known_extensions: whether to strip out only known extensions or
to consider as extension anything that follows a dot.
@type only_known_extensions: bool
@param allow_subformat: whether to consider also subformats as part of
the extension.
@type allow_subformat: bool
@return: a tuple with the directory name, the basename and extension.
@rtype: (dirname, basename, extension)
@note: if a URL is provided, the scheme will be part of the dirname.
@see: L{file_strip_ext} for the algorithm used to retrieve the extension.
"""
if skip_version:
version = afile.split(';')[-1]
try:
int(version)
afile = afile[:-len(version)-1]
except ValueError:
pass
basename = os.path.basename(afile)
dirname = afile[:-len(basename)-1]
base = file_strip_ext(
basename,
only_known_extensions=only_known_extensions,
allow_subformat=allow_subformat)
extension = basename[len(base) + 1:]
if extension:
extension = '.' + extension
return (dirname, base, extension)
def decompose_file_with_version(afile):
"""
Decompose a file into dirname, basename, extension and version.
>>> decompose_file_with_version('/tmp/foo.tar.gz;1')
('/tmp', 'foo', '.tar.gz', 1)
@param afile: the path/name of a file.
@type afile: string
@return: a tuple with the directory name, the basename, extension and
version.
@rtype: (dirname, basename, extension, version)
@raise ValueError: in case version does not exist it will.
@note: if a URL is provided, the scheme will be part of the dirname.
"""
version_str = afile.split(';')[-1]
version = int(version_str)
afile = afile[:-len(version_str)-1]
basename = os.path.basename(afile)
dirname = afile[:-len(basename)-1]
base = file_strip_ext(basename)
extension = basename[len(base) + 1:]
if extension:
extension = '.' + extension
return (dirname, base, extension, version)
def get_subformat_from_format(docformat):
"""
@return the subformat if any.
@rtype: string
>>> get_subformat_from_format('foo;bar')
'bar'
>>> get_subformat_from_format('foo')
''
"""
try:
return docformat[docformat.rindex(';') + 1:]
except ValueError:
return ''
def get_superformat_from_format(docformat):
"""
@return the superformat if any.
@rtype: string
>>> get_superformat_from_format('foo;bar')
'foo'
>>> get_superformat_from_format('foo')
'foo'
"""
try:
return docformat[:docformat.rindex(';')]
except ValueError:
return docformat
def propose_next_docname(docname):
"""
Given a I{docname}, suggest a new I{docname} (useful when trying to generate
a unique I{docname}).
>>> propose_next_docname('foo')
'foo_1'
>>> propose_next_docname('foo_1')
'foo_2'
>>> propose_next_docname('foo_10')
'foo_11'
@param docname: the base docname.
@type docname: string
@return: the next possible docname based on the given one.
@rtype: string
"""
if '_' in docname:
split_docname = docname.split('_')
try:
split_docname[-1] = str(int(split_docname[-1]) + 1)
docname = '_'.join(split_docname)
except ValueError:
docname += '_1'
else:
docname += '_1'
return docname
class BibRecDocs(object):
"""
This class represents all the files attached to one record.
@param recid: the record identifier.
@type recid: integer
@param deleted_too: whether to consider deleted documents as normal
documents (useful when trying to recover deleted information).
@type deleted_too: bool
@param human_readable: whether numbers should be printed in human readable
format (e.g. 2048 bytes -> 2Kb)
@ivar id: the record identifier as passed to the constructor.
@type id: integer
@ivar human_readable: the human_readable flag as passed to the constructor.
@type human_readable: bool
@ivar deleted_too: the deleted_too flag as passed to the constructor.
@type deleted_too: bool
@ivar bibdocs: the list of documents attached to the record.
@type bibdocs: list of BibDoc
"""
def __init__(self, recid, deleted_too=False, human_readable=False):
try:
self.id = int(recid)
except ValueError:
raise ValueError("BibRecDocs: recid is %s but must be an integer." % repr(recid))
self.human_readable = human_readable
self.deleted_too = deleted_too
self.bibdocs = {}
self.attachment_types = {} # dictionary docname->attachment type
self.build_bibdoc_list()
def __repr__(self):
"""
@return: the canonical string representation of the C{BibRecDocs}.
@rtype: string
"""
return 'BibRecDocs(%s%s%s)' % (self.id,
self.deleted_too and ', True' or '',
self.human_readable and ', True' or ''
)
def __str__(self):
"""
@return: an easy to be I{grepped} string representation of the
whole C{BibRecDocs} content.
@rtype: string
"""
out = '%i::::total bibdocs attached=%i\n' % (self.id, len(self.bibdocs))
out += '%i::::total size latest version=%s\n' % (self.id, nice_size(self.get_total_size_latest_version()))
out += '%i::::total size all files=%s\n' % (self.id, nice_size(self.get_total_size()))
for (docname, (bibdoc, dummy)) in self.bibdocs.items():
out += str(docname) + ":" + str(bibdoc)
return out
def empty_p(self):
"""
@return: True when the record has no attached documents.
@rtype: bool
"""
return len(self.bibdocs) == 0
def deleted_p(self):
"""
@return: True if the correxsponding record has been deleted.
@rtype: bool
"""
from invenio.legacy.search_engine import record_exists
return record_exists(self.id) == -1
def get_xml_8564(self):
"""
Return a snippet of I{MARCXML} representing the I{8564} fields
corresponding to the current state.
@return: the MARCXML representation.
@rtype: string
"""
from invenio.legacy.search_engine import get_record
out = ''
record = get_record(self.id)
fields = record_get_field_instances(record, '856', '4', ' ')
for field in fields:
urls = field_get_subfield_values(field, 'u')
if urls and not bibdocfile_url_p(urls[0]):
out += '\t<datafield tag="856" ind1="4" ind2=" ">\n'
for subfield, value in field_get_subfield_instances(field):
out += '\t\t<subfield code="%s">%s</subfield>\n' % (subfield, encode_for_xml(value))
out += '\t</datafield>\n'
for afile in self.list_latest_files(list_hidden=False):
out += '\t<datafield tag="856" ind1="4" ind2=" ">\n'
url = afile.get_url()
description = afile.get_description()
comment = afile.get_comment()
if url:
out += '\t\t<subfield code="u">%s</subfield>\n' % encode_for_xml(url)
if description:
out += '\t\t<subfield code="y">%s</subfield>\n' % encode_for_xml(description)
if comment:
out += '\t\t<subfield code="z">%s</subfield>\n' % encode_for_xml(comment)
out += '\t</datafield>\n'
return out
def get_total_size_latest_version(self):
"""
Returns the total size used on disk by all the files belonging
to this record and corresponding to the latest version.
@return: the total size.
@rtype: integer
"""
size = 0
for (bibdoc, _) in self.bibdocs.values():
size += bibdoc.get_total_size_latest_version()
return size
def get_total_size(self):
"""
Return the total size used on disk of all the files belonging
to this record of any version (not only the last as in
L{get_total_size_latest_version}).
@return: the total size.
@rtype: integer
"""
size = 0
for (bibdoc, _) in self.bibdocs.values():
size += bibdoc.get_total_size()
return size
def build_bibdoc_list(self):
"""
This method must be called everytime a I{bibdoc} is added, removed or
modified.
"""
self.bibdocs = {}
if self.deleted_too:
res = run_sql("""SELECT brbd.id_bibdoc, brbd.docname, brbd.type FROM bibrec_bibdoc as brbd JOIN
bibdoc as bd ON bd.id=brbd.id_bibdoc WHERE brbd.id_bibrec=%s
ORDER BY brbd.docname ASC""", (self.id,))
else:
res = run_sql("""SELECT brbd.id_bibdoc, brbd.docname, brbd.type FROM bibrec_bibdoc as brbd JOIN
bibdoc as bd ON bd.id=brbd.id_bibdoc WHERE brbd.id_bibrec=%s AND
bd.status<>'DELETED' ORDER BY brbd.docname ASC""", (self.id,))
for row in res:
cur_doc = BibDoc.create_instance(docid=row[0], recid=self.id,
human_readable=self.human_readable)
self.bibdocs[row[1]] = (cur_doc, row[2])
def list_bibdocs_by_names(self, doctype=None):
"""
Returns the dictionary of all bibdocs object belonging to a recid.
Keys in the dictionary are names of documetns and values are BibDoc objects.
If C{doctype} is set, it returns just the bibdocs of that doctype.
@param doctype: the optional doctype.
@type doctype: string
@return: the dictionary of bibdocs.
@rtype: dictionary of Dcname -> BibDoc
"""
if not doctype:
return dict((k,v) for (k,(v,_)) in self.bibdocs.iteritems())
res = {}
for (docname, (doc, attachmenttype)) in self.bibdocs.iteritems():
if attachmenttype == doctype:
res[docname] = doc
return res
def list_bibdocs(self, doctype=None):
"""
Returns the list all bibdocs object belonging to a recid.
If C{doctype} is set, it returns just the bibdocs of that doctype.
@param doctype: the optional doctype.
@type doctype: string
@return: the list of bibdocs.
@rtype: list of BibDoc
"""
if not doctype:
return [d for (d,_) in self.bibdocs.values()]
else:
return [bibdoc for (bibdoc, attype) in self.bibdocs.values() if doctype == attype]
def get_bibdoc_names(self, doctype=None):
"""
Returns all the names of the documents associated with the bibrec.
If C{doctype} is set, restrict the result to all the matching doctype.
@param doctype: the optional doctype.
@type doctype: string
@return: the list of document names.
@rtype: list of string
"""
return [docname for (docname, dummy) in self.list_bibdocs_by_names(doctype).items()]
def check_file_exists(self, path, f_format):
"""
Check if a file with the same content of the file pointed in C{path}
is already attached to this record.
@param path: the file to be checked against.
@type path: string
@return: True if a file with the requested content is already attached
to the record.
@rtype: bool
"""
size = os.path.getsize(path)
# Let's consider all the latest files
files = self.list_latest_files()
# Let's consider all the latest files with same size
potential = [afile for afile in files if afile.get_size() == size and afile.format == f_format]
if potential:
checksum = calculate_md5(path)
# Let's consider all the latest files with the same size and the
# same checksum
potential = [afile for afile in potential if afile.get_checksum() == checksum]
if potential:
potential = [afile for afile in potential if \
filecmp.cmp(afile.get_full_path(), path)]
if potential:
return True
else:
# Gosh! How unlucky, same size, same checksum but not same
# content!
pass
return False
def propose_unique_docname(self, docname):
"""
Given C{docname}, return a new docname that is not already attached to
the record.
@param docname: the reference docname.
@type docname: string
@return: a docname not already attached.
@rtype: string
"""
docname = normalize_docname(docname)
goodname = docname
i = 1
while goodname in self.get_bibdoc_names():
i += 1
goodname = "%s_%s" % (docname, i)
return goodname
def merge_bibdocs(self, docname1, docname2):
"""
This method merge C{docname2} into C{docname1}.
1. Given all the formats of the latest version of the files
attached to C{docname2}, these files are added as new formats
into C{docname1}.
2. C{docname2} is marked as deleted.
@raise InvenioBibDocFileError: if at least one format in C{docname2}
already exists in C{docname1}. (In this case the two bibdocs are
preserved)
@note: comments and descriptions are also copied.
@note: if C{docname2} has a I{restriction}(i.e. if the I{status} is
set) and C{docname1} doesn't, the restriction is imported.
"""
bibdoc1 = self.get_bibdoc(docname1)
bibdoc2 = self.get_bibdoc(docname2)
## Check for possibility
for bibdocfile in bibdoc2.list_latest_files():
docformat = bibdocfile.get_format()
if bibdoc1.format_already_exists_p(docformat):
raise InvenioBibDocFileError('Format %s already exists in bibdoc %s of record %s. It\'s impossible to merge bibdoc %s into it.' % (docformat, docname1, self.id, docname2))
## Importing restriction if needed.
restriction1 = bibdoc1.get_status()
restriction2 = bibdoc2.get_status()
if restriction2 and not restriction1:
bibdoc1.set_status(restriction2)
## Importing formats
for bibdocfile in bibdoc2.list_latest_files():
docformat = bibdocfile.get_format()
comment = bibdocfile.get_comment()
description = bibdocfile.get_description()
bibdoc1.add_file_new_format(bibdocfile.get_full_path(),
description=description,
comment=comment, docformat=docformat)
## Finally deleting old bibdoc2
bibdoc2.delete()
self.build_bibdoc_list()
def get_docid(self, docname):
"""
@param docname: the document name.
@type docname: string
@return: the identifier corresponding to the given C{docname}.
@rtype: integer
@raise InvenioBibDocFileError: if the C{docname} does not
corresponds to a document attached to this record.
"""
if docname in self.bibdocs:
return self.bibdocs[docname][0].id
raise InvenioBibDocFileError, "Recid '%s' is not connected with a " \
"docname '%s'" % (self.id, docname)
def get_docname(self, docid):
"""
@param docid: the document identifier.
@type docid: integer
@return: the name of the document corresponding to the given document
identifier.
@rtype: string
@raise InvenioBibDocFileError: if the C{docid} does not
corresponds to a document attached to this record.
"""
for (docname, (bibdoc, _)) in self.bibdocs.items():
if bibdoc.id == docid:
return docname
raise InvenioBibDocFileError, "Recid '%s' is not connected with a " \
"docid '%s'" % (self.id, docid)
def change_name(self, newname, oldname=None, docid=None):
"""
Renames document of a given name.
@param newname: the new name.
@type newname: string
@raise InvenioBibDocFileError: if the new name corresponds to
a document already attached to the record owning this document.
"""
if not oldname and not docid:
raise StandardError("Trying to rename unspecified document")
if not oldname:
oldname = self.get_docname(docid)
if not docid:
docid = self.get_docid(oldname)
doc, atttype = self.bibdocs[oldname]
try:
newname = normalize_docname(newname)
res = run_sql("SELECT id_bibdoc FROM bibrec_bibdoc WHERE id_bibrec=%s AND docname=%s", (self.id, newname))
if res:
raise InvenioBibDocFileError, "A bibdoc called %s already exists for recid %s" % (newname, self.id)
run_sql("update bibrec_bibdoc set docname=%s where id_bibdoc=%s and id_bibrec=%s", (newname, docid, self.id))
finally:
# updating the document
for a in doc.bibrec_links:
if a["recid"] == self.id:
a["docname"] = newname
# updating the record structure
del self.bibdocs[oldname]
self.bibdocs[newname] = (doc, atttype)
def has_docname_p(self, docname):
"""
@param docname: the document name,
@type docname: string
@return: True if a document with the given name is attached to this
record.
@rtype: bool
"""
return docname in self.bibdocs.keys()
def get_bibdoc(self, docname):
"""
@return: the bibdoc with a particular docname associated with
this recid"""
if docname in self.bibdocs:
return self.bibdocs[docname][0]
raise InvenioBibDocFileError, "Recid '%s' is not connected with " \
" docname '%s'" % (self.id, docname)
def delete_bibdoc(self, docname):
"""
Deletes the document with the specified I{docname}.
@param docname: the document name.
@type docname: string
"""
if docname in self.bibdocs:
self.bibdocs[docname][0].delete()
self.build_bibdoc_list()
def add_bibdoc(self, doctype="Main", docname='file', never_fail=False):
"""
Add a new empty document object (a I{bibdoc}) to the list of
documents of this record.
@param doctype: the document type.
@type doctype: string
@param docname: the document name.
@type docname: string
@param never_fail: if True, this procedure will not fail, even if
a document with the given name is already attached to this
record. In this case a new name will be generated (see
L{propose_unique_docname}).
@type never_fail: bool
@return: the newly created document object.
@rtype: BibDoc
@raise InvenioBibDocFileError: in case of any error.
"""
try:
docname = normalize_docname(docname)
if never_fail:
docname = self.propose_unique_docname(docname)
if docname in self.get_bibdoc_names():
raise InvenioBibDocFileError, \
"%s has already a bibdoc with docname %s" % (self.id, docname)
else:
bibdoc = BibDoc.create_instance(recid=self.id, doctype=doctype,
docname=docname,
human_readable=self.human_readable)
self.build_bibdoc_list()
return bibdoc
except Exception, e:
register_exception()
raise InvenioBibDocFileError(str(e))
def add_new_file(self, fullpath, doctype="Main", docname=None,
never_fail=False, description=None, comment=None,
docformat=None, flags=None, modification_date=None):
"""
Directly add a new file to this record.
Adds a new file with the following policy:
- if the C{docname} is not set it is retrieved from the name of the
file.
- If a bibdoc with the given docname doesn't already exist, it is
created and the file is added to it.
- It it exist but it doesn't contain the format that is being
added, the new format is added.
- If the format already exists then if C{never_fail} is True a new
bibdoc is created with a similar name but with a progressive
number as a suffix and the file is added to it (see
L{propose_unique_docname}).
@param fullpath: the filesystme path of the document to be added.
@type fullpath: string
@param doctype: the type of the document.
@type doctype: string
@param docname: the document name.
@type docname: string
@param never_fail: if True, this procedure will not fail, even if
a document with the given name is already attached to this
record. In this case a new name will be generated (see
L{propose_unique_docname}).
@type never_fail: bool
@param description: an optional description of the file.
@type description: string
@param comment: an optional comment to the file.
@type comment: string
@param format: the extension of the file. If not specified it will
be guessed (see L{guess_format_from_url}).
@type format: string
@param flags: a set of flags to be associated with the file (see
L{CFG_BIBDOCFILE_AVAILABLE_FLAGS})
@type flags: list of string
@return: the elaborated document object.
@rtype: BibDoc
@raise InvenioBibDocFileError: in case of error.
"""
if docname is None:
docname = decompose_file(fullpath)[1]
if docformat is None:
docformat = decompose_file(fullpath)[2]
docname = normalize_docname(docname)
try:
bibdoc = self.get_bibdoc(docname)
except InvenioBibDocFileError:
# bibdoc doesn't already exists!
bibdoc = self.add_bibdoc(doctype, docname, False)
bibdoc.add_file_new_version(fullpath, description=description, comment=comment, docformat=docformat, flags=flags, modification_date=modification_date)
self.build_bibdoc_list()
else:
try:
bibdoc.add_file_new_format(fullpath, description=description, comment=comment, docformat=docformat, flags=flags, modification_date=modification_date)
self.build_bibdoc_list()
except InvenioBibDocFileError, dummy:
# Format already exist!
if never_fail:
bibdoc = self.add_bibdoc(doctype, docname, True)
bibdoc.add_file_new_version(fullpath, description=description, comment=comment, docformat=docformat, flags=flags, modification_date=modification_date)
self.build_bibdoc_list()
else:
raise
return bibdoc
def add_new_version(self, fullpath, docname=None, description=None, comment=None, docformat=None, flags=None):
"""
Adds a new file to an already existent document object as a new
version.
@param fullpath: the filesystem path of the file to be added.
@type fullpath: string
@param docname: the document name. If not specified it will be
extracted from C{fullpath} (see L{decompose_file}).
@type docname: string
@param description: an optional description for the file.
@type description: string
@param comment: an optional comment to the file.
@type comment: string
@param format: the extension of the file. If not specified it will
be guessed (see L{guess_format_from_url}).
@type format: string
@param flags: a set of flags to be associated with the file (see
L{CFG_BIBDOCFILE_AVAILABLE_FLAGS})
@type flags: list of string
@return: the elaborated document object.
@rtype: BibDoc
@raise InvenioBibDocFileError: in case of error.
@note: previous files associated with the same document will be
considered obsolete.
"""
if docname is None:
docname = decompose_file(fullpath)[1]
if docformat is None:
docformat = decompose_file(fullpath)[2]
if flags is None:
flags = []
if 'pdfa' in get_subformat_from_format(docformat).split(';') and not 'PDF/A' in flags:
flags.append('PDF/A')
bibdoc = self.get_bibdoc(docname=docname)
bibdoc.add_file_new_version(fullpath, description=description, comment=comment, docformat=docformat, flags=flags)
self.build_bibdoc_list()
return bibdoc
def add_new_format(self, fullpath, docname=None, description=None, comment=None, docformat=None, flags=None, modification_date=None):
"""
Adds a new file to an already existent document object as a new
format.
@param fullpath: the filesystem path of the file to be added.
@type fullpath: string
@param docname: the document name. If not specified it will be
extracted from C{fullpath} (see L{decompose_file}).
@type docname: string
@param description: an optional description for the file.
@type description: string
@param comment: an optional comment to the file.
@type comment: string
@param format: the extension of the file. If not specified it will
be guessed (see L{guess_format_from_url}).
@type format: string
@param flags: a set of flags to be associated with the file (see
L{CFG_BIBDOCFILE_AVAILABLE_FLAGS})
@type flags: list of string
@return: the elaborated document object.
@rtype: BibDoc
@raise InvenioBibDocFileError: in case the same format already
exists.
"""
if docname is None:
docname = decompose_file(fullpath)[1]
if docformat is None:
docformat = decompose_file(fullpath)[2]
if flags is None:
flags = []
if 'pdfa' in get_subformat_from_format(docformat).split(';') and not 'PDF/A' in flags:
flags.append('PDF/A')
bibdoc = self.get_bibdoc(docname=docname)
bibdoc.add_file_new_format(fullpath, description=description, comment=comment, docformat=docformat, flags=flags, modification_date=modification_date)
self.build_bibdoc_list()
return bibdoc
def list_latest_files(self, doctype='', list_hidden=True):
"""
Returns a list of the latest files.
@param doctype: if set, only document of the given type will be listed.
@type doctype: string
@param list_hidden: if True, will list also files with the C{HIDDEN}
flag being set.
@type list_hidden: bool
@return: the list of latest files.
@rtype: list of BibDocFile
"""
docfiles = []
for bibdoc in self.list_bibdocs(doctype):
docfiles += bibdoc.list_latest_files(list_hidden=list_hidden)
return docfiles
def fix(self, docname):
"""
Algorithm that transform a broken/old bibdoc into a coherent one.
Think of it as being the fsck of BibDocs.
- All the files in the bibdoc directory will be renamed according
to the document name. Proper .recid, .type, .md5 files will be
created/updated.
- In case of more than one file with the same format version a new
bibdoc will be created in order to put does files.
@param docname: the document name that need to be fixed.
@type docname: string
@return: the list of newly created bibdocs if any.
@rtype: list of BibDoc
@raise InvenioBibDocFileError: in case of issues that can not be
fixed automatically.
"""
bibdoc = self.get_bibdoc(docname)
versions = {}
res = []
new_bibdocs = [] # List of files with the same version/format of
# existing file which need new bibdoc.
counter = 0
zero_version_bug = False
if os.path.exists(bibdoc.basedir):
for filename in os.listdir(bibdoc.basedir):
if filename[0] != '.' and ';' in filename:
name, version = filename.split(';')
try:
version = int(version)
except ValueError:
# Strange name
register_exception()
raise InvenioBibDocFileError, "A file called %s exists under %s. This is not a valid name. After the ';' there must be an integer representing the file version. Please, manually fix this file either by renaming or by deleting it." % (filename, bibdoc.basedir)
if version == 0:
zero_version_bug = True
docformat = name[len(file_strip_ext(name)):]
docformat = normalize_format(docformat)
if not versions.has_key(version):
versions[version] = {}
new_name = 'FIXING-%s-%s' % (str(counter), name)
try:
shutil.move('%s/%s' % (bibdoc.basedir, filename), '%s/%s' % (bibdoc.basedir, new_name))
except Exception, e:
register_exception()
raise InvenioBibDocFileError, "Error in renaming '%s' to '%s': '%s'" % ('%s/%s' % (bibdoc.basedir, filename), '%s/%s' % (bibdoc.basedir, new_name), e)
if versions[version].has_key(docformat):
new_bibdocs.append((new_name, version))
else:
versions[version][docformat] = new_name
counter += 1
elif filename[0] != '.':
# Strange name
register_exception()
raise InvenioBibDocFileError, "A file called %s exists under %s. This is not a valid name. There should be a ';' followed by an integer representing the file version. Please, manually fix this file either by renaming or by deleting it." % (filename, bibdoc.basedir)
else:
# we create the corresponding storage directory
old_umask = os.umask(022)
os.makedirs(bibdoc.basedir)
# and save the father record id if it exists
try:
if self.id != "":
recid_fd = open("%s/.recid" % bibdoc.basedir, "w")
recid_fd.write(str(self.id))
recid_fd.close()
if bibdoc.doctype != "":
type_fd = open("%s/.type" % bibdoc.basedir, "w")
type_fd.write(str(bibdoc.doctype))
type_fd.close()
except Exception, e:
register_exception()
raise InvenioBibDocFileError, e
os.umask(old_umask)
if not versions:
bibdoc.delete()
else:
for version, formats in versions.iteritems():
if zero_version_bug:
version += 1
for docformat, filename in formats.iteritems():
destination = '%s%s;%i' % (docname, docformat, version)
try:
shutil.move('%s/%s' % (bibdoc.basedir, filename), '%s/%s' % (bibdoc.basedir, destination))
except Exception, e:
register_exception()
raise InvenioBibDocFileError, "Error in renaming '%s' to '%s': '%s'" % ('%s/%s' % (bibdoc.basedir, filename), '%s/%s' % (bibdoc.basedir, destination), e)
try:
recid_fd = open("%s/.recid" % bibdoc.basedir, "w")
recid_fd.write(str(self.id))
recid_fd.close()
type_fd = open("%s/.type" % bibdoc.basedir, "w")
type_fd.write(str(bibdoc.doctype))
type_fd.close()
except Exception, e:
register_exception()
raise InvenioBibDocFileError, "Error in creating .recid and .type file for '%s' folder: '%s'" % (bibdoc.basedir, e)
self.build_bibdoc_list()
res = []
for (filename, version) in new_bibdocs:
if zero_version_bug:
version += 1
new_bibdoc = self.add_bibdoc(doctype=bibdoc.doctype, docname=docname, never_fail=True)
new_bibdoc.add_file_new_format('%s/%s' % (bibdoc.basedir, filename), version)
res.append(new_bibdoc)
try:
os.remove('%s/%s' % (bibdoc.basedir, filename))
except Exception, e:
register_exception()
raise InvenioBibDocFileError, "Error in removing '%s': '%s'" % ('%s/%s' % (bibdoc.basedir, filename), e)
Md5Folder(bibdoc.basedir).update(only_new=False)
bibdoc._build_file_list()
self.build_bibdoc_list()
for (bibdoc, dummyatttype) in self.bibdocs.values():
if not run_sql('SELECT data_value FROM bibdocmoreinfo WHERE bibdocid=%s', (bibdoc.id,)):
## Import from MARC only if the bibdoc has never had
## its more_info initialized.
try:
bibdoc.import_descriptions_and_comments_from_marc()
except Exception, e:
register_exception()
raise InvenioBibDocFileError, "Error in importing description and comment from %s for record %s: %s" % (repr(bibdoc), self.id, e)
return res
def check_format(self, docname):
"""
Check for any format related issue.
In case L{CFG_BIBDOCFILE_ADDITIONAL_KNOWN_FILE_EXTENSIONS} is
altered or Python version changes, it might happen that a docname
contains files which are no more docname + .format ; version, simply
because the .format is now recognized (and it was not before, so
it was contained into the docname).
This algorithm verify if it is necessary to fix (seel L{fix_format}).
@param docname: the document name whose formats should be verified.
@type docname: string
@return: True if format is correct. False if a fix is needed.
@rtype: bool
@raise InvenioBibDocFileError: in case of any error.
"""
bibdoc = self.get_bibdoc(docname)
correct_docname = decompose_file(docname + '.pdf')[1]
if docname != correct_docname:
return False
for filename in os.listdir(bibdoc.basedir):
if not filename.startswith('.'):
try:
dummy, dummy, docformat, version = decompose_file_with_version(filename)
except Exception:
raise InvenioBibDocFileError('Incorrect filename "%s" for docname %s for recid %i' % (filename, docname, self.id))
if '%s%s;%i' % (correct_docname, docformat, version) != filename:
return False
return True
def check_duplicate_docnames(self):
"""
Check wethever the record is connected with at least tho documents
with the same name.
@return: True if everything is fine.
@rtype: bool
"""
docnames = set()
for docname in self.get_bibdoc_names():
if docname in docnames:
return False
else:
docnames.add(docname)
return True
def uniformize_bibdoc(self, docname):
"""
This algorithm correct wrong file name belonging to a bibdoc.
@param docname: the document name whose formats should be verified.
@type docname: string
"""
bibdoc = self.get_bibdoc(docname)
for filename in os.listdir(bibdoc.basedir):
if not filename.startswith('.'):
try:
dummy, dummy, docformat, version = decompose_file_with_version(filename)
except ValueError:
register_exception(alert_admin=True, prefix= "Strange file '%s' is stored in %s" % (filename, bibdoc.basedir))
else:
os.rename(os.path.join(bibdoc.basedir, filename), os.path.join(bibdoc.basedir, '%s%s;%i' % (docname, docformat, version)))
Md5Folder(bibdoc.basedir).update()
bibdoc.touch()
bibdoc._build_file_list('rename')
def fix_format(self, docname, skip_check=False):
"""
Fixes format related inconsistencies.
@param docname: the document name whose formats should be verified.
@type docname: string
@param skip_check: if True assume L{check_format} has already been
called and the need for fix has already been found.
If False, will implicitly call L{check_format} and skip fixing
if no error is found.
@type skip_check: bool
@return: in case merging two bibdocs is needed but it's not possible.
@rtype: bool
"""
if not skip_check:
if self.check_format(docname):
return True
bibdoc = self.get_bibdoc(docname)
correct_docname = decompose_file(docname + '.pdf')[1]
need_merge = False
if correct_docname != docname:
need_merge = self.has_docname_p(correct_docname)
if need_merge:
proposed_docname = self.propose_unique_docname(correct_docname)
run_sql('UPDATE bibdoc SET docname=%s WHERE id=%s', (proposed_docname, bibdoc.id))
self.build_bibdoc_list()
self.uniformize_bibdoc(proposed_docname)
try:
self.merge_bibdocs(docname, proposed_docname)
except InvenioBibDocFileError:
return False
else:
run_sql('UPDATE bibdoc SET docname=%s WHERE id=%s', (correct_docname, bibdoc.id))
self.build_bibdoc_list()
self.uniformize_bibdoc(correct_docname)
else:
self.uniformize_bibdoc(docname)
return True
def fix_duplicate_docnames(self, skip_check=False):
"""
Algotirthm to fix duplicate docnames.
If a record is connected with at least two bibdoc having the same
docname, the algorithm will try to merge them.
@param skip_check: if True assume L{check_duplicate_docnames} has
already been called and the need for fix has already been found.
If False, will implicitly call L{check_duplicate_docnames} and skip
fixing if no error is found.
@type skip_check: bool
"""
if not skip_check:
if self.check_duplicate_docnames():
return
docnames = set()
for bibdoc in self.list_bibdocs():
docname = self.get_docname(bibdoc.id)
if docname in docnames:
new_docname = self.propose_unique_docname(self.get_docname(bibdoc.id))
self.change_name(docid=bibdoc.id, newname=new_docname)
self.merge_bibdocs(docname, new_docname)
docnames.add(docname)
def get_text(self, extract_text_if_necessary=True):
"""
@return: concatenated texts of all bibdocs separated by " ": string
"""
texts = []
for bibdoc in self.list_bibdocs():
if hasattr(bibdoc, 'has_text'):
if extract_text_if_necessary and not bibdoc.has_text(require_up_to_date=True):
re_perform_ocr = re.compile(CFG_BIBINDEX_PERFORM_OCR_ON_DOCNAMES)
perform_ocr = bool(re_perform_ocr.match(bibdoc.get_docname()))
- from invenio.bibtask import write_message
+ from invenio.legacy.bibsched.bibtask import write_message
write_message("... will extract words from %s (docid: %s) %s" % (bibdoc.get_docname(), bibdoc.get_id(), perform_ocr and 'with OCR' or ''), verbose=2)
bibdoc.extract_text(perform_ocr=perform_ocr)
texts.append(bibdoc.get_text())
return " ".join(texts)
class BibDoc(object):
"""
This class represents one document (i.e. a set of files with different
formats and with versioning information that consitutes a piece of
information.
To instanciate a new document, the recid and the docname are mandatory.
To instanciate an already existing document, either the recid and docname
or the docid alone are sufficient to retrieve it.
@param docid: the document identifier.
@type docid: integer
@param recid: the record identifier of the record to which this document
belongs to. If the C{docid} is specified the C{recid} is automatically
retrieven from the database.
@type recid: integer
@param docname: the document name.
@type docname: string
@param doctype: the document type (used when instanciating a new document).
@type doctype: string
@param human_readable: whether sizes should be represented in a human
readable format.
@type human_readable: bool
@raise InvenioBibDocFileError: in case of error.
"""
@staticmethod
def create_new_document(doc_type = "Main", rec_links = []):
status = ''
doc_id = run_sql("INSERT INTO bibdoc (status, creation_date, modification_date, doctype) "
"values(%s,NOW(),NOW(), %s)", (status, doc_type))
if not doc_id:
raise InvenioBibDocFileError, "New docid cannot be created"
# creating the representation on disk ... preparing the directory
try:
BibDoc.prepare_basedir(doc_id)
except Exception, e:
run_sql('DELETE FROM bibdoc WHERE id=%s', (doc_id, ))
# run_sql('DELETE FROM bibrec_bibdoc WHERE id_bibdoc=%s', (doc_id, ))
register_exception(alert_admin=True)
raise InvenioBibDocFileError, e
# the object has been created: linking to bibliographical records
doc = BibDoc(doc_id)
for link in rec_links:
if "rec_id" in link and link["rec_id"]:
rec_id = link["rec_id"]
doc_name = normalize_docname(link["doc_name"])
a_type = link["a_type"]
doc.attach_to_record(rec_id, str(a_type), str(doc_name))
return doc_id
def __init__(self, docid, human_readable=False, initial_data=None):
"""Constructor of a bibdoc. At least the docid or the recid/docname
pair is needed.
specifying recid, docname and doctype without specifying docid results in
attaching newly created document to a record
"""
# docid is known, the document already exists
res2 = run_sql("SELECT id_bibrec, type, docname FROM bibrec_bibdoc WHERE id_bibdoc=%s", (docid,))
self.bibrec_types = [(r[0], r[1], r[2]) for r in res2 ] # just in case the result was behaving like tuples but was something else
if not res2:
# fake attachment
self.bibrec_types = [(0, None, "fake_name_for_unattached_document")]
if initial_data is None:
initial_data = BibDoc._retrieve_data(docid)
self.docfiles = []
self.__md5s = None
self.human_readable = human_readable
self.cd = initial_data["cd"] # creation date
self.md = initial_data["md"] # modification date
self.td = initial_data["td"] # text extraction date # should be moved from here !!!!
self.bibrec_links = initial_data["bibrec_links"]
self.id = initial_data["id"]
self.status = initial_data["status"]
self.basedir = initial_data["basedir"]
self.doctype = initial_data["doctype"]
self.storagename = initial_data["storagename"] # the old docname -> now used as a storage name for old records
self.more_info = BibDocMoreInfo(self.id)
self._build_file_list('init')
# link with related_files
self._build_related_file_list()
@staticmethod
def prepare_basedir(doc_id):
"""Prepares the directory serving as root of a BibDoc"""
basedir = _make_base_dir(doc_id)
# we create the corresponding storage directory
if not os.path.exists(basedir):
old_umask = os.umask(022)
os.makedirs(basedir)
os.umask(old_umask)
def _update_additional_info_files_p(self):
"""Update the hidden file in the document directory ... the file contains all links to records"""
try:
reclinks_fd = open("%s/.reclinks" % (self.basedir, ), "w")
reclinks_fd.write("RECID DOCNAME TYPE\n")
for link in self.bibrec_links:
reclinks_fd.write("%(recid)s %(docname)s %(doctype)s\n" % link)
reclinks_fd.close()
except Exception, e:
register_exception(alert_admin=True)
raise InvenioBibDocFileError, e
@staticmethod
def _retrieve_data(docid = None):
"""
Filling information about a document from the database entry
"""
container = {}
container["bibrec_links"] = []
container["id"] = docid
container["basedir"] = _make_base_dir(container["id"])
# retrieving links betwen records and documents
res = run_sql("SELECT id_bibrec, type, docname FROM bibrec_bibdoc WHERE id_bibdoc=%s", (str(docid),), 1)
if res:
for r in res:
container["bibrec_links"].append({"recid": r[0], "doctype": r[1], "docname": r[2]})
# gather the other information
res = run_sql("SELECT status, creation_date, modification_date, text_extraction_date, doctype, docname FROM bibdoc WHERE id=%s LIMIT 1", (docid,), 1)
if res:
container["status"] = res[0][0]
container["cd"] = res[0][1]
container["md"] = res[0][2]
container["td"] = res[0][3]
container["doctype"] = res[0][4]
container["storagename"] = res[0][5]
else:
# this bibdoc doesn't exist
raise InvenioBibDocFileError, "The docid %s does not exist." % docid
# retreiving all available formats
fprefix = container["storagename"] or "content"
container["extensions"] = [fname[len(fprefix):] for fname in filter(lambda x: x.startswith(fprefix),os.listdir(container["basedir"]))]
return container
@staticmethod
def create_instance(docid=None, recid=None, docname=None,
doctype='Fulltext', a_type = 'Main', human_readable=False):
"""
Parameters of an attachement to the record:
a_type, recid, docname
@param a_type Type of the attachment to the record (by default Main)
@type a_type String
@param doctype Type of the document itself (by default Fulltext)
@type doctype String
"""
# first try to retrieve existing record based on obtained data
data = None
extensions = []
if docid != None:
data = BibDoc._retrieve_data(docid)
doctype = data["doctype"]
extensions = data["extensions"]
# now check if the doctypype is supported by any particular plugin
def plugin_bldr(plugin_code):
"""Preparing the plugin dictionary structure"""
if not plugin_code.__name__.split('.')[-1].startswith('bom_'):
return
ret = {}
ret['create_instance'] = getattr(plugin_code, "create_instance", None)
ret['supports'] = getattr(plugin_code, "supports", None)
return ret
bibdoc_plugins = filter(None, map(
plugin_bldr, import_submodules_from_packages(
'bibdocfile_plugins', packages=['invenio'])))
# Loading an appropriate plugin (by default a generic BibDoc)
used_plugin = None
for plugin in bibdoc_plugins:
if plugin['supports'](doctype, extensions):
used_plugin = plugin
if not docid:
rec_links = []
if recid:
rec_links.append({"rec_id": recid, "doc_name" : docname, "a_type": a_type})
if used_plugin and 'create_new' in used_plugin:
docid = used_plugin['create_new'](doctype, rec_links)
else:
docid = BibDoc.create_new_document(doctype, rec_links)
if used_plugin:
return used_plugin['create_instance'](docid=docid,
human_readable=human_readable,
initial_data=data)
return BibDoc(docid=docid,
human_readable=human_readable,
initial_data=data)
# parameters can not be passed any more
@staticmethod
def _attach_to_record_p(doc_id, rec_id, a_type, docname):
"""Private core of a method attaching document of a given ID to a record
@param a_type Attachment type (a function in which the document appears in the document)
@type a_type String
"""
run_sql("INSERT INTO bibrec_bibdoc (id_bibrec, id_bibdoc, type, docname) VALUES (%s,%s,%s,%s)",
(str(rec_id), str(doc_id), a_type, docname))
def attach_to_record(self, recid, a_type, docname):
""" Attaches given document to a record given by its identifier.
@param recid The identifier of the record
@type recid Integer
@param a_type Function of a document in the record
@type a_type String
@param docname Name of a document inside of a record
@type docname String
"""
run_sql("INSERT INTO bibrec_bibdoc (id_bibrec, id_bibdoc, type, docname) VALUES (%s,%s,%s,%s)",
(str(recid), str(self.id), a_type, docname))
self._update_additional_info_files_p()
def __repr__(self):
"""
@return: the canonical string representation of the C{BibDoc}.
@rtype: string
"""
return 'BibDoc(%s, %s, %s)' % (repr(self.id), repr(self.doctype), repr(self.human_readable))
def format_recids(self):
"""Returns a string representation of related record ids"""
if len(self.bibrec_links) == 1:
return self.bibrec_links[0]["recid"]
return "[" + ",".join([str(el["recid"]) for el in self.bibrec_links]) + "]"
def __str__(self):
"""
@return: an easy to be I{grepped} string representation of the
whole C{BibDoc} content.
@rtype: string
"""
recids = self.format_recids()
out = '%s:%i:::doctype=%s\n' % (recids, self.id, self.doctype)
out += '%s:%i:::status=%s\n' % (recids, self.id, self.status)
out += '%s:%i:::basedir=%s\n' % (recids, self.id, self.basedir)
out += '%s:%i:::creation date=%s\n' % (recids, self.id, self.cd)
out += '%s:%i:::modification date=%s\n' % (recids, self.id, self.md)
out += '%s:%i:::text extraction date=%s\n' % (recids, self.id, self.td)
out += '%s:%i:::total file attached=%s\n' % (recids, self.id, len(self.docfiles))
if self.human_readable:
out += '%s:%i:::total size latest version=%s\n' % (recids, self.id, nice_size(self.get_total_size_latest_version()))
out += '%s:%i:::total size all files=%s\n' % (recids, self.id, nice_size(self.get_total_size()))
else:
out += '%s:%i:::total size latest version=%s\n' % (recids, self.id, self.get_total_size_latest_version())
out += '%s:%i:::total size all files=%s\n' % (recids, self.id, self.get_total_size())
for docfile in self.docfiles:
out += str(docfile)
return out
def get_md5s(self):
"""
@return: an instance of the Md5Folder class to access MD5 information
of the current BibDoc
@rtype: Md5Folder
"""
if self.__md5s is None:
self.__md5s = Md5Folder(self.basedir)
return self.__md5s
md5s = property(get_md5s)
def format_already_exists_p(self, docformat):
"""
@param format: a format to be checked.
@type format: string
@return: True if a file of the given format already exists among the
latest files.
@rtype: bool
"""
docformat = normalize_format(docformat)
for afile in self.list_latest_files():
if docformat == afile.get_format():
return True
return False
def get_status(self):
"""
@return: the status information.
@rtype: string
"""
return self.status
@staticmethod
def get_fileprefix(basedir, storagename=None):
fname = "%s" % (storagename or "content", )
return os.path.join(basedir, fname )
def get_filepath(self, docformat, version):
""" Generaters the path inside of the filesystem where the document should be stored.
@param format The format of the document
@type format string
@param version version to be stored in the file
@type version string
TODO: this should be completely replaced. File storage (and so, also path building)
should be abstracted from BibDoc and be using loadable extensions
@param format Format of the document to be stored
@type format string
@param version Version of the document to be stored
@type version String
@return Full path to the file encoding a particular version and format of the document
@trype string
"""
return "%s%s;%i" % (BibDoc.get_fileprefix(self.basedir, self.storagename), docformat, version)
def get_docname(self):
"""Obsolete !! (will return empty String for new format documents"""
return self.storagename
def get_doctype(self, recid):
"""Retrieves the type of this document in the scope of a given recid"""
link_types = [attachement["doctype"] for attachement in \
filter(lambda x: str(x["recid"]) == str(recid), \
self.bibrec_links)]
if link_types:
return link_types[0]
return ""
def touch(self):
"""
Update the modification time of the bibdoc (as in the UNIX command
C{touch}).
"""
run_sql('UPDATE bibdoc SET modification_date=NOW() WHERE id=%s', (self.id, ))
#if self.recid:
#run_sql('UPDATE bibrec SET modification_date=NOW() WHERE id=%s', (self.recid, ))
def change_doctype(self, new_doctype):
"""
Modify the doctype of a BibDoc
"""
run_sql('UPDATE bibdoc SET doctype=%s WHERE id=%s', (new_doctype, self.id))
run_sql('UPDATE bibrec_bibdoc SET type=%s WHERE id_bibdoc=%s', (new_doctype, self.id))
def set_status(self, new_status):
"""
Set a new status. A document with a status information is a restricted
document that can be accessed only to user which as an authorization
to the I{viewrestrdoc} WebAccess action with keyword status with value
C{new_status}.
@param new_status: the new status. If empty the document will be
unrestricted.
@type new_status: string
@raise InvenioBibDocFileError: in case the reserved word
'DELETED' is used.
"""
if new_status != KEEP_OLD_VALUE:
if new_status == 'DELETED':
raise InvenioBibDocFileError('DELETED is a reserved word and can not be used for setting the status')
run_sql('UPDATE bibdoc SET status=%s WHERE id=%s', (new_status, self.id))
self.status = new_status
self.touch()
self._build_file_list()
def add_file_new_version(self, filename, description=None, comment=None, docformat=None, flags=None, modification_date=None):
"""
Add a new version of a file. If no physical file is already attached
to the document a the given file will have version 1. Otherwise the
new file will have the current version number plus one.
@param filename: the local path of the file.
@type filename: string
@param description: an optional description for the file.
@type description: string
@param comment: an optional comment to the file.
@type comment: string
@param format: the extension of the file. If not specified it will
be retrieved from the filename (see L{decompose_file}).
@type format: string
@param flags: a set of flags to be associated with the file (see
L{CFG_BIBDOCFILE_AVAILABLE_FLAGS})
@type flags: list of string
@raise InvenioBibDocFileError: in case of error.
"""
try:
latestVersion = self.get_latest_version()
if latestVersion == 0:
myversion = 1
else:
myversion = latestVersion + 1
if os.path.exists(filename):
if not os.path.getsize(filename) > 0:
raise InvenioBibDocFileError, "%s seems to be empty" % filename
if docformat is None:
docformat = decompose_file(filename)[2]
else:
docformat = normalize_format(docformat)
destination = self.get_filepath(docformat, myversion)
if run_sql("SELECT id_bibdoc FROM bibdocfsinfo WHERE id_bibdoc=%s AND version=%s AND format=%s", (self.id, myversion, docformat)):
raise InvenioBibDocFileError("According to the database a file of format %s is already attached to the docid %s" % (docformat, self.id))
try:
shutil.copyfile(filename, destination)
os.chmod(destination, 0644)
if modification_date: # if the modification time of the file needs to be changed
update_modification_date_of_file(destination, modification_date)
except Exception, e:
register_exception()
raise InvenioBibDocFileError, "Encountered an exception while copying '%s' to '%s': '%s'" % (filename, destination, e)
self.more_info.set_description(description, docformat, myversion)
self.more_info.set_comment(comment, docformat, myversion)
if flags is None:
flags = []
if 'pdfa' in get_subformat_from_format(docformat).split(';') and not 'PDF/A' in flags:
flags.append('PDF/A')
for flag in flags:
if flag == 'PERFORM_HIDE_PREVIOUS':
for afile in self.list_all_files():
docformat = afile.get_format()
version = afile.get_version()
if version < myversion:
self.more_info.set_flag('HIDDEN', docformat, myversion)
else:
self.more_info.set_flag(flag, docformat, myversion)
else:
raise InvenioBibDocFileError, "'%s' does not exists!" % filename
finally:
self.touch()
Md5Folder(self.basedir).update()
self._build_file_list()
just_added_file = self.get_file(docformat, myversion)
run_sql("INSERT INTO bibdocfsinfo(id_bibdoc, version, format, last_version, cd, md, checksum, filesize, mime) VALUES(%s, %s, %s, true, %s, %s, %s, %s, %s)", (self.id, myversion, docformat, just_added_file.cd, just_added_file.md, just_added_file.get_checksum(), just_added_file.get_size(), just_added_file.mime))
run_sql("UPDATE bibdocfsinfo SET last_version=false WHERE id_bibdoc=%s AND version<%s", (self.id, myversion))
def add_file_new_format(self, filename, version=None, description=None, comment=None, docformat=None, flags=None, modification_date=None):
"""
Add a file as a new format.
@param filename: the local path of the file.
@type filename: string
@param version: an optional specific version to which the new format
should be added. If None, the last version will be used.
@type version: integer
@param description: an optional description for the file.
@type description: string
@param comment: an optional comment to the file.
@type comment: string
@param format: the extension of the file. If not specified it will
be retrieved from the filename (see L{decompose_file}).
@type format: string
@param flags: a set of flags to be associated with the file (see
L{CFG_BIBDOCFILE_AVAILABLE_FLAGS})
@type flags: list of string
@raise InvenioBibDocFileError: if the given format already exists.
"""
try:
if version is None:
version = self.get_latest_version()
if version == 0:
version = 1
if os.path.exists(filename):
if not os.path.getsize(filename) > 0:
raise InvenioBibDocFileError, "%s seems to be empty" % filename
if docformat is None:
docformat = decompose_file(filename)[2]
else:
docformat = normalize_format(docformat)
if run_sql("SELECT id_bibdoc FROM bibdocfsinfo WHERE id_bibdoc=%s AND version=%s AND format=%s", (self.id, version, docformat)):
raise InvenioBibDocFileError("According to the database a file of format %s is already attached to the docid %s" % (docformat, self.id))
destination = self.get_filepath(docformat, version)
if os.path.exists(destination):
raise InvenioBibDocFileError, "A file for docid '%s' already exists for the format '%s'" % (str(self.id), docformat)
try:
shutil.copyfile(filename, destination)
os.chmod(destination, 0644)
if modification_date: # if the modification time of the file needs to be changed
update_modification_date_of_file(destination, modification_date)
except Exception, e:
register_exception()
raise InvenioBibDocFileError, "Encountered an exception while copying '%s' to '%s': '%s'" % (filename, destination, e)
self.more_info.set_comment(comment, docformat, version)
self.more_info.set_description(description, docformat, version)
if flags is None:
flags = []
if 'pdfa' in get_subformat_from_format(docformat).split(';') and not 'PDF/A' in flags:
flags.append('PDF/A')
for flag in flags:
if flag != 'PERFORM_HIDE_PREVIOUS':
self.more_info.set_flag(flag, docformat, version)
else:
raise InvenioBibDocFileError, "'%s' does not exists!" % filename
finally:
Md5Folder(self.basedir).update()
self.touch()
self._build_file_list()
just_added_file = self.get_file(docformat, version)
run_sql("INSERT INTO bibdocfsinfo(id_bibdoc, version, format, last_version, cd, md, checksum, filesize, mime) VALUES(%s, %s, %s, true, %s, %s, %s, %s, %s)", (self.id, version, docformat, just_added_file.cd, just_added_file.md, just_added_file.get_checksum(), just_added_file.get_size(), just_added_file.mime))
def change_docformat(self, oldformat, newformat):
"""
Renames a format name on disk and in all BibDoc structures.
The change will touch only the last version files.
The change will take place only if the newformat doesn't already exist.
@param oldformat: the format that needs to be renamed
@type oldformat: string
@param newformat: the format new name
@type newformat: string
"""
oldformat = normalize_format(oldformat)
newformat = normalize_format(newformat)
if self.format_already_exists_p(newformat):
# same format already exists in the latest files, abort
return
for bibdocfile in self.list_latest_files():
if bibdocfile.get_format() == oldformat:
# change format -> rename x.oldformat -> x.newformat
dirname, base, docformat, version = decompose_file_with_version(bibdocfile.get_full_path())
os.rename(bibdocfile.get_full_path(), os.path.join(dirname, '%s%s;%i' %(base, newformat, version)))
Md5Folder(self.basedir).update()
self.touch()
self._build_file_list('rename')
self._sync_to_db()
return
def purge(self):
"""
Physically removes all the previous version of the given bibdoc.
Everything but the last formats will be erased.
"""
version = self.get_latest_version()
if version > 1:
for afile in self.docfiles:
if afile.get_version() < version:
self.more_info.unset_comment(afile.get_format(), afile.get_version())
self.more_info.unset_description(afile.get_format(), afile.get_version())
for flag in CFG_BIBDOCFILE_AVAILABLE_FLAGS:
self.more_info.unset_flag(flag, afile.get_format(), afile.get_version())
try:
os.remove(afile.get_full_path())
except Exception, dummy:
register_exception()
Md5Folder(self.basedir).update()
self.touch()
self._build_file_list()
run_sql("DELETE FROM bibdocfsinfo WHERE id_bibdoc=%s AND version<%s", (self.id, version))
def expunge(self):
"""
Physically remove all the traces of a given document.
@note: an expunged BibDoc object shouldn't be used anymore or the
result might be unpredicted.
"""
del self.__md5s
self.more_info.delete()
del self.more_info
os.system('rm -rf %s' % escape_shell_arg(self.basedir))
run_sql('DELETE FROM bibrec_bibdoc WHERE id_bibdoc=%s', (self.id, ))
run_sql('DELETE FROM bibdoc_bibdoc WHERE id_bibdoc1=%s OR id_bibdoc2=%s', (self.id, self.id))
run_sql('DELETE FROM bibdoc WHERE id=%s', (self.id, ))
run_sql('INSERT DELAYED INTO hstDOCUMENT(action, id_bibdoc, doctimestamp) VALUES("EXPUNGE", %s, NOW())', (self.id, ))
run_sql('DELETE FROM bibdocfsinfo WHERE id_bibdoc=%s', (self.id, ))
del self.docfiles
del self.id
del self.cd
del self.md
del self.td
del self.basedir
del self.doctype
del self.bibrec_links
def revert(self, version):
"""
Revert the document to a given version. All the formats corresponding
to that version are copied forward to a new version.
@param version: the version to revert to.
@type version: integer
@raise InvenioBibDocFileError: in case of errors
"""
version = int(version)
docfiles = self.list_version_files(version)
if docfiles:
self.add_file_new_version(docfiles[0].get_full_path(), description=docfiles[0].get_description(), comment=docfiles[0].get_comment(), docformat=docfiles[0].get_format(), flags=docfiles[0].flags)
for docfile in docfiles[1:]:
self.add_file_new_format(docfile.filename, description=docfile.get_description(), comment=docfile.get_comment(), docformat=docfile.get_format(), flags=docfile.flags)
def import_descriptions_and_comments_from_marc(self, record=None):
"""
Import descriptions and comments from the corresponding MARC metadata.
@param record: the record (if None it will be calculated).
@type record: bibrecord recstruct
@note: If record is passed it is directly used, otherwise it is retrieved
from the MARCXML stored in the database.
"""
## Let's get the record
from invenio.legacy.search_engine import get_record
if record is None:
record = get_record(self.id)
fields = record_get_field_instances(record, '856', '4', ' ')
global_comment = None
global_description = None
local_comment = {}
local_description = {}
for field in fields:
url = field_get_subfield_values(field, 'u')
if url:
## Given a url
url = url[0]
if re.match('%s/%s/[0-9]+/files/' % (CFG_SITE_URL, CFG_SITE_RECORD), url):
## If it is a traditional /CFG_SITE_RECORD/1/files/ one
## We have global description/comment for all the formats
description = field_get_subfield_values(field, 'y')
if description:
global_description = description[0]
comment = field_get_subfield_values(field, 'z')
if comment:
global_comment = comment[0]
elif bibdocfile_url_p(url):
## Otherwise we have description/comment per format
dummy, docname, docformat = decompose_bibdocfile_url(url)
brd = BibRecDocs(self.id)
if docname == brd.get_docname(self.id):
description = field_get_subfield_values(field, 'y')
if description:
local_description[docformat] = description[0]
comment = field_get_subfield_values(field, 'z')
if comment:
local_comment[docformat] = comment[0]
## Let's update the tables
version = self.get_latest_version()
for docfile in self.list_latest_files():
docformat = docfile.get_format()
if docformat in local_comment:
self.set_comment(local_comment[docformat], docformat, version)
else:
self.set_comment(global_comment, docformat, version)
if docformat in local_description:
self.set_description(local_description[docformat], docformat, version)
else:
self.set_description(global_description, docformat, version)
self._build_file_list('init')
def get_icon(self, subformat_re=CFG_BIBDOCFILE_ICON_SUBFORMAT_RE, display_hidden=True):
"""
@param subformat_re: by default the convention is that
L{CFG_BIBDOCFILE_ICON_SUBFORMAT_RE} is used as a subformat indicator to
mean that a particular format is to be used as an icon.
Specifiy a different subformat if you need to use a different
convention.
@type subformat_re: compiled regular expression
@return: the bibdocfile corresponding to the icon of this document, or
None if any icon exists for this document.
@rtype: BibDocFile
@warning: before I{subformat} were introduced this method was
returning a BibDoc, while now is returning a BibDocFile. Check
if your client code is compatible with this.
"""
for docfile in self.list_latest_files(list_hidden=display_hidden):
if subformat_re.match(docfile.get_subformat()):
return docfile
return None
def add_icon(self, filename, docformat=None, subformat=CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT, modification_date=None):
"""
Attaches icon to this document.
@param filename: the local filesystem path to the icon.
@type filename: string
@param format: an optional format for the icon. If not specified it
will be calculated after the filesystem path.
@type format: string
@param subformat: by default the convention is that
CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT is used as a subformat indicator to
mean that a particular format is to be used as an icon.
Specifiy a different subformat if you need to use a different
convention.
@type subformat: string
@raise InvenioBibDocFileError: in case of errors.
"""
#first check if an icon already exists
if not docformat:
docformat = decompose_file(filename)[2]
if subformat:
docformat += ";%s" % subformat
self.add_file_new_format(filename, docformat=docformat, modification_date=modification_date)
def delete_icon(self, subformat_re=CFG_BIBDOCFILE_ICON_SUBFORMAT_RE):
"""
@param subformat_re: by default the convention is that
L{CFG_BIBDOCFILE_ICON_SUBFORMAT_RE} is used as a subformat indicator to
mean that a particular format is to be used as an icon.
Specifiy a different subformat if you need to use a different
convention.
@type subformat: compiled regular expression
Removes the icon attached to the document if it exists.
"""
for docfile in self.list_latest_files():
if subformat_re.match(docfile.get_subformat()):
self.delete_file(docfile.get_format(), docfile.get_version())
def change_name(self, recid, newname):
"""
Renames this document in connection with a given record.
@param newname: the new name.
@type newname: string
@raise InvenioBibDocFileError: if the new name corresponds to
a document already attached to the record owning this document.
"""
try:
newname = normalize_docname(newname)
res = run_sql("SELECT id_bibdoc FROM bibrec_bibdoc WHERE id_bibrec=%s AND docname=%s", (recid, newname))
if res:
raise InvenioBibDocFileError, "A bibdoc called %s already exists for recid %s" % (newname, recid)
run_sql("update bibrec_bibdoc set docname=%s where id_bibdoc=%s and id_bibrec=%s", (newname, self.id, recid))
finally:
self.touch()
def set_comment(self, comment, docformat, version=None):
"""
Updates the comment of a specific format/version of the document.
@param comment: the new comment.
@type comment: string
@param format: the specific format for which the comment should be
updated.
@type format: string
@param version: the specific version for which the comment should be
updated. If not specified the last version will be used.
@type version: integer
"""
if version is None:
version = self.get_latest_version()
docformat = normalize_format(docformat)
self.more_info.set_comment(comment, docformat, version)
self.touch()
self._build_file_list('init')
def set_description(self, description, docformat, version=None):
"""
Updates the description of a specific format/version of the document.
@param description: the new description.
@type description: string
@param format: the specific format for which the description should be
updated.
@type format: string
@param version: the specific version for which the description should be
updated. If not specified the last version will be used.
@type version: integer
"""
if version is None:
version = self.get_latest_version()
docformat = normalize_format(docformat)
self.more_info.set_description(description, docformat, version)
self.touch()
self._build_file_list('init')
def set_flag(self, flagname, docformat, version=None):
"""
Sets a flag for a specific format/version of the document.
@param flagname: a flag from L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}.
@type flagname: string
@param format: the specific format for which the flag should be
set.
@type format: string
@param version: the specific version for which the flag should be
set. If not specified the last version will be used.
@type version: integer
"""
if version is None:
version = self.get_latest_version()
docformat = normalize_format(docformat)
self.more_info.set_flag(flagname, docformat, version)
self.touch()
self._build_file_list('init')
def has_flag(self, flagname, docformat, version=None):
"""
Checks if a particular flag for a format/version is set.
@param flagname: a flag from L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}.
@type flagname: string
@param format: the specific format for which the flag should be
set.
@type format: string
@param version: the specific version for which the flag should be
set. If not specified the last version will be used.
@type version: integer
@return: True if the flag is set.
@rtype: bool
"""
if version is None:
version = self.get_latest_version()
docformat = normalize_format(docformat)
return self.more_info.has_flag(flagname, docformat, version)
def unset_flag(self, flagname, docformat, version=None):
"""
Unsets a flag for a specific format/version of the document.
@param flagname: a flag from L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}.
@type flagname: string
@param format: the specific format for which the flag should be
unset.
@type format: string
@param version: the specific version for which the flag should be
unset. If not specified the last version will be used.
@type version: integer
"""
if version is None:
version = self.get_latest_version()
docformat = normalize_format(docformat)
self.more_info.unset_flag(flagname, docformat, version)
self.touch()
self._build_file_list('init')
def get_comment(self, docformat, version=None):
"""
Retrieve the comment of a specific format/version of the document.
@param format: the specific format for which the comment should be
retrieved.
@type format: string
@param version: the specific version for which the comment should be
retrieved. If not specified the last version will be used.
@type version: integer
@return: the comment.
@rtype: string
"""
if version is None:
version = self.get_latest_version()
docformat = normalize_format(docformat)
return self.more_info.get_comment(docformat, version)
def get_description(self, docformat, version=None):
"""
Retrieve the description of a specific format/version of the document.
@param format: the specific format for which the description should be
retrieved.
@type format: string
@param version: the specific version for which the description should
be retrieved. If not specified the last version will be used.
@type version: integer
@return: the description.
@rtype: string
"""
if version is None:
version = self.get_latest_version()
docformat = normalize_format(docformat)
return self.more_info.get_description(docformat, version)
def hidden_p(self, docformat, version=None):
"""
Returns True if the file specified by the given format/version is
hidden.
@param format: the specific format for which the description should be
retrieved.
@type format: string
@param version: the specific version for which the description should
be retrieved. If not specified the last version will be used.
@type version: integer
@return: True if hidden.
@rtype: bool
"""
if version is None:
version = self.get_latest_version()
return self.more_info.has_flag('HIDDEN', docformat, version)
def get_base_dir(self):
"""
@return: the base directory on the local filesystem for this document
(e.g. C{/soft/cdsweb/var/data/files/g0/123})
@rtype: string
"""
return self.basedir
def get_type(self):
"""
@return: the type of this document.
@rtype: string"""
return self.doctype
def get_id(self):
"""
@return: the id of this document.
@rtype: integer
"""
return self.id
def get_file(self, docformat, version=""):
"""
Returns a L{BibDocFile} instance of this document corresponding to the
specific format and version.
@param format: the specific format.
@type format: string
@param version: the specific version for which the description should
be retrieved. If not specified the last version will be used.
@type version: integer
@return: the L{BibDocFile} instance.
@rtype: BibDocFile
"""
if version == "":
docfiles = self.list_latest_files()
else:
version = int(version)
docfiles = self.list_version_files(version)
docformat = normalize_format(docformat)
for docfile in docfiles:
if (docfile.get_format() == docformat or not docformat):
return docfile
## Let's skip the subformat specification and consider just the
## superformat
superformat = get_superformat_from_format(docformat)
for docfile in docfiles:
if get_superformat_from_format(docfile.get_format()) == superformat:
return docfile
raise InvenioBibDocFileError, "No file for doc %i of format '%s', version '%s'" % (self.id, docformat, version)
def list_versions(self):
"""
@return: the list of existing version numbers for this document.
@rtype: list of integer
"""
versions = []
for docfile in self.docfiles:
if not docfile.get_version() in versions:
versions.append(docfile.get_version())
versions.sort()
return versions
def delete(self, recid = None):
"""
Delete this document.
@see: L{undelete} for how to undelete the document.
@raise InvenioBibDocFileError: in case of errors.
"""
try:
today = datetime.today()
recids = []
if recid:
recids = [recid]
else:
recids = [link["recid"] for link in self.bibrec_links]
for rid in recids:
brd = BibRecDocs(rid)
docname = brd.get_docname(self.id)
# if the document is attached to some records
brd.change_name(docid=self.id, newname = 'DELETED-%s%s-%s' % (today.strftime('%Y%m%d%H%M%S'), today.microsecond, docname))
run_sql("UPDATE bibdoc SET status='DELETED' WHERE id=%s", (self.id,))
self.status = 'DELETED'
except Exception, e:
register_exception()
raise InvenioBibDocFileError, "It's impossible to delete bibdoc %s: %s" % (self.id, e)
def deleted_p(self):
"""
@return: True if this document has been deleted.
@rtype: bool
"""
return self.status == 'DELETED'
def empty_p(self):
"""
@return: True if this document is empty, i.e. it has no bibdocfile
connected.
@rtype: bool
"""
return len(self.docfiles) == 0
def undelete(self, previous_status='', recid=None):
"""
Undelete a deleted file (only if it was actually deleted via L{delete}).
The previous C{status}, i.e. the restriction key can be provided.
Otherwise the undeleted document will be public.
@param previous_status: the previous status the should be restored.
@type previous_status: string
@raise InvenioBibDocFileError: in case of any error.
"""
try:
run_sql("UPDATE bibdoc SET status=%s WHERE id=%s AND status='DELETED'", (previous_status, self.id))
except Exception, e:
raise InvenioBibDocFileError, "It's impossible to undelete bibdoc %s: %s" % (self.id, e)
if recid:
bibrecdocs = BibRecDocs(recid)
docname = bibrecdocs.get_docname(self.id)
if docname.startswith('DELETED-'):
try:
# Let's remove DELETED-20080214144322- in front of the docname
original_name = '-'.join(docname.split('-')[2:])
original_name = bibrecdocs.propose_unique_docname(original_name)
bibrecdocs.change_name(docid=self.id, newname=original_name)
except Exception, e:
raise InvenioBibDocFileError, "It's impossible to restore the previous docname %s. %s kept as docname because: %s" % (original_name, docname, e)
else:
raise InvenioBibDocFileError, "Strange just undeleted docname isn't called DELETED-somedate-docname but %s" % docname
def delete_file(self, docformat, version):
"""
Delete a specific format/version of this document on the filesystem.
@param format: the particular format to be deleted.
@type format: string
@param version: the particular version to be deleted.
@type version: integer
@note: this operation is not reversible!"""
try:
afile = self.get_file(docformat, version)
except InvenioBibDocFileError:
return
try:
os.remove(afile.get_full_path())
run_sql("DELETE FROM bibdocfsinfo WHERE id_bibdoc=%s AND version=%s AND format=%s", (self.id, afile.get_version(), afile.get_format()))
last_version = run_sql("SELECT max(version) FROM bibdocfsinfo WHERE id_bibdoc=%s", (self.id, ))[0][0]
if last_version:
## Updating information about last version
run_sql("UPDATE bibdocfsinfo SET last_version=true WHERE id_bibdoc=%s AND version=%s", (self.id, last_version))
run_sql("UPDATE bibdocfsinfo SET last_version=false WHERE id_bibdoc=%s AND version<>%s", (self.id, last_version))
except OSError:
pass
self.touch()
self._build_file_list()
def get_history(self):
"""
@return: a human readable and parsable string that represent the
history of this document.
@rtype: string
"""
ret = []
hst = run_sql("""SELECT action, docname, docformat, docversion,
docsize, docchecksum, doctimestamp
FROM hstDOCUMENT
WHERE id_bibdoc=%s ORDER BY doctimestamp ASC""", (self.id, ))
for row in hst:
ret.append("%s %s '%s', format: '%s', version: %i, size: %s, checksum: '%s'" % (row[6].strftime('%Y-%m-%d %H:%M:%S'), row[0], row[1], row[2], row[3], nice_size(row[4]), row[5]))
return ret
def _build_file_list(self, context=''):
"""
Lists all files attached to the bibdoc. This function should be
called everytime the bibdoc is modified.
As a side effect it log everything that has happened to the bibdocfiles
in the log facility, according to the context:
"init": means that the function has been called;
for the first time by a constructor, hence no logging is performed
"": by default means to log every deleted file as deleted and every
added file as added;
"rename": means that every appearently deleted file is logged as
renamef and every new file as renamet.
"""
def log_action(action, docid, docname, docformat, version, size, checksum, timestamp=''):
"""Log an action into the bibdoclog table."""
try:
if timestamp:
run_sql('INSERT DELAYED INTO hstDOCUMENT(action, id_bibdoc, docname, docformat, docversion, docsize, docchecksum, doctimestamp) VALUES(%s, %s, %s, %s, %s, %s, %s, %s)', (action, docid, docname, docformat, version, size, checksum, timestamp))
else:
run_sql('INSERT DELAYED INTO hstDOCUMENT(action, id_bibdoc, docname, docformat, docversion, docsize, docchecksum, doctimestamp) VALUES(%s, %s, %s, %s, %s, %s, %s, NOW())', (action, docid, docname, docformat, version, size, checksum))
except DatabaseError:
register_exception()
def make_removed_added_bibdocfiles(previous_file_list):
"""Internal function for build the log of changed files."""
# Let's rebuild the previous situation
old_files = {}
for bibdocfile in previous_file_list:
old_files[(bibdocfile.name, bibdocfile.format, bibdocfile.version)] = (bibdocfile.size, bibdocfile.checksum, bibdocfile.md)
# Let's rebuild the new situation
new_files = {}
for bibdocfile in self.docfiles:
new_files[(bibdocfile.name, bibdocfile.format, bibdocfile.version)] = (bibdocfile.size, bibdocfile.checksum, bibdocfile.md)
# Let's subtract from added file all the files that are present in
# the old list, and let's add to deleted files that are not present
# added file.
added_files = dict(new_files)
deleted_files = {}
for key, value in old_files.iteritems():
if added_files.has_key(key):
del added_files[key]
else:
deleted_files[key] = value
return (added_files, deleted_files)
if context != ('init', 'init_from_disk'):
previous_file_list = list(self.docfiles)
res = run_sql("SELECT status, creation_date,"
"modification_date FROM bibdoc WHERE id=%s", (self.id,))
self.cd = res[0][1]
self.md = res[0][2]
self.status = res[0][0]
self.more_info = BibDocMoreInfo(self.id)
self.docfiles = []
if CFG_BIBDOCFILE_ENABLE_BIBDOCFSINFO_CACHE and context == 'init':
## In normal init context we read from DB
res = run_sql("SELECT version, format, cd, md, checksum, filesize FROM bibdocfsinfo WHERE id_bibdoc=%s", (self.id, ))
for version, docformat, cd, md, checksum, size in res:
filepath = self.get_filepath(docformat, version)
self.docfiles.append(BibDocFile(
filepath, self.bibrec_types,
version, docformat, self.id, self.status, checksum,
self.more_info, human_readable=self.human_readable, cd=cd, md=md, size=size, bibdoc=self))
else:
if os.path.exists(self.basedir):
files = os.listdir(self.basedir)
files.sort()
for afile in files:
if not afile.startswith('.'):
try:
filepath = os.path.join(self.basedir, afile)
dummy, dummy, docformat, fileversion = decompose_file_with_version(filepath)
checksum = self.md5s.get_checksum(afile)
self.docfiles.append(BibDocFile(filepath, self.bibrec_types,
fileversion, docformat,
self.id, self.status, checksum,
self.more_info, human_readable=self.human_readable, bibdoc=self))
except Exception, e:
register_exception()
raise InvenioBibDocFileError, e
if context in ('init', 'init_from_disk'):
return
else:
added_files, deleted_files = make_removed_added_bibdocfiles(previous_file_list)
deletedstr = "DELETED"
addedstr = "ADDED"
if context == 'rename':
deletedstr = "RENAMEDFROM"
addedstr = "RENAMEDTO"
for (docname, docformat, version), (size, checksum, md) in added_files.iteritems():
if context == 'rename':
md = '' # No modification time
log_action(addedstr, self.id, docname, docformat, version, size, checksum, md)
for (docname, docformat, version), (size, checksum, md) in deleted_files.iteritems():
if context == 'rename':
md = '' # No modification time
log_action(deletedstr, self.id, docname, docformat, version, size, checksum, md)
def _sync_to_db(self):
"""
Update the content of the bibdocfile table by taking what is available on the filesystem.
"""
self._build_file_list('init_from_disk')
run_sql("DELETE FROM bibdocfsinfo WHERE id_bibdoc=%s", (self.id,))
for afile in self.docfiles:
run_sql("INSERT INTO bibdocfsinfo(id_bibdoc, version, format, last_version, cd, md, checksum, filesize, mime) VALUES(%s, %s, %s, false, %s, %s, %s, %s, %s)", (self.id, afile.get_version(), afile.get_format(), afile.cd, afile.md, afile.get_checksum(), afile.get_size(), afile.mime))
run_sql("UPDATE bibdocfsinfo SET last_version=true WHERE id_bibdoc=%s AND version=%s", (self.id, self.get_latest_version()))
def _build_related_file_list(self):
"""Lists all files attached to the bibdoc. This function should be
called everytime the bibdoc is modified within e.g. its icon.
@deprecated: use subformats instead.
"""
self.related_files = {}
res = run_sql("SELECT ln.id_bibdoc2,ln.rel_type,bibdoc.status FROM "
"bibdoc_bibdoc AS ln,bibdoc WHERE bibdoc.id=ln.id_bibdoc2 AND "
"ln.id_bibdoc1=%s", (str(self.id),))
for row in res:
docid = row[0]
doctype = row[1]
if row[2] != 'DELETED':
if not self.related_files.has_key(doctype):
self.related_files[doctype] = []
cur_doc = BibDoc.create_instance(docid=docid, human_readable=self.human_readable)
self.related_files[doctype].append(cur_doc)
def get_total_size_latest_version(self):
"""Return the total size used on disk of all the files belonging
to this bibdoc and corresponding to the latest version."""
ret = 0
for bibdocfile in self.list_latest_files():
ret += bibdocfile.get_size()
return ret
def get_total_size(self):
"""Return the total size used on disk of all the files belonging
to this bibdoc."""
ret = 0
for bibdocfile in self.list_all_files():
ret += bibdocfile.get_size()
return ret
def list_all_files(self, list_hidden=True):
"""Returns all the docfiles linked with the given bibdoc."""
if list_hidden:
return self.docfiles
else:
return [afile for afile in self.docfiles if not afile.hidden_p()]
def list_latest_files(self, list_hidden=True):
"""Returns all the docfiles within the last version."""
return self.list_version_files(self.get_latest_version(), list_hidden=list_hidden)
def list_version_files(self, version, list_hidden=True):
"""Return all the docfiles of a particular version."""
version = int(version)
return [docfile for docfile in self.docfiles if docfile.get_version() == version and (list_hidden or not docfile.hidden_p())]
def get_latest_version(self):
""" Returns the latest existing version number for the given bibdoc.
If no file is associated to this bibdoc, returns '0'.
"""
version = 0
for bibdocfile in self.docfiles:
if bibdocfile.get_version() > version:
version = bibdocfile.get_version()
return version
def get_file_number(self):
"""Return the total number of files."""
return len(self.docfiles)
def register_download(self, ip_address, version, docformat, userid=0, recid=0):
"""Register the information about a download of a particular file."""
docformat = normalize_format(docformat)
if docformat[:1] == '.':
docformat = docformat[1:]
docformat = docformat.upper()
if not version:
version = self.get_latest_version()
return run_sql("INSERT DELAYED INTO rnkDOWNLOADS "
"(id_bibrec,id_bibdoc,file_version,file_format,"
"id_user,client_host,download_time) VALUES "
"(%s,%s,%s,%s,%s,INET_ATON(%s),NOW())",
(recid, self.id, version, docformat,
userid, ip_address,))
def get_incoming_relations(self, rel_type=None):
"""Return all relations in which this BibDoc appears on target position
@param rel_type: Type of the relation, to which we want to limit our search. None = any type
@type rel_type: string
@return: List of BibRelation instances
@rtype: list
"""
return BibRelation.get_relations(rel_type = rel_type,
bibdoc2_id = self.id)
def get_outgoing_relations(self, rel_type=None):
"""Return all relations in which this BibDoc appears on target position
@param rel_type: Type of the relation, to which we want to limit our search. None = any type
@type rel_type: string
@return: List of BibRelation instances
@rtype: list
"""
return BibRelation.get_relations(rel_type = rel_type,
bibdoc1_id = self.id)
def create_outgoing_relation(self, bibdoc2, rel_type):
"""
Create an outgoing relation between current BibDoc and a different one
"""
return BibRelation.create(bibdoc1_id = self.id, bibdoc2_id = bibdoc2.id, rel_type = rel_type)
def create_incoming_relation(self, bibdoc1, rel_type):
"""
Create an outgoing relation between a particular version of
current BibDoc and a particular version of a different BibDoc
"""
return BibRelation.create(bibdoc1_id = bibdoc1.id, bibdoc2_id = self.id, rel_type = rel_type)
def generic_path2bidocfile(fullpath):
"""
Returns a BibDocFile objects that wraps the given fullpath.
@note: the object will contain the minimum information that can be
guessed from the fullpath (e.g. docname, format, subformat, version,
md5, creation_date, modification_date). It won't contain for example
a comment, a description, a doctype, a restriction.
"""
fullpath = os.path.abspath(fullpath)
try:
path, name, docformat, version = decompose_file_with_version(fullpath)
except ValueError:
## There is no version
version = 0
path, name, docformat = decompose_file(fullpath)
md5folder = Md5Folder(path)
checksum = md5folder.get_checksum(os.path.basename(fullpath))
return BibDocFile(fullpath=fullpath,
recid_doctypes=[(0, None, name)],
version=version,
docformat=docformat,
docid=0,
status=None,
checksum=checksum,
more_info=None)
class BibDocFile(object):
"""This class represents a physical file in the Invenio filesystem.
It should never be instantiated directly"""
def __init__(self, fullpath, recid_doctypes, version, docformat, docid, status, checksum, more_info=None, human_readable=False, cd=None, md=None, size=None, bibdoc = None):
self.fullpath = os.path.abspath(fullpath)
self.docid = docid
self.recids_doctypes = recid_doctypes
self.version = version
self.status = status
self.checksum = checksum
self.human_readable = human_readable
self.name = recid_doctypes[0][2]
self.bibdoc = bibdoc
if more_info:
self.description = more_info.get_description(docformat, version)
self.comment = more_info.get_comment(docformat, version)
self.flags = more_info.get_flags(docformat, version)
else:
self.description = None
self.comment = None
self.flags = []
self.format = normalize_format(docformat)
self.superformat = get_superformat_from_format(self.format)
self.subformat = get_subformat_from_format(self.format)
if docformat:
self.recids_doctypes = [(a,b,c+self.superformat) for (a,b,c) in self.recids_doctypes]
self.mime, self.encoding = _mimes.guess_type(self.recids_doctypes[0][2])
if self.mime is None:
self.mime = "application/octet-stream"
self.more_info = more_info
self.hidden = 'HIDDEN' in self.flags
self.size = size or os.path.getsize(fullpath)
self.md = md or datetime.fromtimestamp(os.path.getmtime(fullpath))
try:
self.cd = cd or datetime.fromtimestamp(os.path.getctime(fullpath))
except OSError:
self.cd = self.md
self.dir = os.path.dirname(fullpath)
if self.subformat:
self.url = create_url('%s/%s/%s/files/%s%s' % (CFG_SITE_URL, CFG_SITE_RECORD, self.recids_doctypes[0][0], self.name, self.superformat), {'subformat' : self.subformat})
self.fullurl = create_url('%s/%s/%s/files/%s%s' % (CFG_SITE_URL, CFG_SITE_RECORD, self.recids_doctypes[0][0], self.name, self.superformat), {'subformat' : self.subformat, 'version' : self.version})
else:
self.url = create_url('%s/%s/%s/files/%s%s' % (CFG_SITE_URL, CFG_SITE_RECORD, self.recids_doctypes[0][0], self.name, self.superformat), {})
self.fullurl = create_url('%s/%s/%s/files/%s%s' % (CFG_SITE_URL, CFG_SITE_RECORD, self.recids_doctypes[0][0], self.name, self.superformat), {'version' : self.version})
self.etag = '"%i%s%i"' % (self.docid, self.format, self.version)
self.magic = None
def __repr__(self):
return ('BibDocFile(%s, %i, %s, %s, %i, %i, %s, %s, %s, %s)' % (repr(self.fullpath), self.version, repr(self.name), repr(self.format), self.recids_doctypes[0][0], self.docid, repr(self.status), repr(self.checksum), repr(self.more_info), repr(self.human_readable)))
def format_recids(self):
if self.bibdoc:
return self.bibdoc.format_recids()
return "0"
def __str__(self):
recids = self.format_recids()
out = '%s:%s:%s:%s:fullpath=%s\n' % (recids, self.docid, self.version, self.format, self.fullpath)
out += '%s:%s:%s:%s:name=%s\n' % (recids, self.docid, self.version, self.format, self.name)
out += '%s:%s:%s:%s:subformat=%s\n' % (recids, self.docid, self.version, self.format, get_subformat_from_format(self.format))
out += '%s:%s:%s:%s:status=%s\n' % (recids, self.docid, self.version, self.format, self.status)
out += '%s:%s:%s:%s:checksum=%s\n' % (recids, self.docid, self.version, self.format, self.checksum)
if self.human_readable:
out += '%s:%s:%s:%s:size=%s\n' % (recids, self.docid, self.version, self.format, nice_size(self.size))
else:
out += '%s:%s:%s:%s:size=%s\n' % (recids, self.docid, self.version, self.format, self.size)
out += '%s:%s:%s:%s:creation time=%s\n' % (recids, self.docid, self.version, self.format, self.cd)
out += '%s:%s:%s:%s:modification time=%s\n' % (recids, self.docid, self.version, self.format, self.md)
out += '%s:%s:%s:%s:magic=%s\n' % (recids, self.docid, self.version, self.format, self.get_magic())
out += '%s:%s:%s:%s:mime=%s\n' % (recids, self.docid, self.version, self.format, self.mime)
out += '%s:%s:%s:%s:encoding=%s\n' % (recids, self.docid, self.version, self.format, self.encoding)
out += '%s:%s:%s:%s:url=%s\n' % (recids, self.docid, self.version, self.format, self.url)
out += '%s:%s:%s:%s:fullurl=%s\n' % (recids, self.docid, self.version, self.format, self.fullurl)
out += '%s:%s:%s:%s:description=%s\n' % (recids, self.docid, self.version, self.format, self.description)
out += '%s:%s:%s:%s:comment=%s\n' % (recids, self.docid, self.version, self.format, self.comment)
out += '%s:%s:%s:%s:hidden=%s\n' % (recids, self.docid, self.version, self.format, self.hidden)
out += '%s:%s:%s:%s:flags=%s\n' % (recids, self.docid, self.version, self.format, self.flags)
out += '%s:%s:%s:%s:etag=%s\n' % (recids, self.docid, self.version, self.format, self.etag)
return out
def is_restricted(self, user_info):
"""Returns restriction state. (see acc_authorize_action return values)"""
if self.status not in ('', 'DELETED'):
return check_bibdoc_authorization(user_info, status=self.status)
elif self.status == 'DELETED':
return (1, 'File has ben deleted')
else:
return (0, '')
def is_icon(self, subformat_re=CFG_BIBDOCFILE_ICON_SUBFORMAT_RE):
"""
@param subformat_re: by default the convention is that
L{CFG_BIBDOCFILE_ICON_SUBFORMAT_RE} is used as a subformat indicator to
mean that a particular format is to be used as an icon.
Specifiy a different subformat if you need to use a different
convention.
@type subformat: compiled regular expression
@return: True if this file is an icon.
@rtype: bool
"""
return bool(subformat_re.match(self.subformat))
def hidden_p(self):
return self.hidden
def get_url(self):
return self.url
def get_type(self):
"""Returns the first type connected with the bibdoc of this file."""
return self.recids_doctypes[0][1]
def get_path(self):
return self.fullpath
def get_bibdocid(self):
return self.docid
def get_name(self):
return self.name
def get_full_name(self):
"""Returns the first name connected with the bibdoc of this file."""
return self.recids_doctypes[0][2]
def get_full_path(self):
return self.fullpath
def get_format(self):
return self.format
def get_subformat(self):
return self.subformat
def get_superformat(self):
return self.superformat
def get_size(self):
return self.size
def get_version(self):
return self.version
def get_checksum(self):
return self.checksum
def get_description(self):
return self.description
def get_comment(self):
return self.comment
def get_content(self):
"""Returns the binary content of the file."""
content_fd = open(self.fullpath, 'rb')
content = content_fd.read()
content_fd.close()
return content
def get_recid(self):
"""Returns the first recid connected with the bibdoc of this file."""
return self.recids_doctypes[0][0]
def get_status(self):
"""Returns the status of the file, i.e. either '', 'DELETED' or a
restriction keyword."""
return self.status
def get_magic(self):
"""Return all the possible guesses from the magic library about
the content of the file."""
if self.magic is None:
if CFG_HAS_MAGIC == 1:
magic_cookies = _get_magic_cookies()
magic_result = []
for key in magic_cookies.keys():
magic_result.append(magic_cookies[key].file(self.fullpath))
self.magic = tuple(magic_result)
elif CFG_HAS_MAGIC == 2:
magic_result = []
for key in ({'mime': False, 'mime_encoding': False},
{'mime': True, 'mime_encoding': False},
{'mime': False, 'mime_encoding': True}):
magic_result.append(_magic_wrapper(self.fullpath, **key))
self.magic = tuple(magic_result)
return self.magic
def check(self):
"""Return True if the checksum corresponds to the file."""
return calculate_md5(self.fullpath) == self.checksum
def stream(self, req, download=False):
"""Stream the file. Note that no restriction check is being
done here, since restrictions have been checked previously
inside websubmit_webinterface.py."""
if os.path.exists(self.fullpath):
if random.random() < CFG_BIBDOCFILE_MD5_CHECK_PROBABILITY and calculate_md5(self.fullpath) != self.checksum:
raise InvenioBibDocFileError, "File %s, version %i, is corrupted!" % (self.recids_doctypes[0][2], self.version)
stream_file(req, self.fullpath, "%s%s" % (self.name, self.superformat), self.mime, self.encoding, self.etag, self.checksum, self.fullurl, download=download)
raise apache.SERVER_RETURN, apache.DONE
else:
req.status = apache.HTTP_NOT_FOUND
raise InvenioBibDocFileError, "%s does not exists!" % self.fullpath
_RE_STATUS_PARSER = re.compile(r'^(?P<type>email|group|egroup|role|firerole|status):\s*(?P<value>.*)$', re.S + re.I)
def check_bibdoc_authorization(user_info, status):
"""
Check if the user is authorized to access a document protected with the given status.
L{status} is a string of the form::
auth_type: auth_value
where C{auth_type} can have values in::
email, group, role, firerole, status
and C{auth_value} has a value interpreted againsta C{auth_type}:
- C{email}: the user can access the document if his/her email matches C{auth_value}
- C{group}: the user can access the document if one of the groups (local or
external) of which he/she is member matches C{auth_value}
- C{role}: the user can access the document if he/she belongs to the WebAccess
role specified in C{auth_value}
- C{firerole}: the user can access the document if he/she is implicitly matched
by the role described by the firewall like role definition in C{auth_value}
- C{status}: the user can access the document if he/she is authorized to
for the action C{viewrestrdoc} with C{status} paramter having value
C{auth_value}
@note: If no C{auth_type} is specified or if C{auth_type} is not one of the
above, C{auth_value} will be set to the value contained in the
parameter C{status}, and C{auth_type} will be considered to be C{status}.
@param user_info: the user_info dictionary
@type: dict
@param status: the status of the document.
@type status: string
@return: a tuple, of the form C{(auth_code, auth_message)} where auth_code is 0
if the authorization is granted and greater than 0 otherwise.
@rtype: (int, string)
@raise ValueError: in case of unexpected parsing error.
"""
if not status:
return (0, CFG_WEBACCESS_WARNING_MSGS[0])
def parse_status(status):
g = _RE_STATUS_PARSER.match(status)
if g:
return (g.group('type').lower(), g.group('value'))
else:
return ('status', status)
if acc_is_user_in_role(user_info, acc_get_role_id(SUPERADMINROLE)):
return (0, CFG_WEBACCESS_WARNING_MSGS[0])
auth_type, auth_value = parse_status(status)
if auth_type == 'status':
return acc_authorize_action(user_info, 'viewrestrdoc', status=auth_value)
elif auth_type == 'email':
if not auth_value.lower().strip() == user_info['email'].lower().strip():
return (1, 'You must be member of the group %s in order to access this document' % repr(auth_value))
elif auth_type == 'group':
if not auth_value in user_info['group']:
return (1, 'You must be member of the group %s in order to access this document' % repr(auth_value))
elif auth_type == 'role':
if not acc_is_user_in_role(user_info, acc_get_role_id(auth_value)):
return (1, 'You must be member in the role %s in order to access this document' % repr(auth_value))
elif auth_type == 'firerole':
if not acc_firerole_check_user(user_info, compile_role_definition(auth_value)):
return (1, 'You must be authorized in order to access this document')
else:
raise ValueError, 'Unexpected authorization type %s for %s' % (repr(auth_type), repr(auth_value))
return (0, CFG_WEBACCESS_WARNING_MSGS[0])
## TODO for future reimplementation of stream_file
#class StreamFileException(Exception):
# def __init__(self, value):
# self.value = value
_RE_BAD_MSIE = re.compile("MSIE\s+(\d+\.\d+)")
def stream_file(req, fullpath, fullname=None, mime=None, encoding=None, etag=None, md5str=None, location=None, download=False):
"""This is a generic function to stream a file to the user.
If fullname, mime, encoding, and location are not provided they will be
guessed based on req and fullpath.
md5str should be passed as an hexadecimal string.
"""
## TODO for future reimplementation of stream_file
# from flask import send_file
# if fullname is None:
# fullname = fullpath.split('/')[-1]
# response = send_file(fullpath,
# attachment_filename=fullname.replace('"', '\\"'),
# as_attachment=False)
# if not download:
# response.headers['Content-Disposition'] = 'inline; filename="%s"' % fullname.replace('"', '\\"')
#
# raise StreamFileException(response)
def normal_streaming(size):
req.set_content_length(size)
req.send_http_header()
if req.method != 'HEAD':
req.sendfile(fullpath)
return ""
def single_range(size, the_range):
req.set_content_length(the_range[1])
req.headers_out['Content-Range'] = 'bytes %d-%d/%d' % (the_range[0], the_range[0] + the_range[1] - 1, size)
req.status = apache.HTTP_PARTIAL_CONTENT
req.send_http_header()
if req.method != 'HEAD':
req.sendfile(fullpath, the_range[0], the_range[1])
return ""
def multiple_ranges(size, ranges, mime):
req.status = apache.HTTP_PARTIAL_CONTENT
boundary = '%s%04d' % (time.strftime('THIS_STRING_SEPARATES_%Y%m%d%H%M%S'), random.randint(0, 9999))
req.content_type = 'multipart/byteranges; boundary=%s' % boundary
content_length = 0
for arange in ranges:
content_length += len('--%s\r\n' % boundary)
content_length += len('Content-Type: %s\r\n' % mime)
content_length += len('Content-Range: bytes %d-%d/%d\r\n' % (arange[0], arange[0] + arange[1] - 1, size))
content_length += len('\r\n')
content_length += arange[1]
content_length += len('\r\n')
content_length += len('--%s--\r\n' % boundary)
req.set_content_length(content_length)
req.send_http_header()
if req.method != 'HEAD':
for arange in ranges:
req.write('--%s\r\n' % boundary, 0)
req.write('Content-Type: %s\r\n' % mime, 0)
req.write('Content-Range: bytes %d-%d/%d\r\n' % (arange[0], arange[0] + arange[1] - 1, size), 0)
req.write('\r\n', 0)
req.sendfile(fullpath, arange[0], arange[1])
req.write('\r\n', 0)
req.write('--%s--\r\n' % boundary)
req.flush()
return ""
def parse_date(date):
"""According to <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.3>
a date can come in three formats (in order of preference):
Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123
Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036
Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format
Moreover IE is adding some trailing information after a ';'.
Wrong dates should be simpled ignored.
This function return the time in seconds since the epoch GMT or None
in case of errors."""
if not date:
return None
try:
date = date.split(';')[0].strip() # Because of IE
## Sun, 06 Nov 1994 08:49:37 GMT
return time.mktime(time.strptime(date, '%a, %d %b %Y %X %Z'))
except:
try:
## Sun, 06 Nov 1994 08:49:37 GMT
return time.mktime(time.strptime(date, '%A, %d-%b-%y %H:%M:%S %Z'))
except:
try:
## Sun, 06 Nov 1994 08:49:37 GMT
return time.mktime(date)
except:
return None
def parse_ranges(ranges):
"""According to <http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35>
a (multiple) range request comes in the form:
bytes=20-30,40-60,70-,-80
with the meaning:
from byte to 20 to 30 inclusive (11 bytes)
from byte to 40 to 60 inclusive (21 bytes)
from byte 70 to (size - 1) inclusive (size - 70 bytes)
from byte size - 80 to (size - 1) inclusive (80 bytes)
This function will return the list of ranges in the form:
[[first_byte, last_byte], ...]
If first_byte or last_byte aren't specified they'll be set to None
If the list is not well formatted it will return None
"""
try:
if ranges.startswith('bytes') and '=' in ranges:
ranges = ranges.split('=')[1].strip()
else:
return None
ret = []
for arange in ranges.split(','):
arange = arange.strip()
if arange.startswith('-'):
ret.append([None, int(arange[1:])])
elif arange.endswith('-'):
ret.append([int(arange[:-1]), None])
else:
ret.append(map(int, arange.split('-')))
return ret
except:
return None
def parse_tags(tags):
"""Return a list of tags starting from a comma separated list."""
return [tag.strip() for tag in tags.split(',')]
def fix_ranges(ranges, size):
"""Complementary to parse_ranges it will transform all the ranges
into (first_byte, length), adjusting all the value based on the
actual size provided.
"""
ret = []
for arange in ranges:
if (arange[0] is None and arange[1] > 0) or arange[0] < size:
if arange[0] is None:
arange[0] = size - arange[1]
elif arange[1] is None:
arange[1] = size - arange[0]
else:
arange[1] = arange[1] - arange[0] + 1
arange[0] = max(0, arange[0])
arange[1] = min(size - arange[0], arange[1])
if arange[1] > 0:
ret.append(arange)
return ret
def get_normalized_headers():
"""Strip and lowerize all the keys of the headers dictionary plus
strip, lowerize and transform known headers value into their value."""
ret = {
'if-match' : None,
'unless-modified-since' : None,
'if-modified-since' : None,
'range' : None,
'if-range' : None,
'if-none-match' : None,
}
for key, value in req.headers_in.iteritems():
key = key.strip().lower()
value = value.strip()
if key in ('unless-modified-since', 'if-modified-since'):
value = parse_date(value)
elif key == 'range':
value = parse_ranges(value)
elif key == 'if-range':
value = parse_date(value) or parse_tags(value)
elif key in ('if-match', 'if-none-match'):
value = parse_tags(value)
if value:
ret[key] = value
return ret
headers = get_normalized_headers()
g = _RE_BAD_MSIE.search(headers.get('user-agent', "MSIE 6.0"))
bad_msie = g and float(g.group(1)) < 9.0
if CFG_BIBDOCFILE_USE_XSENDFILE:
## If XSendFile is supported by the server, let's use it.
if os.path.exists(fullpath):
if fullname is None:
fullname = os.path.basename(fullpath)
if bad_msie:
## IE is confused by quotes
req.headers_out["Content-Disposition"] = 'attachment; filename=%s' % fullname.replace('"', '\\"')
elif download:
req.headers_out["Content-Disposition"] = 'attachment; filename="%s"' % fullname.replace('"', '\\"')
else:
## IE is confused by inline
req.headers_out["Content-Disposition"] = 'inline; filename="%s"' % fullname.replace('"', '\\"')
req.headers_out["X-Sendfile"] = fullpath
if mime is None:
(mime, encoding) = _mimes.guess_type(fullpath)
if mime is None:
mime = "application/octet-stream"
if not bad_msie:
## IE is confused by not supported mimetypes
req.content_type = mime
return ""
else:
raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND
if headers['if-match']:
if etag is not None and etag not in headers['if-match']:
raise apache.SERVER_RETURN, apache.HTTP_PRECONDITION_FAILED
if os.path.exists(fullpath):
mtime = os.path.getmtime(fullpath)
if fullname is None:
fullname = os.path.basename(fullpath)
if mime is None:
(mime, encoding) = _mimes.guess_type(fullpath)
if mime is None:
mime = "application/octet-stream"
if location is None:
location = req.uri
if not bad_msie:
## IE is confused by not supported mimetypes
req.content_type = mime
req.encoding = encoding
req.filename = fullname
req.headers_out["Last-Modified"] = time.strftime('%a, %d %b %Y %X GMT', time.gmtime(mtime))
if CFG_ENABLE_HTTP_RANGE_REQUESTS:
req.headers_out["Accept-Ranges"] = "bytes"
else:
req.headers_out["Accept-Ranges"] = "none"
req.headers_out["Content-Location"] = location
if etag is not None:
req.headers_out["ETag"] = etag
if md5str is not None:
req.headers_out["Content-MD5"] = base64.encodestring(binascii.unhexlify(md5str.upper()))[:-1]
if bad_msie:
## IE is confused by quotes
req.headers_out["Content-Disposition"] = 'attachment; filename=%s' % fullname.replace('"', '\\"')
elif download:
req.headers_out["Content-Disposition"] = 'attachment; filename="%s"' % fullname.replace('"', '\\"')
else:
## IE is confused by inline
req.headers_out["Content-Disposition"] = 'inline; filename="%s"' % fullname.replace('"', '\\"')
size = os.path.getsize(fullpath)
if not size:
try:
raise Exception, '%s exists but is empty' % fullpath
except Exception:
register_exception(req=req, alert_admin=True)
raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND
if headers['if-modified-since'] and headers['if-modified-since'] >= mtime:
raise apache.SERVER_RETURN, apache.HTTP_NOT_MODIFIED
if headers['if-none-match']:
if etag is not None and etag in headers['if-none-match']:
raise apache.SERVER_RETURN, apache.HTTP_NOT_MODIFIED
if headers['unless-modified-since'] and headers['unless-modified-since'] < mtime:
return normal_streaming(size)
if CFG_ENABLE_HTTP_RANGE_REQUESTS and headers['range']:
try:
if headers['if-range']:
if etag is None or etag not in headers['if-range']:
return normal_streaming(size)
ranges = fix_ranges(headers['range'], size)
except:
return normal_streaming(size)
if len(ranges) > 1:
return multiple_ranges(size, ranges, mime)
elif ranges:
return single_range(size, ranges[0])
else:
raise apache.SERVER_RETURN, apache.HTTP_RANGE_NOT_SATISFIABLE
else:
return normal_streaming(size)
else:
raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND
def stream_restricted_icon(req):
"""Return the content of the "Restricted Icon" file."""
stream_file(req, '%s/img/restricted.gif' % CFG_WEBDIR)
raise apache.SERVER_RETURN, apache.DONE
#def list_versions_from_array(docfiles):
# """Retrieve the list of existing versions from the given docfiles list."""
# versions = []
# for docfile in docfiles:
# if not docfile.get_version() in versions:
# versions.append(docfile.get_version())
# versions.sort()
# versions.reverse()
# return versions
def _make_base_dir(docid):
"""Given a docid it returns the complete path that should host its files."""
group = "g" + str(int(int(docid) / CFG_BIBDOCFILE_FILESYSTEM_BIBDOC_GROUP_LIMIT))
return os.path.join(CFG_BIBDOCFILE_FILEDIR, group, str(docid))
class Md5Folder(object):
"""Manage all the Md5 checksum about a folder"""
def __init__(self, folder):
"""Initialize the class from the md5 checksum of a given path"""
self.folder = folder
self.load()
def update(self, only_new=True):
"""Update the .md5 file with the current files. If only_new
is specified then only not already calculated file are calculated."""
if not only_new:
self.md5s = {}
if os.path.exists(self.folder):
for filename in os.listdir(self.folder):
if filename not in self.md5s and not filename.startswith('.'):
self.md5s[filename] = calculate_md5(os.path.join(self.folder, filename))
self.store()
def store(self):
"""Store the current md5 dictionary into .md5"""
try:
old_umask = os.umask(022)
md5file = open(os.path.join(self.folder, ".md5"), "w")
for key, value in self.md5s.items():
md5file.write('%s *%s\n' % (value, key))
md5file.close()
os.umask(old_umask)
except Exception, e:
register_exception(alert_admin=True)
raise InvenioBibDocFileError("Encountered an exception while storing .md5 for folder '%s': '%s'" % (self.folder, e))
def load(self):
"""Load .md5 into the md5 dictionary"""
self.md5s = {}
md5_path = os.path.join(self.folder, ".md5")
if os.path.exists(md5_path):
for row in open(md5_path, "r"):
md5hash = row[:32]
filename = row[34:].strip()
self.md5s[filename] = md5hash
else:
self.update()
def check(self, filename=''):
"""Check the specified file or all the files for which it exists a hash
for being coherent with the stored hash."""
if filename and filename in self.md5s.keys():
try:
return self.md5s[filename] == calculate_md5(os.path.join(self.folder, filename))
except Exception, e:
register_exception(alert_admin=True)
raise InvenioBibDocFileError("Encountered an exception while loading '%s': '%s'" % (os.path.join(self.folder, filename), e))
else:
for filename, md5hash in self.md5s.items():
try:
if calculate_md5(os.path.join(self.folder, filename)) != md5hash:
return False
except Exception, e:
register_exception(alert_admin=True)
raise InvenioBibDocFileError("Encountered an exception while loading '%s': '%s'" % (os.path.join(self.folder, filename), e))
return True
def get_checksum(self, filename):
"""Return the checksum of a physical file."""
md5hash = self.md5s.get(filename, None)
if md5hash is None:
self.update()
# Now it should not fail!
md5hash = self.md5s[filename]
return md5hash
def calculate_md5_external(filename):
"""Calculate the md5 of a physical file through md5sum Command Line Tool.
This is suitable for file larger than 256Kb."""
try:
md5_result = os.popen(CFG_PATH_MD5SUM + ' -b %s' % escape_shell_arg(filename))
ret = md5_result.read()[:32]
md5_result.close()
if len(ret) != 32:
# Error in running md5sum. Let's fallback to internal
# algorithm.
return calculate_md5(filename, force_internal=True)
else:
return ret
except Exception, e:
raise InvenioBibDocFileError("Encountered an exception while calculating md5 for file '%s': '%s'" % (filename, e))
def calculate_md5(filename, force_internal=False):
"""Calculate the md5 of a physical file. This is suitable for files smaller
than 256Kb."""
if not CFG_PATH_MD5SUM or force_internal or os.path.getsize(filename) < CFG_BIBDOCFILE_MD5_THRESHOLD:
try:
to_be_read = open(filename, "rb")
computed_md5 = md5()
while True:
buf = to_be_read.read(CFG_BIBDOCFILE_MD5_BUFFER)
if buf:
computed_md5.update(buf)
else:
break
to_be_read.close()
return computed_md5.hexdigest()
except Exception, e:
register_exception(alert_admin=True)
raise InvenioBibDocFileError("Encountered an exception while calculating md5 for file '%s': '%s'" % (filename, e))
else:
return calculate_md5_external(filename)
def bibdocfile_url_to_bibrecdocs(url):
"""Given an URL in the form CFG_SITE_[SECURE_]URL/CFG_SITE_RECORD/xxx/files/... it returns
a BibRecDocs object for the corresponding recid."""
recid = decompose_bibdocfile_url(url)[0]
return BibRecDocs(recid)
def bibdocfile_url_to_bibdoc(url):
"""Given an URL in the form CFG_SITE_[SECURE_]URL/CFG_SITE_RECORD/xxx/files/... it returns
a BibDoc object for the corresponding recid/docname."""
docname = decompose_bibdocfile_url(url)[1]
return bibdocfile_url_to_bibrecdocs(url).get_bibdoc(docname)
def bibdocfile_url_to_bibdocfile(url):
"""Given an URL in the form CFG_SITE_[SECURE_]URL/CFG_SITE_RECORD/xxx/files/... it returns
a BibDocFile object for the corresponding recid/docname/format."""
docformat = decompose_bibdocfile_url(url)[2]
return bibdocfile_url_to_bibdoc(url).get_file(docformat)
def bibdocfile_url_to_fullpath(url):
"""Given an URL in the form CFG_SITE_[SECURE_]URL/CFG_SITE_RECORD/xxx/files/... it returns
the fullpath for the corresponding recid/docname/format."""
return bibdocfile_url_to_bibdocfile(url).get_full_path()
def bibdocfile_url_p(url):
"""Return True when the url is a potential valid url pointing to a
fulltext owned by a system."""
if url.startswith('%s/getfile.py' % CFG_SITE_URL) or url.startswith('%s/getfile.py' % CFG_SITE_SECURE_URL):
return True
if not (url.startswith('%s/%s/' % (CFG_SITE_URL, CFG_SITE_RECORD)) or url.startswith('%s/%s/' % (CFG_SITE_SECURE_URL, CFG_SITE_RECORD))):
return False
splitted_url = url.split('/files/')
return len(splitted_url) == 2 and splitted_url[0] != '' and splitted_url[1] != ''
def get_docid_from_bibdocfile_fullpath(fullpath):
"""Given a bibdocfile fullpath (e.g. "CFG_BIBDOCFILE_FILEDIR/g0/123/bar.pdf;1")
returns the docid (e.g. 123)."""
if not fullpath.startswith(os.path.join(CFG_BIBDOCFILE_FILEDIR, 'g')):
raise InvenioBibDocFileError, "Fullpath %s doesn't correspond to a valid bibdocfile fullpath" % fullpath
dirname = decompose_file_with_version(fullpath)[0]
try:
return int(dirname.split('/')[-1])
except:
raise InvenioBibDocFileError, "Fullpath %s doesn't correspond to a valid bibdocfile fullpath" % fullpath
def decompose_bibdocfile_fullpath(fullpath):
"""Given a bibdocfile fullpath (e.g. "CFG_BIBDOCFILE_FILEDIR/g0/123/bar.pdf;1")
returns a quadruple (recid, docname, format, version)."""
if not fullpath.startswith(os.path.join(CFG_BIBDOCFILE_FILEDIR, 'g')):
raise InvenioBibDocFileError, "Fullpath %s doesn't correspond to a valid bibdocfile fullpath" % fullpath
dirname, dummy, extension, version = decompose_file_with_version(fullpath)
try:
docid = int(dirname.split('/')[-1])
return {"doc_id" : docid, "extension": extension, "version": version}
except:
raise InvenioBibDocFileError, "Fullpath %s doesn't correspond to a valid bibdocfile fullpath" % fullpath
def decompose_bibdocfile_url(url):
"""Given a bibdocfile_url return a triple (recid, docname, format)."""
if url.startswith('%s/getfile.py' % CFG_SITE_URL) or url.startswith('%s/getfile.py' % CFG_SITE_SECURE_URL):
return decompose_bibdocfile_very_old_url(url)
if url.startswith('%s/%s/' % (CFG_SITE_URL, CFG_SITE_RECORD)):
recid_file = url[len('%s/%s/' % (CFG_SITE_URL, CFG_SITE_RECORD)):]
elif url.startswith('%s/%s/' % (CFG_SITE_SECURE_URL, CFG_SITE_RECORD)):
recid_file = url[len('%s/%s/' % (CFG_SITE_SECURE_URL, CFG_SITE_RECORD)):]
else:
raise InvenioBibDocFileError, "Url %s doesn't correspond to a valid record inside the system." % url
recid_file = recid_file.replace('/files/', '/')
recid, docname, docformat = decompose_file(urllib.unquote(recid_file)) # this will work in the case of URL... not file !
if not recid and docname.isdigit():
## If the URL was something similar to CFG_SITE_URL/CFG_SITE_RECORD/123
return (int(docname), '', '')
return (int(recid), docname, docformat)
re_bibdocfile_old_url = re.compile(r'/%s/(\d*)/files/' % CFG_SITE_RECORD)
def decompose_bibdocfile_old_url(url):
"""Given a bibdocfile old url (e.g. CFG_SITE_URL/CFG_SITE_RECORD/123/files)
it returns the recid."""
g = re_bibdocfile_old_url.search(url)
if g:
return int(g.group(1))
raise InvenioBibDocFileError('%s is not a valid old bibdocfile url' % url)
def decompose_bibdocfile_very_old_url(url):
"""Decompose an old /getfile.py? URL"""
if url.startswith('%s/getfile.py' % CFG_SITE_URL) or url.startswith('%s/getfile.py' % CFG_SITE_SECURE_URL):
params = urllib.splitquery(url)[1]
if params:
try:
params = cgi.parse_qs(params)
if 'docid' in params:
docid = int(params['docid'][0])
bibdoc = BibDoc.create_instance(docid)
if bibdoc.bibrec_links:
recid = bibdoc.bibrec_links[0]["rec_id"]
docname = bibdoc.bibrec_links[0]["doc_name"]
else:
raise InvenioBibDocFileError("Old style URL pointing to an unattached document")
elif 'recid' in params:
recid = int(params['recid'][0])
if 'name' in params:
docname = params['name'][0]
else:
docname = ''
else:
raise InvenioBibDocFileError('%s has not enough params to correspond to a bibdocfile.' % url)
docformat = normalize_format(params.get('format', [''])[0])
return (recid, docname, docformat)
except Exception, e:
raise InvenioBibDocFileError('Problem with %s: %s' % (url, e))
else:
raise InvenioBibDocFileError('%s has no params to correspond to a bibdocfile.' % url)
else:
raise InvenioBibDocFileError('%s is not a valid very old bibdocfile url' % url)
def get_docname_from_url(url):
"""Return a potential docname given a url"""
path = urllib2.urlparse.urlsplit(urllib.unquote(url))[2]
filename = os.path.split(path)[-1]
return file_strip_ext(filename)
def get_format_from_url(url):
"""Return a potential format given a url"""
path = urllib2.urlparse.urlsplit(urllib.unquote(url))[2]
filename = os.path.split(path)[-1]
return filename[len(file_strip_ext(filename)):]
def clean_url(url):
"""Given a local url e.g. a local path it render it a realpath."""
if is_url_a_local_file(url):
path = urllib2.urlparse.urlsplit(urllib.unquote(url))[2]
return os.path.abspath(path)
else:
return url
def is_url_a_local_file(url):
"""Return True if the given URL is pointing to a local file."""
protocol = urllib2.urlparse.urlsplit(url)[0]
return protocol in ('', 'file')
def check_valid_url(url):
"""
Check for validity of a url or a file.
@param url: the URL to check
@type url: string
@raise StandardError: if the URL is not a valid URL.
"""
try:
if is_url_a_local_file(url):
path = urllib2.urlparse.urlsplit(urllib.unquote(url))[2]
if os.path.abspath(path) != path:
raise StandardError, "%s is not a normalized path (would be %s)." % (path, os.path.normpath(path))
for allowed_path in CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS + [CFG_TMPDIR, CFG_TMPSHAREDDIR, CFG_WEBSUBMIT_STORAGEDIR]:
if path.startswith(allowed_path):
dummy_fd = open(path)
dummy_fd.close()
return
raise StandardError, "%s is not in one of the allowed paths." % path
else:
try:
open_url(url)
except InvenioBibdocfileUnauthorizedURL, e:
raise StandardError, str(e)
except Exception, e:
raise StandardError, "%s is not a correct url: %s" % (url, e)
def safe_mkstemp(suffix, prefix='bibdocfile_'):
"""Create a temporary filename that don't have any '.' inside a part
from the suffix."""
tmpfd, tmppath = tempfile.mkstemp(suffix=suffix, prefix=prefix, dir=CFG_TMPDIR)
# Close the file and leave the responsability to the client code to
# correctly open/close it.
os.close(tmpfd)
if '.' not in suffix:
# Just in case format is empty
return tmppath
while '.' in os.path.basename(tmppath)[:-len(suffix)]:
os.remove(tmppath)
tmpfd, tmppath = tempfile.mkstemp(suffix=suffix, prefix=prefix, dir=CFG_TMPDIR)
os.close(tmpfd)
return tmppath
def download_local_file(filename, docformat=None):
"""
Copies a local file to Invenio's temporary directory.
@param filename: the name of the file to copy
@type filename: string
@param format: the format of the file to copy (will be found if not
specified)
@type format: string
@return: the path of the temporary file created
@rtype: string
@raise StandardError: if something went wrong
"""
# Make sure the format is OK.
if docformat is None:
docformat = guess_format_from_url(filename)
else:
docformat = normalize_format(docformat)
tmppath = ''
# Now try to copy.
try:
path = urllib2.urlparse.urlsplit(urllib.unquote(filename))[2]
if os.path.abspath(path) != path:
raise StandardError, "%s is not a normalized path (would be %s)." \
% (path, os.path.normpath(path))
for allowed_path in CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS + [CFG_TMPDIR,
CFG_WEBSUBMIT_STORAGEDIR]:
if path.startswith(allowed_path):
tmppath = safe_mkstemp(docformat)
shutil.copy(path, tmppath)
if os.path.getsize(tmppath) == 0:
os.remove(tmppath)
raise StandardError, "%s seems to be empty" % filename
break
else:
raise StandardError, "%s is not in one of the allowed paths." % path
except Exception, e:
raise StandardError, "Impossible to copy the local file '%s': %s" % \
(filename, str(e))
return tmppath
def download_external_url(url, docformat=None, progress_callback=None):
"""
Download a url (if it corresponds to a remote file) and return a
local url to it.
@param url: the URL to download
@type url: string
@param format: the format of the file (will be found if not specified)
@type format: string
@return: the path to the download local file
@rtype: string
@raise StandardError: if the download failed
"""
tmppath = None
# Make sure the format is OK.
if docformat is None:
# First try to find a known extension to the URL
docformat = decompose_file(url, skip_version=True,
only_known_extensions=True)[2]
if not docformat:
# No correct format could be found. Will try to get it from the
# HTTP message headers.
docformat = ''
else:
docformat = normalize_format(docformat)
from_file, to_file, tmppath = None, None, ''
try:
from_file = open_url(url)
except InvenioBibdocfileUnauthorizedURL, e:
raise StandardError, str(e)
except urllib2.URLError, e:
raise StandardError, 'URL could not be opened: %s' % str(e)
if not docformat:
# We could not determine the format from the URL, so let's try
# to read it from the HTTP headers.
docformat = get_format_from_http_response(from_file)
try:
tmppath = safe_mkstemp(docformat)
if progress_callback:
total_size = int(from_file.info().getheader('Content-Length').strip())
progress_size = 0
to_file = open(tmppath, 'w')
while True:
block = from_file.read(CFG_BIBDOCFILE_BLOCK_SIZE)
if not block:
break
to_file.write(block)
if progress_callback:
progress_size += CFG_BIBDOCFILE_BLOCK_SIZE
progress_callback(progress_size, CFG_BIBDOCFILE_BLOCK_SIZE,
total_size)
to_file.close()
from_file.close()
if os.path.getsize(tmppath) == 0:
raise StandardError, "%s seems to be empty" % url
except Exception, e:
# Try to close and remove the temporary file.
try:
to_file.close()
except Exception:
pass
try:
os.remove(tmppath)
except Exception:
pass
raise StandardError, "Error when downloading %s into %s: %s" % \
(url, tmppath, e)
return tmppath
def get_format_from_http_response(response):
"""
Tries to retrieve the format of the file from the message headers of the
HTTP response.
@param response: the HTTP response
@type response: file-like object (as returned by urllib.urlopen)
@return: the format of the remote resource
@rtype: string
"""
def parse_content_type(text):
return text.split(';')[0].strip()
def parse_content_disposition(text):
for item in text.split(';'):
item = item.strip()
if item.strip().startswith('filename='):
return item[len('filename="'):-len('"')]
info = response.info()
docformat = ''
content_disposition = info.getheader('Content-Disposition')
if content_disposition:
filename = parse_content_disposition(content_disposition)
if filename:
docformat = decompose_file(filename, only_known_extensions=False)[2]
if docformat:
return docformat
content_type = info.getheader('Content-Type')
if content_type:
content_type = parse_content_type(content_type)
if content_type not in ('text/plain', 'application/octet-stream'):
## We actually ignore these mimetypes since they are the
## defaults often returned by Apache in case the mimetype
## was not known
ext = _mimes.guess_extension(content_type)
if ext:
## Normalize some common magic mis-interpreation
ext = {'.asc': '.txt', '.obj': '.bin'}.get(ext, ext)
docformat = normalize_format(ext)
return docformat
def download_url(url, docformat=None):
"""
Download a url (if it corresponds to a remote file) and return a
local url to it.
"""
tmppath = None
try:
if is_url_a_local_file(url):
tmppath = download_local_file(url, docformat = docformat)
else:
tmppath = download_external_url(url, docformat = docformat)
except StandardError:
raise
return tmppath
class MoreInfo(object):
"""This class represents a genering MoreInfo dictionary.
MoreInfo object can be attached to bibdoc, bibversion, format or BibRelation.
The entity where a particular MoreInfo object is attached has to be specified using the
constructor parametes.
This class is a thin wrapper around the database table.
"""
def __init__(self, docid = None, version = None, docformat = None,
relation = None, cache_only = False, cache_reads = True, initial_data = None):
"""
@param cache_only Determines if MoreInfo object should be created in
memory only or reflected in the database
@type cache_only boolean
@param cache_reads Determines if reads should be executed on the
in-memory cache or should be redirected to the
database. If this is true, cache can be entirely
regenerated from the database only upon an explicit
request. If the value is not present in the cache,
the database is queried
@type cache_reads boolean
@param initial_data Allows to specify initial content of the cache.
This parameter is useful when we create an in-memory
instance from serialised value
@type initial_data string
"""
self.docid = docid
self.version = version
self.format = docformat
self.relation = relation
self.cache_only = cache_only
if initial_data != None:
self.cache = initial_data
self.dirty = initial_data
if not self.cache_only:
self._flush_cache() #inserts new entries
else:
self.cache = {}
self.dirty = {}
self.cache_reads = cache_reads
if not self.cache_only:
self.populate_from_database()
@staticmethod
def create_from_serialised(ser_str, docid = None, version = None, docformat = None,
relation = None, cache_only = False, cache_reads = True):
"""Creates an instance of MoreInfo
using serialised data as the cache content"""
data = cPickle.loads(base64.b64decode(ser_str))
return MoreInfo(docid = docid, version = version, docformat = docformat,
relation = relation, cache_only = cache_only,
cache_reads = cache_reads, initial_data = data);
def serialise_cache(self):
"""Returns a serialised representation of the cache"""
return base64.b64encode(cPickle.dumps(self.get_cache()))
def populate_from_database(self):
"""Retrieves all values of MoreInfo and places them in the cache"""
where_str, where_args = self._generate_where_query_args()
query_str = "SELECT namespace, data_key, data_value FROM bibdocmoreinfo WHERE %s" % (where_str, )
res = run_sql(query_str, where_args)
if res:
for row in res:
namespace, data_key, data_value_ser = row
data_value = cPickle.loads(data_value_ser)
if not namespace in self.cache:
self.cache[namespace] = {}
self.cache[namespace][data_key] = data_value
def _mark_dirty(self, namespace, data_key):
"""Marks a data key dirty - that should be saved into the database"""
if not namespace in self.dirty:
self.dirty[namespace] = {}
self.dirty[namespace][data_key] = True
def _database_get_distinct_string_list(self, column, namespace = None):
"""A private method reading an unique list of strings from the
moreinfo database table"""
where_str, where_args = self._generate_where_query_args(
namespace = namespace)
query_str = "SELECT DISTINCT %s FROM bibdocmoreinfo WHERE %s" % \
( column, where_str, )
if DBG_LOG_QUERIES:
- from invenio.bibtask import write_message
+ from invenio.legacy.bibsched.bibtask import write_message
write_message("Executing query: " + query_str + " ARGS: " + repr(where_args))
print "Executing query: " + query_str + " ARGS: " + repr(where_args)
res = run_sql(query_str, where_args)
return (res and [x[0] for x in res]) or [] # after migrating to python 2.6, can be rewritten using x if y else z syntax: return [x[0] for x in res] if res else []
def _database_get_namespaces(self):
"""Read the database to discover namespaces declared in a given MoreInfo"""
return self._database_get_distinct_string_list("namespace")
def _database_get_keys(self, namespace):
"""Returns all keys assigned in a given namespace of a MoreInfo instance"""
return self._database_get_distinct_string_list("data_key", namespace=namespace)
def _database_contains_key(self, namespace, key):
return self._database_read_value(namespace, key) != None
def _database_save_value(self, namespace, key, value):
"""Write changes into the database"""
#TODO: this should happen within one transaction
serialised_val = cPickle.dumps(value)
# on duplicate key will not work here as miltiple null values are permitted by the index
if not self._database_contains_key(namespace, key):
#insert new value
query_parts = []
query_args = []
to_process = [(self.docid, "id_bibdoc"), (self.version, "version"),
(self.format, "format"), (self.relation, "id_rel"),
(str(namespace), "namespace"), (str(key), "data_key"),
(str(serialised_val), "data_value")]
for entry in to_process:
_val_or_null(entry[0], q_str = query_parts, q_args = query_args)
columns_str = ", ".join(map(lambda x: x[1], to_process))
values_str = ", ".join(query_parts)
query_str = "INSERT INTO bibdocmoreinfo (%s) VALUES(%s)" % \
(columns_str, values_str)
if DBG_LOG_QUERIES:
- from invenio.bibtask import write_message
+ from invenio.legacy.bibsched.bibtask import write_message
write_message("Executing query: " + query_str + " ARGS: " + repr(query_args))
print "Executing query: " + query_str + " ARGS: " + repr(query_args)
run_sql(query_str, query_args)
else:
#Update existing value
where_str, where_args = self._generate_where_query_args(namespace, key)
query_str = "UPDATE bibdocmoreinfo SET data_value=%s WHERE " + where_str
query_args = [str(serialised_val)] + where_args
if DBG_LOG_QUERIES:
- from invenio.bibtask import write_message
+ from invenio.legacy.bibsched.bibtask import write_message
write_message("Executing query: " + query_str + " ARGS: " + repr(query_args))
print "Executing query: " + query_str + " ARGS: " + repr(query_args)
run_sql(query_str, query_args )
def _database_read_value(self, namespace, key):
"""Reads a value directly from the database
@param namespace - namespace of the data to be read
@param key - key of the data to be read
"""
where_str, where_args = self._generate_where_query_args(namespace = namespace, data_key = key)
query_str = "SELECT data_value FROM bibdocmoreinfo WHERE " + where_str
res = run_sql(query_str, where_args)
if DBG_LOG_QUERIES:
- from invenio.bibtask import write_message
+ from invenio.legacy.bibsched.bibtask import write_message
write_message("Executing query: " + query_str + " ARGS: " + repr(where_args) + "WITH THE RESULT: " + str(res))
s_ = ""
if res:
s_ = cPickle.loads(res[0][0])
print "Executing query: " + query_str + " ARGS: " + repr(where_args) + " WITH THE RESULT: " + str(s_)
if res and res[0][0]:
try:
return cPickle.loads(res[0][0])
except:
raise Exception("Error when deserialising value for %s key=%s retrieved value=%s" % (repr(self), str(key), str(res[0][0])))
return None
def _database_remove_value(self, namespace, key):
"""Removes an entry directly in the database"""
where_str, where_args = self._generate_where_query_args(namespace = namespace, data_key = key)
query_str = "DELETE FROM bibdocmoreinfo WHERE " + where_str
if DBG_LOG_QUERIES:
- from invenio.bibtask import write_message
+ from invenio.legacy.bibsched.bibtask import write_message
write_message("Executing query: " + query_str + " ARGS: " + repr(where_args))
print "Executing query: " + query_str + " ARGS: " + repr(where_args)
run_sql(query_str, where_args)
return None
def _flush_cache(self):
"""Writes all the dirty cache entries into the database"""
for namespace in self.dirty:
for data_key in self.dirty[namespace]:
if namespace in self.cache and data_key in self.cache[namespace]\
and not self.cache[namespace][data_key] is None:
self._database_save_value(namespace, data_key, self.cache[namespace][data_key])
else:
# This might happen if a value has been removed from the cache
self._database_remove_value(namespace, data_key)
self.dirty = {}
def _generate_where_query_args(self, namespace = None, data_key = None):
"""Private method generating WHERE clause of SQL statements"""
ns = []
if namespace != None:
ns = [(namespace, "namespace")]
dk = []
if data_key != None:
dk = [(data_key, "data_key")]
to_process = [(self.docid, "id_bibdoc"), (self.version, "version"),
(self.format, "format"), (self.relation, "id_rel")] + \
ns + dk
return _sql_generate_conjunctive_where(to_process)
def set_data(self, namespace, key, value):
"""setting data directly in the database dictionary"""
if not namespace in self.cache:
self.cache[namespace] = {}
self.cache[namespace][key] = value
self._mark_dirty(namespace, key)
if not self.cache_only:
self._flush_cache()
def get_data(self, namespace, key):
"""retrieving data from the database"""
if self.cache_reads or self.cache_only:
if namespace in self.cache and key in self.cache[namespace]:
return self.cache[namespace][key]
if not self.cache_only:
# we have a permission to read from the database
value = self._database_read_value(namespace, key)
if value:
if not namespace in self.cache:
self.cache[namespace] = {}
self.cache[namespace][key] = value
return value
return None
def del_key(self, namespace, key):
"""retrieving data from the database"""
if not namespace in self.cache:
return None
del self.cache[namespace][key]
self._mark_dirty(namespace, key)
if not self.cache_only:
self._flush_cache()
def contains_key(self, namespace, key):
return self.get_data(namespace, key) != None
# the dictionary interface -> updating the default namespace
def __setitem__(self, key, value):
self.set_data("", key, value) #the default value
def __getitem__(self, key):
return self.get_data("", key)
def __delitem__(self, key):
self.del_key("", key)
def __contains__(self, key):
return self.contains_key("", key)
def __repr__(self):
return "MoreInfo(docid=%s, version=%s, docformat=%s, relation=%s)" % \
(self.docid, self.version, self.format, self.relation)
def delete(self):
"""Remove all entries associated with this MoreInfo"""
self.cache = {}
if not self.cache_only:
where_str, query_args = self._generate_where_query_args()
query_str = "DELETE FROM bibdocmoreinfo WHERE %s" % (where_str, )
if DBG_LOG_QUERIES:
- from invenio.bibtask import write_message
+ from invenio.legacy.bibsched.bibtask import write_message
write_message("Executing query: " + query_str + " ARGS: " + repr(query_args))
print "Executing query: " + query_str + " ARGS: " + repr(query_args)
run_sql(query_str, query_args)
def get_cache(self):
"""Returns the content of the cache
@return The content of the MoreInfo cache
@rtype dictionary {namespace: {key1: value1, ... }, namespace2: {}}
"""
return self.cache
def get_namespaces(self):
"""Returns a list of namespaces present in the MoreInfo structure.
If the object is permitted access to the database, the data should
be always read from there. Unlike when reading a particular value,
we can not check if value is missing in the cache
"""
if self.cache_only and self.cache_reads:
return self.cache.keys()
return self._database_get_namespaces()
def get_keys(self, namespace):
"""Returns a list of keys present in a given namespace"""
if self.cache_only and self.cache_reads:
res = []
if namespace in self.cache:
res = self.cache[namespace].keys()
return res
else:
return self._database_get_keys(namespace)
def flush(self):
"""Flush the content into the database"""
self._flush_cache()
class BibDocMoreInfo(MoreInfo):
"""
This class wraps contextual information of the documents, such as the
- comments
- descriptions
- flags.
Such information is kept separately per every format/version instance of
the corresponding document and is searialized in the database, ready
to be retrieved (but not searched).
@param docid: the document identifier.
@type docid: integer
@param more_info: a serialized version of an already existing more_info
object. If not specified this information will be readed from the
database, and othewise an empty dictionary will be allocated.
@raise ValueError: if docid is not a positive integer.
@ivar docid: the document identifier as passed to the constructor.
@type docid: integer
@ivar more_info: the more_info dictionary that will hold all the
additional document information.
@type more_info: dict of dict of dict
@note: in general this class is never instanciated in client code and
never used outside bibdocfile module.
@note: this class will be extended in the future to hold all the new auxiliary
information about a document.
"""
def __init__(self, docid, cache_only = False, initial_data = None):
if not (type(docid) in (long, int) and docid > 0):
raise ValueError("docid is not a positive integer, but %s." % docid)
MoreInfo.__init__(self, docid, cache_only = cache_only, initial_data = initial_data)
if 'descriptions' not in self:
self['descriptions'] = {}
if 'comments' not in self:
self['comments'] = {}
if 'flags' not in self:
self['flags'] = {}
if DBG_LOG_QUERIES:
- from invenio.bibtask import write_message
+ from invenio.legacy.bibsched.bibtask import write_message
write_message("Creating BibDocMoreInfo :" + repr(self["comments"]))
print "Creating BibdocMoreInfo :" + repr(self["comments"])
def __repr__(self):
"""
@return: the canonical string representation of the C{BibDocMoreInfo}.
@rtype: string
"""
return 'BibDocMoreInfo(%i, %s)' % (self.docid, repr(cPickle.dumps(self)))
def set_flag(self, flagname, docformat, version):
"""
Sets a flag.
@param flagname: the flag to set (see
L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}).
@type flagname: string
@param format: the format for which the flag should set.
@type format: string
@param version: the version for which the flag should set:
@type version: integer
@raise ValueError: if the flag is not in
L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}
"""
if flagname in CFG_BIBDOCFILE_AVAILABLE_FLAGS:
flags = self['flags']
if not flagname in flags:
flags[flagname] = {}
if not version in flags[flagname]:
flags[flagname][version] = {}
if not docformat in flags[flagname][version]:
flags[flagname][version][docformat] = {}
flags[flagname][version][docformat] = True
self['flags'] = flags
else:
raise ValueError, "%s is not in %s" % \
(flagname, CFG_BIBDOCFILE_AVAILABLE_FLAGS)
def get_comment(self, docformat, version):
"""
Returns the specified comment.
@param format: the format for which the comment should be
retrieved.
@type format: string
@param version: the version for which the comment should be
retrieved.
@type version: integer
@return: the specified comment.
@rtype: string
"""
try:
assert(type(version) is int)
docformat = normalize_format(docformat)
return self['comments'].get(version, {}).get(docformat)
except:
register_exception()
raise
def get_description(self, docformat, version):
"""
Returns the specified description.
@param format: the format for which the description should be
retrieved.
@type format: string
@param version: the version for which the description should be
retrieved.
@type version: integer
@return: the specified description.
@rtype: string
"""
try:
assert(type(version) is int)
docformat = normalize_format(docformat)
return self['descriptions'].get(version, {}).get(docformat)
except:
register_exception()
raise
def has_flag(self, flagname, docformat, version):
"""
Return True if the corresponding has been set.
@param flagname: the name of the flag (see
L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}).
@type flagname: string
@param format: the format for which the flag should be checked.
@type format: string
@param version: the version for which the flag should be checked.
@type version: integer
@return: True if the flag is set for the given format/version.
@rtype: bool
@raise ValueError: if the flagname is not in
L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}
"""
if flagname in CFG_BIBDOCFILE_AVAILABLE_FLAGS:
return self['flags'].get(flagname, {}).get(version, {}).get(docformat, False)
else:
raise ValueError, "%s is not in %s" % (flagname, CFG_BIBDOCFILE_AVAILABLE_FLAGS)
def get_flags(self, docformat, version):
"""
Return the list of all the enabled flags.
@param format: the format for which the list should be returned.
@type format: string
@param version: the version for which the list should be returned.
@type version: integer
@return: the list of enabled flags (from
L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}).
@rtype: list of string
"""
return [flag for flag in self['flags'] if docformat in self['flags'][flag].get(version, {})]
def set_comment(self, comment, docformat, version):
"""
Set a comment.
@param comment: the comment to be set.
@type comment: string
@param format: the format for which the comment should be set.
@type format: string
@param version: the version for which the comment should be set:
@type version: integer
"""
try:
assert(type(version) is int and version > 0)
docformat = normalize_format(docformat)
if comment == KEEP_OLD_VALUE:
comment = self.get_comment(docformat, version) or self.get_comment(docformat, version - 1)
if not comment:
self.unset_comment(docformat, version)
return
if not version in self['comments']:
comments = self['comments']
comments[version] = {}
self['comments'] = comments
comments = self['comments']
comments[version][docformat] = comment
self['comments'] = comments
except:
register_exception()
raise
def set_description(self, description, docformat, version):
"""
Set a description.
@param description: the description to be set.
@type description: string
@param format: the format for which the description should be set.
@type format: string
@param version: the version for which the description should be set:
@type version: integer
"""
try:
assert(type(version) is int and version > 0)
docformat = normalize_format(docformat)
if description == KEEP_OLD_VALUE:
description = self.get_description(docformat, version) or self.get_description(docformat, version - 1)
if not description:
self.unset_description(docformat, version)
return
descriptions = self['descriptions']
if not version in descriptions:
descriptions[version] = {}
descriptions[version][docformat] = description
self.set_data("", 'descriptions', descriptions)
except:
register_exception()
raise
def unset_comment(self, docformat, version):
"""
Unset a comment.
@param format: the format for which the comment should be unset.
@type format: string
@param version: the version for which the comment should be unset:
@type version: integer
"""
try:
assert(type(version) is int and version > 0)
comments = self['comments']
del comments[version][docformat]
self['comments'] = comments
except KeyError:
pass
except:
register_exception()
raise
def unset_description(self, docformat, version):
"""
Unset a description.
@param format: the format for which the description should be unset.
@type format: string
@param version: the version for which the description should be unset:
@type version: integer
"""
try:
assert(type(version) is int and version > 0)
descriptions = self['descriptions']
del descriptions[version][docformat]
self['descriptions'] = descriptions
except KeyError:
pass
except:
register_exception()
raise
def unset_flag(self, flagname, docformat, version):
"""
Unset a flag.
@param flagname: the flag to be unset (see
L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}).
@type flagname: string
@param format: the format for which the flag should be unset.
@type format: string
@param version: the version for which the flag should be unset:
@type version: integer
@raise ValueError: if the flag is not in
L{CFG_BIBDOCFILE_AVAILABLE_FLAGS}
"""
if flagname in CFG_BIBDOCFILE_AVAILABLE_FLAGS:
try:
flags = self['flags']
del flags[flagname][version][docformat]
self['flags'] = flags
except KeyError:
pass
else:
raise ValueError, "%s is not in %s" % (flagname, CFG_BIBDOCFILE_AVAILABLE_FLAGS)
_bib_relation__any_value = -1
class BibRelation(object):
"""
A representation of a relation between documents or their particular versions
"""
def __init__(self, rel_type = None,
bibdoc1_id = None, bibdoc2_id = None,
bibdoc1_ver = None, bibdoc2_ver = None,
bibdoc1_fmt = None, bibdoc2_fmt = None,
rel_id = None):
"""
The constructor of the class representing a relation between two
documents.
If the more_info parameter is specified, no data is retrieved from
the database and the internal dictionary is initialised with
the passed value. If the more_info is not provided, the value is
read from the database. In the case of non-existing record, an
empty dictionary is assigned.
If a version of whichever record is not specified, the resulting
object desctibes a relation of all version of a given BibDoc.
@param bibdoc1
@type bibdoc1 BibDoc
@param bibdoc1_ver
@type version1_ver int
@param bibdoc2
@type bibdoc2 BibDco
@param bibdoc2_ver
@type bibdoc2_ver int
@param bibdoc1_fmt format of the first document
@type bibdoc1_fmt string
@param bibdoc2_fmt format of the second document
@type bibdoc2_fmt string
@param rel_type
@type rel_type string
@param more_info The serialised representation of the more_info
@type more_info string
@param rel_id allows to specify the identifier of the newly created relation
@type rel_ide unsigned int
"""
self.id = rel_id
self.bibdoc1_id = bibdoc1_id
self.bibdoc2_id = bibdoc2_id
self.bibdoc1_ver = bibdoc1_ver
self.bibdoc2_ver = bibdoc2_ver
self.bibdoc1_fmt = bibdoc1_fmt
self.bibdoc2_fmt = bibdoc2_fmt
self.rel_type = rel_type
if rel_id == None:
self._fill_id_from_data()
else:
self._fill_data_from_id()
self.more_info = MoreInfo(relation = self.id)
def _fill_data_from_id(self):
"""Fill all the relation data from the relation identifier
"""
query = "SELECT id_bibdoc1, version1, format1, id_bibdoc2, version2, format2, rel_type FROM bibdoc_bibdoc WHERE id=%s"
res = run_sql(query, (str(self.id), ))
if res != None and res[0] != None:
self.bibdoc1_id = res[0][0]
self.bibdoc1_ver = res[0][1]
self.bibdoc1_fmt = res[0][2]
self.bibdoc2_id = res[0][3]
self.bibdoc2_ver = res[0][4]
self.bibdoc2_fmt = res[0][5]
self.rel_type = res[0][6]
def _fill_id_from_data(self):
"""Fill the relation identifier based on the data provided"""
where_str, where_args = self._get_where_clauses()
query = "SELECT id FROM bibdoc_bibdoc WHERE %s" % (where_str, )
res = run_sql(query, where_args)
if res and res[0][0]:
self.id = int(res[0][0])
def _get_value_column_mapping(self):
"""
Returns a list of tuples each tuple consists of a value and a name
of a database column where this value should fit
"""
return [(self.rel_type, "rel_type"), (self.bibdoc1_id, "id_bibdoc1"),
(self.bibdoc1_ver, "version1"),
(self.bibdoc1_fmt, "format1"),
(self.bibdoc2_id, "id_bibdoc2"),
(self.bibdoc2_ver, "version2"),
(self.bibdoc2_fmt, "format2")]
def _get_where_clauses(self):
"""Private function returning part of the SQL statement identifying
current relation
@return
@rtype tuple
"""
return _sql_generate_conjunctive_where(self._get_value_column_mapping())
@staticmethod
def create(bibdoc1_id = None, bibdoc1_ver = None,
bibdoc1_fmt = None, bibdoc2_id = None,
bibdoc2_ver = None, bibdoc2_fmt = None,
rel_type = ""):
"""
Create a relation and return instance.
Ommiting an argument means that a particular relation concerns any value of the parameter
"""
# check if there is already entry corresponding to parameters
existing = BibRelation.get_relations(rel_type = rel_type,
bibdoc1_id = bibdoc1_id,
bibdoc2_id = bibdoc2_id,
bibdoc1_ver = bibdoc1_ver,
bibdoc2_ver = bibdoc2_ver,
bibdoc1_fmt = bibdoc1_fmt,
bibdoc2_fmt = bibdoc2_fmt)
if len(existing) > 0:
return existing[0]
# build the insert query and execute it
to_process = [(rel_type, "rel_type"), (bibdoc1_id, "id_bibdoc1"),
(bibdoc1_ver, "version1"), (bibdoc1_fmt, "format1"),
(bibdoc2_id, "id_bibdoc2"), (bibdoc2_ver, "version2"),
(bibdoc2_fmt, "format2")]
values_list = []
args_list = []
columns_list = []
for entry in to_process:
columns_list.append(entry[1])
if entry[0] == None:
values_list.append("NULL")
else:
values_list.append("%s")
args_list.append(entry[0])
query = "INSERT INTO bibdoc_bibdoc (%s) VALUES (%s)" % (", ".join(columns_list), ", ".join(values_list))
# print "Query: %s Args: %s" % (query, str(args_list))
rel_id = run_sql(query, args_list)
return BibRelation(rel_id = rel_id)
def delete(self):
""" Removes a relation between objects from the database.
executing the flush function on the same object will restore
the relation
"""
where_str, where_args = self._get_where_clauses()
run_sql("DELETE FROM bibdoc_bibdoc WHERE %s" % (where_str,), where_args) # kwalitee: disable=sql
# removing associated MoreInfo
self.more_info.delete()
def get_more_info(self):
return self.more_info
@staticmethod
def get_relations(rel_type = _bib_relation__any_value,
bibdoc1_id = _bib_relation__any_value,
bibdoc2_id = _bib_relation__any_value,
bibdoc1_ver = _bib_relation__any_value,
bibdoc2_ver = _bib_relation__any_value,
bibdoc1_fmt = _bib_relation__any_value,
bibdoc2_fmt = _bib_relation__any_value):
"""Retrieves list of relations satisfying condtions.
If a parameter is specified, its value has to match exactly.
If a parameter is ommited, any of its values will be accepted"""
to_process = [(rel_type, "rel_type"), (bibdoc1_id, "id_bibdoc1"),
(bibdoc1_ver, "version1"), (bibdoc1_fmt, "format1"),
(bibdoc2_id, "id_bibdoc2"), (bibdoc2_ver, "version2"),
(bibdoc2_fmt, "format2")]
where_str, where_args = _sql_generate_conjunctive_where(
filter(lambda x: x[0] != _bib_relation__any_value, to_process))
if where_str:
where_str = "WHERE " + where_str # in case of nonempty where, we need a where clause
query_str = "SELECT id FROM bibdoc_bibdoc %s" % (where_str, )
# print "running query : %s with arguments %s on the object %s" % (query_str, str(where_args), repr(self))
try:
res = run_sql(query_str, where_args)
except:
raise Exception(query_str + " " + str(where_args))
results = []
if res != None:
for res_row in res:
results.append(BibRelation(rel_id=res_row[0]))
return results
# Access to MoreInfo
def set_data(self, category, key, value):
"""assign additional information to this relation"""
self.more_info.set_data(category, key, value)
def get_data(self, category, key):
"""read additional information assigned to this relation"""
return self.more_info.get_data(category, key)
#the dictionary interface allowing to set data bypassing the namespaces
def __setitem__(self, key, value):
self.more_info[key] = value
def __getitem__(self, key):
return self.more_info[key]
def __contains__(self, key):
return self.more_info.__contains__(key)
def __repr__(self):
return "BibRelation(id_bibdoc1 = %s, version1 = %s, format1 = %s, id_bibdoc2 = %s, version2 = %s, format2 = %s, rel_type = %s)" % \
(self.bibdoc1_id, self.bibdoc1_ver, self.bibdoc1_fmt,
self.bibdoc2_id, self.bibdoc2_ver, self.bibdoc2_fmt,
self.rel_type)
def readfile(filename):
"""
Read a file.
@param filename: the name of the file to be read.
@type filename: string
@return: the text contained in the file.
@rtype: string
@note: Returns empty string in case of any error.
@note: this function is useful for quick implementation of websubmit
functions.
"""
try:
return open(filename).read()
except Exception:
return ''
class HeadRequest(urllib2.Request):
"""
A request object to perform a HEAD request.
"""
def get_method(self):
return 'HEAD'
def read_cookie(cookiefile):
"""
Parses a cookie file and returns a string as needed for the urllib2 headers
The file should respect the Netscape cookie specifications
"""
cookie_data = ''
cfile = open(cookiefile, 'r')
for line in cfile.readlines():
tokens = line.split('\t')
if len(tokens) == 7: # we are on a cookie line
cookie_data += '%s=%s; ' % (tokens[5], tokens[6].replace('\n', ''))
cfile.close()
return cookie_data
def open_url(url, headers=None, head_request=False):
"""
Opens a URL. If headers are passed as argument, no check is performed and
the URL will be opened. Otherwise checks if the URL is present in
CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS and uses the headers specified in
the config variable.
@param url: the URL to open
@type url: string
@param headers: the headers to use
@type headers: dictionary
@param head_request: if True, perform a HEAD request, otherwise a POST
request
@type head_request: boolean
@return: a file-like object as returned by urllib2.urlopen.
"""
headers_to_use = None
if headers is None:
for regex, headers in _CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS:
if regex.match(url) is not None:
headers_to_use = headers
break
if headers_to_use is None:
# URL is not allowed.
raise InvenioBibdocfileUnauthorizedURL, "%s is not an authorized " \
"external URL." % url
else:
headers_to_use = headers
request_obj = head_request and HeadRequest or urllib2.Request
request = request_obj(url)
request.add_header('User-Agent', make_user_agent_string('bibdocfile'))
for key, value in headers_to_use.items():
try:
value = globals()[value['fnc']](**value['args'])
except (KeyError, TypeError):
pass
request.add_header(key, value)
return urllib2.urlopen(request)
def update_modification_date_of_file(filepath, modification_date):
"""Update the modification time and date of the file with the modification_date
@param filepath: the full path of the file that needs to be updated
@type filepath: string
@param modification_date: the new modification date and time
@type modification_date: datetime.datetime object
"""
try:
modif_date_in_seconds = time.mktime(modification_date.timetuple()) # try to get the time in seconds
except (AttributeError, TypeError):
modif_date_in_seconds = 0
if modif_date_in_seconds:
statinfo = os.stat(filepath) # we need to keep the same access time
os.utime(filepath, (statinfo.st_atime, modif_date_in_seconds)) #update the modification time
diff --git a/invenio/legacy/bibdocfile/cli.py b/invenio/legacy/bibdocfile/cli.py
index 7f30870c1..fc7bf32e3 100644
--- a/invenio/legacy/bibdocfile/cli.py
+++ b/invenio/legacy/bibdocfile/cli.py
@@ -1,1259 +1,1260 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibDocAdmin CLI administration tool
"""
__revision__ = "$Id$"
import sys
import re
import os
import time
import fnmatch
import time
from datetime import datetime
from logging import getLogger, debug, DEBUG
from optparse import OptionParser, OptionGroup, OptionValueError
from tempfile import mkstemp
+from invenio.base.factory import with_app_context
+
from invenio.ext.logging import register_exception
from invenio.config import CFG_SITE_URL, CFG_BIBDOCFILE_FILEDIR, \
CFG_SITE_RECORD, CFG_TMPSHAREDDIR
-from invenio.bibdocfile import BibRecDocs, BibDoc, InvenioBibDocFileError, \
+from invenio.legacy.bibdocfile.api import BibRecDocs, BibDoc, InvenioBibDocFileError, \
nice_size, check_valid_url, clean_url, get_docname_from_url, \
guess_format_from_url, KEEP_OLD_VALUE, decompose_bibdocfile_fullpath, \
bibdocfile_url_to_bibdoc, decompose_bibdocfile_url, CFG_BIBDOCFILE_AVAILABLE_FLAGS
from invenio.intbitset import intbitset
from invenio.legacy.search_engine import perform_request_search
from invenio.utils.text import wrap_text_in_a_box, wait_for_user
from invenio.legacy.dbquery import run_sql
-from invenio.bibtask import task_low_level_submission
+from invenio.legacy.bibsched.bibtask import task_low_level_submission
from invenio.utils.text import encode_for_xml
-from invenio.websubmit_file_converter import can_perform_ocr
+from invenio.legacy.websubmit.file_converter import can_perform_ocr
def _xml_mksubfield(key, subfield, fft):
return fft.get(key, None) is not None and '\t\t<subfield code="%s">%s</subfield>\n' % (subfield, encode_for_xml(str(fft[key]))) or ''
def _xml_mksubfields(key, subfield, fft):
ret = ""
for value in fft.get(key, []):
ret += '\t\t<subfield code="%s">%s</subfield>\n' % (subfield, encode_for_xml(str(value)))
return ret
def _xml_fft_creator(fft):
"""Transform an fft dictionary (made by keys url, docname, format,
new_docname, comment, description, restriction, doctype, into an xml
string."""
debug('Input FFT structure: %s' % fft)
out = '\t<datafield tag ="FFT" ind1=" " ind2=" ">\n'
out += _xml_mksubfield('url', 'a', fft)
out += _xml_mksubfield('docname', 'n', fft)
out += _xml_mksubfield('format', 'f', fft)
out += _xml_mksubfield('new_docname', 'm', fft)
out += _xml_mksubfield('doctype', 't', fft)
out += _xml_mksubfield('description', 'd', fft)
out += _xml_mksubfield('comment', 'z', fft)
out += _xml_mksubfield('restriction', 'r', fft)
out += _xml_mksubfields('options', 'o', fft)
out += _xml_mksubfield('version', 'v', fft)
out += '\t</datafield>\n'
debug('FFT created: %s' % out)
return out
def ffts_to_xml(ffts_dict):
"""Transform a dictionary: recid -> ffts where ffts is a list of fft dictionary
into xml.
"""
debug('Input FFTs dictionary: %s' % ffts_dict)
out = ''
recids = ffts_dict.keys()
recids.sort()
for recid in recids:
ffts = ffts_dict[recid]
if ffts:
out += '<record>\n'
out += '\t<controlfield tag="001">%i</controlfield>\n' % recid
for fft in ffts:
out += _xml_fft_creator(fft)
out += '</record>\n'
debug('MARC to Upload: %s' % out)
return out
_shift_re = re.compile("([-\+]{0,1})([\d]+)([dhms])")
def _parse_datetime(var):
"""Returns a date string according to the format string.
It can handle normal date strings and shifts with respect
to now."""
if not var:
return None
date = time.time()
factors = {"d":24*3600, "h":3600, "m":60, "s":1}
m = _shift_re.match(var)
if m:
sign = m.groups()[0] == "-" and -1 or 1
factor = factors[m.groups()[2]]
value = float(m.groups()[1])
return datetime.fromtimestamp(date + sign * factor * value)
else:
return datetime(*(time.strptime(var, "%Y-%m-%d %H:%M:%S")[0:6]))
# The code above is Python 2.4 compatible. The following is the 2.5
# version.
# return datetime.strptime(var, "%Y-%m-%d %H:%M:%S")
def _parse_date_range(var):
"""Returns the two dates contained as a low,high tuple"""
limits = var.split(",")
if len(limits)==1:
low = _parse_datetime(limits[0])
return low, None
if len(limits)==2:
low = _parse_datetime(limits[0])
high = _parse_datetime(limits[1])
return low, high
return None, None
def cli_quick_match_all_recids(options):
"""Return an quickly an approximate but (by excess) list of good recids."""
url = getattr(options, 'url', None)
if url:
return intbitset([decompose_bibdocfile_url(url)[0]])
path = getattr(options, 'path', None)
if path:
docid = decompose_bibdocfile_fullpath(path)["doc_id"]
bd = BibDoc(docid)
ids = []
for rec_link in bd.bibrec_links:
ids.append(rec_link["recid"])
return intbitset(ids)
docids = getattr(options, 'docids', None)
if docids:
ids = []
for docid in docids:
bd = BibDoc(docid)
for rec_link in bd.bibrec_links:
ids.append(rec_link["recid"])
return intbitset(ids)
collection = getattr(options, 'collection', None)
pattern = getattr(options, 'pattern', None)
recids = getattr(options, 'recids', None)
md_rec = getattr(options, 'md_rec', None)
cd_rec = getattr(options, 'cd_rec', None)
tmp_date_query = []
tmp_date_params = []
if recids is None:
debug('Initially considering all the recids')
recids = intbitset(run_sql('SELECT id FROM bibrec'))
if not recids:
print >> sys.stderr, 'WARNING: No record in the database'
if md_rec[0] is not None:
tmp_date_query.append('modification_date>=%s')
tmp_date_params.append(md_rec[0])
if md_rec[1] is not None:
tmp_date_query.append('modification_date<=%s')
tmp_date_params.append(md_rec[1])
if cd_rec[0] is not None:
tmp_date_query.append('creation_date>=%s')
tmp_date_params.append(cd_rec[0])
if cd_rec[1] is not None:
tmp_date_query.append('creation_date<=%s')
tmp_date_params.append(cd_rec[1])
if tmp_date_query:
tmp_date_query = ' AND '.join(tmp_date_query)
tmp_date_params = tuple(tmp_date_params)
query = 'SELECT id FROM bibrec WHERE %s' % tmp_date_query
debug('Query: %s, param: %s' % (query, tmp_date_params))
recids &= intbitset(run_sql(query % tmp_date_query, tmp_date_params))
debug('After applying dates we obtain recids: %s' % recids)
if not recids:
print >> sys.stderr, 'WARNING: Time constraints for records are too strict'
if collection or pattern:
recids &= intbitset(perform_request_search(cc=collection or '', p=pattern or ''))
debug('After applyings pattern and collection we obtain recids: %s' % recids)
debug('Quick recids: %s' % recids)
return recids
def cli_quick_match_all_docids(options, recids=None):
"""Return an quickly an approximate but (by excess) list of good docids."""
url = getattr(options, 'url', None)
if url:
return intbitset([bibdocfile_url_to_bibdoc(url).get_id()])
path = getattr(options, 'path', None)
if path:
docid = decompose_bibdocfile_fullpath(path)["doc_id"]
bd = BibDoc(docid)
ids = []
for rec_link in bd.bibrec_links:
ids.append(rec_link["recid"])
return intbitset(ids)
deleted_docs = getattr(options, 'deleted_docs', None)
action_undelete = getattr(options, 'action', None) == 'undelete'
docids = getattr(options, 'docids', None)
md_doc = getattr(options, 'md_doc', None)
cd_doc = getattr(options, 'cd_doc', None)
if docids is None:
debug('Initially considering all the docids')
if recids is None:
recids = cli_quick_match_all_recids(options)
docids = intbitset()
for id_bibrec, id_bibdoc in run_sql('SELECT id_bibrec, id_bibdoc FROM bibrec_bibdoc'):
if id_bibrec in recids:
docids.add(id_bibdoc)
else:
debug('Initially considering this docids: %s' % docids)
tmp_query = []
tmp_params = []
if deleted_docs is None and action_undelete:
deleted_docs = 'only'
if deleted_docs == 'no':
tmp_query.append('status<>"DELETED"')
elif deleted_docs == 'only':
tmp_query.append('status="DELETED"')
if md_doc[0] is not None:
tmp_query.append('modification_date>=%s')
tmp_params.append(md_doc[0])
if md_doc[1] is not None:
tmp_query.append('modification_date<=%s')
tmp_params.append(md_doc[1])
if cd_doc[0] is not None:
tmp_query.append('creation_date>=%s')
tmp_params.append(cd_doc[0])
if cd_doc[1] is not None:
tmp_query.append('creation_date<=%s')
tmp_params.append(cd_doc[1])
if tmp_query:
tmp_query = ' AND '.join(tmp_query)
tmp_params = tuple(tmp_params)
query = 'SELECT id FROM bibdoc WHERE %s' % tmp_query
debug('Query: %s, param: %s' % (query, tmp_params))
docids &= intbitset(run_sql(query, tmp_params))
debug('After applying dates we obtain docids: %s' % docids)
return docids
def cli_slow_match_single_recid(options, recid, docids=None):
"""Apply all the given queries in order to assert wethever a recid
match or not.
if with_docids is True, the recid is matched if it has at least one docid that is matched"""
debug('cli_slow_match_single_recid checking: %s' % recid)
deleted_docs = getattr(options, 'deleted_docs', None)
deleted_recs = getattr(options, 'deleted_recs', None)
empty_recs = getattr(options, 'empty_recs', None)
docname = cli2docname(options)
bibrecdocs = BibRecDocs(recid, deleted_too=(deleted_docs != 'no'))
if bibrecdocs.deleted_p() and (deleted_recs == 'no'):
return False
elif not bibrecdocs.deleted_p() and (deleted_recs != 'only'):
if docids:
for bibdoc in bibrecdocs.list_bibdocs():
if bibdoc.get_id() in docids:
break
else:
return False
if docname:
for other_docname in bibrecdocs.get_bibdoc_names():
if docname and fnmatch.fnmatchcase(other_docname, docname):
break
else:
return False
if bibrecdocs.empty_p() and (empty_recs != 'no'):
return True
elif not bibrecdocs.empty_p() and (empty_recs != 'only'):
return True
return False
def cli_slow_match_single_docid(options, docid, recids=None):
"""Apply all the given queries in order to assert wethever a recid
match or not."""
debug('cli_slow_match_single_docid checking: %s' % docid)
empty_docs = getattr(options, 'empty_docs', None)
docname = cli2docname(options)
if recids is None:
recids = cli_quick_match_all_recids(options)
bibdoc = BibDoc.create_instance(docid)
dn = None
if bibdoc.bibrec_links:
dn = bibdoc.bibrec_links[0]["docname"]
if docname and not fnmatch.fnmatchcase(dn, docname):
debug('docname %s does not match the pattern %s' % (repr(dn), repr(docname)))
return False
# elif bibdoc.get_recid() and bibdoc.get_recid() not in recids:
# debug('recid %s is not in pattern %s' % (repr(bibdoc.get_recid()), repr(recids)))
# return False
elif empty_docs == 'no' and bibdoc.empty_p():
debug('bibdoc is empty')
return False
elif empty_docs == 'only' and not bibdoc.empty_p():
debug('bibdoc is not empty')
return False
else:
return True
def cli2recid(options, recids=None, docids=None):
"""Given the command line options return a recid."""
recids = list(cli_recids_iterator(options, recids=recids, docids=docids))
if len(recids) == 1:
return recids[0]
if recids:
raise StandardError, "More than one recid has been matched: %s" % recids
else:
raise StandardError, "No recids matched"
def cli2docid(options, recids=None, docids=None):
"""Given the command line options return a docid."""
docids = list(cli_docids_iterator(options, recids=recids, docids=docids))
if len(docids) == 1:
return docids[0]
if docids:
raise StandardError, "More than one docid has been matched: %s" % docids
else:
raise StandardError, "No docids matched"
def cli2flags(options):
"""
Transform a comma separated list of flags into a list of valid flags.
"""
flags = getattr(options, 'flags', None)
if flags:
flags = [flag.strip().upper() for flag in flags.split(',')]
for flag in flags:
if flag not in CFG_BIBDOCFILE_AVAILABLE_FLAGS:
raise StandardError("%s is not among the valid flags: %s" % (flag, ', '.join(CFG_BIBDOCFILE_AVAILABLE_FLAGS)))
return flags
return []
def cli2description(options):
"""Return a good value for the description."""
description = getattr(options, 'set_description', None)
if description is None:
description = KEEP_OLD_VALUE
return description
def cli2restriction(options):
"""Return a good value for the restriction."""
restriction = getattr(options, 'set_restriction', None)
if restriction is None:
restriction = KEEP_OLD_VALUE
return restriction
def cli2comment(options):
"""Return a good value for the comment."""
comment = getattr(options, 'set_comment', None)
if comment is None:
comment = KEEP_OLD_VALUE
return comment
def cli2doctype(options):
"""Return a good value for the doctype."""
doctype = getattr(options, 'set_doctype', None)
if not doctype:
return 'Main'
return doctype
def cli2docname(options, url=None):
"""Given the command line options and optional precalculated docid
returns the corresponding docname."""
docname = getattr(options, 'docname', None)
if docname is not None:
return docname
if url is not None:
return get_docname_from_url(url)
else:
return None
def cli2format(options, url=None):
"""Given the command line options returns the corresponding format."""
docformat = getattr(options, 'format', None)
if docformat is not None:
return docformat
elif url is not None:
## FIXME: to deploy once conversion-tools branch is merged
#return guess_format_from_url(url)
return guess_format_from_url(url)
else:
raise OptionValueError("Not enough information to retrieve a valid format")
def cli_recids_iterator(options, recids=None, docids=None):
"""Slow iterator over all the matched recids.
if with_docids is True, the recid must be attached to at least a matched docid"""
debug('cli_recids_iterator')
if recids is None:
recids = cli_quick_match_all_recids(options)
debug('working on recids: %s, docids: %s' % (recids, docids))
for recid in recids:
if cli_slow_match_single_recid(options, recid, docids):
yield recid
raise StopIteration
def cli_docids_iterator(options, recids=None, docids=None):
"""Slow iterator over all the matched docids."""
if recids is None:
recids = cli_quick_match_all_recids(options)
if docids is None:
docids = cli_quick_match_all_docids(options, recids)
for docid in docids:
if cli_slow_match_single_docid(options, docid, recids):
yield docid
raise StopIteration
def cli_get_stats(dummy):
"""Print per every collection some stats"""
def print_table(title, table):
if table:
print "=" * 20, title, "=" * 20
for row in table:
print "\t".join(str(elem) for elem in row)
for collection, reclist in run_sql("SELECT name, reclist FROM collection ORDER BY name"):
print "-" * 79
print "Statistic for: %s " % collection
reclist = intbitset(reclist)
if reclist:
sqlreclist = "(" + ','.join(str(elem) for elem in reclist) + ')'
print_table("Formats", run_sql("SELECT COUNT(format) as c, format FROM bibrec_bibdoc AS bb JOIN bibdocfsinfo AS fs ON bb.id_bibdoc=fs.id_bibdoc WHERE id_bibrec in %s AND last_version=true GROUP BY format ORDER BY c DESC" % sqlreclist)) # kwalitee: disable=sql
print_table("Mimetypes", run_sql("SELECT COUNT(mime) as c, mime FROM bibrec_bibdoc AS bb JOIN bibdocfsinfo AS fs ON bb.id_bibdoc=fs.id_bibdoc WHERE id_bibrec in %s AND last_version=true GROUP BY mime ORDER BY c DESC" % sqlreclist)) # kwalitee: disable=sql
print_table("Sizes", run_sql("SELECT SUM(filesize) AS c FROM bibrec_bibdoc AS bb JOIN bibdocfsinfo AS fs ON bb.id_bibdoc=fs.id_bibdoc WHERE id_bibrec in %s AND last_version=true" % sqlreclist)) # kwalitee: disable=sql
class OptionParserSpecial(OptionParser):
def format_help(self, *args, **kwargs):
result = OptionParser.format_help(self, *args, **kwargs)
if hasattr(self, 'trailing_text'):
return "%s\n%s\n" % (result, self.trailing_text)
else:
return result
def prepare_option_parser():
"""Parse the command line options."""
def _ids_ranges_callback(option, opt, value, parser):
"""Callback for optparse to parse a set of ids ranges in the form
nnn1-nnn2,mmm1-mmm2... returning the corresponding intbitset.
"""
try:
debug('option: %s, opt: %s, value: %s, parser: %s' % (option, opt, value, parser))
debug('Parsing range: %s' % value)
value = ranges2ids(value)
setattr(parser.values, option.dest, value)
except Exception, e:
raise OptionValueError("It's impossible to parse the range '%s' for option %s: %s" % (value, opt, e))
def _date_range_callback(option, opt, value, parser):
"""Callback for optparse to parse a range of dates in the form
[date1],[date2]. Both date1 and date2 could be optional.
the date can be expressed absolutely ("%Y-%m-%d %H:%M:%S")
or relatively (([-\+]{0,1})([\d]+)([dhms])) to the current time."""
try:
value = _parse_date_range(value)
setattr(parser.values, option.dest, value)
except Exception, e:
raise OptionValueError("It's impossible to parse the range '%s' for option %s: %s" % (value, opt, e))
parser = OptionParserSpecial(usage="usage: %prog [options]",
#epilog="""With <query> you select the range of record/docnames/single files to work on. Note that some actions e.g. delete, append, revise etc. works at the docname level, while others like --set-comment, --set-description, at single file level and other can be applied in an iterative way to many records in a single run. Note that specifing docid(2) takes precedence over recid(2) which in turns takes precedence over pattern/collection search.""",
version=__revision__)
parser.trailing_text = """
Examples:
$ bibdocfile --append foo.tar.gz --recid=1
$ bibdocfile --revise http://foo.com?search=123 --with-docname='sam'
--format=pdf --recid=3 --set-docname='pippo' # revise for record 3
# the document sam, renaming it to pippo.
$ bibdocfile --delete --with-docname="*sam" --all # delete all documents
# starting ending
# with "sam"
$ bibdocfile --undelete -c "Test Collection" # undelete documents for
# the collection
$ bibdocfile --get-info --recids=1-4,6-8 # obtain informations
$ bibdocfile -r 1 --with-docname=foo --set-docname=bar # Rename a document
$ bibdocfile -r 1 --set-restriction "firerole: deny until '2011-01-01'
allow any" # set an embargo to all the documents attached to record 1
# (note the ^M or \\n before 'allow any')
# See also $r subfield in <%(site)s/help/admin/bibupload-admin-guide#3.6>
# and Firerole in <%(site)s/help/admin/webaccess-admin-guide#6>
$ bibdocfile --append x.pdf --recid=1 --with-flags='PDF/A,OCRED' # append
# to record 1 the file x.pdf specifying the PDF/A and OCRED flags
""" % {'site': CFG_SITE_URL}
query_options = OptionGroup(parser, 'Query options')
query_options.add_option('-r', '--recids', action="callback", callback=_ids_ranges_callback, type='string', dest='recids', help='matches records by recids, e.g.: --recids=1-3,5-7')
query_options.add_option('-d', '--docids', action="callback", callback=_ids_ranges_callback, type='string', dest='docids', help='matches documents by docids, e.g.: --docids=1-3,5-7')
query_options.add_option('-a', '--all', action='store_true', dest='all', help='Select all the records')
query_options.add_option("--with-deleted-recs", choices=['yes', 'no', 'only'], type="choice", dest="deleted_recs", help="'Yes' to also match deleted records, 'no' to exclude them, 'only' to match only deleted ones", metavar="yes/no/only", default='no')
query_options.add_option("--with-deleted-docs", choices=['yes', 'no', 'only'], type="choice", dest="deleted_docs", help="'Yes' to also match deleted documents, 'no' to exclude them, 'only' to match only deleted ones (e.g. for undeletion)", metavar="yes/no/only", default='no')
query_options.add_option("--with-empty-recs", choices=['yes', 'no', 'only'], type="choice", dest="empty_recs", help="'Yes' to also match records without attached documents, 'no' to exclude them, 'only' to consider only such records (e.g. for statistics)", metavar="yes/no/only", default='no')
query_options.add_option("--with-empty-docs", choices=['yes', 'no', 'only'], type="choice", dest="empty_docs", help="'Yes' to also match documents without attached files, 'no' to exclude them, 'only' to consider only such documents (e.g. for sanity checking)", metavar="yes/no/only", default='no')
query_options.add_option("--with-record-modification-date", action="callback", callback=_date_range_callback, dest="md_rec", nargs=1, type="string", default=(None, None), help="matches records modified date1 and date2; dates can be expressed relatively, e.g.:\"-5m,2030-2-23 04:40\" # matches records modified since 5 minutes ago until the 2030...", metavar="date1,date2")
query_options.add_option("--with-record-creation-date", action="callback", callback=_date_range_callback, dest="cd_rec", nargs=1, type="string", default=(None, None), help="matches records created between date1 and date2; dates can be expressed relatively", metavar="date1,date2")
query_options.add_option("--with-document-modification-date", action="callback", callback=_date_range_callback, dest="md_doc", nargs=1, type="string", default=(None, None), help="matches documents modified between date1 and date2; dates can be expressed relatively", metavar="date1,date2")
query_options.add_option("--with-document-creation-date", action="callback", callback=_date_range_callback, dest="cd_doc", nargs=1, type="string", default=(None, None), help="matches documents created between date1 and date2; dates can be expressed relatively", metavar="date1,date2")
query_options.add_option("--url", dest="url", help='matches the document referred by the URL, e.g. "%s/%s/1/files/foobar.pdf?version=2"' % (CFG_SITE_URL, CFG_SITE_RECORD))
query_options.add_option("--path", dest="path", help='matches the document referred by the internal filesystem path, e.g. %s/g0/1/foobar.pdf\\;1' % CFG_BIBDOCFILE_FILEDIR)
query_options.add_option("--with-docname", dest="docname", help='matches documents with the given docname (accept wildcards)')
query_options.add_option("--with-doctype", dest="doctype", help='matches documents with the given doctype')
query_options.add_option('-p', '--pattern', dest='pattern', help='matches records by pattern')
query_options.add_option('-c', '--collection', dest='collection', help='matches records by collection')
query_options.add_option('--force', dest='force', help='force an action even when it\'s not necessary e.g. textify on an already textified bibdoc.', action='store_true', default=False)
parser.add_option_group(query_options)
getting_information_options = OptionGroup(parser, 'Actions for getting information')
getting_information_options.add_option('--get-info', dest='action', action='store_const', const='get-info', help='print all the informations about the matched record/documents')
getting_information_options.add_option('--get-disk-usage', dest='action', action='store_const', const='get-disk-usage', help='print disk usage statistics of the matched documents')
getting_information_options.add_option('--get-history', dest='action', action='store_const', const='get-history', help='print the matched documents history')
getting_information_options.add_option('--get-stats', dest='action', action='store_const', const='get-stats', help='print some statistics of file properties grouped by collections')
parser.add_option_group(getting_information_options)
setting_information_options = OptionGroup(parser, 'Actions for setting information')
setting_information_options.add_option('--set-doctype', dest='set_doctype', help='specify the new doctype', metavar='doctype')
setting_information_options.add_option('--set-description', dest='set_description', help='specify a description', metavar='description')
setting_information_options.add_option('--set-comment', dest='set_comment', help='specify a comment', metavar='comment')
setting_information_options.add_option('--set-restriction', dest='set_restriction', help='specify a restriction tag', metavar='restriction')
setting_information_options.add_option('--set-docname', dest='new_docname', help='specifies a new docname for renaming', metavar='docname')
setting_information_options.add_option("--unset-comment", action="store_const", const='', dest="set_comment", help="remove any comment")
setting_information_options.add_option("--unset-descriptions", action="store_const", const='', dest="set_description", help="remove any description")
setting_information_options.add_option("--unset-restrictions", action="store_const", const='', dest="set_restriction", help="remove any restriction")
setting_information_options.add_option("--hide", dest="action", action='store_const', const='hide', help="hides matched documents and revisions")
setting_information_options.add_option("--unhide", dest="action", action='store_const', const='unhide', help="hides matched documents and revisions")
parser.add_option_group(setting_information_options)
revising_options = OptionGroup(parser, 'Action for revising content')
revising_options.add_option("--append", dest='append_path', help='specify the URL/path of the file that will appended to the bibdoc (implies --with-empty-recs=yes)', metavar='PATH/URL')
revising_options.add_option("--revise", dest='revise_path', help='specify the URL/path of the file that will revise the bibdoc', metavar='PATH/URL')
revising_options.add_option("--revert", dest='action', action='store_const', const='revert', help='reverts a document to the specified version')
revising_options.add_option("--delete", action='store_const', const='delete', dest='action', help='soft-delete the matched documents')
revising_options.add_option("--hard-delete", action='store_const', const='hard-delete', dest='action', help='hard-delete the single matched document with a specific format and a specific revision (this operation is not revertible)')
revising_options.add_option("--undelete", action='store_const', const='undelete', dest='action', help='undelete previosuly soft-deleted documents')
revising_options.add_option("--purge", action='store_const', const='purge', dest='action', help='purge (i.e. hard-delete any format of any version prior to the latest version of) the matched documents')
revising_options.add_option("--expunge", action='store_const', const='expunge', dest='action', help='expunge (i.e. hard-delete any version and formats of) the matched documents')
revising_options.add_option("--with-version", dest="version", help="specifies the version(s) to be used with hide, unhide, e.g.: 1-2,3 or ALL. Specifies the version to be used with hard-delete and revert, e.g. 2")
revising_options.add_option("--with-format", dest="format", help='to specify a format when appending/revising/deleting/reverting a document, e.g. "pdf"', metavar='FORMAT')
revising_options.add_option("--with-hide-previous", dest='hide_previous', action='store_true', help='when revising, hides previous versions', default=False)
revising_options.add_option("--with-flags", dest='flags', help='comma-separated optional list of flags used when appending/revising a document. Valid flags are: %s' % ', '.join(CFG_BIBDOCFILE_AVAILABLE_FLAGS), default=None)
parser.add_option_group(revising_options)
housekeeping_options = OptionGroup(parser, 'Actions for housekeeping')
housekeeping_options.add_option("--check-md5", action='store_const', const='check-md5', dest='action', help='check md5 checksum validity of files')
housekeeping_options.add_option("--check-format", action='store_const', const='check-format', dest='action', help='check if any format-related inconsistences exists')
housekeeping_options.add_option("--check-duplicate-docnames", action='store_const', const='check-duplicate-docnames', dest='action', help='check for duplicate docnames associated with the same record')
housekeeping_options.add_option("--update-md5", action='store_const', const='update-md5', dest='action', help='update md5 checksum of files')
housekeeping_options.add_option("--fix-all", action='store_const', const='fix-all', dest='action', help='fix inconsistences in filesystem vs database vs MARC')
housekeeping_options.add_option("--fix-marc", action='store_const', const='fix-marc', dest='action', help='synchronize MARC after filesystem/database')
housekeeping_options.add_option("--fix-format", action='store_const', const='fix-format', dest='action', help='fix format related inconsistences')
housekeeping_options.add_option("--fix-duplicate-docnames", action='store_const', const='fix-duplicate-docnames', dest='action', help='fix duplicate docnames associated with the same record')
housekeeping_options.add_option("--fix-bibdocfsinfo-cache", action='store_const', const='fix-bibdocfsinfo-cache', dest='action', help='fix bibdocfsinfo cache related inconsistences')
parser.add_option_group(housekeeping_options)
experimental_options = OptionGroup(parser, 'Experimental options (do not expect to find them in the next release)')
experimental_options.add_option('--textify', dest='action', action='store_const', const='textify', help='extract text from matched documents and store it for later indexing')
experimental_options.add_option('--with-ocr', dest='perform_ocr', action='store_true', default=False, help='when used with --textify, wether to perform OCR')
parser.add_option_group(experimental_options)
parser.add_option('-D', '--debug', action='store_true', dest='debug', default=False)
parser.add_option('-H', '--human-readable', dest='human_readable', action='store_true', default=False, help='print sizes in human readable format (e.g., 1KB 234MB 2GB)')
parser.add_option('--yes-i-know', action='store_true', dest='yes-i-know', help='use with care!')
return parser
def print_info(docid, info):
"""Nicely print info about a docid."""
print '%i:%s' % (docid, info)
def bibupload_ffts(ffts, append=False, do_debug=False, interactive=True):
"""Given an ffts dictionary it creates the xml and submit it."""
xml = ffts_to_xml(ffts)
if xml:
if interactive:
print xml
tmp_file_fd, tmp_file_name = mkstemp(suffix='.xml', prefix="bibdocfile_%s" % time.strftime("%Y-%m-%d_%H:%M:%S"), dir=CFG_TMPSHAREDDIR)
os.write(tmp_file_fd, xml)
os.close(tmp_file_fd)
os.chmod(tmp_file_name, 0644)
if append:
if interactive:
wait_for_user("This will be appended via BibUpload")
if do_debug:
task = task_low_level_submission('bibupload', 'bibdocfile', '-a', tmp_file_name, '-N', 'FFT', '-S2', '-v9')
else:
task = task_low_level_submission('bibupload', 'bibdocfile', '-a', tmp_file_name, '-N', 'FFT', '-S2')
if interactive:
print "BibUpload append submitted with id %s" % task
else:
if interactive:
wait_for_user("This will be corrected via BibUpload")
if do_debug:
task = task_low_level_submission('bibupload', 'bibdocfile', '-c', tmp_file_name, '-N', 'FFT', '-S2', '-v9')
else:
task = task_low_level_submission('bibupload', 'bibdocfile', '-c', tmp_file_name, '-N', 'FFT', '-S2')
if interactive:
print "BibUpload correct submitted with id %s" % task
elif interactive:
print >> sys.stderr, "WARNING: no MARC to upload."
return True
def ranges2ids(parse_string):
"""Parse a string and return the intbitset of the corresponding ids."""
ids = intbitset()
ranges = parse_string.split(",")
for arange in ranges:
tmp_ids = arange.split("-")
if len(tmp_ids)==1:
ids.add(int(tmp_ids[0]))
else:
if int(tmp_ids[0]) > int(tmp_ids[1]): # sanity check
tmp = tmp_ids[0]
tmp_ids[0] = tmp_ids[1]
tmp_ids[1] = tmp
ids += xrange(int(tmp_ids[0]), int(tmp_ids[1]) + 1)
return ids
def cli_append(options, append_path):
"""Create a bibupload FFT task submission for appending a format."""
recid = cli2recid(options)
comment = cli2comment(options)
description = cli2description(options)
restriction = cli2restriction(options)
doctype = cli2doctype(options)
docname = cli2docname(options, url=append_path)
flags = cli2flags(options)
if not docname:
raise OptionValueError, 'Not enough information to retrieve a valid docname'
docformat = cli2format(options, append_path)
url = clean_url(append_path)
check_valid_url(url)
bibrecdocs = BibRecDocs(recid)
if bibrecdocs.has_docname_p(docname) and bibrecdocs.get_bibdoc(docname).format_already_exists_p(docformat):
new_docname = bibrecdocs.propose_unique_docname(docname)
wait_for_user("WARNING: a document with name %s and format %s already exists for recid %s. A new document with name %s will be created instead." % (repr(docname), repr(docformat), repr(recid), repr(new_docname)))
docname = new_docname
ffts = {recid: [{
'docname' : docname,
'comment' : comment,
'description' : description,
'restriction' : restriction,
'doctype' : doctype,
'format' : docformat,
'url' : url,
'options': flags
}]}
return bibupload_ffts(ffts, append=True)
def cli_revise(options, revise_path):
"""Create aq bibupload FFT task submission for appending a format."""
recid = cli2recid(options)
comment = cli2comment(options)
description = cli2description(options)
restriction = cli2restriction(options)
docname = cli2docname(options, url=revise_path)
hide_previous = getattr(options, 'hide_previous', None)
flags = cli2flags(options)
if hide_previous and 'PERFORM_HIDE_PREVIOUS' not in flags:
flags.append('PERFORM_HIDE_PREVIOUS')
if not docname:
raise OptionValueError, 'Not enough information to retrieve a valid docname'
docformat = cli2format(options, revise_path)
doctype = cli2doctype(options)
url = clean_url(revise_path)
new_docname = getattr(options, 'new_docname', None)
check_valid_url(url)
ffts = {recid : [{
'docname' : docname,
'new_docname' : new_docname,
'comment' : comment,
'description' : description,
'restriction' : restriction,
'doctype' : doctype,
'format' : docformat,
'url' : url,
'options' : flags
}]}
return bibupload_ffts(ffts)
def cli_set_batch(options):
"""Change in batch the doctype, description, comment and restriction."""
ffts = {}
doctype = getattr(options, 'set_doctype', None)
description = cli2description(options)
comment = cli2comment(options)
restriction = cli2restriction(options)
with_format = getattr(options, 'format', None)
for docid in cli_docids_iterator(options):
bibdoc = BibDoc.create_instance(docid)
recid = None
docname = None
if bibdoc.bibrec_links:
# pick a sample recid from those to which a BibDoc is attached
recid = bibdoc.bibrec_links[0]["recid"]
docname = bibdoc.bibrec_links[0]["docname"]
fft = []
if description is not None or comment is not None:
for bibdocfile in bibdoc.list_latest_files():
docformat = bibdocfile.get_format()
if not with_format or with_format == format:
fft.append({
'docname': docname,
'restriction': restriction,
'comment': comment,
'description': description,
'format': docformat,
'doctype': doctype
})
else:
fft.append({
'docname': docname,
'restriction': restriction,
'doctype': doctype,
})
ffts[recid] = fft
return bibupload_ffts(ffts, append=False)
def cli_textify(options):
"""Extract text to let indexing on fulltext be possible."""
force = getattr(options, 'force', None)
perform_ocr = getattr(options, 'perform_ocr', None)
if perform_ocr:
if not can_perform_ocr():
print >> sys.stderr, "WARNING: OCR requested but OCR is not possible"
perform_ocr = False
if perform_ocr:
additional = ' using OCR (this might take some time)'
else:
additional = ''
for docid in cli_docids_iterator(options):
bibdoc = BibDoc.create_instance(docid)
print 'Extracting text for docid %s%s...' % (docid, additional),
sys.stdout.flush()
#pylint: disable=E1103
if force or (hasattr(bibdoc, "has_text") and not bibdoc.has_text(require_up_to_date=True)):
try:
#pylint: disable=E1103
bibdoc.extract_text(perform_ocr=perform_ocr)
print "DONE"
except InvenioBibDocFileError, e:
print >> sys.stderr, "WARNING: %s" % e
else:
print "not needed"
def cli_rename(options):
"""Rename a docname within a recid."""
new_docname = getattr(options, 'new_docname', None)
docid = cli2docid(options)
bibdoc = BibDoc.create_instance(docid)
docname = None
if bibdoc.bibrec_links:
docname = bibdoc.bibrec_links[0]["docname"]
recid = cli2recid(options) # now we read the recid from options
ffts = {recid : [{'docname' : docname, 'new_docname' : new_docname}]}
return bibupload_ffts(ffts, append=False)
def cli_fix_bibdocfsinfo_cache(options):
"""Rebuild the bibdocfsinfo table according to what is available on filesystem"""
to_be_fixed = intbitset()
for docid in intbitset(run_sql("SELECT id FROM bibdoc")):
print "Fixing bibdocfsinfo table for docid %s..." % docid,
sys.stdout.flush()
try:
bibdoc = BibDoc(docid)
except InvenioBibDocFileError, err:
print err
continue
try:
bibdoc._sync_to_db()
except Exception, err:
if bibdoc.bibrec_links:
recid = bibdoc.bibrec_links[0]["recid"]
if recid:
to_be_fixed.add(recid)
print "ERROR: %s, scheduling a fix for recid %s" % (err, recid)
else:
print "ERROR %s" % (err, )
print "DONE"
if to_be_fixed:
cli_fix_format(options, recids=to_be_fixed)
print "You can now add CFG_BIBDOCFILE_ENABLE_BIBDOCFSINFO_CACHE=1 to your invenio-local.conf file."
def cli_fix_all(options):
"""Fix all the records of a recid_set."""
ffts = {}
for recid in cli_recids_iterator(options):
ffts[recid] = []
for docname in BibRecDocs(recid).get_bibdoc_names():
ffts[recid].append({'docname' : docname, 'doctype' : 'FIX-ALL'})
return bibupload_ffts(ffts, append=False)
def cli_fix_marc(options, explicit_recid_set=None, interactive=True):
"""Fix all the records of a recid_set."""
ffts = {}
if explicit_recid_set is not None:
for recid in explicit_recid_set:
ffts[recid] = [{'doctype' : 'FIX-MARC'}]
else:
for recid in cli_recids_iterator(options):
ffts[recid] = [{'doctype' : 'FIX-MARC'}]
return bibupload_ffts(ffts, append=False, interactive=interactive)
def cli_check_format(options):
"""Check if any format-related inconsistences exists."""
count = 0
tot = 0
duplicate = False
for recid in cli_recids_iterator(options):
tot += 1
bibrecdocs = BibRecDocs(recid)
if not bibrecdocs.check_duplicate_docnames():
print >> sys.stderr, "recid %s has duplicate docnames!"
broken = True
duplicate = True
else:
broken = False
for docname in bibrecdocs.get_bibdoc_names():
if not bibrecdocs.check_format(docname):
print >> sys.stderr, "recid %s with docname %s need format fixing" % (recid, docname)
broken = True
if broken:
count += 1
if count:
result = "%d out of %d records need their formats to be fixed." % (count, tot)
else:
result = "All records appear to be correct with respect to formats."
if duplicate:
result += " Note however that at least one record appear to have duplicate docnames. You should better fix this situation by using --fix-duplicate-docnames."
print wrap_text_in_a_box(result, style="conclusion")
return not(duplicate or count)
def cli_check_duplicate_docnames(options):
"""Check if some record is connected with bibdoc having the same docnames."""
count = 0
tot = 0
for recid in cli_recids_iterator(options):
tot += 1
bibrecdocs = BibRecDocs(recid)
if bibrecdocs.check_duplicate_docnames():
count += 1
print >> sys.stderr, "recid %s has duplicate docnames!"
if count:
print "%d out of %d records have duplicate docnames." % (count, tot)
return False
else:
print "All records appear to be correct with respect to duplicate docnames."
return True
def cli_fix_format(options, recids=None):
"""Fix format-related inconsistences."""
fixed = intbitset()
tot = 0
if not recids:
recids = cli_recids_iterator(options)
for recid in recids:
tot += 1
bibrecdocs = BibRecDocs(recid)
for docname in bibrecdocs.get_bibdoc_names():
if not bibrecdocs.check_format(docname):
if bibrecdocs.fix_format(docname, skip_check=True):
print >> sys.stderr, "%i has been fixed for docname %s" % (recid, docname)
else:
print >> sys.stderr, "%i has been fixed for docname %s. However note that a new bibdoc might have been created." % (recid, docname)
fixed.add(recid)
if fixed:
print "Now we need to synchronize MARC to reflect current changes."
cli_fix_marc(options, explicit_recid_set=fixed)
print wrap_text_in_a_box("%i out of %i record needed to be fixed." % (tot, len(fixed)), style="conclusion")
return not fixed
def cli_fix_duplicate_docnames(options):
"""Fix duplicate docnames."""
fixed = intbitset()
tot = 0
for recid in cli_recids_iterator(options):
tot += 1
bibrecdocs = BibRecDocs(recid)
if not bibrecdocs.check_duplicate_docnames():
bibrecdocs.fix_duplicate_docnames(skip_check=True)
print >> sys.stderr, "%i has been fixed for duplicate docnames." % recid
fixed.add(recid)
if fixed:
print "Now we need to synchronize MARC to reflect current changes."
cli_fix_marc(options, explicit_recid_set=fixed)
print wrap_text_in_a_box("%i out of %i record needed to be fixed." % (tot, len(fixed)), style="conclusion")
return not fixed
def cli_delete(options):
"""Delete the given docid_set."""
ffts = {}
for docid in cli_docids_iterator(options):
bibdoc = BibDoc.create_instance(docid)
docname = None
recid = None
# retrieve the 1st recid
if recid.bibrec_links:
recid = bibdoc.bibrec_links[0]["recid"]
docname = bibdoc.bibrec_links[0]["docname"]
if recid not in ffts:
ffts[recid] = [{'docname' : docname, 'doctype' : 'DELETE'}]
else:
ffts[recid].append({'docname' : docname, 'doctype' : 'DELETE'})
return bibupload_ffts(ffts)
def cli_delete_file(options):
"""Delete the given file irreversibely."""
docid = cli2docid(options)
recid = cli2recid(options, docids=intbitset([docid]))
docformat = cli2format(options)
bdr = BibRecDocs(recid)
docname = bdr.get_docname(docid)
version = getattr(options, 'version', None)
try:
version_int = int(version)
if 0 >= version_int:
raise ValueError
except:
raise OptionValueError, 'when hard-deleting, version should be valid positive integer, not %s' % version
ffts = {recid : [{'docname' : docname, 'version' : version, 'format' : docformat, 'doctype' : 'DELETE-FILE'}]}
return bibupload_ffts(ffts)
def cli_revert(options):
"""Revert a bibdoc to a given version."""
docid = cli2docid(options)
recid = cli2recid(options, docids=intbitset([docid]))
bdr = BibRecDocs(recid)
docname = bdr.get_docname(docid)
version = getattr(options, 'version', None)
try:
version_int = int(version)
if 0 >= version_int:
raise ValueError
except:
raise OptionValueError, 'when reverting, version should be valid positive integer, not %s' % version
ffts = {recid : [{'docname' : docname, 'version' : version, 'doctype' : 'REVERT'}]}
return bibupload_ffts(ffts)
def cli_undelete(options):
"""Delete the given docname"""
docname = cli2docname(options)
restriction = getattr(options, 'restriction', None)
count = 0
if not docname:
docname = 'DELETED-*-*'
if not docname.startswith('DELETED-'):
docname = 'DELETED-*-' + docname
to_be_undeleted = intbitset()
fix_marc = intbitset()
setattr(options, 'deleted_docs', 'only')
for docid in cli_docids_iterator(options):
bibdoc = BibDoc.create_instance(docid)
dnold = None
if bibdoc.bibrec_links:
dnold = bibdoc.bibrec_links[0]["docname"]
if bibdoc.get_status() == 'DELETED' and fnmatch.fnmatch(dnold, docname):
to_be_undeleted.add(docid)
# get the 1st recid to which the document is attached
recid = None
if bibdoc.bibrec_links:
recid = bibdoc.bibrec_links[0]["recid"]
fix_marc.add(recid)
count += 1
print '%s (docid %s from recid %s) will be undeleted to restriction: %s' % (dnold, docid, recid, restriction)
wait_for_user("I'll proceed with the undeletion")
for docid in to_be_undeleted:
bibdoc = BibDoc.create_instance(docid)
bibdoc.undelete(restriction)
cli_fix_marc(options, explicit_recid_set=fix_marc)
print wrap_text_in_a_box("%s bibdoc successfuly undeleted with status '%s'" % (count, restriction), style="conclusion")
def cli_get_info(options):
"""Print all the info of the matched docids or recids."""
debug('Getting info!')
human_readable = bool(getattr(options, 'human_readable', None))
debug('human_readable: %s' % human_readable)
deleted_docs = getattr(options, 'deleted_docs', None) in ('yes', 'only')
debug('deleted_docs: %s' % deleted_docs)
if getattr(options, 'docids', None):
for docid in cli_docids_iterator(options):
sys.stdout.write(str(BibDoc.create_instance(docid, human_readable=human_readable)))
else:
for recid in cli_recids_iterator(options):
sys.stdout.write(str(BibRecDocs(recid, deleted_too=deleted_docs, human_readable=human_readable)))
def cli_purge(options):
"""Purge the matched docids."""
ffts = {}
for docid in cli_docids_iterator(options):
bibdoc = BibDoc.create_instance(docid)
recid = None
docname = None
if bibdoc.bibrec_links:
recid = bibdoc.bibrec_links[0]["recid"]
docname = bibdoc.bibrec_links[0]["docname"]
if recid:
if recid not in ffts:
ffts[recid] = []
ffts[recid].append({
'docname' : docname,
'doctype' : 'PURGE',
})
return bibupload_ffts(ffts)
def cli_expunge(options):
"""Expunge the matched docids."""
ffts = {}
for docid in cli_docids_iterator(options):
bibdoc = BibDoc.create_instance(docid)
recid = None
docname = None
if bibdoc.bibrec_links:
#TODO: If we have a syntax for manipulating completely standalone objects,
# this has to be modified
recid = bibdoc.bibrec_links[0]["recid"]
docname = bibdoc.bibrec_links[0]["docname"]
if recid:
if recid not in ffts:
ffts[recid] = []
ffts[recid].append({
'docname' : docname,
'doctype' : 'EXPUNGE',
})
return bibupload_ffts(ffts)
def cli_get_history(options):
"""Print the history of a docid_set."""
for docid in cli_docids_iterator(options):
bibdoc = BibDoc.create_instance(docid)
history = bibdoc.get_history()
for row in history:
print_info(docid, row)
def cli_get_disk_usage(options):
"""Print the space usage of a docid_set."""
human_readable = getattr(options, 'human_readable', None)
total_size = 0
total_latest_size = 0
for docid in cli_docids_iterator(options):
bibdoc = BibDoc.create_instance(docid)
size = bibdoc.get_total_size()
total_size += size
latest_size = bibdoc.get_total_size_latest_version()
total_latest_size += latest_size
if human_readable:
print_info(docid, 'size=%s' % nice_size(size))
print_info(docid, 'latest version size=%s' % nice_size(latest_size))
else:
print_info(docid, 'size=%s' % size)
print_info( docid, 'latest version size=%s' % latest_size)
if human_readable:
print wrap_text_in_a_box('total size: %s\n\nlatest version total size: %s'
% (nice_size(total_size), nice_size(total_latest_size)),
style='conclusion')
else:
print wrap_text_in_a_box('total size: %s\n\nlatest version total size: %s'
% (total_size, total_latest_size),
style='conclusion')
def cli_check_md5(options):
"""Check the md5 sums of a docid_set."""
failures = 0
for docid in cli_docids_iterator(options):
bibdoc = BibDoc.create_instance(docid)
if bibdoc.md5s.check():
print_info(docid, 'checksum OK')
else:
for afile in bibdoc.list_all_files():
if not afile.check():
failures += 1
print_info(docid, '%s failing checksum!' % afile.get_full_path())
if failures:
print wrap_text_in_a_box('%i files failing' % failures , style='conclusion')
else:
print wrap_text_in_a_box('All files are correct', style='conclusion')
def cli_update_md5(options):
"""Update the md5 sums of a docid_set."""
for docid in cli_docids_iterator(options):
bibdoc = BibDoc.create_instance(docid)
if bibdoc.md5s.check():
print_info(docid, 'checksum OK')
else:
for afile in bibdoc.list_all_files():
if not afile.check():
print_info(docid, '%s failing checksum!' % afile.get_full_path())
wait_for_user('Updating the md5s of this document can hide real problems.')
bibdoc.md5s.update(only_new=False)
bibdoc._sync_to_db()
def cli_hide(options):
"""Hide the matched versions of documents."""
documents_to_be_hidden = {}
to_be_fixed = intbitset()
versions = getattr(options, 'version', 'all')
if versions != 'all':
try:
versions = ranges2ids(versions)
except:
raise OptionValueError, 'You should specify correct versions. Not %s' % versions
else:
versions = intbitset(trailing_bits=True)
for docid in cli_docids_iterator(options):
bibdoc = BibDoc.create_instance(docid)
recid = None
if bibdoc.bibrec_links:
recid = bibdoc.bibrec_links[0]["recid"]
if recid:
for bibdocfile in bibdoc.list_all_files():
this_version = bibdocfile.get_version()
this_format = bibdocfile.get_format()
if this_version in versions:
if docid not in documents_to_be_hidden:
documents_to_be_hidden[docid] = []
documents_to_be_hidden[docid].append((this_version, this_format))
to_be_fixed.add(recid)
print '%s (docid: %s, recid: %s) will be hidden' % (bibdocfile.get_full_name(), docid, recid)
wait_for_user('Proceeding to hide the matched documents...')
for docid, documents in documents_to_be_hidden.iteritems():
bibdoc = BibDoc.create_instance(docid)
for version, docformat in documents:
bibdoc.set_flag('HIDDEN', docformat, version)
return cli_fix_marc(options, to_be_fixed)
def cli_unhide(options):
"""Unhide the matched versions of documents."""
documents_to_be_unhidden = {}
to_be_fixed = intbitset()
versions = getattr(options, 'version', 'all')
if versions != 'all':
try:
versions = ranges2ids(versions)
except:
raise OptionValueError, 'You should specify correct versions. Not %s' % versions
else:
versions = intbitset(trailing_bits=True)
for docid in cli_docids_iterator(options):
bibdoc = BibDoc.create_instance(docid)
recid = None
if bibdoc.bibrec_links:
recid = bibdoc.bibrec_links[0]["recid"]
if recid:
for bibdocfile in bibdoc.list_all_files():
this_version = bibdocfile.get_version()
this_format = bibdocfile.get_format()
if this_version in versions:
if docid not in documents_to_be_unhidden:
documents_to_be_unhidden[docid] = []
documents_to_be_unhidden[docid].append((this_version, this_format))
to_be_fixed.add(recid)
print '%s (docid: %s, recid: %s) will be unhidden' % (bibdocfile.get_full_name(), docid, recid)
wait_for_user('Proceeding to unhide the matched documents...')
for docid, documents in documents_to_be_unhidden.iteritems():
bibdoc = BibDoc.create_instance(docid)
for version, docformat in documents:
bibdoc.unset_flag('HIDDEN', docformat, version)
return cli_fix_marc(options, to_be_fixed)
+
+@with_app_context()
def main():
parser = prepare_option_parser()
(options, args) = parser.parse_args()
if getattr(options, 'debug', None):
getLogger().setLevel(DEBUG)
debug('test')
debug('options: %s, args: %s' % (options, args))
try:
if not getattr(options, 'action', None) and \
not getattr(options, 'append_path', None) and \
not getattr(options, 'revise_path', None):
if getattr(options, 'set_doctype', None) is not None or \
getattr(options, 'set_comment', None) is not None or \
getattr(options, 'set_description', None) is not None or \
getattr(options, 'set_restriction', None) is not None:
cli_set_batch(options)
elif getattr(options, 'new_docname', None):
cli_rename(options)
else:
print >> sys.stderr, "ERROR: no action specified"
sys.exit(1)
elif getattr(options, 'append_path', None):
options.empty_recs = 'yes'
options.empty_docs = 'yes'
cli_append(options, getattr(options, 'append_path', None))
elif getattr(options, 'revise_path', None):
cli_revise(options, getattr(options, 'revise_path', None))
elif options.action == 'textify':
cli_textify(options)
elif getattr(options, 'action', None) == 'get-history':
cli_get_history(options)
elif getattr(options, 'action', None) == 'get-info':
cli_get_info(options)
elif getattr(options, 'action', None) == 'get-disk-usage':
cli_get_disk_usage(options)
elif getattr(options, 'action', None) == 'check-md5':
cli_check_md5(options)
elif getattr(options, 'action', None) == 'update-md5':
cli_update_md5(options)
elif getattr(options, 'action', None) == 'fix-all':
cli_fix_all(options)
elif getattr(options, 'action', None) == 'fix-marc':
cli_fix_marc(options)
elif getattr(options, 'action', None) == 'delete':
cli_delete(options)
elif getattr(options, 'action', None) == 'hard-delete':
cli_delete_file(options)
elif getattr(options, 'action', None) == 'fix-duplicate-docnames':
cli_fix_duplicate_docnames(options)
elif getattr(options, 'action', None) == 'fix-format':
cli_fix_format(options)
elif getattr(options, 'action', None) == 'check-duplicate-docnames':
cli_check_duplicate_docnames(options)
elif getattr(options, 'action', None) == 'check-format':
cli_check_format(options)
elif getattr(options, 'action', None) == 'undelete':
cli_undelete(options)
elif getattr(options, 'action', None) == 'purge':
cli_purge(options)
elif getattr(options, 'action', None) == 'expunge':
cli_expunge(options)
elif getattr(options, 'action', None) == 'revert':
cli_revert(options)
elif getattr(options, 'action', None) == 'hide':
cli_hide(options)
elif getattr(options, 'action', None) == 'unhide':
cli_unhide(options)
elif getattr(options, 'action', None) == 'fix-bibdocfsinfo-cache':
options.empty_docs = 'yes'
cli_fix_bibdocfsinfo_cache(options)
elif getattr(options, 'action', None) == 'get-stats':
cli_get_stats(options)
else:
print >> sys.stderr, "ERROR: Action %s is not valid" % getattr(options, 'action', None)
sys.exit(1)
except Exception, e:
register_exception()
print >> sys.stderr, 'ERROR: %s' % e
sys.exit(1)
-
-if __name__ == '__main__':
- main()
diff --git a/invenio/legacy/bibdocfile/fulltext_files_migration_kit.py b/invenio/legacy/bibdocfile/fulltext_files_migration_kit.py
index 7d0897854..96d075880 100644
--- a/invenio/legacy/bibdocfile/fulltext_files_migration_kit.py
+++ b/invenio/legacy/bibdocfile/fulltext_files_migration_kit.py
@@ -1,142 +1,142 @@
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
"""This script updates the filesystem structure of fulltext files in order
to make it coherent with bibdocfile implementation (bibdocfile.py structure is backward
compatible with file.py structure, but the viceversa is not true).
"""
import sys
from invenio.intbitset import intbitset
from invenio.utils.text import wrap_text_in_a_box
from invenio.config import CFG_LOGDIR, CFG_SITE_SUPPORT_EMAIL
from invenio.legacy.dbquery import run_sql, OperationalError
-from invenio.bibdocfile import BibRecDocs, InvenioBibDocFileError
+from invenio.legacy.bibdocfile.api import BibRecDocs, InvenioBibDocFileError
from datetime import datetime
def retrieve_fulltext_recids():
"""Returns the list of all the recid number linked with at least a fulltext
file."""
res = run_sql('SELECT DISTINCT id_bibrec FROM bibrec_bibdoc')
return intbitset(res)
def fix_recid(recid, logfile):
"""Fix a given recid."""
print "Upgrading record %s ->" % recid,
print >> logfile, "Upgrading record %s:" % recid
bibrec = BibRecDocs(recid)
print >> logfile, bibrec
docnames = bibrec.get_bibdoc_names()
try:
for docname in docnames:
print docname,
new_bibdocs = bibrec.fix(docname)
new_bibdocnames = [bibrec.get_docname(bibdoc.id) for bibdoc in new_bibdocs]
if new_bibdocnames:
print "(created bibdocs: '%s')" % "', '".join(new_bibdocnames),
print >> logfile, "(created bibdocs: '%s')" % "', '".join(new_bibdocnames)
except InvenioBibDocFileError, e:
print >> logfile, BibRecDocs(recid)
print "%s -> ERROR", e
return False
else:
print >> logfile, BibRecDocs(recid)
print "-> OK"
return True
def backup_tables(drop=False):
"""This function create a backup of bibrec_bibdoc, bibdoc and bibdoc_bibdoc tables. Returns False in case dropping of previous table is needed."""
if drop:
run_sql('DROP TABLE bibrec_bibdoc_backup')
run_sql('DROP TABLE bibdoc_backup')
run_sql('DROP TABLE bibdoc_bibdoc_backup')
try:
run_sql("""CREATE TABLE bibrec_bibdoc_backup (KEY id_bibrec(id_bibrec),
KEY id_bibdoc(id_bibdoc)) SELECT * FROM bibrec_bibdoc""")
run_sql("""CREATE TABLE bibdoc_backup (PRIMARY KEY id(id))
SELECT * FROM bibdoc""")
run_sql("""CREATE TABLE bibdoc_bibdoc_backup (KEY id_bibdoc1(id_bibdoc1),
KEY id_bibdoc2(id_bibdoc2)) SELECT * FROM bibdoc_bibdoc""")
except OperationalError, e:
if not drop:
return False
raise
return True
def check_yes():
"""Return True if the user types 'yes'."""
try:
return raw_input().strip() == 'yes'
except KeyboardInterrupt:
return False
def main():
"""Core loop."""
logfilename = '%s/fulltext_files_migration_kit-%s.log' % (CFG_LOGDIR, datetime.today().strftime('%Y%m%d%H%M%S'))
try:
logfile = open(logfilename, 'w')
except IOError, e:
print wrap_text_in_a_box('NOTE: it\'s impossible to create the log:\n\n %s\n\nbecause of:\n\n %s\n\nPlease run this migration kit as the same user who runs Invenio (e.g. Apache)' % (logfilename, e), style='conclusion', break_long=False)
sys.exit(1)
recids = retrieve_fulltext_recids()
print wrap_text_in_a_box ("""This script migrate the filesystem structure used to store fulltext files to the new stricter structure.
This script must not be run during normal Invenio operations.
It is safe to run this script. No file will be deleted.
Anyway it is recommended to run a backup of the filesystem structure just in case.
A backup of the database tables involved will be automatically performed.""", style='important')
print "%s records will be migrated/fixed." % len(recids)
print "Please type yes if you want to go further:",
if not check_yes():
print "INTERRUPTED"
sys.exit(1)
print "Backing up database tables"
try:
if not backup_tables():
print wrap_text_in_a_box("""It appears that is not the first time that you run this script.
Backup tables have been already created by a previous run.
In order for the script to go further they need to be removed.""", style='important')
print "Please, type yes if you agree to remove them and go further:",
if not check_yes():
print wrap_text_in_a_box("INTERRUPTED", style='conclusion')
sys.exit(1)
print "Backing up database tables (after dropping previous backup)",
backup_tables(drop=True)
print "-> OK"
else:
print "-> OK"
except Exception, e:
print wrap_text_in_a_box("Unexpected error while backing up tables. Please, do your checks: %s" % e, style='conclusion')
sys.exit(1)
print "Created a complete log file into %s" % logfilename
for recid in recids:
if not fix_recid(recid, logfile):
logfile.close()
print wrap_text_in_a_box(title="INTERRUPTED BECAUSE OF ERROR!", body="""Please see the log file %s for what was the status of record %s prior to the error. Contact %s in case of problems, attaching the log.""" % (logfilename, recid, CFG_SITE_SUPPORT_EMAIL),
style='conclusion')
sys.exit(1)
print wrap_text_in_a_box("DONE", style='conclusion')
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/bibdocfile/icon_migration_kit.py b/invenio/legacy/bibdocfile/icon_migration_kit.py
index e99428d26..a21f55ee3 100644
--- a/invenio/legacy/bibdocfile/icon_migration_kit.py
+++ b/invenio/legacy/bibdocfile/icon_migration_kit.py
@@ -1,163 +1,163 @@
## This file is part of Invenio.
## Copyright (C) 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
This script updates the filesystem and database structure WRT icons.
In particular it will move all the icons information out of bibdoc_bibdoc
tables and into the normal bibdoc + subformat infrastructure.
"""
import sys
from datetime import datetime
from invenio.utils.text import wrap_text_in_a_box, wait_for_user
-from invenio.bibtask import check_running_process_user
+from invenio.legacy.bibsched.bibtask import check_running_process_user
from invenio.legacy.dbquery import run_sql, OperationalError
-from invenio.bibdocfile import BibDoc
+from invenio.legacy.bibdocfile.api import BibDoc
from invenio.config import CFG_LOGDIR, CFG_SITE_SUPPORT_EMAIL
-from invenio.bibdocfilecli import cli_fix_marc
+from invenio.legacy.bibdocfile.cli import cli_fix_marc
from invenio.ext.logging import register_exception
from invenio.intbitset import intbitset
from invenio.legacy.search_engine import record_exists
def retrieve_bibdoc_bibdoc():
return run_sql('SELECT id_bibdoc1, id_bibdoc2 from bibdoc_bibdoc')
def get_recid_from_docid(docid):
return run_sql('SELECT id_bibrec FROM bibrec_bibdoc WHERE id_bibdoc=%s', (docid, ))
def backup_tables(drop=False):
"""This function create a backup of bibrec_bibdoc, bibdoc and bibdoc_bibdoc tables. Returns False in case dropping of previous table is needed."""
if drop:
run_sql('DROP TABLE bibdoc_bibdoc_backup_for_icon')
try:
run_sql("""CREATE TABLE bibdoc_bibdoc_backup_for_icon (KEY id_bibdoc1(id_bibdoc1),
KEY id_bibdoc2(id_bibdoc2)) SELECT * FROM bibdoc_bibdoc""")
except OperationalError, e:
if not drop:
return False
raise e
return True
def fix_bibdoc_bibdoc(id_bibdoc1, id_bibdoc2, logfile):
"""
Migrate an icon.
"""
try:
the_bibdoc = BibDoc.create_instance(id_bibdoc1)
except Exception, err:
msg = "WARNING: when opening docid %s: %s" % (id_bibdoc1, err)
print >> logfile, msg
print msg
return True
try:
msg = "Fixing icon for the document %s" % (id_bibdoc1, )
print msg,
print >> logfile, msg,
the_icon = BibDoc.create_instance(id_bibdoc2)
for a_file in the_icon.list_latest_files():
the_bibdoc.add_icon(a_file.get_full_path(), format=a_file.get_format())
the_icon.delete()
run_sql("DELETE FROM bibdoc_bibdoc WHERE id_bibdoc1=%s AND id_bibdoc2=%s", (id_bibdoc1, id_bibdoc2))
print "OK"
print >> logfile, "OK"
return True
except Exception, err:
print "ERROR: %s" % err
print >> logfile, "ERROR: %s" % err
register_exception()
return False
def main():
"""Core loop."""
check_running_process_user()
logfilename = '%s/fulltext_files_migration_kit-%s.log' % (CFG_LOGDIR, datetime.today().strftime('%Y%m%d%H%M%S'))
try:
logfile = open(logfilename, 'w')
except IOError, e:
print wrap_text_in_a_box('NOTE: it\'s impossible to create the log:\n\n %s\n\nbecause of:\n\n %s\n\nPlease run this migration kit as the same user who runs Invenio (e.g. Apache)' % (logfilename, e), style='conclusion', break_long=False)
sys.exit(1)
bibdoc_bibdoc = retrieve_bibdoc_bibdoc()
print wrap_text_in_a_box ("""This script migrate the filesystem structure used to store icons files to the new stricter structure.
This script must not be run during normal Invenio operations.
It is safe to run this script. No file will be deleted.
Anyway it is recommended to run a backup of the filesystem structure just in case.
A backup of the database tables involved will be automatically performed.""", style='important')
if not bibdoc_bibdoc:
print wrap_text_in_a_box("No need for migration", style='conclusion')
return
print "%s icons will be migrated/fixed." % len(bibdoc_bibdoc)
wait_for_user()
print "Backing up database tables"
try:
if not backup_tables():
print wrap_text_in_a_box("""It appears that is not the first time that you run this script.
Backup tables have been already created by a previous run.
In order for the script to go further they need to be removed.""", style='important')
wait_for_user()
print "Backing up database tables (after dropping previous backup)",
backup_tables(drop=True)
print "-> OK"
else:
print "-> OK"
except Exception, e:
print wrap_text_in_a_box("Unexpected error while backing up tables. Please, do your checks: %s" % e, style='conclusion')
sys.exit(1)
to_fix_marc = intbitset()
print "Created a complete log file into %s" % logfilename
try:
try:
for id_bibdoc1, id_bibdoc2 in bibdoc_bibdoc:
try:
record_does_exist = True
recids = get_recid_from_docid(id_bibdoc1)
if not recids:
print "Skipping %s" % id_bibdoc1
continue
for recid in recids:
if record_exists(recid[0]) > 0:
to_fix_marc.add(recid[0])
else:
record_does_exist = False
if not fix_bibdoc_bibdoc(id_bibdoc1, id_bibdoc2, logfile):
if record_does_exist:
raise StandardError("Error when correcting document ID %s" % id_bibdoc1)
except Exception, err:
print >> logfile, "ERROR: %s" % err
print wrap_text_in_a_box("DONE", style='conclusion')
except:
logfile.close()
register_exception()
print wrap_text_in_a_box(
title = "INTERRUPTED BECAUSE OF ERROR!",
body = """Please see the log file %s for what was the status prior to the error. Contact %s in case of problems, attaching the log.""" % (logfilename, CFG_SITE_SUPPORT_EMAIL),
style = 'conclusion')
sys.exit(1)
finally:
print "Scheduling FIX-MARC to synchronize MARCXML for updated records."
cli_fix_marc(options={}, explicit_recid_set=to_fix_marc)
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/bibdocfile/managedocfiles.py b/invenio/legacy/bibdocfile/managedocfiles.py
index 08bbaacb0..156b4f2b5 100644
--- a/invenio/legacy/bibdocfile/managedocfiles.py
+++ b/invenio/legacy/bibdocfile/managedocfiles.py
@@ -1,2938 +1,2938 @@
## $Id: Revise_Files.py,v 1.37 2009/03/26 15:11:05 jerome Exp $
## This file is part of Invenio.
## Copyright (C) 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibDocFile Upload File Interface utils
=====================================
Tools to help with creation of file management interfaces.
Contains the two main functions `create_file_upload_interface' and
`move_uploaded_files_to_storage', which must be run one after the
other:
- create_file_upload_interface: Generates the HTML of an interface to
revise files of a given record. The actions on the files are
recorded in a working directory, but not applied to the record.
- move_uploaded_files_to_storage: Applies/executes the modifications
on files as recorded by the `create_file_upload_interface'
function.
Theses functions are a complex interplay of HTML, Javascript and HTTP
requests. They are not meant to be used in any type of scenario, but
require to be used in extremely specific contexts (Currently in
WebSubmit Response Elements, WebSubmit functions and the BibDocFile
File Management interface).
NOTES:
======
- Comments are not considered as a property of bibdocfiles, but
bibdocs: this conflicts with the APIs
FIXME:
======
- refactor into smaller components. Eg. form processing in
create_file_upload_interface could be move outside the function.
- better differentiate between revised file, and added format
(currently when adding a format, the whole bibdoc is marked as
updated, and all links are removed)
- After a file has been revised or added, add a 'check' icon
- One issue: if we allow deletion or renaming, we might lose track of
a bibdoc: someone adds X, renames X->Y, and adds again another file
with name X: when executing actions, we will add the second X, and
rename it to Y
-> need to go back in previous action when renaming... or check
that name has never been used..
DEPENDENCIES:
=============
- jQuery Form plugin U{http://jquery.malsup.com/form/}
"""
import cPickle
import os
import time
import cgi
from urllib import urlencode
from invenio.config import \
CFG_SITE_LANG, \
CFG_SITE_URL, \
CFG_WEBSUBMIT_STORAGEDIR, \
CFG_TMPSHAREDDIR, \
CFG_SITE_SUPPORT_EMAIL, \
CFG_CERN_SITE, \
CFG_SITE_RECORD
from invenio.base.i18n import gettext_set_language
-from invenio.bibdocfilecli import cli_fix_marc
-from invenio.bibdocfile import BibRecDocs, \
+from invenio.legacy.bibdocfile.cli import cli_fix_marc
+from invenio.legacy.bibdocfile.api import BibRecDocs, \
decompose_file, calculate_md5, BibDocFile, \
InvenioBibDocFileError, BibDocMoreInfo
from invenio.legacy.websubmit.functions.Shared_Functions import \
createRelatedFormats
from invenio.ext.logging import register_exception
from invenio.legacy.dbquery import run_sql
from invenio.websubmit_icon_creator import \
create_icon, InvenioWebSubmitIconCreatorError
from invenio.utils.url import create_html_mailto
from invenio.utils.html import escape_javascript_string
-from invenio.bibdocfile_config import CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT
+from invenio.legacy.bibdocfile.config import CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT
CFG_ALLOWED_ACTIONS = ['revise', 'delete', 'add', 'addFormat']
params_id = 0
def create_file_upload_interface(recid,
form=None,
print_outside_form_tag=True,
print_envelope=True,
include_headers=False,
ln=CFG_SITE_LANG,
minsize='', maxsize='',
doctypes_and_desc=None,
can_delete_doctypes=None,
can_revise_doctypes=None,
can_describe_doctypes=None,
can_comment_doctypes=None,
can_keep_doctypes=None,
can_rename_doctypes=None,
can_add_format_to_doctypes=None,
create_related_formats=True,
can_name_new_files=True,
keep_default=True, show_links=True,
file_label=None, filename_label=None,
description_label=None, comment_label=None,
restrictions_and_desc=None,
can_restrict_doctypes=None,
restriction_label=None,
doctypes_to_default_filename=None,
max_files_for_doctype=None,
sbm_indir=None, sbm_doctype=None, sbm_access=None,
uid=None, sbm_curdir=None,
display_hidden_files=False, protect_hidden_files=True):
"""
Returns the HTML for the file upload interface.
@param recid: the id of the record to edit files
@type recid: int or None
@param form: the form sent by the user's browser in response to a
user action. This is used to read and record user's
actions.
@param form: as returned by the interface handler.
@param print_outside_form_tag: display encapsulating <form> tag or
not
@type print_outside_form_tag: boolean
@param print_envelope: (internal parameter) if True, return the
encapsulating initial markup, otherwise
skip it.
@type print_envelope: boolean
@param include_headers: include javascript and css headers in the
body of the page. If you set this to
False, you must take care of including
these headers in your page header. Setting
this parameter to True is useful if you
cannot change the page header.
@type include_headers: boolean
@param ln: language
@type ln: string
@param minsize: the minimum size (in bytes) allowed for the
uploaded files. Files not big enough are
discarded.
@type minsize: int
@param maxsize: the maximum size (in bytes) allowed for the
uploaded files. Files too big are discarded.
@type maxsize: int
@param doctypes_and_desc: the list of doctypes (like 'Main' or
'Additional') and their description that users
can choose from when adding new files.
- When no value is provided, users cannot add new
file (they can only revise/delete/add format)
- When a single value is given, it is used as
default doctype for all new documents
Order is relevant
Eg:
[('main', 'Main document'), ('additional', 'Figure, schema. etc')]
@type doctypes_and_desc: list(tuple(string, string))
@param restrictions_and_desc: the list of restrictions (like 'Restricted' or
'No Restriction') and their description that
users can choose from when adding or revising
files. Restrictions can then be configured at
the level of WebAccess.
- When no value is provided, no restriction is
applied
- When a single value is given, it is used as
default resctriction for all documents.
- The first value of the list is used as default
restriction if the user if not given the
choice of the restriction. Order is relevant
Eg:
[('', 'No restriction'), ('restr', 'Restricted')]
@type restrictions_and_desc: list(tuple(string, string))
@param can_delete_doctypes: the list of doctypes that users are
allowed to delete.
Eg: ['main', 'additional']
Use ['*'] for "all doctypes"
@type can_delete_doctypes: list(string)
@param can_revise_doctypes: the list of doctypes that users are
allowed to revise
Eg: ['main', 'additional']
Use ['*'] for "all doctypes"
@type can_revise_doctypes: list(string)
@param can_describe_doctypes: the list of doctypes that users are
allowed to describe
Eg: ['main', 'additional']
Use ['*'] for "all doctypes"
@type can_describe_doctypes: list(string)
@param can_comment_doctypes: the list of doctypes that users are
allowed to comment
Eg: ['main', 'additional']
Use ['*'] for "all doctypes"
@type can_comment_doctypes: list(string)
@param can_keep_doctypes: the list of doctypes for which users can
choose to keep previous versions visible when
revising a file (i.e. 'Keep previous version'
checkbox). See also parameter 'keepDefault'.
Note that this parameter is ~ignored when
revising the attributes of a file (comment,
description) without uploading a new
file. See also parameter
Move_Uploaded_Files_to_Storage.force_file_revision
Eg: ['main', 'additional']
Use ['*'] for "all doctypes"
@type can_keep_doctypes: list(string)
@param can_add_format_to_doctypes: the list of doctypes for which users can
add new formats. If there is no value,
then no 'add format' link nor warning
about losing old formats are displayed.
Eg: ['main', 'additional']
Use ['*'] for "all doctypes"
@type can_add_format_to_doctypes: list(string)
@param can_restrict_doctypes: the list of doctypes for which users can
choose the access restrictions when adding or
revising a file. If no value is given:
- no restriction is applied if none is defined
in the 'restrictions' parameter.
- else the *first* value of the 'restrictions'
parameter is used as default restriction.
Eg: ['main', 'additional']
Use ['*'] for "all doctypes"
@type can_restrict_doctypes : list(string)
@param can_rename_doctypes: the list of doctypes that users are allowed
to rename (when revising)
Eg: ['main', 'additional']
Use ['*'] for "all doctypes"
@type can_rename_doctypes: list(string)
@param can_name_new_files: if user can choose the name of the files they
upload or not
@type can_name_new_files: boolean
@param doctypes_to_default_filename: Rename uploaded files to admin-chosen
values. To rename to a value found in a file in curdir,
use 'file:' prefix to specify the file to read from.
Eg:
{'main': 'file:RN', 'additional': 'foo'}
If the same doctype is submitted
several times, a"-%i" suffix is added
to the name defined in the file.
When using 'file:' prefix, the name
is only resolved at the end of the
submission, when attaching the file.
The default filenames are overriden
by user-chosen names if you allow
'can_name_new_files' or
'can_rename_doctypes', excepted if the
name is prefixed with 'file:'.
@type doctypes_to_default_filename: dict
@param max_files_for_doctype: the maximum number of files that users can
upload for each doctype.
Eg: {'main': 1, 'additional': 2}
Do not specify the doctype here to have an
unlimited number of files for a given
doctype.
@type max_files_for_doctype: dict
@param create_related_formats: if uploaded files get converted to
whatever format we can or not
@type create_related_formats: boolean
@param keep_default: the default behaviour for keeping or not previous
version of files when users cannot choose (no
value in can_keep_doctypes).
Note that this parameter is ignored when revising
the attributes of a file (comment, description)
without uploading a new file. See also parameter
Move_Uploaded_Files_to_Storage.force_file_revision
@type keep_default: boolean
@param show_links: if we display links to files when possible or
not
@type show_links: boolean
@param file_label: the label for the file field
@type file_label: string
@param filename_label: the label for the file name field
@type filename_label: string
@param description_label: the label for the description field
@type description_label: string
@param comment_label: the label for the comments field
@type comment_label: string
@param restriction_label: the label in front of the restrictions list
@type restriction_label: string
@param sbm_indir: the submission indir parameter, in case the
function is used in a WebSubmit submission
context.
This value will be used to retrieve where to
read the current state of the interface and
store uploaded files
@type sbm_indir : string
@param sbm_doctype: the submission doctype parameter, in case the
function is used in a WebSubmit submission
context.
This value will be used to retrieve where to
read the current state of the interface and
store uploaded files
@type sbm_doctype: string
@param sbm_access: the submission access parameter. Must be
specified in the context of WebSubmit
submission, as well when used in the
WebSubmit Admin file management interface.
This value will be used to retrieve where to
read the current state of the interface and
store uploaded files
@type sbm_access: string
@param sbm_curdir: the submission curdir parameter. Must be
specified in the context of WebSubmit
function Create_Upload_File_Interface.
This value will be used to retrieve where to
read the current state of the interface and
store uploaded files.
@type sbm_curdir: string
@param uid: the user id
@type uid: int
@param display_hidden_files: if bibdoc containing bibdocfiles
flagged as 'HIDDEN' should be
displayed or not.
@type display_hidden_files: boolean
@param protect_hidden_files: if bibdoc containing bibdocfiles
flagged as 'HIDDEN' can be edited
(revise, delete, add format) or not.
@type protect_hidden_files: boolean
@return Tuple (errorcode, html)
"""
# Clean and set up a few parameters
_ = gettext_set_language(ln)
body = ''
if not file_label:
file_label = _('Choose a file')
if not filename_label:
filename_label = _('Name')
if not description_label:
description_label = _('Description')
if not comment_label:
comment_label = _('Comment')
if not restriction_label:
restriction_label = _('Access')
if not doctypes_and_desc:
doctypes_and_desc = []
if not can_delete_doctypes:
can_delete_doctypes = []
if not can_revise_doctypes:
can_revise_doctypes = []
if not can_describe_doctypes:
can_describe_doctypes = []
if not can_comment_doctypes:
can_comment_doctypes = []
if not can_keep_doctypes:
can_keep_doctypes = []
if not can_rename_doctypes:
can_rename_doctypes = []
if not can_add_format_to_doctypes:
can_add_format_to_doctypes = []
if not restrictions_and_desc:
restrictions_and_desc = []
if not can_restrict_doctypes:
can_restrict_doctypes = []
if not doctypes_to_default_filename:
doctypes_to_default_filename = {}
if not max_files_for_doctype:
max_files_for_doctype = {}
doctypes = [doctype for (doctype, desc) in doctypes_and_desc]
# Retrieve/build a working directory to save uploaded files and
# states + configuration.
working_dir = None
if sbm_indir and sbm_doctype and sbm_access:
# Write/read configuration to/from working_dir (WebSubmit mode).
# Retrieve the interface configuration from the current
# submission directory.
working_dir = os.path.join(CFG_WEBSUBMIT_STORAGEDIR,
sbm_indir,
sbm_doctype,
sbm_access)
try:
assert(working_dir == os.path.abspath(working_dir))
except AssertionError:
register_exception(prefix='Cannot create file upload interface: ' + \
+ 'missing parameter ',
alert_admin=True)
return (1, "Unauthorized parameters")
form_url_params = "?" + urlencode({'access': sbm_access,
'indir': sbm_indir,
'doctype': sbm_doctype})
elif uid and sbm_access:
# WebSubmit File Management (admin) interface mode.
# Working directory is in CFG_TMPSHAREDDIR
working_dir = os.path.join(CFG_TMPSHAREDDIR,
'websubmit_upload_interface_config_' + str(uid),
sbm_access)
try:
assert(working_dir == os.path.abspath(working_dir))
except AssertionError:
register_exception(prefix='Some user tried to access ' \
+ working_dir + \
' which is different than ' + \
os.path.abspath(working_dir),
alert_admin=True)
return (1, "Unauthorized parameters")
if not os.path.exists(working_dir):
os.makedirs(working_dir)
form_url_params = "?" + urlencode({'access': sbm_access})
elif sbm_curdir:
# WebSubmit Create_Upload_File_Interface.py function
working_dir = sbm_curdir
form_url_params = None
else:
register_exception(prefix='Some user tried to access ' \
+ working_dir + \
' which is different than ' + \
os.path.abspath(working_dir),
alert_admin=True)
return (1, "Unauthorized parameters")
# Save interface configuration, if this is the first time we come
# here, or else load parameters
try:
parameters = _read_file_revision_interface_configuration_from_disk(working_dir)
(minsize, maxsize, doctypes_and_desc, doctypes,
can_delete_doctypes, can_revise_doctypes,
can_describe_doctypes,
can_comment_doctypes, can_keep_doctypes,
can_rename_doctypes,
can_add_format_to_doctypes, create_related_formats,
can_name_new_files, keep_default, show_links,
file_label, filename_label, description_label,
comment_label, restrictions_and_desc,
can_restrict_doctypes,
restriction_label, doctypes_to_default_filename,
max_files_for_doctype, print_outside_form_tag,
display_hidden_files, protect_hidden_files) = parameters
except:
# Initial display of the interface: save configuration to
# disk for later reuse
parameters = (minsize, maxsize, doctypes_and_desc, doctypes,
can_delete_doctypes, can_revise_doctypes,
can_describe_doctypes,
can_comment_doctypes, can_keep_doctypes,
can_rename_doctypes,
can_add_format_to_doctypes, create_related_formats,
can_name_new_files, keep_default, show_links,
file_label, filename_label, description_label,
comment_label, restrictions_and_desc,
can_restrict_doctypes,
restriction_label, doctypes_to_default_filename,
max_files_for_doctype, print_outside_form_tag,
display_hidden_files, protect_hidden_files)
_write_file_revision_interface_configuration_to_disk(working_dir, parameters)
# Get the existing bibdocs as well as the actions performed during
# the former revise sessions of the user, to build an updated list
# of documents. We will use it to check if last action performed
# by user is allowed.
performed_actions = read_actions_log(working_dir)
if recid:
bibrecdocs = BibRecDocs(recid)
# Create the list of files based on current files and performed
# actions
bibdocs = bibrecdocs.list_bibdocs()
else:
bibdocs = []
# "merge":
abstract_bibdocs = build_updated_files_list(bibdocs,
performed_actions,
recid or -1,
display_hidden_files)
# If any, process form submitted by user
if form:
## Get and clean parameters received from user
(file_action, file_target, file_target_doctype,
keep_previous_files, file_description, file_comment, file_rename,
file_doctype, file_restriction, uploaded_filename, uploaded_filepath) = \
wash_form_parameters(form, abstract_bibdocs, can_keep_doctypes,
keep_default, can_describe_doctypes, can_comment_doctypes,
can_rename_doctypes, can_name_new_files, can_restrict_doctypes,
doctypes_to_default_filename, working_dir)
if protect_hidden_files and \
(file_action in ['revise', 'addFormat', 'delete']) and \
is_hidden_for_docname(file_target, abstract_bibdocs):
# Sanity check. We should not let editing
file_action = ''
body += '<script>alert("%s");</script>' % \
_("The file you want to edit is protected against modifications. Your action has not been applied")
## Check the last action performed by user, and log it if
## everything is ok
if uploaded_filepath and \
((file_action == 'add' and (file_doctype in doctypes)) or \
(file_action == 'revise' and \
((file_target_doctype in can_revise_doctypes) or \
'*' in can_revise_doctypes)) or
(file_action == 'addFormat' and \
((file_target_doctype in can_add_format_to_doctypes) or \
'*' in can_add_format_to_doctypes))):
# A file has been uploaded (user has revised or added a file,
# or a format)
dirname, filename, extension = decompose_file(uploaded_filepath)
os.unlink("%s/myfile" % working_dir)
if minsize.isdigit() and os.path.getsize(uploaded_filepath) < int(minsize):
os.unlink(uploaded_filepath)
body += '<script>alert("%s");</script>' % \
(_("The uploaded file is too small (<%i o) and has therefore not been considered") % \
int(minsize)).replace('"', '\\"')
elif maxsize.isdigit() and os.path.getsize(uploaded_filepath) > int(maxsize):
os.unlink(uploaded_filepath)
body += '<script>alert("%s");</script>' % \
(_("The uploaded file is too big (>%i o) and has therefore not been considered") % \
int(maxsize)).replace('"', '\\"')
elif len(filename) + len(extension) + 4 > 255:
# Max filename = 256, including extension and version that
# will be appended later by BibDoc
os.unlink(uploaded_filepath)
body += '<script>alert("%s");</script>' % \
_("The uploaded file name is too long and has therefore not been considered").replace('"', '\\"')
elif file_action == 'add' and \
max_files_for_doctype.has_key(file_doctype) and \
max_files_for_doctype[file_doctype] < \
(len([bibdoc for bibdoc in abstract_bibdocs \
if bibdoc['get_type'] == file_doctype]) + 1):
# User has tried to upload more than allowed for this
# doctype. Should never happen, unless the user did some
# nasty things
os.unlink(uploaded_filepath)
body += '<script>alert("%s");</script>' % \
_("You have already reached the maximum number of files for this type of document").replace('"', '\\"')
else:
# Prepare to move file to
# working_dir/files/updated/doctype/bibdocname/
folder_doctype = file_doctype or \
bibrecdocs.get_bibdoc(file_target).get_type()
folder_bibdocname = file_rename or file_target or filename
new_uploaded_filepath = os.path.join(working_dir, 'files', 'updated',
folder_doctype,
folder_bibdocname, uploaded_filename)
# First check that we do not conflict with an already
# existing bibdoc name
if file_action == "add" and \
((filename in [bibdoc['get_docname'] for bibdoc \
in abstract_bibdocs] and not file_rename) or \
file_rename in [bibdoc['get_docname'] for bibdoc \
in abstract_bibdocs]):
# A file with that name already exist. Cancel action
# and tell user.
os.unlink(uploaded_filepath)
body += '<script>alert("%s");</script>' % \
(_("A file named %s already exists. Please choose another name.") % \
(file_rename or filename)).replace('"', '\\"')
elif file_action == "revise" and \
file_rename != file_target and \
file_rename in [bibdoc['get_docname'] for bibdoc \
in abstract_bibdocs]:
# A file different from the one to revise already has
# the same bibdocname
os.unlink(uploaded_filepath)
body += '<script>alert("%s");</script>' % \
(_("A file named %s already exists. Please choose another name.") % \
file_rename).replace('"', '\\"')
elif file_action == "addFormat" and \
(extension in \
get_extensions_for_docname(file_target,
abstract_bibdocs)):
# A file with that extension already exists. Cancel
# action and tell user.
os.unlink(uploaded_filepath)
body += '<script>alert("%s");</script>' % \
(_("A file with format '%s' already exists. Please upload another format.") % \
extension).replace('"', '\\"')
elif '.' in file_rename or '/' in file_rename or "\\" in file_rename or \
not os.path.abspath(new_uploaded_filepath).startswith(os.path.join(working_dir, 'files', 'updated')):
# We forbid usage of a few characters, for the good of
# everybody...
os.unlink(uploaded_filepath)
body += '<script>alert("%s");</script>' % \
_("You are not allowed to use dot '.', slash '/', or backslash '\\\\' in file names. Choose a different name and upload your file again. In particular, note that you should not include the extension in the renaming field.").replace('"', '\\"')
else:
# No conflict with file name
# When revising, delete previously uploaded files for
# this entry, so that we do not execute the
# corresponding action
if file_action == "revise":
for path_to_delete in \
get_uploaded_files_for_docname(working_dir, file_target):
delete_file(working_dir, path_to_delete)
# Move uploaded file to working_dir/files/updated/doctype/bibdocname/
os.renames(uploaded_filepath, new_uploaded_filepath)
if file_action == "add":
# no need to check bibrecdocs.check_file_exists(new_uploaded_filepath, new_uploaded_format): was done before
# Log
if file_rename != '':
# at this point, bibdocname is specified
# name, no need to 'rename'
filename = file_rename
log_action(working_dir, file_action, filename,
new_uploaded_filepath, file_rename,
file_description, file_comment,
file_doctype, keep_previous_files,
file_restriction)
# Automatically create additional formats when
# possible.
additional_formats = []
if create_related_formats:
additional_formats = createRelatedFormats(new_uploaded_filepath,
overwrite=False)
for additional_format in additional_formats:
# Log
log_action(working_dir, 'addFormat', filename,
additional_format, file_rename,
file_description, file_comment,
file_doctype, True, file_restriction)
if file_action == "revise" and file_target != "":
# Log
log_action(working_dir, file_action, file_target,
new_uploaded_filepath, file_rename,
file_description, file_comment,
file_target_doctype, keep_previous_files,
file_restriction)
# Automatically create additional formats when
# possible.
additional_formats = []
if create_related_formats:
additional_formats = createRelatedFormats(new_uploaded_filepath,
overwrite=False)
for additional_format in additional_formats:
# Log
log_action(working_dir, 'addFormat',
(file_rename or file_target),
additional_format, file_rename,
file_description, file_comment,
file_target_doctype, True,
file_restriction)
if file_action == "addFormat" and file_target != "":
# We have already checked above that this format does
# not already exist.
# Log
log_action(working_dir, file_action, file_target,
new_uploaded_filepath, file_rename,
file_description, file_comment,
file_target_doctype, keep_previous_files,
file_restriction)
elif file_action in ["add", "addFormat"]:
# No file found, but action involved adding file: ask user to
# select a file
body += """<script>
alert("You did not specify a file. Please choose one before uploading.");
</script>"""
elif file_action == "revise" and file_target != "":
# User has chosen to revise attributes of a file (comment,
# name, etc.) without revising the file itself.
if file_rename != file_target and \
file_rename in [bibdoc['get_docname'] for bibdoc \
in abstract_bibdocs]:
# A file different from the one to revise already has
# the same bibdocname
body += '<script>alert("%s");</script>' % \
(_("A file named %s already exists. Please choose another name.") % \
file_rename).replace('"', '\\"')
elif file_rename != file_target and \
('.' in file_rename or '/' in file_rename or "\\" in file_rename):
# We forbid usage of a few characters, for the good of
# everybody...
body += '<script>alert("%s");</script>' % \
_("You are not allowed to use dot '.', slash '/', or backslash '\\\\' in file names. Choose a different name and upload your file again. In particular, note that you should not include the extension in the renaming field.").replace('"', '\\"')
else:
# Log
log_action(working_dir, file_action, file_target,
"", file_rename,
file_description, file_comment,
file_target_doctype, keep_previous_files,
file_restriction)
elif file_action == "delete" and file_target != "" and \
((file_target_doctype in can_delete_doctypes) or \
'*' in can_delete_doctypes):
# Delete previously uploaded files for this entry
for path_to_delete in get_uploaded_files_for_docname(working_dir, file_target):
delete_file(working_dir, path_to_delete)
# Log
log_action(working_dir, file_action, file_target, "", file_rename,
file_description, file_comment, "",
keep_previous_files, file_restriction)
## Display
performed_actions = read_actions_log(working_dir)
#performed_actions = []
if recid:
bibrecdocs = BibRecDocs(recid)
# Create the list of files based on current files and performed
# actions
bibdocs = bibrecdocs.list_bibdocs()
else:
bibdocs = []
abstract_bibdocs = build_updated_files_list(bibdocs, performed_actions,
recid or -1, display_hidden_files)
abstract_bibdocs.sort(lambda x, y: x['order'] - y['order'])
# Display form and necessary CSS + Javscript
#body += '<div>'
#body += css
js_can_describe_doctypes = repr({}.fromkeys(can_describe_doctypes, ''))
js_can_comment_doctypes = repr({}.fromkeys(can_comment_doctypes, ''))
js_can_restrict_doctypes = repr({}.fromkeys(can_restrict_doctypes, ''))
# Prepare to display file revise panel "balloon". Check if we
# should display the list of doctypes or if it is not necessary (0
# or 1 doctype). Also make sure that we do not exceed the maximum
# number of files specified per doctype. The markup of the list of
# doctypes is prepared here, and will be passed as parameter to
# the display_revise_panel function
cleaned_doctypes = [doctype for doctype in doctypes if
not max_files_for_doctype.has_key(doctype) or
(max_files_for_doctype[doctype] > \
len([bibdoc for bibdoc in abstract_bibdocs \
if bibdoc['get_type'] == doctype]))]
doctypes_list = ""
if len(cleaned_doctypes) > 1:
doctypes_list = '<select id="fileDoctype" name="fileDoctype" onchange="var idx=this.selectedIndex;var doctype=this.options[idx].value;updateForm(doctype,'+','.join([js_can_describe_doctypes, js_can_comment_doctypes, js_can_restrict_doctypes])+');">' + \
'\n'.join(['<option value="' + cgi.escape(doctype, True) + '">' + \
cgi.escape(description) + '</option>' \
for (doctype, description) \
in doctypes_and_desc if \
doctype in cleaned_doctypes]) + \
'</select>'
elif len(cleaned_doctypes) == 1:
doctypes_list = '<input id="fileDoctype" name="fileDoctype" type="hidden" value="%s" />' % cleaned_doctypes[0]
# Check if we should display the list of access restrictions or if
# it is not necessary
restrictions_list = ""
if len(restrictions_and_desc) > 1:
restrictions_list = '<select id="fileRestriction" name="fileRestriction">' + \
'\n'.join(['<option value="' + cgi.escape(restriction, True) + '">' + \
cgi.escape(description) + '</option>' \
for (restriction, description) \
in restrictions_and_desc]) + \
'</select>'
restrictions_list = '''<label for="restriction">%(restriction_label)s:</label>&nbsp;%(restrictions_list)s&nbsp;<small>[<a href="" onclick="alert('%(restriction_help)s');return false;">?</a>]</small>''' % \
{'restrictions_list': restrictions_list,
'restriction_label': restriction_label,
'restriction_help': _('Choose how you want to restrict access to this file.').replace("'", "\\'")}
elif len(restrictions_and_desc) == 1:
restrictions_list = '<select style="display:none" id="fileRestriction" name="fileRestriction"><option value="%(restriction_attr)s">%(restriction)s</option></select>' % {
'restriction': cgi.escape(restrictions_and_desc[0][0]),
'restriction_attr': cgi.escape(restrictions_and_desc[0][0], True)
}
else:
restrictions_list = '<select style="display:none" id="fileRestriction" name="fileRestriction"></select>'
# List the files
body += '''
<div id="reviseControl">
<table class="reviseControlBrowser">'''
i = 0
for bibdoc in abstract_bibdocs:
if bibdoc['list_latest_files']:
i += 1
body += create_file_row(bibdoc, can_delete_doctypes,
can_rename_doctypes,
can_revise_doctypes,
can_describe_doctypes,
can_comment_doctypes,
can_keep_doctypes,
can_add_format_to_doctypes,
doctypes_list,
show_links,
can_restrict_doctypes,
even=not (i % 2),
ln=ln,
form_url_params=form_url_params,
protect_hidden_files=protect_hidden_files)
body += '</table>'
if len(cleaned_doctypes) > 0:
(revise_panel, javascript_prefix) = javascript_display_revise_panel(action='add', target='', show_doctypes=True, show_keep_previous_versions=False, show_rename=can_name_new_files, show_description=True, show_comment=True, bibdocname='', description='', comment='', show_restrictions=True, restriction=len(restrictions_and_desc) > 0 and restrictions_and_desc[0][0] or '', doctypes=doctypes_list)
body += '''%(javascript_prefix)s<input type="button" onclick="%(display_revise_panel)s;updateForm('%(defaultSelectedDoctype)s', %(can_describe_doctypes)s, %(can_comment_doctypes)s, %(can_restrict_doctypes)s);return false;" value="%(add_new_file)s"/>''' % \
{'display_revise_panel': revise_panel,
'javascript_prefix': javascript_prefix,
'defaultSelectedDoctype': escape_javascript_string(cleaned_doctypes[0], escape_quote_for_html=True),
'add_new_file': _("Add new file"),
'can_describe_doctypes':js_can_describe_doctypes,
'can_comment_doctypes': repr({}.fromkeys(can_comment_doctypes, '')),
'can_restrict_doctypes': repr({}.fromkeys(can_restrict_doctypes, ''))}
body += '</div>'
if print_envelope:
# We should print this only if we display for the first time
body = '<div id="uploadFileInterface">' + body + '</div>'
if include_headers:
body = get_upload_file_interface_javascript(form_url_params) + \
get_upload_file_interface_css() + \
body
# Display markup of the revision panel. This one is also
# printed only at the beginning, so that it does not need to
# be returned with each response
body += revise_balloon % \
{'CFG_SITE_URL': CFG_SITE_URL,
'file_label': file_label,
'filename_label': filename_label,
'description_label': description_label,
'comment_label': comment_label,
'restrictions': restrictions_list,
'previous_versions_help': _('You can decide to hide or not previous version(s) of this file.').replace("'", "\\'"),
'revise_format_help': _('When you revise a file, the additional formats that you might have previously uploaded are removed, since they no longer up-to-date with the new file.').replace("'", "\\'"),
'revise_format_warning': _('Alternative formats uploaded for current version of this file will be removed'),
'previous_versions_label': _('Keep previous versions'),
'cancel': _('Cancel'),
'upload': _('Upload'),
'uploading_label': _('Uploading...'),
'postprocess_label': _('Please wait...'),
'submit_or_button': form_url_params and 'button' or 'submit'}
body += '''
<input type="hidden" name="recid" value="%(recid)i"/>
<input type="hidden" name="ln" value="%(ln)s"/>
''' % \
{'recid': recid or -1,
'ln': ln}
# End submission button
if sbm_curdir:
body += '''<br /><div style="font-size:small">
<input type="button" class="adminbutton" name="Submit" id="applyChanges" value="%(apply_changes)s" onClick="nextStep();"></div>''' % \
{'apply_changes': _("Apply changes")}
# Display a link to support email in case users have problem
# revising/adding files
mailto_link = create_html_mailto(email=CFG_SITE_SUPPORT_EMAIL,
subject=_("Need help revising or adding files to record %(recid)s") % \
{'recid': recid or ''},
body=_("""Dear Support,
I would need help to revise or add a file to record %(recid)s.
I have attached the new version to this email.
Best regards""") % {'recid': recid or ''})
problem_revising = _('Having a problem revising a file? Send the revised version to %(mailto_link)s.') % {'mailto_link': mailto_link}
if len(cleaned_doctypes) > 0:
# We can add files, so change note
problem_revising = _('Having a problem adding or revising a file? Send the new/revised version to %(mailto_link)s.') % {'mailto_link': mailto_link}
body += '<br />'
body += problem_revising
if print_envelope and print_outside_form_tag:
body = '<form method="post" action="/%s/managedocfilesasync" id="uploadFileForm">' % CFG_SITE_RECORD + body + '</form>'
return (0, body)
def create_file_row(abstract_bibdoc, can_delete_doctypes,
can_rename_doctypes, can_revise_doctypes,
can_describe_doctypes, can_comment_doctypes,
can_keep_doctypes, can_add_format_to_doctypes,
doctypes_list, show_links, can_restrict_doctypes,
even=False, ln=CFG_SITE_LANG, form_url_params='',
protect_hidden_files=True):
"""
Creates a row in the files list representing the given abstract_bibdoc
@param abstract_bibdoc: list of "fake" BibDocs: it is a list of dictionaries
with keys 'list_latest_files' and 'get_docname' with
values corresponding to what you would expect to receive
when calling their counterpart function on a real BibDoc
object.
@param can_delete_doctypes: list of doctypes for which we allow users to delete
documents
@param can_revise_doctypes: the list of doctypes that users are
allowed to revise.
@param can_describe_doctypes: the list of doctypes that users are
allowed to describe.
@param can_comment_doctypes: the list of doctypes that users are
allowed to comment.
@param can_keep_doctypes: the list of doctypes for which users can
choose to keep previous versions visible
when revising a file (i.e. 'Keep previous
version' checkbox).
@param can_rename_doctypes: the list of doctypes that users are
allowed to rename (when revising)
@param can_add_format_to_doctypes: the list of doctypes for which users can
add new formats
@param show_links: if we display links to files
@param even: if the row is even or odd on the list
@type even: boolean
@param ln: language
@type ln: string
@param form_url_params: the
@type form_url_params: string
@param protect_hidden_files: if bibdoc containing bibdocfiles
flagged as 'HIDDEN' can be edited
(revise, delete, add format) or not.
@type protect_hidden_files: boolean
@return: an HTML formatted "file" row
@rtype: string
"""
_ = gettext_set_language(ln)
# Try to retrieve "main format", to display as link for the
# file. There is no such concept in BibDoc, but let's just try to
# get the pdf file if it exists
main_bibdocfile = [bibdocfile for bibdocfile in abstract_bibdoc['list_latest_files'] \
if bibdocfile.get_format().strip('.').lower() == 'pdf']
if len(main_bibdocfile) > 0:
main_bibdocfile = main_bibdocfile[0]
else:
main_bibdocfile = abstract_bibdoc['list_latest_files'][0]
main_bibdocfile_description = main_bibdocfile.get_description()
if main_bibdocfile_description is None:
main_bibdocfile_description = ''
updated = abstract_bibdoc['updated'] # Has BibDoc been updated?
hidden_p = abstract_bibdoc['hidden_p']
# Main file row
out = '<tr%s>' % (even and ' class="even"' or '')
out += '<td class="reviseControlFileColumn"%s>' % (hidden_p and ' style="color:#99F"' or '')
if not updated and show_links and not hidden_p:
out += '<a target="_blank" href="' + main_bibdocfile.get_url() \
+ '">'
out += cgi.escape(abstract_bibdoc['get_docname'])
if hidden_p:
out += ' <span style="font-size:small;font-style:italic;color:#888">(hidden)</span>'
if not updated and show_links and not hidden_p:
out += '</a>'
if main_bibdocfile_description:
out += ' (<em>' + cgi.escape(main_bibdocfile_description) + '</em>)'
out += '</td>'
(description, comment) = get_description_and_comment(abstract_bibdoc['list_latest_files'])
restriction = abstract_bibdoc['get_status']
# Revise link
out += '<td class="reviseControlActionColumn">'
if main_bibdocfile.get_type() in can_revise_doctypes or \
'*' in can_revise_doctypes and not (hidden_p and protect_hidden_files):
(revise_panel, javascript_prefix) = javascript_display_revise_panel(
action='revise',
target=abstract_bibdoc['get_docname'],
show_doctypes=False,
show_keep_previous_versions=(main_bibdocfile.get_type() in can_keep_doctypes) or '*' in can_keep_doctypes,
show_rename=(main_bibdocfile.get_type() in can_rename_doctypes) or '*' in can_rename_doctypes,
show_description=(main_bibdocfile.get_type() in can_describe_doctypes) or '*' in can_describe_doctypes,
show_comment=(main_bibdocfile.get_type() in can_comment_doctypes) or '*' in can_comment_doctypes,
bibdocname=abstract_bibdoc['get_docname'],
description=description,
comment=comment,
show_restrictions=(main_bibdocfile.get_type() in can_restrict_doctypes) or '*' in can_restrict_doctypes,
restriction=restriction,
doctypes=doctypes_list)
out += '%(javascript_prefix)s[<a href="" onclick="%(display_revise_panel)s;return false;">%(revise)s</a>]' % \
{'display_revise_panel': revise_panel,
'javascript_prefix': javascript_prefix,
'revise': _("revise")
}
# Delete link
if main_bibdocfile.get_type() in can_delete_doctypes or \
'*' in can_delete_doctypes and not (hidden_p and protect_hidden_files):
global params_id
params_id += 1
out += '''
<script type="text/javascript">
/*<![CDATA[*/
var delete_panel_params_%(id)i = "%(bibdocname)s";
/*]]>*/
</script>
[<a href="" onclick="return askDelete(delete_panel_params_%(id)i, '%(form_url_params)s')">%(delete)s</a>]
''' % {'bibdocname': escape_javascript_string(abstract_bibdoc['get_docname'], escape_for_html=False),
'delete': _("delete"),
'form_url_params': form_url_params or '',
'id': params_id}
out += '''</td>'''
# Format row
out += '''<tr%s>
<td class="reviseControlFormatColumn"%s>
<img src="%s/img/tree_branch.gif" alt="">
''' % (even and ' class="even"' or '', hidden_p and ' style="color:#999"' or '', CFG_SITE_URL)
for bibdocfile in abstract_bibdoc['list_latest_files']:
if not updated and show_links and not hidden_p:
out += '<a target="_blank" href="' + bibdocfile.get_url() + '">'
out += bibdocfile.get_format().strip('.')
if not updated and show_links and not hidden_p:
out += '</a>'
out += ' '
# Add format link
out += '<td class="reviseControlActionColumn">'
if main_bibdocfile.get_type() in can_add_format_to_doctypes or \
'*' in can_add_format_to_doctypes and not (hidden_p and protect_hidden_files):
(revise_panel, javascript_prefix) = javascript_display_revise_panel(
action='addFormat',
target=abstract_bibdoc['get_docname'],
show_doctypes=False,
show_keep_previous_versions=False,
show_rename=False,
show_description=False,
show_comment=False,
bibdocname='',
description='',
comment='',
show_restrictions=False,
restriction=restriction,
doctypes=doctypes_list)
out += '%(javascript_prefix)s[<a href="" onclick="%(display_revise_panel)s;return false;">%(add_format)s</a>]' % \
{'display_revise_panel': revise_panel,
'javascript_prefix': javascript_prefix,
'add_format':_("add format")}
out += '</td></tr>'
return out
def build_updated_files_list(bibdocs, actions, recid, display_hidden_files=False):
"""
Parses the list of BibDocs and builds an updated version to reflect
the changes performed by the user of the file
It is necessary to abstract the BibDocs since user wants to
perform action on the files that are committed only at the end of
the session.
@param bibdocs: the original list of bibdocs on which we want to
build a new updated list
@param actions: the list of actions performed by the user on the
files, and that we want to consider to build an
updated file list
@param recid: the record ID to which the files belong
@param display_hidden_files: if bibdoc containing bibdocfiles
flagged as 'HIDDEN' should be
displayed or not.
@type display_hidden_files: boolean
"""
abstract_bibdocs = {}
i = 0
for bibdoc in bibdocs:
hidden_p = True in [bibdocfile.hidden_p() for bibdocfile in bibdoc.list_latest_files()]
if CFG_CERN_SITE:
hidden_p = False # Temporary workaround. See Ticket #846
if not display_hidden_files and hidden_p:
# Do not consider hidden files
continue
i += 1
status = bibdoc.get_status()
if status == "DELETED":
status = ''
brd = BibRecDocs(recid)
abstract_bibdocs[brd.get_docname(bibdoc.id)] = \
{'list_latest_files': bibdoc.list_latest_files(),
'get_docname': brd.get_docname(bibdoc.id),
'updated': False,
'get_type': bibdoc.get_type(),
'get_status': status,
'order': i,
'hidden_p': hidden_p}
for action, bibdoc_name, file_path, rename, description, \
comment, doctype, keep_previous_versions, \
file_restriction in actions:
dirname, filename, fileformat = decompose_file(file_path)
i += 1
if action in ["add", "revise"] and \
os.path.exists(file_path):
checksum = calculate_md5(file_path)
order = i
if action == "revise" and \
abstract_bibdocs.has_key(bibdoc_name):
# Keep previous values
order = abstract_bibdocs[bibdoc_name]['order']
doctype = abstract_bibdocs[bibdoc_name]['get_type']
if bibdoc_name.strip() == '' and rename.strip() == '':
bibdoc_name = os.path.extsep.join(filename.split(os.path.extsep)[:-1])
elif rename.strip() != '' and \
abstract_bibdocs.has_key(bibdoc_name):
# Keep previous position
del abstract_bibdocs[bibdoc_name]
# First instantiate a fake BibDocMoreInfo object, without any side effect
more_info = BibDocMoreInfo(1, cache_only = False, initial_data = {})
if description is not None:
more_info['descriptions'] = {1: {fileformat:description}}
if comment is not None:
more_info['comments'] = {1: {fileformat:comment}}
abstract_bibdocs[(rename or bibdoc_name)] = \
{'list_latest_files': [BibDocFile(file_path, [(int(recid), doctype,(rename or bibdoc_name))], version=1,
docformat=fileformat,
docid=-1,
status=file_restriction,
checksum=checksum,
more_info=more_info)],
'get_docname': rename or bibdoc_name,
'get_type': doctype,
'updated': True,
'get_status': file_restriction,
'order': order,
'hidden_p': False}
abstract_bibdocs[(rename or bibdoc_name)]['updated'] = True
elif action == "revise" and not file_path:
# revision of attributes of a file (description, name,
# comment or restriction) but no new file.
abstract_bibdocs[bibdoc_name]['get_docname'] = rename or bibdoc_name
abstract_bibdocs[bibdoc_name]['get_status'] = file_restriction
set_description_and_comment(abstract_bibdocs[bibdoc_name]['list_latest_files'],
description, comment)
abstract_bibdocs[bibdoc_name]['updated'] = True
elif action == "delete":
if abstract_bibdocs.has_key(bibdoc_name):
del abstract_bibdocs[bibdoc_name]
elif action == "addFormat" and \
os.path.exists(file_path):
checksum = calculate_md5(file_path)
# Preserve type and status
doctype = abstract_bibdocs[bibdoc_name]['get_type']
file_restriction = abstract_bibdocs[bibdoc_name]['get_status']
# First instantiate a fake BibDocMoreInfo object, without any side effect
more_info = BibDocMoreInfo(1, cPickle.dumps({}))
if description is not None:
more_info['descriptions'] = {1: {fileformat:description}}
if comment is not None:
more_info['comments'] = {1: {fileformat:comment}}
abstract_bibdocs[bibdoc_name]['list_latest_files'].append(\
BibDocFile(file_path, [(int(recid), doctype, (rename or bibdoc_name))], version=1,
docformat=fileformat,
docid=-1, status='',
checksum=checksum, more_info=more_info))
abstract_bibdocs[bibdoc_name]['updated'] = True
return abstract_bibdocs.values()
def _read_file_revision_interface_configuration_from_disk(working_dir):
"""
Read the configuration of the file revision interface from disk
@param working_dir: the path to the working directory where we can find
the configuration file
"""
input_file = open(os.path.join(working_dir, 'upload_interface.config'), 'rb')
configuration = cPickle.load(input_file)
input_file.close()
return configuration
def _write_file_revision_interface_configuration_to_disk(working_dir, parameters):
"""
Write the configuration of the file revision interface to disk
@param working_dir: the path to the working directory where we should
write the configuration.
@param parameters: the parameters to write to disk
"""
output = open(os.path.join(working_dir, 'upload_interface.config'), 'wb')
cPickle.dump(parameters, output)
output.close()
def log_action(log_dir, action, bibdoc_name, file_path, rename,
description, comment, doctype, keep_previous_versions,
file_restriction):
"""
Logs a new action performed by user on a BibDoc file.
The log file record one action per line, each column being split
by '<--->' ('---' is escaped from values 'rename', 'description',
'comment' and 'bibdoc_name'). The original request for this
format was motivated by the need to have it easily readable by
other scripts. Not sure it still makes sense nowadays...
Newlines are also reserved, and are escaped from the input values
(necessary for the 'comment' field, which is the only one allowing
newlines from the browser)
Each line starts with the time of the action in the following
format: '2008-06-20 08:02:04 --> '
@param log_dir: directory where to save the log (ie. working_dir)
@param action: the performed action (one of 'revise', 'delete',
'add', 'addFormat')
@param bibdoc_name: the name of the bibdoc on which the change is
applied
@param file_path: the path to the file that is going to be
integrated as bibdoc, if any (should be""
in case of action="delete", or action="revise"
when revising only attributes of a file)
@param rename: the name used to display the bibdoc, instead of the
filename (can be None for no renaming)
@param description: a description associated with the file
@param comment: a comment associated with the file
@param doctype: the category in which the file is going to be
integrated
@param keep_previous_versions: if the previous versions of this
file are to be hidden (0) or not (1)
@param file_restriction: the restriction applied to the
file. Empty string if no restriction
"""
log_file = os.path.join(log_dir, 'bibdocactions.log')
try:
file_desc = open(log_file, "a+")
# We must escape new lines from comments in some way:
comment = str(comment).replace('\\', '\\\\').replace('\r\n', '\\n\\r')
msg = action + '<--->' + \
bibdoc_name.replace('---', '___') + '<--->' + \
file_path + '<--->' + \
str(rename).replace('---', '___') + '<--->' + \
str(description).replace('---', '___') + '<--->' + \
comment.replace('---', '___') + '<--->' + \
doctype + '<--->' + \
str(int(keep_previous_versions)) + '<--->' + \
file_restriction + '\n'
file_desc.write("%s --> %s" %(time.strftime("%Y-%m-%d %H:%M:%S"), msg))
file_desc.close()
except Exception ,e:
raise e
def read_actions_log(log_dir):
"""
Reads the logs of action to be performed on files
See log_action(..) for more information about the structure of the
log file.
@param log_dir: the path to the directory from which to read the
log file
@type log_dir: string
"""
actions = []
log_file = os.path.join(log_dir, 'bibdocactions.log')
try:
file_desc = open(log_file, "r")
for line in file_desc.readlines():
(timestamp, action) = line.split(' --> ', 1)
try:
(action, bibdoc_name, file_path, rename, description,
comment, doctype, keep_previous_versions,
file_restriction) = action.rstrip('\n').split('<--->')
except ValueError, e:
# Malformed action log
pass
# Clean newline-escaped comment:
comment = comment.replace('\\n\\r', '\r\n').replace('\\\\', '\\')
# Perform some checking
if action not in CFG_ALLOWED_ACTIONS:
# Malformed action log
pass
try:
keep_previous_versions = int(keep_previous_versions)
except:
# Malformed action log
keep_previous_versions = 1
pass
actions.append((action, bibdoc_name, file_path, rename, \
description, comment, doctype,
keep_previous_versions, file_restriction))
file_desc.close()
except:
pass
return actions
def javascript_display_revise_panel(action, target, show_doctypes, show_keep_previous_versions, show_rename, show_description, show_comment, bibdocname, description, comment, show_restrictions, restriction, doctypes):
"""
Returns a correctly encoded call to the javascript function to
display the revision panel.
"""
global params_id
params_id += 1
javascript_prefix = '''
<script type="text/javascript">
/*<![CDATA[*/
var revise_panel_params_%(id)i = {"action": "%(action)s",
"target": "%(target)s",
"showDoctypes": %(showDoctypes)s,
"showKeepPreviousVersions": %(showKeepPreviousVersions)s,
"showRename": %(showRename)s,
"showDescription": %(showDescription)s,
"showComment": %(showComment)s,
"bibdocname": "%(bibdocname)s",
"description": "%(description)s",
"comment": "%(comment)s",
"showRestrictions": %(showRestrictions)s,
"restriction": "%(restriction)s",
"doctypes": "%(doctypes)s"}
/*]]>*/
</script>''' % {'id': params_id,
'action': action,
'showDoctypes': show_doctypes and 'true' or 'false',
'target': escape_javascript_string(target, escape_for_html=False),
'bibdocname': escape_javascript_string(bibdocname, escape_for_html=False),
'showRename': show_rename and 'true' or 'false',
'showKeepPreviousVersions': show_keep_previous_versions and 'true' or 'false',
'showComment': show_comment and 'true' or 'false',
'showDescription': show_description and 'true' or 'false',
'description': description and escape_javascript_string(description, escape_for_html=False) or '',
'comment': comment and escape_javascript_string(comment, escape_for_html=False) or '',
'showRestrictions': show_restrictions and 'true' or 'false',
'restriction': escape_javascript_string(restriction, escape_for_html=False),
'doctypes': escape_javascript_string(doctypes, escape_for_html=False)}
return ('display_revise_panel(this, revise_panel_params_%(id)i)' % {'id': params_id},
javascript_prefix)
def get_uploaded_files_for_docname(log_dir, docname):
"""
Given a docname, returns the paths to the files uploaded for this
revision session.
@param log_dir: the path to the directory that should contain the
uploaded files.
@param docname: the name of the bibdoc for which we want to
retrieve files.
"""
return [file_path for action, bibdoc_name, file_path, rename, \
description, comment, doctype, keep_previous_versions , \
file_restriction in read_actions_log(log_dir) \
if bibdoc_name == docname and os.path.exists(file_path)]
def get_bibdoc_for_docname(docname, abstract_bibdocs):
"""
Given a docname, returns the corresponding bibdoc from the
'abstract' bibdocs.
Return None if not found
@param docname: the name of the bibdoc we want to retrieve
@param abstract_bibdocs: the list of bibdocs from which we want to
retrieve the bibdoc
"""
bibdocs = [bibdoc for bibdoc in abstract_bibdocs \
if bibdoc['get_docname'] == docname]
if len(bibdocs) > 0:
return bibdocs[0]
else:
return None
def get_extensions_for_docname(docname, abstract_bibdocs):
"""
Returns the list of extensions that exists for given bibdoc
name in the given 'abstract' bibdocs.
@param docname: the name of the bibdoc for wich we want to
retrieve the available extensions
@param abstract_bibdocs: the list of bibdocs from which we want to
retrieve the bibdoc extensions
"""
bibdocfiles = [bibdoc['list_latest_files'] for bibdoc \
in abstract_bibdocs \
if bibdoc['get_docname'] == docname]
if len(bibdocfiles) > 0:
# There should always be at most 1 matching docname, or 0 if
# it is a new file
return [bibdocfile.get_format() for bibdocfile \
in bibdocfiles[0]]
return []
def is_hidden_for_docname(docname, abstract_bibdocs):
"""
Returns True if the bibdoc with given docname in abstract_bibdocs
should be hidden. Also return True if docname cannot be found in
abstract_bibdocs.
@param docname: the name of the bibdoc for wich we want to
check if it is hidden or not
@param abstract_bibdocs: the list of bibdocs from which we want to
look for the given docname
"""
bibdocs = [bibdoc for bibdoc in abstract_bibdocs \
if bibdoc['get_docname'] == docname]
if len(bibdocs) > 0:
return bibdocs[0]['hidden_p']
return True
def get_description_and_comment(bibdocfiles):
"""
Returns the first description and comment as tuple (description,
comment) found in the given list of bibdocfile
description and/or comment can be None.
This function is needed since we do consider that there is one
comment/description per bibdoc, and not per bibdocfile as APIs
state.
@param bibdocfiles: the list of files of a given bibdoc for which
we want to extract the description and comment.
"""
description = None
comment = None
all_descriptions = [bibdocfile.get_description() for bibdocfile \
in bibdocfiles
if bibdocfile.get_description() not in ['', None]]
if len(all_descriptions) > 0:
description = all_descriptions[0]
all_comments = [bibdocfile.get_comment() for bibdocfile \
in bibdocfiles
if bibdocfile.get_comment() not in ['', None]]
if len(all_comments) > 0:
comment = all_comments[0]
return (description, comment)
def set_description_and_comment(abstract_bibdocfiles, description, comment):
"""
Set the description and comment to the given (abstract)
bibdocfiles.
description and/or comment can be None.
This function is needed since we do consider that there is one
comment/description per bibdoc, and not per bibdocfile as APIs
state.
@param abstract_bibdocfiles: the list of 'abstract' files of a
given bibdoc for which we want to set the
description and comment.
@param description: the new description
@param comment: the new comment
"""
for bibdocfile in abstract_bibdocfiles:
bibdocfile.description = description
bibdocfile.comment = comment
def delete_file(working_dir, file_path):
"""
Deletes a file at given path from the file.
In fact, we just move it to working_dir/files/trash
@param working_dir: the path to the working directory
@param file_path: the path to the file to delete
"""
if os.path.exists(file_path):
filename = os.path.split(file_path)[1]
move_to = os.path.join(working_dir, 'files', 'trash',
filename +'_' + str(time.time()))
os.renames(file_path, move_to)
def wash_form_parameters(form, abstract_bibdocs, can_keep_doctypes,
keep_default, can_describe_doctypes,
can_comment_doctypes, can_rename_doctypes,
can_name_new_files, can_restrict_doctypes,
doctypes_to_default_filename, working_dir):
"""
Washes the (user-defined) form parameters, taking into account the
current state of the files and the admin defaults.
@param form: the form of the function
@param abstract_bibdocs: a representation of the current state of
the files, as returned by
build_updated_file_list(..)
@param can_keep_doctypes: the list of doctypes for which we allow
users to choose to keep or not the
previous versions when revising.
@type can_keep_doctypes: list
@param keep_default: the admin-defined default for when users
cannot choose to keep or not previous version
of a revised file
@type keep_default: boolean
@param can_describe_doctypes: the list of doctypes for which we
let users define descriptions.
@type can_describe_doctypes: list
@param can_comment_doctypes: the list of doctypes for which we let
users define comments.
@type can_comment_doctypes: list
@param can_rename_doctypes: the list of doctypes for which we let
users rename bibdoc when revising.
@type can_rename_doctypes: list
@param can_name_new_files: if we let users choose a name when
adding new files.
@type can_name_new_files: boolean
@param can_restrict_doctypes: the list of doctypes for which we
let users define access
restrictions.
@type can_restrict_doctypes: list
@param doctypes_to_default_filename: mapping from doctype to
admin-chosen name for
uploaded file.
@type doctypes_to_default_filename: dict
@param working_dir: the path to the current working directory
@type working_dir: string
@return: tuple (file_action, file_target, file_target_doctype,
keep_previous_files, file_description, file_comment,
file_rename, file_doctype, file_restriction) where::
file_action: *str* the performed action ('add',
'revise','addFormat' or 'delete')
file_target: *str* the bibdocname of the file on which the
action is performed (empty string when
file_action=='add')
file_target_doctype: *str* the doctype of the file we will
work on. Eg: ('main',
'additional'). Empty string with
file_action=='add'.
keep_previous_files: *bool* if we keep the previous version of
the file or not. Only useful when
revising files.
file_description: *str* the user-defined description to apply
to the file. Empty string when no
description defined or when not applicable
file_comment: *str* the user-defined comment to apply to the
file. Empty string when no comment defined or
when not applicable
file_rename: *str* the new name chosen by user for the
bibdoc. Empty string when not defined or when not
applicable.
file_doctype: *str* the user-chosen doctype for the bibdoc
when file_action=='add', or the current doctype
of the file_target in other cases (doctype must
be preserved).
file_restriction: *str* the user-selected restriction for the
file. Emptry string if not defined or when
not applicable.
file_name: *str* the original name of the uploaded file. None
if no file uploaded
file_path: *str* the full path to the file
@rtype: tuple(string, string, string, boolean, string, string,
string, string, string, string, string)
"""
# Action performed ...
if form.has_key("fileAction") and \
form['fileAction'] in CFG_ALLOWED_ACTIONS:
file_action = str(form['fileAction']) # "add", "revise",
# "addFormat" or "delete"
else:
file_action = ""
# ... on file ...
if form.has_key("fileTarget"):
file_target = str(form['fileTarget']) # contains bibdocname
# Also remember its doctype to make sure we do valid actions
# on it
corresponding_bibdoc = get_bibdoc_for_docname(file_target,
abstract_bibdocs)
if corresponding_bibdoc is not None:
file_target_doctype = corresponding_bibdoc['get_type']
else:
file_target_doctype = ""
else:
file_target = ""
file_target_doctype = ""
# ... with doctype?
# Only useful when adding file: otherwise fileTarget doctype is
# preserved
file_doctype = file_target_doctype
if form.has_key("fileDoctype") and \
file_action == 'add':
file_doctype = str(form['fileDoctype'])
# ... keeping previous version? ...
if file_target_doctype != '' and \
not form.has_key("keepPreviousFiles"):
# no corresponding key. Two possibilities:
if file_target_doctype in can_keep_doctypes or \
'*' in can_keep_doctypes:
# User decided no to keep
keep_previous_files = 0
else:
# No choice for user. Use default admin has chosen
keep_previous_files = keep_default
else:
# Checkbox seems to be checked ...
if file_target_doctype in can_keep_doctypes or \
'*' in can_keep_doctypes:
# ...and this is allowed
keep_previous_files = 1
else:
# ...but this is not allowed
keep_previous_files = keep_default
# ... and decription? ...
if form.has_key("description") and \
(((file_action == 'revise' and \
(file_target_doctype in can_describe_doctypes)) or \
(file_action == 'add' and \
(file_doctype in can_describe_doctypes))) \
or '*' in can_describe_doctypes):
file_description = str(form['description'])
else:
file_description = ''
# ... and comment? ...
if form.has_key("comment") and \
(((file_action == 'revise' and \
(file_target_doctype in can_comment_doctypes)) or \
(file_action == 'add' and \
(file_doctype in can_comment_doctypes))) \
or '*' in can_comment_doctypes):
file_comment = str(form['comment'])
else:
file_comment = ''
# ... and rename to ? ...
if form.has_key("rename") and \
((file_action == "revise" and \
((file_target_doctype in can_rename_doctypes) or \
'*' in can_rename_doctypes)) or \
(file_action == "add" and \
can_name_new_files)):
file_rename = str(form['rename']) # contains new bibdocname if applicable
elif file_action == "add" and \
doctypes_to_default_filename.has_key(file_doctype):
# Admin-chosen name.
file_rename = doctypes_to_default_filename[file_doctype]
if file_rename.lower().startswith('file:'):
# We will define name at a later stage, i.e. when
# submitting the file with bibdocfile. The name will be
# chosen by reading content of a file in curdir
file_rename = ''
else:
# Ensure name is unique, by appending a suffix
file_rename = doctypes_to_default_filename[file_doctype]
file_counter = 2
while get_bibdoc_for_docname(file_rename, abstract_bibdocs):
if file_counter == 2:
file_rename += '-2'
else:
file_rename = file_rename[:-len(str(file_counter))] + \
str(file_counter)
file_counter += 1
else:
file_rename = ''
# ... and file restriction ? ...
file_restriction = ''
if form.has_key("fileRestriction"):
# We cannot clean that value as it could be a restriction
# declared in another submission. We keep this value.
file_restriction = str(form['fileRestriction'])
# ... and the file itself ? ...
if form.has_key('myfile') and \
hasattr(form['myfile'], "filename") and \
form['myfile'].filename:
dir_to_open = os.path.join(working_dir, 'files', 'myfile')
if not os.path.exists(dir_to_open):
try:
os.makedirs(dir_to_open)
except:
pass
# Shall we continue?
if os.path.exists(dir_to_open):
form_field = form['myfile']
file_name = form_field.filename
form_file = form_field.file
## Before saving the file to disk, wash the filename (in particular
## washing away UNIX and Windows (e.g. DFS) paths):
file_name = os.path.basename(file_name.split('\\')[-1])
file_name = file_name.strip()
if file_name != "":
# This may be dangerous if the file size is bigger than
# the available memory
file_path = os.path.join(dir_to_open, file_name)
if not os.path.exists(file_path):
# If file already exists, it means that it was
# handled by WebSubmit
fp = file(file_path, "wb")
chunk = form_file.read(10240)
while chunk:
fp.write(chunk)
chunk = form_file.read(10240)
fp.close()
fp = open(os.path.join(working_dir, "lastuploadedfile"), "w")
fp.write(file_name)
fp.close()
fp = open(os.path.join(working_dir, 'myfile'), "w")
fp.write(file_name)
fp.close()
else:
file_name = None
file_path = None
return (file_action, file_target, file_target_doctype,
keep_previous_files, file_description, file_comment,
file_rename, file_doctype, file_restriction, file_name,
file_path)
def move_uploaded_files_to_storage(working_dir, recid, icon_sizes,
create_icon_doctypes,
force_file_revision):
"""
Apply the modifications on files (add/remove/revise etc.) made by
users with one of the compatible interfaces (WebSubmit function
`Create_Upload_Files_Interface.py'; WebSubmit element or WebSubmit
File management interface using function
`create_file_upload_interface').
This function needs a "working directory" (working_dir) that contains a
bibdocactions.log file with the list of actions to perform.
@param working_dir: a path to the working directory containing actions to perform and files to attach
@type working_dir: string
@param recid: the recid to modify
@type recid: int
@param icon_sizes: the sizes of icons to create, as understood by
the websubmit icon creation tool
@type icon_sizes: list(string)
@param create_icon_doctypes: a list of doctype for which we want
to create icons
@type create_icon_doctypes: list(string)
@param force_file_revision: when revising attributes of a file
(comment, description) without
uploading a new file, force a revision
of the current version (so that old
comment, description, etc. is kept
or not)
@type force_file_revision: bool
"""
# We need to remember of some actions that cannot be performed,
# because files have been deleted or moved after a renaming.
# Those pending action must be applied when revising the bibdoc
# with a file that exists (that means that the bibdoc has not been
# deleted nor renamed by a later action)
pending_bibdocs = {}
newly_added_bibdocs = [] # Does not consider new formats/revisions
performed_actions = read_actions_log(working_dir)
for action, bibdoc_name, file_path, rename, description, \
comment, doctype, keep_previous_versions, \
file_restriction in performed_actions:
# FIXME: get this out of the loop once changes to bibrecdocs
# are immediately visible. For the moment, reload the
# structure from scratch at each step
bibrecdocs = BibRecDocs(recid)
if action == 'add':
new_bibdoc = \
add(file_path, bibdoc_name, rename, doctype, description,
comment, file_restriction, recid, working_dir, icon_sizes,
create_icon_doctypes, pending_bibdocs, bibrecdocs)
if new_bibdoc:
newly_added_bibdocs.append(new_bibdoc)
elif action == 'addFormat':
add_format(file_path, bibdoc_name, recid, doctype, working_dir,
icon_sizes, create_icon_doctypes,
pending_bibdocs, bibrecdocs)
elif action == 'revise':
new_bibdoc = \
revise(file_path, bibdoc_name, rename, doctype,
description, comment, file_restriction, icon_sizes,
create_icon_doctypes, keep_previous_versions,
recid, working_dir, pending_bibdocs,
bibrecdocs, force_file_revision)
if new_bibdoc:
newly_added_bibdocs.append(new_bibdoc)
elif action == 'delete':
delete(bibdoc_name, recid, working_dir, pending_bibdocs,
bibrecdocs)
# Finally rename bibdocs that should be named according to a file in
# curdir (eg. naming according to report number). Only consider
# file that have just been added.
parameters = _read_file_revision_interface_configuration_from_disk(working_dir)
new_names = []
doctypes_to_default_filename = parameters[22]
for bibdoc_to_rename in newly_added_bibdocs:
bibdoc_to_rename_doctype = bibdoc_to_rename.doctype
rename_to = doctypes_to_default_filename.get(bibdoc_to_rename_doctype, '')
if rename_to.startswith('file:'):
# This BibDoc must be renamed. Look for name in working dir
name_at_filepath = os.path.join(working_dir, rename_to[5:])
if os.path.exists(name_at_filepath) and \
os.path.abspath(name_at_filepath).startswith(working_dir):
try:
rename = file(name_at_filepath).read()
except:
register_exception(prefix='Move_Uploaded_Files_to_Storage ' \
'could not read file %s in curdir to rename bibdoc' % \
(name_at_filepath,),
alert_admin=True)
if rename:
file_counter = 2
new_filename = rename
while bibrecdocs.has_docname_p(new_filename) or (new_filename in new_names):
new_filename = rename + '_%i' % file_counter
file_counter += 1
bibdoc_to_rename.change_name(new_filename)
new_names.append(new_filename) # keep track of name, or we have to reload bibrecdoc...
_do_log(working_dir, 'Renamed ' + bibdoc_to_rename.get_docname())
# Delete the HB BibFormat cache in the DB, so that the fulltext
# links do not point to possible dead files
run_sql("DELETE LOW_PRIORITY from bibfmt WHERE format='HB' AND id_bibrec=%s", (recid,))
# Update the MARC
cli_fix_marc(None, [recid], interactive=False)
def add(file_path, bibdoc_name, rename, doctype, description, comment,
file_restriction, recid, working_dir, icon_sizes, create_icon_doctypes,
pending_bibdocs, bibrecdocs):
"""
Adds the file using bibdocfile CLI
Return the bibdoc that has been newly added.
"""
try:
brd = BibRecDocs(recid)
if os.path.exists(file_path):
# Add file
bibdoc = bibrecdocs.add_new_file(file_path,
doctype,
rename or bibdoc_name,
never_fail=True)
_do_log(working_dir, 'Added ' + brd.get_docname(bibdoc.id) + ': ' + \
file_path)
# Add icon
iconpath = ''
has_added_default_icon_subformat_p = False
for icon_size in icon_sizes:
if doctype in create_icon_doctypes or \
'*' in create_icon_doctypes:
iconpath = _create_icon(file_path, icon_size)
if iconpath is not None:
try:
if not has_added_default_icon_subformat_p:
bibdoc.add_icon(iconpath)
has_added_default_icon_subformat_p = True
else:
icon_suffix = icon_size.replace('>', '').replace('<', '').replace('^', '').replace('!', '')
bibdoc.add_icon(iconpath, subformat=CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT + "-" + icon_suffix)
_do_log(working_dir, 'Added icon to ' + \
brd.get_docname(bibdoc.id) + ': ' + iconpath)
except InvenioBibDocFileError, e:
# Most probably icon already existed.
pass
# Add description
if description:
bibdocfiles = bibdoc.list_latest_files()
for bibdocfile in bibdocfiles:
bibdoc.set_description(description,
bibdocfile.get_format())
_do_log(working_dir, 'Described ' + \
brd.get_docname(bibdoc.id) + ': ' + description)
# Add comment
if comment:
bibdocfiles = bibdoc.list_latest_files()
for bibdocfile in bibdocfiles:
bibdoc.set_comment(comment,
bibdocfile.get_format())
_do_log(working_dir, 'Commented ' + \
brd.get_docname(bibdoc.id) + ': ' + comment)
# Set restriction
bibdoc.set_status(file_restriction)
_do_log(working_dir, 'Set restriction of ' + \
brd.get_docname(bibdoc.id) + ': ' + \
file_restriction or '(no restriction)')
return bibdoc
else:
# File has been later renamed or deleted.
# Remember to add it later if file is found (ie
# it was renamed)
pending_bibdocs[bibdoc_name] = (doctype, comment, description, [])
except InvenioBibDocFileError, e:
# Format already existed. How come? We should
# have checked this in Create_Upload_Files_Interface.py
register_exception(prefix='Move_Uploaded_Files_to_Storage ' \
'tried to add already existing file %s ' \
'with name %s to record %i.' % \
(file_path, bibdoc_name, recid),
alert_admin=True)
def add_format(file_path, bibdoc_name, recid, doctype, working_dir,
icon_sizes, create_icon_doctypes, pending_bibdocs,
bibrecdocs):
"""
Adds a new format to a bibdoc using bibdocfile CLI
"""
try:
brd = BibRecDocs(recid)
if os.path.exists(file_path):
# We must retrieve previous description and comment as
# adding a file using the APIs reset these values
prev_desc, prev_comment = None, None
if bibrecdocs.has_docname_p(bibdoc_name):
(prev_desc, prev_comment) = \
get_description_and_comment(bibrecdocs.get_bibdoc(bibdoc_name).list_latest_files())
# Add file
bibdoc = bibrecdocs.add_new_format(file_path,
bibdoc_name,
prev_desc,
prev_comment)
_do_log(working_dir, 'Added new format to ' + \
brd.get_docname(bibdoc.id) + ': ' + file_path)
# Add icons
has_added_default_icon_subformat_p = False
for icon_size in icon_sizes:
iconpath = ''
if doctype in create_icon_doctypes or \
'*' in create_icon_doctypes:
iconpath = _create_icon(file_path, icon_size)
if iconpath is not None:
try:
if not has_added_default_icon_subformat_p:
bibdoc.add_icon(iconpath)
has_added_default_icon_subformat_p = True
else:
# We have already added the "default" icon subformat
icon_suffix = icon_size.replace('>', '').replace('<', '').replace('^', '').replace('!', '')
bibdoc.add_icon(iconpath, subformat=CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT + "-" + icon_suffix)
_do_log(working_dir, 'Added icon to ' + \
brd.get_docname(bibdoc.id) + ': ' + iconpath)
except InvenioBibDocFileError, e:
# Most probably icon already existed.
pass
else:
# File has been later renamed or deleted.
# Remember to add it later if file is found
if pending_bibdocs.has_key(bibdoc_name):
pending_bibdocs[bibdoc_name][3].append(file_path)
# else: we previously added a file by mistake. Do
# not care, it will be deleted
except InvenioBibDocFileError, e:
# Format already existed. How come? We should
# have checked this in Create_Upload_Files_Interface.py
register_exception(prefix='Move_Uploaded_Files_to_Storage ' \
'tried to add already existing format %s ' \
'named %s in record %i.' % \
(file_path, bibdoc_name, recid),
alert_admin=True)
def revise(file_path, bibdoc_name, rename, doctype, description,
comment, file_restriction, icon_sizes, create_icon_doctypes,
keep_previous_versions, recid, working_dir, pending_bibdocs,
bibrecdocs, force_file_revision):
"""
Revises the given bibdoc with a new file.
Return the bibdoc that has been newly added. (later: if needed,
return as tuple the bibdoc that has been revised, or deleted,
etc.)
"""
added_bibdoc = None
try:
if os.path.exists(file_path) or not file_path:
brd = BibRecDocs(recid)
# Perform pending actions
if pending_bibdocs.has_key(bibdoc_name):
# We have some pending actions to apply before
# going further.
if description == '':
# Last revision did not include a description.
# Use the one of the pending actions
description = pending_bibdocs[bibdoc_name][2]
if comment == '':
# Last revision did not include a comment.
# Use the one of the pending actions
comment = pending_bibdocs[bibdoc_name][1]
original_bibdoc_name = pending_bibdocs[bibdoc_name][0]
if not bibrecdocs.has_docname_p(original_bibdoc_name) and file_path:
# the bibdoc did not originaly exist, so it
# must be added first
bibdoc = bibrecdocs.add_new_file(file_path,
pending_bibdocs[bibdoc_name][0],
bibdoc_name,
never_fail=True)
_do_log(working_dir, 'Added ' + brd.get_docname(bibdoc.id) + ': ' + \
file_path)
added_bibdoc = bibdoc
# Set restriction
bibdoc.set_status(file_restriction)
_do_log(working_dir, 'Set restriction of ' + \
bibrecdocs.get_docname(bibdoc.id) + ': ' + \
file_restriction or '(no restriction)')
# We must retrieve previous description and comment as
# revising a file using the APIs reset these values
prev_desc, prev_comment = None, None
if bibrecdocs.has_docname_p(bibdoc_name):
(prev_desc, prev_comment) = \
get_description_and_comment(bibrecdocs.get_bibdoc(bibdoc_name).list_latest_files())
# Do we have additional formats?
for additional_format in pending_bibdocs[bibdoc_name][3]:
if os.path.exists(additional_format):
bibdoc.add_file_new_format(additional_format,
description=bibdoc.get_description(),
comment=bibdoc.get_comment())
_do_log(working_dir, 'Added new format to' + \
brd.get_docname(bibdoc.id) + ': ' + file_path)
# All pending modification have been applied,
# so delete
del pending_bibdocs[bibdoc_name]
# We must retrieve previous description and comment as
# revising a file using the APIs reset these values
prev_desc, prev_comment = None, None
if bibrecdocs.has_docname_p(bibdoc_name):
(prev_desc, prev_comment) = \
get_description_and_comment(bibrecdocs.get_bibdoc(bibdoc_name).list_latest_files())
if keep_previous_versions and file_path:
# Standard procedure, keep previous version
bibdoc = bibrecdocs.add_new_version(file_path,
bibdoc_name,
prev_desc,
prev_comment)
_do_log(working_dir, 'Revised ' + brd.get_docname(bibdoc.id) + \
' with : ' + file_path)
elif file_path:
# Soft-delete previous versions, and add new file
# (we need to get the doctype before deleting)
if bibrecdocs.has_docname_p(bibdoc_name):
# Delete only if bibdoc originally
# existed
bibrecdocs.delete_bibdoc(bibdoc_name)
_do_log(working_dir, 'Deleted ' + bibdoc_name)
try:
bibdoc = bibrecdocs.add_new_file(file_path,
doctype,
bibdoc_name,
never_fail=True,
description=prev_desc,
comment=prev_comment)
_do_log(working_dir, 'Added ' + brd.get_docname(bibdoc.id) + ': ' + \
file_path)
except InvenioBibDocFileError, e:
_do_log(working_dir, str(e))
register_exception(prefix='Move_Uploaded_Files_to_Storage ' \
'tried to revise a file %s ' \
'named %s in record %i.' % \
(file_path, bibdoc_name, recid),
alert_admin=True)
else:
# User just wanted to change attribute of the file,
# not the file itself
bibdoc = bibrecdocs.get_bibdoc(bibdoc_name)
(prev_desc, prev_comment) = \
get_description_and_comment(bibdoc.list_latest_files())
if prev_desc is None:
prev_desc = ""
if prev_comment is None:
prev_comment = ""
if force_file_revision and \
(description != prev_desc or comment != prev_comment):
# FIXME: If we are going to create a new version,
# then we should honour the keep_previous_versions
# parameter (soft-delete, then add bibdoc, etc)
# But it is a bit complex right now...
# Trick: we revert to current version, which
# creates a revision of the BibDoc
bibdoc.revert(bibdoc.get_latest_version())
bibdoc = bibrecdocs.get_bibdoc(bibdoc_name)
# Rename
if rename and rename != bibdoc_name:
bibrecdocs.change_name(newname=rename, docid=bibdoc.id)
_do_log(working_dir, 'renamed ' + bibdoc_name +' to '+ rename)
# Add icons
if file_path:
has_added_default_icon_subformat_p = False
for icon_size in icon_sizes:
iconpath = ''
if doctype in create_icon_doctypes or \
'*' in create_icon_doctypes:
iconpath = _create_icon(file_path, icon_size)
if iconpath is not None:
try:
if not has_added_default_icon_subformat_p:
bibdoc.add_icon(iconpath)
has_added_default_icon_subformat_p = True
else:
# We have already added the "default" icon subformat
icon_suffix = icon_size.replace('>', '').replace('<', '').replace('^', '').replace('!', '')
bibdoc.add_icon(iconpath, subformat=CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT + "-" + icon_suffix)
_do_log(working_dir, 'Added icon to ' + \
brd.get_docname(bibdoc.id) + ': ' + iconpath)
except InvenioBibDocFileError, e:
# Most probably icon already existed.
pass
# Description
if description:
bibdocfiles = bibdoc.list_latest_files()
for bibdocfile in bibdocfiles:
bibdoc.set_description(description,
bibdocfile.get_format())
_do_log(working_dir, 'Described ' + \
brd.get_docname(bibdoc.id) + ': ' + description)
# Comment
if comment:
bibdocfiles = bibdoc.list_latest_files()
for bibdocfile in bibdocfiles:
bibdoc.set_comment(comment,
bibdocfile.get_format())
_do_log(working_dir, 'Commented ' + \
brd.get_docname(bibdoc.id) + ': ' + comment)
# Set restriction
bibdoc.set_status(file_restriction)
_do_log(working_dir, 'Set restriction of ' + \
brd.get_docname(bibdoc.id) + ': ' + \
file_restriction or '(no restriction)')
else:
# File has been later renamed or deleted.
# Remember it
if rename and rename != bibdoc_name:
pending_bibdocs[rename] = pending_bibdocs[bibdoc_name]
except InvenioBibDocFileError, e:
# Format already existed. How come? We should
# have checked this in Create_Upload_Files_Interface.py
register_exception(prefix='Move_Uploaded_Files_to_Storage ' \
'tried to revise a file %s ' \
'named %s in record %i.' % \
(file_path, bibdoc_name, recid),
alert_admin=True)
return added_bibdoc
def delete(bibdoc_name, recid, working_dir, pending_bibdocs,
bibrecdocs):
"""
Deletes the given bibdoc
"""
try:
if bibrecdocs.has_docname_p(bibdoc_name):
bibrecdocs.delete_bibdoc(bibdoc_name)
_do_log(working_dir, 'Deleted ' + bibdoc_name)
if pending_bibdocs.has_key(bibdoc_name):
del pending_bibdocs[bibdoc_name]
except InvenioBibDocFileError, e:
# Mmh most probably we deleted two files at the same
# second. Sleep 1 second and retry... This might go
# away one bibdoc improves its way to delete files
try:
time.sleep(1)
bibrecdocs.delete_bibdoc(bibdoc_name)
_do_log(working_dir, 'Deleted ' + bibdoc_name)
if pending_bibdocs.has_key(bibdoc_name):
del pending_bibdocs[bibdoc_name]
except InvenioBibDocFileError, e:
_do_log(working_dir, str(e))
_do_log(working_dir, repr(bibrecdocs.list_bibdocs()))
register_exception(prefix='Move_Uploaded_Files_to_Storage ' \
'tried to delete a file' \
'named %s in record %i.' % \
(bibdoc_name, recid),
alert_admin=True)
def _do_log(log_dir, msg):
"""
Log what we have done, in case something went wrong.
Nice to compare with bibdocactions.log
Should be removed when the development is over.
@param log_dir: the path to the working directory
@type log_dir: string
@param msg: the message to log
@type msg: string
"""
log_file = os.path.join(log_dir, 'performed_actions.log')
file_desc = open(log_file, "a+")
file_desc.write("%s --> %s\n" %(time.strftime("%Y-%m-%d %H:%M:%S"), msg))
file_desc.close()
def _create_icon(file_path, icon_size, docformat='gif', verbosity=9):
"""
Creates icon of given file.
Returns path to the icon. If creation fails, return None, and
register exception (send email to admin).
@param file_path: full path to icon
@type file_path: string
@param icon_size: the scaling information to be used for the
creation of the new icon.
@type icon_size: int
@param verbosity: the verbosity level under which the program
is to run;
@type verbosity: int
"""
icon_path = None
try:
filename = os.path.splitext(os.path.basename(file_path))[0]
(icon_dir, icon_name) = create_icon(
{'input-file':file_path,
'icon-name': "icon-%s" % filename,
'multipage-icon': False,
'multipage-icon-delay': 0,
'icon-scale': icon_size,
'icon-file-format': format,
'verbosity': verbosity})
icon_path = icon_dir + os.sep + icon_name
except InvenioWebSubmitIconCreatorError, e:
register_exception(prefix='Icon for file %s could not be created: %s' % \
(file_path, str(e)),
alert_admin=False)
return icon_path
def get_upload_file_interface_javascript(form_url_params):
"""
Returns the Javascript code necessary to run the upload file
interface.
"""
javascript = '''
<script type="text/javascript" src="/js/jquery.form.js"></script>
<script type="text/javascript">
<!--
'''
if form_url_params:
javascript += '''
// prepare the form when the DOM is ready
$(document).ready(function() {
var progress = $('.progress');
var rotatingprogress = $('.rotatingprogress');
var bar = $('.bar');
var percent = $('.percent');
var options = {
target: '#uploadFileInterface', // target element(s) to be updated with server response
uploadProgress: function(event, position, total, percentComplete) {
update_progress(progress, bar, percent, percentComplete, rotatingprogress);},
beforeSubmit: function(arr, $form, options) {
show_upload_progress();
return true;},
success: showResponse, // post-submit callback
url: '/%(CFG_SITE_RECORD)s/managedocfilesasync%(form_url_params)s' // override for form's 'action' attribute
};
// bind form using 'ajaxForm'
var this_form = $('form:has(#balloonReviseFileInput)')
$('#bibdocfilemanagedocfileuploadbutton').click(function() {
this_form.bibdocfilemanagedocfileuploadbuttonpressed=true;
this_form.ajaxSubmit(options);
})
});
// post-submit callback
function showResponse(responseText, statusText) {
hide_upload_progress();
hide_revise_panel();
}
''' % {
'form_url_params': form_url_params,
'CFG_SITE_RECORD': CFG_SITE_RECORD}
javascript += '''
/* Record position of the last clicked link that triggered the display
* of the revise panel
*/
var last_clicked_link = null;
function display_revise_panel(link, params){
var action = params['action'];
var target = params['target'];
var showDoctypes = params['showDoctypes'];
var showKeepPreviousVersions = params['showKeepPreviousVersions'];
var showRename = params['showRename'];
var showDescription = params['showDescription'];
var showComment = params['showComment'];
var bibdocname = params['bibdocname'];
var description = params['description'];
var comment = params['comment'];
var showRestrictions = params['showRestrictions'];
var restriction = params['restriction'];
var doctypes = params['doctypes'];
var balloon = document.getElementById("balloon");
var file_input_block = document.getElementById("balloonReviseFileInputBlock");
var doctype = document.getElementById("fileDoctypesRow");
var warningFormats = document.getElementById("warningFormats");
var keepPreviousVersions = document.getElementById("keepPreviousVersions");
var renameBox = document.getElementById("renameBox");
var descriptionBox = document.getElementById("descriptionBox");
var commentBox = document.getElementById("commentBox");
var restrictionBox = document.getElementById("restrictionBox");
var apply_button = document.getElementById("applyChanges");
var mainForm = getMainForm();
last_clicked_link = link;
var pos;
/* Show/hide parts of the form */
if (showDoctypes) {
doctype.style.display = ''
} else {
doctype.style.display = 'none'
}
if (action == 'revise' && showKeepPreviousVersions == true){
warningFormats.style.display = ''
} else {
warningFormats.style.display = 'none'
}
if ((action == 'revise' || action == 'add') && showRename == true){
renameBox.style.display = ''
} else {
renameBox.style.display = 'none'
}
if ((action == 'revise' || action == 'add') && showDescription == true){
descriptionBox.style.display = ''
} else {
descriptionBox.style.display = 'none'
}
if ((action == 'revise' || action == 'add') && showComment == true){
commentBox.style.display = ''
} else {
commentBox.style.display = 'none'
}
if ((action == 'revise' || action == 'add') && showRestrictions == true){
restrictionBox.style.display = ''
} else {
restrictionBox.style.display = 'none'
}
if (action == 'revise' && showKeepPreviousVersions == true) {
keepPreviousVersions.style.display = ''
} else {
keepPreviousVersions.style.display = 'none'
}
if (action == 'add') {
updateForm();
}
/* Reset values */
file_input_block.innerHTML = file_input_block.innerHTML; // Trick to reset input field
doctype.innerHTML = doctypes;
mainForm.balloonReviseFileKeep.checked = true;
mainForm.rename.value = bibdocname;
mainForm.comment.value = comment;
mainForm.description.value = description;
var fileRestrictionFound = false;
for (var i=0; i < mainForm.fileRestriction.length; i++) {
if (mainForm.fileRestriction[i].value == restriction) {
mainForm.fileRestriction.selectedIndex = i;
fileRestrictionFound = true;
}
}
if (!fileRestrictionFound) {
var restrictionItem = new Option(restriction, restriction);
mainForm.fileRestriction.appendChild(restrictionItem);
var lastIndex = mainForm.fileRestriction.length - 1;
mainForm.fileRestriction.selectedIndex = lastIndex;
}
/* Display and move to correct position*/
pos = findPosition(link)
balloon.style.display = '';
balloon.style.position="absolute";
balloon.style.left = pos[0] + link.offsetWidth +"px";
balloon.style.top = pos[1] - Math.round(balloon.offsetHeight/2) + 5 + "px";
balloon.style.zIndex = 1001;
balloon.style.display = '';
/* Set the correct action and target file*/
mainForm.fileAction.value = action;
mainForm.fileTarget.value = target;
/* Disable other controls */
if (apply_button) {
apply_button.disabled = true;
}
/*gray_out(true);*/
}
function hide_revise_panel(){
var balloon = document.getElementById("balloon");
var apply_button = document.getElementById("applyChanges");
balloon.style.display = 'none';
if (apply_button) {
apply_button.disabled = false;
}
/*gray_out(false);*/
}
/* Intercept ESC key in order to close revise panel*/
document.onkeyup = keycheck;
function keycheck(e){
var KeyID = (window.event) ? event.keyCode : e.keyCode;
var upload_in_progress_p = $('.progress').is(":visible") || $('.rotatingprogress').is(":visible")
if(KeyID==27){
if (upload_in_progress_p) {
hide_upload_progress();
} else {
hide_revise_panel();
}
}
}
/* Update progress bar, show if necessary (and then hide rotating progress indicator) */
function update_progress(progress, bar, percent, percentComplete, rotatingprogress){
if (rotatingprogress.is(":visible")) {
$('.rotatingprogress').hide();
$('.progress').show();
}
var percentVal = percentComplete + '%%';
bar.width(percentVal)
percent.html(percentVal);
if (percentComplete == '100') {
// There might be some lengthy post-processing to do.
show_upload_progress(post_process_label=true);
}
}
/* Hide upload/cancel button, show rotating progress indicator */
function show_upload_progress(post_process_label_p) {
if (!post_process_label_p) { post_process_label_p = false;}
if (post_process_label_p) {
/* Show post-process label */
$('.progress').hide();
$('.rotatingprogress').hide();
$('.rotatingpostprocess').show();
} else {
/* Show uploading label */
$('#canceluploadbuttongroup').hide();
$('.rotatingprogress').show();
}
}
/* show upload/cancel button, hide any progress indicator */
function hide_upload_progress() {
$('.progress').hide();
$('.rotatingprogress').hide();
$('.rotatingpostprocess').hide();
$('#canceluploadbuttongroup').show();
$('.percent').html('0%%');
}
function findPosition( oElement ) {
/*Return the x,y position on page of the given object*/
if( typeof( oElement.offsetParent ) != 'undefined' ) {
for( var posX = 0, posY = 0; oElement; oElement = oElement.offsetParent ) {
posX += oElement.offsetLeft;
posY += oElement.offsetTop;
}
return [ posX, posY ];
} else {
return [ oElement.x, oElement.y ];
}
}
function getMainForm()
{
return $('form:has(#balloonReviseFileInput)')[0];
}
function nextStep()
{
if(confirm("You are about to submit the files and end the upload process."))
{
var mainForm = getMainForm();
mainForm.step.value = 2;
user_must_confirm_before_leaving_page = false;
mainForm.submit();
}
return true;
}
function updateForm(doctype, can_describe_doctypes, can_comment_doctypes, can_restrict_doctypes) {
/* Update the revision panel to hide or not part of the interface
* based on selected doctype
*
* Note: we use a small trick here to use the javascript 'in' operator, which
* does not work for arrays, but for object => we transform our arrays into
* objects literal
*/
/* Get the elements we are going to affect */
var renameBox = document.getElementById("renameBox");
var descriptionBox = document.getElementById("descriptionBox");
var commentBox = document.getElementById("commentBox");
var restrictionBox = document.getElementById("restrictionBox");
if (!can_describe_doctypes) {var can_describe_doctypes = [];}
if (!can_comment_doctypes) {var can_comment_doctypes = [];}
if (!can_restrict_doctypes) {var can_restrict_doctypes = [];}
if ((doctype in can_describe_doctypes) ||
('*' in can_describe_doctypes)){
descriptionBox.style.display = ''
} else {
descriptionBox.style.display = 'none'
}
if ((doctype in can_comment_doctypes) ||
('*' in can_comment_doctypes)){
commentBox.style.display = ''
} else {
commentBox.style.display = 'none'
}
if ((doctype in can_restrict_doctypes) ||
('*' in can_restrict_doctypes)){
restrictionBox.style.display = ''
} else {
restrictionBox.style.display = 'none'
}
/* Move the revise panel accordingly */
var balloon = document.getElementById("balloon");
pos = findPosition(last_clicked_link)
balloon.style.display = '';
balloon.style.position="absolute";
balloon.style.left = pos[0] + last_clicked_link.offsetWidth +"px";
balloon.style.top = pos[1] - Math.round(balloon.offsetHeight/2) + 5 + "px";
balloon.style.zIndex = 1001;
balloon.style.display = '';
}
function askDelete(bibdocname, form_url_params){
/*
Ask user if she wants to delete file
*/
if (confirm('Are you sure you want to delete '+bibdocname+'?'))
{
if (form_url_params) {
var mainForm = getMainForm();
mainForm.fileTarget.value = bibdocname;
mainForm.fileAction.value='delete';
user_must_confirm_before_leaving_page = false;
var options = {
target: '#uploadFileInterface',
success: showResponse,
url: '/%(CFG_SITE_RECORD)s/managedocfilesasync' + form_url_params
};
$(mainForm).ajaxSubmit(options);
} else {
/*WebSubmit function*/
document.forms[0].fileTarget.value = bibdocname;
document.forms[0].fileAction.value='delete';
user_must_confirm_before_leaving_page = false;
document.forms[0].submit();
}
}
return false;
}
function gray_out(visible) {
/* Gray out the screen so that user cannot click anywhere else.
Based on <http://www.hunlock.com/blogs/Snippets:_Howto_Grey-Out_The_Screen>
*/
var modalShield = document.getElementById('modalShield');
if (!modalShield) {
var tbody = document.getElementsByTagName("body")[0];
var tnode = document.createElement('div');
tnode.style.position = 'absolute';
tnode.style.top = '0px';
tnode.style.left = '0px';
tnode.style.overflow = 'hidden';
tnode.style.display = 'none';
tnode.id = 'modalShield';
tbody.appendChild(tnode);
modalShield = document.getElementById('modalShield');
}
if (visible){
// Calculate the page width and height
var pageWidth = '100%%';
var pageHeight = '100%%';
//set the shader to cover the entire page and make it visible.
modalShield.style.opacity = 0.7;
modalShield.style.MozOpacity = 0.7;
modalShield.style.filter = 'alpha(opacity=70)';
modalShield.style.zIndex = 1000;
modalShield.style.backgroundColor = '#000000';
modalShield.style.width = pageWidth;
modalShield.style.height = pageHeight;
modalShield.style.display = 'block';
} else {
modalShield.style.display = 'none';
}
}
-->
</script>
''' % {'CFG_SITE_RECORD': CFG_SITE_RECORD}
return javascript
def get_upload_file_interface_css():
"""
Returns the CSS to embed in the page for the upload file interface.
"""
# The CSS embedded in the page for the revise panel
css = '''
<style type="text/css">
<!--
#reviseControl{
overflow:auto;
width: 600px;
padding:1px;
}
.reviseControlBrowser{
padding:5px;
background-color:#fff;
border-collapse:collapse;
border-spacing: 0px;
border: 1px solid #999;
}
.reviseControlFileColumn {
padding-right:60px;
padding-left:5px;
text-align: left;
color:#00f;
}
.reviseControlActionColumn,
.reviseControlFormatColumn{
font-size:small;
}
.reviseControlActionColumn,
.reviseControlActionColumn a,
.reviseControlActionColumn a:link,
.reviseControlActionColumn a:hover
.reviseControlActionColumn a:visited{
font-size:small;
color: #060;
text-align:right;
}
.reviseControlFormatColumn,
.reviseControlFormatColumn a,
.reviseControlFormatColumn a:link,
.reviseControlFormatColumn a:hover
.reviseControlFormatColumn a:visited{
font-size:small;
color: #555;
text-align:left;
}
.optional{
color: #555;
font-size:0.9em;
font-weight:normal
}
.even{
background-color:#ecf3fe;
}
/*
.buttonLikeLink, .buttonLikeLink:visited, .buttonLikeLink:hover{
background-color:#fff;
border:2px outset #555;
color:#000;
padding: 2px 5px;
display:inline-block;
margin:2px;
text-decoration:none;
font-size:small;
cursor: default
}
*/
#balloon table{
border-collapse:collapse;
border-spacing: 0px;
}
#balloon table td.topleft{
background: transparent url(%(CFG_SITE_URL)s/img/balloon_top_left_shadow.png) no-repeat bottom right;
}
#balloon table td.bottomleft{
background: transparent url(%(CFG_SITE_URL)s/img/balloon_bottom_left_shadow.png) no-repeat top right;
}
#balloon table td.topright{
background: transparent url(%(CFG_SITE_URL)s/img/balloon_top_right_shadow.png) no-repeat bottom left;
}
#balloon table td.bottomright{
background: transparent url(%(CFG_SITE_URL)s/img/balloon_bottom_right_shadow.png) no-repeat top left;
}
#balloon table td.top{
background: transparent url(%(CFG_SITE_URL)s/img/balloon_top_shadow.png) repeat-x bottom left;
}
#balloon table td.bottom{
background: transparent url(%(CFG_SITE_URL)s/img/balloon_bottom_shadow.png) repeat-x top left;
}
#balloon table td.left{
background: transparent url(%(CFG_SITE_URL)s/img/balloon_left_shadow.png) repeat-y top right;
text-align:right;
padding:0;
}
#balloon table td.right{
background: transparent url(%(CFG_SITE_URL)s/img/balloon_right_shadow.png) repeat-y top left;
}
#balloon table td.arrowleft{
background: transparent url(%(CFG_SITE_URL)s/img/balloon_arrow_left_shadow.png) no-repeat bottom right;
width:24px;
height:27px;
}
#balloon table td.center{
background-color:#ffffea;
}
#balloon label{
font-size:small;
}
#balloonReviseFile{
width:220px;
text-align:left;
}
#warningFormats{
color:#432e11;
font-size:x-small;
text-align:center;
margin: 4px auto 4px auto;
}
#fileDoctype {
margin-bottom:3px;
}
#renameBox, #descriptionBox, #commentBox, #keepPreviousVersions{
margin-top:6px;
}
#description, #comment, #rename {
width:90%%;
}
.rotatingprogress, .rotatingpostprocess {
position:relative;
float:right;
padding: 1px;
font-style:italic;
font-size:small;
margin-right: 5px;
display:none;
}
.progress {
position:relative;
width:100%%;
float:left;
border: 1px solid #ddd;
padding: 1px;
border-radius: 3px;
display:none;
}
.bar {
background-color: #dd9700;
width:0%%; height:20px;
border-radius: 3px; }
.percent {
position:absolute;
display:inline-block;
top:3px;
left:45%%;
font-size:small;
color: #514100;
}
-->
</style>
''' % {'CFG_SITE_URL': CFG_SITE_URL}
return css
# The HTML markup of the revise panel
revise_balloon = '''
<div id="balloon" style="display:none;">
<input type="hidden" name="fileAction" value="" />
<input type="hidden" name="fileTarget" value="" />
<table>
<tr>
<td class="topleft">&nbsp;</td>
<td class="top">&nbsp;</td>
<td class="topright">&nbsp;</td>
</tr>
<tr>
<td class="left" vertical-align="center" width="24"><img alt=" " src="../img/balloon_arrow_left_shadow.png" /></td>
<td class="center">
<table id="balloonReviseFile">
<tr>
<td><label for="balloonReviseFileInput">%(file_label)s:</label><br/>
<div style="display:none" id="fileDoctypesRow"></div>
<div id="balloonReviseFileInputBlock"><input type="file" name="myfile" id="balloonReviseFileInput" size="20" /></div>
<!-- <input type="file" name="myfile" id="balloonReviseFileInput" size="20" onchange="var name=getElementById('rename');var filename=this.value.split('/').pop().split('.')[0];name.value=filename;"/> -->
<div id="renameBox" style=""><label for="rename">%(filename_label)s:</label><br/><input type="text" name="rename" id="rename" size="20" autocomplete="off"/></div>
<div id="descriptionBox" style=""><label for="description">%(description_label)s:</label><br/><input type="text" name="description" id="description" size="20" autocomplete="off"/></div>
<div id="commentBox" style=""><label for="comment">%(comment_label)s:</label><br/><textarea name="comment" id="comment" rows="3"/></textarea></div>
<div id="restrictionBox" style="display:none;white-space:nowrap;">%(restrictions)s</div>
<div id="keepPreviousVersions" style="display:none"><input type="checkbox" id="balloonReviseFileKeep" name="keepPreviousFiles" checked="checked" /><label for="balloonReviseFileKeep">%(previous_versions_label)s</label>&nbsp;<small>[<a href="" onclick="alert('%(previous_versions_help)s');return false;">?</a>]</small></div>
<p id="warningFormats" style="display:none"><img src="%(CFG_SITE_URL)s/img/warning.png" alt="Warning"/> %(revise_format_warning)s&nbsp;[<a href="" onclick="alert('%(revise_format_help)s');return false;">?</a>]</p>
<div class="progress"><div class="bar"></div ><div class="percent">0%%</div ></div>
<div class="rotatingprogress"><img src="/img/ui-anim_basic_16x16.gif" /> %(uploading_label)s</div><div class="rotatingpostprocess"><img src="/img/ui-anim_basic_16x16.gif" /> %(postprocess_label)s</div><div id="canceluploadbuttongroup" style="text-align:right;margin-top:5px"><input type="button" value="%(cancel)s" onclick="javascript:hide_revise_panel();"/> <input type="%(submit_or_button)s" id="bibdocfilemanagedocfileuploadbutton" onclick="show_upload_progress()" value="%(upload)s"/></div>
</td>
</tr>
</table>
</td>
<td class="right">&nbsp;</td>
</tr>
<tr>
<td class="bottomleft">&nbsp;</td>
<td class="bottom">&nbsp;</td>
<td class="bottomright">&nbsp;</td>
</tr>
</table>
</div>
'''
diff --git a/invenio/legacy/bibdocfile/plugins/bom_textdoc.py b/invenio/legacy/bibdocfile/plugins/bom_textdoc.py
index fcf19ac6e..79fde9588 100644
--- a/invenio/legacy/bibdocfile/plugins/bom_textdoc.py
+++ b/invenio/legacy/bibdocfile/plugins/bom_textdoc.py
@@ -1,142 +1,142 @@
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibObject Module providing BibObject prividing features for documents containing text (not necessarily as the main part of the content)"""
-from invenio.bibdocfile import BibDoc, InvenioBibDocFileError
+from invenio.legacy.bibdocfile.api import BibDoc, InvenioBibDocFileError
from invenio.legacy.dbquery import run_sql
from datetime import datetime
from invenio.ext.logging import register_exception
import os
class BibTextDoc(BibDoc):
def get_text(self, version=None):
"""
@param version: the requested version. If not set, the latest version
will be used.
@type version: integer
@return: the textual content corresponding to the specified version
of the document.
@rtype: string
"""
if version is None:
version = self.get_latest_version()
if self.has_text(version):
return open(os.path.join(self.basedir, '.text;%i' % version)).read()
else:
return ""
def get_text_path(self, version=None):
"""
@param version: the requested version. If not set, the latest version
will be used.
@type version: int
@return: the full path to the textual content corresponding to the specified version
of the document.
@rtype: string
"""
if version is None:
version = self.get_latest_version()
if self.has_text(version):
return os.path.join(self.basedir, '.text;%i' % version)
else:
return ""
def extract_text(self, version=None, perform_ocr=False, ln='en'):
"""
Try what is necessary to extract the textual information of a document.
@param version: the version of the document for which text is required.
If not specified the text will be retrieved from the last version.
@type version: integer
@param perform_ocr: whether to perform OCR.
@type perform_ocr: bool
@param ln: a two letter language code to give as a hint to the OCR
procedure.
@type ln: string
@raise InvenioBibDocFileError: in case of error.
@note: the text is extracted and cached for later use. Use L{get_text}
to retrieve it.
"""
- from invenio.websubmit_file_converter import get_best_format_to_extract_text_from, convert_file, InvenioWebSubmitFileConverterError
+ from invenio.legacy.websubmit.file_converter import get_best_format_to_extract_text_from, convert_file, InvenioWebSubmitFileConverterError
if version is None:
version = self.get_latest_version()
docfiles = self.list_version_files(version)
## We try to extract text only from original or OCRed documents.
filenames = [docfile.get_full_path() for docfile in docfiles if 'CONVERTED' not in docfile.flags or 'OCRED' in docfile.flags]
try:
filename = get_best_format_to_extract_text_from(filenames)
except InvenioWebSubmitFileConverterError:
## We fall back on considering all the documents
filenames = [docfile.get_full_path() for docfile in docfiles]
try:
filename = get_best_format_to_extract_text_from(filenames)
except InvenioWebSubmitFileConverterError:
open(os.path.join(self.basedir, '.text;%i' % version), 'w').write('')
return
try:
convert_file(filename, os.path.join(self.basedir, '.text;%i' % version), '.txt', perform_ocr=perform_ocr, ln=ln)
if version == self.get_latest_version():
run_sql("UPDATE bibdoc SET text_extraction_date=NOW() WHERE id=%s", (self.id, ))
except InvenioWebSubmitFileConverterError, e:
register_exception(alert_admin=True, prefix="Error in extracting text from bibdoc %i, version %i" % (self.id, version))
raise InvenioBibDocFileError, str(e)
def pdf_a_p(self):
"""
@return: True if this document contains a PDF in PDF/A format.
@rtype: bool"""
return self.has_flag('PDF/A', 'pdf')
def has_text(self, require_up_to_date=False, version=None):
"""
Return True if the text of this document has already been extracted.
@param require_up_to_date: if True check the text was actually
extracted after the most recent format of the given version.
@type require_up_to_date: bool
@param version: a version for which the text should have been
extracted. If not specified the latest version is considered.
@type version: integer
@return: True if the text has already been extracted.
@rtype: bool
"""
if version is None:
version = self.get_latest_version()
if os.path.exists(os.path.join(self.basedir, '.text;%i' % version)):
if not require_up_to_date:
return True
else:
docfiles = self.list_version_files(version)
text_md = datetime.fromtimestamp(os.path.getmtime(os.path.join(self.basedir, '.text;%i' % version)))
for docfile in docfiles:
if text_md <= docfile.md:
return False
return True
return False
def __repr__(self):
return 'BibTextDoc(%s, %s, %s)' % (repr(self.id), repr(self.doctype), repr(self.human_readable))
def supports(doctype, extensions):
return doctype == "Fulltext" or reduce(lambda x, y: x or y.startswith(".pdf") or y.startswith(".ps") , extensions, False)
def create_instance(docid=None, doctype='Main', human_readable=False, # pylint: disable=W0613
initial_data = None):
return BibTextDoc(docid=docid, human_readable=human_readable,
initial_data = initial_data)
diff --git a/invenio/legacy/bibdocfile/scripts/bibdocfile.py b/invenio/legacy/bibdocfile/scripts/bibdocfile.py
index 571b78d5e..a73e9fb10 100644
--- a/invenio/legacy/bibdocfile/scripts/bibdocfile.py
+++ b/invenio/legacy/bibdocfile/scripts/bibdocfile.py
@@ -1,32 +1,26 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
-## Copyright (C) 2008, 2010, 2011, 2013 CERN.
+## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
-"""
-BibDocFile CLI tool.
-"""
-
-__revision__ = "$Id$"
-
from invenio.base.factory import with_app_context
@with_app_context()
def main():
- from invenio.bibdocfilecli import main as bibdocfilecli_main
- return bibdocfilecli_main()
+ from invenio.legacy.bibdocfile.cli import main as bibdocfile_main
+ return bibdocfile_main()
diff --git a/invenio/legacy/bibdocfile/webinterface.py b/invenio/legacy/bibdocfile/webinterface.py
index af2e61f6d..7fed57380 100644
--- a/invenio/legacy/bibdocfile/webinterface.py
+++ b/invenio/legacy/bibdocfile/webinterface.py
@@ -1,540 +1,540 @@
## This file is part of Invenio.
## Copyright (C) 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import cgi
import os
import time
import shutil
from invenio.config import \
CFG_ACCESS_CONTROL_LEVEL_SITE, \
CFG_SITE_LANG, \
CFG_TMPSHAREDDIR, \
CFG_SITE_URL, \
CFG_SITE_SECURE_URL, \
CFG_WEBSUBMIT_STORAGEDIR, \
CFG_SITE_RECORD
-from invenio.bibdocfile_config import CFG_BIBDOCFILE_DOCUMENT_FILE_MANAGER_DOCTYPES, \
+from invenio.legacy.bibdocfile.config import CFG_BIBDOCFILE_DOCUMENT_FILE_MANAGER_DOCTYPES, \
CFG_BIBDOCFILE_DOCUMENT_FILE_MANAGER_MISC, \
CFG_BIBDOCFILE_DOCUMENT_FILE_MANAGER_RESTRICTIONS, \
CFG_BIBDOCFILE_ICON_SUBFORMAT_RE
from invenio.utils import apache
from invenio.modules.access.local_config import VIEWRESTRCOLL
from invenio.modules.access.mailcookie import mail_cookie_create_authorize_action
from invenio.modules.access.engine import acc_authorize_action
from invenio.modules.access.control import acc_is_role
from invenio.legacy.webpage import page, pageheaderonly, \
pagefooteronly, warning_page, write_warning
from invenio.legacy.webuser import getUid, page_not_authorized, collect_user_info, isUserSuperAdmin, \
isGuestUser
from invenio import webjournal_utils
from invenio.ext.legacy.handler import wash_urlargd, WebInterfaceDirectory
from invenio.utils.url import make_canonical_urlargd, redirect_to_url
from invenio.base.i18n import gettext_set_language
from invenio.legacy.search_engine import \
guess_primary_collection_of_a_record, get_colID, record_exists, \
create_navtrail_links, check_user_can_view_record, record_empty, \
is_user_owner_of_record
-from invenio.bibdocfile import BibRecDocs, normalize_format, file_strip_ext, \
+from invenio.legacy.bibdocfile.api import BibRecDocs, normalize_format, file_strip_ext, \
stream_restricted_icon, BibDoc, InvenioBibDocFileError, \
get_subformat_from_format
from invenio.ext.logging import register_exception
-from invenio.websearchadminlib import get_detailed_page_tabs
+from invenio.legacy.websearch.adminlib import get_detailed_page_tabs
import invenio.legacy.template
bibdocfile_templates = invenio.legacy.template.load('bibdocfile')
webstyle_templates = invenio.legacy.template.load('webstyle')
websubmit_templates = invenio.legacy.template.load('websubmit')
websearch_templates = invenio.legacy.template.load('websearch')
-from invenio.bibdocfile_managedocfiles import \
+from invenio.legacy.bibdocfile.managedocfiles import \
create_file_upload_interface, \
get_upload_file_interface_javascript, \
get_upload_file_interface_css, \
move_uploaded_files_to_storage
bibdocfile_templates = invenio.legacy.template.load('bibdocfile')
class WebInterfaceFilesPages(WebInterfaceDirectory):
def __init__(self, recid):
self.recid = recid
def _lookup(self, component, path):
# after /<CFG_SITE_RECORD>/<recid>/files/ every part is used as the file
# name
filename = component
def getfile(req, form):
args = wash_urlargd(form, bibdocfile_templates.files_default_urlargd)
ln = args['ln']
_ = gettext_set_language(ln)
uid = getUid(req)
user_info = collect_user_info(req)
verbose = args['verbose']
if verbose >= 1 and not isUserSuperAdmin(user_info):
# Only SuperUser can see all the details!
verbose = 0
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE > 1:
return page_not_authorized(req, "/%s/%s" % (CFG_SITE_RECORD, self.recid),
navmenuid='submit')
if record_exists(self.recid) < 1:
msg = "<p>%s</p>" % _("Requested record does not seem to exist.")
return warning_page(msg, req, ln)
if record_empty(self.recid):
msg = "<p>%s</p>" % _("Requested record does not seem to have been integrated.")
return warning_page(msg, req, ln)
(auth_code, auth_message) = check_user_can_view_record(user_info, self.recid)
if auth_code and user_info['email'] == 'guest':
if webjournal_utils.is_recid_in_released_issue(self.recid):
# We can serve the file
pass
else:
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : ln, 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
if webjournal_utils.is_recid_in_released_issue(self.recid):
# We can serve the file
pass
else:
return page_not_authorized(req, "../", \
text = auth_message)
readonly = CFG_ACCESS_CONTROL_LEVEL_SITE == 1
# From now on: either the user provided a specific file
# name (and a possible version), or we return a list of
# all the available files. In no case are the docids
# visible.
try:
bibarchive = BibRecDocs(self.recid)
except InvenioBibDocFileError:
register_exception(req=req, alert_admin=True)
msg = "<p>%s</p><p>%s</p>" % (
_("The system has encountered an error in retrieving the list of files for this document."),
_("The error has been logged and will be taken in consideration as soon as possible."))
return warning_page(msg, req, ln)
if bibarchive.deleted_p():
req.status = apache.HTTP_GONE
return warning_page(_("Requested record does not seem to exist."), req, ln)
docname = ''
docformat = ''
version = ''
warn = ''
if filename:
# We know the complete file name, guess which docid it
# refers to
## TODO: Change the extension system according to ext.py from setlink
## and have a uniform extension mechanism...
docname = file_strip_ext(filename)
docformat = filename[len(docname):]
if docformat and docformat[0] != '.':
docformat = '.' + docformat
if args['subformat']:
docformat += ';%s' % args['subformat']
else:
docname = args['docname']
if not docformat:
docformat = args['format']
if args['subformat']:
docformat += ';%s' % args['subformat']
if not version:
version = args['version']
## Download as attachment
is_download = False
if args['download']:
is_download = True
# version could be either empty, or all or an integer
try:
int(version)
except ValueError:
if version != 'all':
version = ''
display_hidden = isUserSuperAdmin(user_info)
if version != 'all':
# search this filename in the complete list of files
for doc in bibarchive.list_bibdocs():
if docname == bibarchive.get_docname(doc.id):
try:
try:
docfile = doc.get_file(docformat, version)
except InvenioBibDocFileError, msg:
req.status = apache.HTTP_NOT_FOUND
if req.headers_in.get('referer'):
## There must be a broken link somewhere.
## Maybe it's good to alert the admin
register_exception(req=req, alert_admin=True)
warn += write_warning(_("The format %s does not exist for the given version: %s") % (cgi.escape(docformat), cgi.escape(str(msg))))
break
(auth_code, auth_message) = docfile.is_restricted(user_info)
if auth_code != 0 and not is_user_owner_of_record(user_info, self.recid):
if CFG_BIBDOCFILE_ICON_SUBFORMAT_RE.match(get_subformat_from_format(docformat)):
return stream_restricted_icon(req)
if user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action('viewrestrdoc', {'status' : docfile.get_status()})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : ln, 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri']}, {})
redirect_to_url(req, target)
else:
req.status = apache.HTTP_UNAUTHORIZED
warn += write_warning(_("This file is restricted: ") + str(auth_message))
break
if not docfile.hidden_p():
if not readonly:
ip = str(req.remote_ip)
doc.register_download(ip, docfile.get_version(), docformat, uid, self.recid)
try:
return docfile.stream(req, download=is_download)
except InvenioBibDocFileError, msg:
register_exception(req=req, alert_admin=True)
req.status = apache.HTTP_INTERNAL_SERVER_ERROR
warn += write_warning(_("An error has happened in trying to stream the request file."))
else:
req.status = apache.HTTP_UNAUTHORIZED
warn += write_warning(_("The requested file is hidden and can not be accessed."))
except InvenioBibDocFileError, msg:
register_exception(req=req, alert_admin=True)
if docname and docformat and not warn:
req.status = apache.HTTP_NOT_FOUND
warn += write_warning(_("Requested file does not seem to exist."))
# filelist = bibarchive.display("", version, ln=ln, verbose=verbose, display_hidden=display_hidden)
filelist = bibdocfile_templates.tmpl_display_bibrecdocs(bibarchive, "", version, ln=ln, verbose=verbose, display_hidden=display_hidden)
t = warn + bibdocfile_templates.tmpl_filelist(
ln=ln,
filelist=filelist)
cc = guess_primary_collection_of_a_record(self.recid)
unordered_tabs = get_detailed_page_tabs(get_colID(cc), self.recid, ln)
ordered_tabs_id = [(tab_id, values['order']) for (tab_id, values) in unordered_tabs.iteritems()]
ordered_tabs_id.sort(lambda x, y: cmp(x[1], y[1]))
link_ln = ''
if ln != CFG_SITE_LANG:
link_ln = '?ln=%s' % ln
tabs = [(unordered_tabs[tab_id]['label'], \
'%s/%s/%s/%s%s' % (CFG_SITE_URL, CFG_SITE_RECORD, self.recid, tab_id, link_ln), \
tab_id == 'files',
unordered_tabs[tab_id]['enabled']) \
for (tab_id, dummy_order) in ordered_tabs_id
if unordered_tabs[tab_id]['visible'] == True]
top = webstyle_templates.detailed_record_container_top(self.recid,
tabs,
args['ln'])
bottom = webstyle_templates.detailed_record_container_bottom(self.recid,
tabs,
args['ln'])
title, description, keywords = websearch_templates.tmpl_record_page_header_content(req, self.recid, args['ln'])
return pageheaderonly(title=title,
navtrail=create_navtrail_links(cc=cc, aas=0, ln=ln) + \
''' &gt; <a class="navtrail" href="%s/%s/%s">%s</a>
&gt; %s''' % \
(CFG_SITE_URL, CFG_SITE_RECORD, self.recid, title, _("Access to Fulltext")),
description=description,
keywords=keywords,
uid=uid,
language=ln,
req=req,
navmenuid='search',
navtrail_append_title_p=0) + \
websearch_templates.tmpl_search_pagestart(ln) + \
top + t + bottom + \
websearch_templates.tmpl_search_pageend(ln) + \
pagefooteronly(language=ln, req=req)
return getfile, []
def __call__(self, req, form):
"""Called in case of URLs like /CFG_SITE_RECORD/123/files without
trailing slash.
"""
args = wash_urlargd(form, bibdocfile_templates.files_default_urlargd)
ln = args['ln']
link_ln = ''
if ln != CFG_SITE_LANG:
link_ln = '?ln=%s' % ln
return redirect_to_url(req, '%s/%s/%s/files/%s' % (CFG_SITE_URL, CFG_SITE_RECORD, self.recid, link_ln))
def bibdocfile_legacy_getfile(req, form):
""" Handle legacy /getfile.py URLs """
args = wash_urlargd(form, {
'recid': (int, 0),
'docid': (int, 0),
'version': (str, ''),
'name': (str, ''),
'format': (str, ''),
'ln' : (str, CFG_SITE_LANG)
})
_ = gettext_set_language(args['ln'])
def _getfile_py(req, recid=0, docid=0, version="", name="", docformat="", ln=CFG_SITE_LANG):
if not recid:
## Let's obtain the recid from the docid
if docid:
try:
bibdoc = BibDoc(docid=docid)
recid = bibdoc.bibrec_links[0]["recid"]
except InvenioBibDocFileError:
return warning_page(_("An error has happened in trying to retrieve the requested file."), req, ln)
else:
return warning_page(_('Not enough information to retrieve the document'), req, ln)
else:
brd = BibRecDocs(recid)
if not name and docid:
## Let's obtain the name from the docid
try:
name = brd.get_docname(docid)
except InvenioBibDocFileError:
return warning_page(_("An error has happened in trying to retrieving the requested file."), req, ln)
docformat = normalize_format(docformat)
redirect_to_url(req, '%s/%s/%s/files/%s%s?ln=%s%s' % (CFG_SITE_URL, CFG_SITE_RECORD, recid, name, docformat, ln, version and 'version=%s' % version or ''), apache.HTTP_MOVED_PERMANENTLY)
return _getfile_py(req, **args)
# --------------------------------------------------
class WebInterfaceManageDocFilesPages(WebInterfaceDirectory):
_exports = ['', 'managedocfiles', 'managedocfilesasync']
def managedocfiles(self, req, form):
"""
Display admin interface to manage files of a record
"""
argd = wash_urlargd(form, {
'ln': (str, ''),
'access': (str, ''),
'recid': (int, None),
'do': (int, 0),
'cancel': (str, None),
})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
user_info = collect_user_info(req)
# Check authorization
(auth_code, auth_msg) = acc_authorize_action(req,
'runbibdocfile')
if auth_code and user_info['email'] == 'guest':
# Ask to login
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'ln' : argd['ln'],
'referer' : CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target)
elif auth_code:
return page_not_authorized(req, referer="/%s/managedocfiles" % CFG_SITE_RECORD,
uid=uid, text=auth_msg,
ln=argd['ln'],
navmenuid="admin")
# Prepare navtrail
navtrail = '''<a class="navtrail" href="%(CFG_SITE_URL)s/help/admin">Admin Area</a> &gt; %(manage_files)s''' \
% {'CFG_SITE_URL': CFG_SITE_URL,
'manage_files': _("Manage Document Files")}
body = ''
if argd['do'] != 0 and not argd['cancel']:
# Apply modifications
working_dir = os.path.join(CFG_TMPSHAREDDIR,
'websubmit_upload_interface_config_' + str(uid),
argd['access'])
move_uploaded_files_to_storage(working_dir=working_dir,
recid=argd['recid'],
icon_sizes=['180>','700>'],
create_icon_doctypes=['*'],
force_file_revision=False)
# Clean temporary directory
shutil.rmtree(working_dir)
# Confirm modifications
body += '<p style="color:#0f0">%s</p>' % \
(_('Your modifications to record #%i have been submitted') % argd['recid'])
elif argd['cancel']:
# Clean temporary directory
working_dir = os.path.join(CFG_TMPSHAREDDIR,
'websubmit_upload_interface_config_' + str(uid),
argd['access'])
shutil.rmtree(working_dir)
body += '<p style="color:#c00">%s</p>' % \
(_('Your modifications to record #%i have been cancelled') % argd['recid'])
if not argd['recid'] or argd['do'] != 0:
body += '''
<form method="post" action="%(CFG_SITE_URL)s/%(CFG_SITE_RECORD)s/managedocfiles">
<label for="recid">%(edit_record)s:</label>
<input type="text" name="recid" id="recid" />
<input type="submit" value="%(edit)s" class="adminbutton" />
</form>
''' % {'edit': _('Edit'),
'edit_record': _('Edit record'),
'CFG_SITE_URL': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD}
access = time.strftime('%Y%m%d_%H%M%S')
if argd['recid'] and argd['do'] == 0:
# Displaying interface to manage files
# Prepare navtrail
title, dummy_description, dummy_keywords = websearch_templates.tmpl_record_page_header_content(req, argd['recid'],
argd['ln'])
navtrail = '''<a class="navtrail" href="%(CFG_SITE_URL)s/help/admin">Admin Area</a> &gt;
<a class="navtrail" href="%(CFG_SITE_URL)s/%(CFG_SITE_RECORD)s/managedocfiles">%(manage_files)s</a> &gt;
%(record)s: %(title)s
''' \
% {'CFG_SITE_URL': CFG_SITE_URL,
'title': title,
'manage_files': _("Document File Manager"),
'record': _("Record #%i") % argd['recid'],
'CFG_SITE_RECORD': CFG_SITE_RECORD}
body += create_file_upload_interface(\
recid=argd['recid'],
ln=argd['ln'],
uid=uid,
sbm_access=access,
display_hidden_files=True,
restrictions_and_desc=CFG_BIBDOCFILE_DOCUMENT_FILE_MANAGER_RESTRICTIONS,
doctypes_and_desc=CFG_BIBDOCFILE_DOCUMENT_FILE_MANAGER_DOCTYPES,
**CFG_BIBDOCFILE_DOCUMENT_FILE_MANAGER_MISC)[1]
body += '''<br />
<form method="post" action="%(CFG_SITE_URL)s/%(CFG_SITE_RECORD)s/managedocfiles">
<input type="hidden" name="recid" value="%(recid)s" />
<input type="hidden" name="do" value="1" />
<input type="hidden" name="access" value="%(access)s" />
<input type="hidden" name="ln" value="%(ln)s" />
<div style="font-size:small">
<input type="submit" name="cancel" value="%(cancel_changes)s" />
<input type="submit" onclick="user_must_confirm_before_leaving_page=false;return true;" class="adminbutton" name="submit" id="applyChanges" value="%(apply_changes)s" />
</div></form>''' % \
{'apply_changes': _("Apply changes"),
'cancel_changes': _("Cancel all changes"),
'recid': argd['recid'],
'access': access,
'ln': argd['ln'],
'CFG_SITE_URL': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD}
body += websubmit_templates.tmpl_page_do_not_leave_submission_js(argd['ln'], enabled=True)
return page(title = _("Document File Manager") + (argd['recid'] and (': ' + _("Record #%i") % argd['recid']) or ''),
navtrail=navtrail,
navtrail_append_title_p=0,
metaheaderadd = get_upload_file_interface_javascript(form_url_params='?access='+access) + \
get_upload_file_interface_css(),
body = body,
uid = uid,
language=argd['ln'],
req=req,
navmenuid='admin')
def managedocfilesasync(self, req, form):
"Upload file and returns upload interface"
argd = wash_urlargd(form, {
'ln': (str, ''),
'recid': (int, 1),
'doctype': (str, ''),
'access': (str, ''),
'indir': (str, ''),
})
user_info = collect_user_info(req)
include_headers = False
# User submitted either through WebSubmit, or admin interface.
if form.has_key('doctype') and form.has_key('indir') \
and form.has_key('access'):
# Submitted through WebSubmit. Check rights
include_headers = True
working_dir = os.path.join(CFG_WEBSUBMIT_STORAGEDIR,
argd['indir'], argd['doctype'],
argd['access'])
try:
assert(working_dir == os.path.abspath(working_dir))
except AssertionError:
raise apache.SERVER_RETURN(apache.HTTP_UNAUTHORIZED)
try:
# Retrieve recid from working_dir, safer.
recid_fd = file(os.path.join(working_dir, 'SN'))
recid = int(recid_fd.read())
recid_fd.close()
except:
recid = ""
try:
act_fd = file(os.path.join(working_dir, 'act'))
action = act_fd.read()
act_fd.close()
except:
action = ""
# Is user authorized to perform this action?
auth_code = acc_authorize_action(user_info,
"submit",
authorized_if_no_roles=not isGuestUser(getUid(req)),
doctype=argd['doctype'],
act=action)[0]
if auth_code and not acc_is_role("submit", doctype=argd['doctype'], act=action):
# There is NO authorization plugged. User should have access
auth_code = 0
else:
# User must be allowed to attach files
auth_code = acc_authorize_action(user_info, 'runbibdocfile')[0]
recid = argd['recid']
if auth_code:
raise apache.SERVER_RETURN(apache.HTTP_UNAUTHORIZED)
return create_file_upload_interface(recid=recid,
ln=argd['ln'],
print_outside_form_tag=False,
print_envelope=False,
form=form,
include_headers=include_headers,
sbm_indir=argd['indir'],
sbm_access=argd['access'],
sbm_doctype=argd['doctype'],
uid=user_info['uid'])[1]
__call__ = managedocfiles
diff --git a/invenio/legacy/bibedit/cli.py b/invenio/legacy/bibedit/cli.py
index d68d75141..f005720ae 100644
--- a/invenio/legacy/bibedit/cli.py
+++ b/invenio/legacy/bibedit/cli.py
@@ -1,299 +1,299 @@
## -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0103
"""
BibEdit CLI tool.
Usage: bibedit [options]
General options::
-h, --help print this help
-V, --version print version number
Options to inspect record history::
--list-revisions [recid] list all revisions of a record
--list-revisions-details [recid] list detailed revisions of a record
--get-revision [recid.revdate] print MARCXML of given record revision
--diff-revisions [recidA.revdateB] [recidC.revdateD] print MARCXML difference between
record A dated B and record C dated D
--revert-to-revision [recid.revdate] submit given record revision to
become current revision
--check-revisions [recid] check if revisions are not corrupted
(* stands for all records)
--fix-revisions [recid] fix revisions that are corrupted
(* stands for all records)
--clean-revisions [recid] clean duplicate revisions
(* stands for all records)
"""
__revision__ = "$Id$"
import sys
import zlib
from invenio.legacy.dbquery import run_sql
from invenio.intbitset import intbitset
-from invenio.bibedit_utils import get_marcxml_of_revision_id, \
+from invenio.legacy.bibedit.utils import get_marcxml_of_revision_id, \
get_record_revision_ids, get_xml_comparison, record_locked_by_other_user, \
record_locked_by_queue, revision_format_valid_p, save_xml_record, \
split_revid, get_info_of_revision_id, get_record_revisions
from invenio.legacy.bibrecord import create_record, records_identical
def print_usage():
"""Print help."""
print __doc__
def print_version():
"""Print version information."""
print __revision__
def cli_clean_revisions(recid, dry_run=True, verbose=True):
"""Clean revisions of the given recid, by removing duplicate revisions
that do not change the content of the record."""
if recid == '*':
recids = intbitset(run_sql("SELECT DISTINCT id_bibrec FROM hstRECORD"))
else:
try:
recids = [int(recid)]
except ValueError:
print 'ERROR: record ID must be integer, not %s.' % recid
sys.exit(1)
for recid in recids:
all_revisions = run_sql("SELECT marcxml, job_id, job_name, job_person, job_date FROM hstRECORD WHERE id_bibrec=%s ORDER BY job_date ASC", (recid,))
previous_rec = {}
deleted_revisions = 0
for marcxml, job_id, job_name, job_person, job_date in all_revisions:
try:
current_rec = create_record(zlib.decompress(marcxml))[0]
except Exception:
print >> sys.stderr, "ERROR: corrupted revisions found. Please run %s --fix-revisions '*'" % sys.argv[0]
sys.exit(1)
if records_identical(current_rec, previous_rec):
deleted_revisions += 1
if not dry_run:
run_sql("DELETE FROM hstRECORD WHERE id_bibrec=%s AND job_id=%s AND job_name=%s AND job_person=%s AND job_date=%s", (recid, job_id, job_name, job_person, job_date))
previous_rec = current_rec
if verbose and deleted_revisions:
print "record %s: deleted %s duplicate revisions out of %s" % (recid, deleted_revisions, len(all_revisions))
if verbose:
print "DONE"
def cli_list_revisions(recid, details=False):
"""Print list of all known record revisions (=RECID.REVDATE) for record
RECID.
"""
try:
recid = int(recid)
except ValueError:
print 'ERROR: record ID must be integer, not %s.' % recid
sys.exit(1)
record_rev_list = get_record_revision_ids(recid)
if not details:
out = '\n'.join(record_rev_list)
else:
out = "%s %s %s %s\n" % ("# Revision".ljust(22), "# Task ID".ljust(15),
"# Author".ljust(15), "# Job Details")
out += '\n'.join([get_info_of_revision_id(revid) for revid in record_rev_list])
if out:
print out
else:
print 'ERROR: Record %s not found.' % recid
def cli_get_revision(revid):
"""Return MARCXML for record revision REVID (=RECID.REVDATE) of a record."""
if not revision_format_valid_p(revid):
print 'ERROR: revision %s is invalid; ' \
'must be NNN.YYYYMMDDhhmmss.' % revid
sys.exit(1)
out = get_marcxml_of_revision_id(revid)
if out:
print out
else:
print 'ERROR: Revision %s not found.' % revid
def cli_diff_revisions(revid1, revid2):
"""Return diffs of MARCXML for record revisions REVID1, REVID2."""
for revid in [revid1, revid2]:
if not revision_format_valid_p(revid):
print 'ERROR: revision %s is invalid; ' \
'must be NNN.YYYYMMDDhhmmss.' % revid
sys.exit(1)
xml1 = get_marcxml_of_revision_id(revid1)
if not xml1:
print 'ERROR: Revision %s not found. ' % revid1
sys.exit(1)
xml2 = get_marcxml_of_revision_id(revid2)
if not xml2:
print 'ERROR: Revision %s not found. ' % revid2
sys.exit(1)
print get_xml_comparison(revid1, revid2, xml1, xml2)
def cli_revert_to_revision(revid):
"""Submit specified record revision REVID upload, to replace current
version.
"""
if not revision_format_valid_p(revid):
print 'ERROR: revision %s is invalid; ' \
'must be NNN.YYYYMMDDhhmmss.' % revid
sys.exit(1)
xml_record = get_marcxml_of_revision_id(revid)
if xml_record == '':
print 'ERROR: Revision %s does not exist. ' % revid
sys.exit(1)
recid = split_revid(revid)[0]
if record_locked_by_other_user(recid, -1):
print 'The record is currently being edited. ' \
'Please try again in a few minutes.'
sys.exit(1)
if record_locked_by_queue(recid):
print 'The record is locked because of unfinished upload tasks. ' \
'Please try again in a few minutes.'
sys.exit(1)
save_xml_record(recid, 0, xml_record)
print 'Your modifications have now been submitted. They will be ' \
'processed as soon as the task queue is empty.'
def check_rev(recid, verbose=True, fix=False):
revisions = get_record_revisions(recid)
for recid, job_date in revisions:
rev = '%s.%s' % (recid, job_date)
try:
get_marcxml_of_revision_id(rev)
if verbose:
print '%s: ok' % rev
except zlib.error:
print '%s: invalid' % rev
if fix:
fix_rev(recid, job_date, verbose)
def fix_rev(recid, job_date, verbose=True):
sql = 'DELETE FROM hstRECORD WHERE id_bibrec = %s AND job_date = "%s"'
run_sql(sql, (recid, job_date))
def cli_check_revisions(recid):
if recid == '*':
print 'Checking all records'
recids = intbitset(run_sql("SELECT id FROM bibrec ORDER BY id"))
for index, rec in enumerate(recids):
if index % 1000 == 0 and index:
print index, 'records processed'
check_rev(rec, verbose=False)
else:
check_rev(recid)
def cli_fix_revisions(recid):
if recid == '*':
print 'Fixing all records'
recids = intbitset(run_sql("SELECT id FROM bibrec ORDER BY id"))
for index, rec in enumerate(recids):
if index % 1000 == 0 and index:
print index, 'records processed'
check_rev(rec, verbose=False, fix=True)
else:
check_rev(recid, fix=True)
def main():
"""Main entry point."""
if '--help' in sys.argv or \
'-h' in sys.argv:
print_usage()
elif '--version' in sys.argv or \
'-V' in sys.argv:
print_version()
else:
try:
cmd = sys.argv[1]
opts = sys.argv[2:]
if not opts:
raise IndexError
except IndexError:
print_usage()
sys.exit(1)
if cmd == '--list-revisions':
try:
recid = opts[0]
except IndexError:
print_usage()
sys.exit(1)
cli_list_revisions(recid, details=False)
elif cmd == '--list-revisions-details':
try:
recid = opts[0]
except IndexError:
print_usage()
sys.exit(1)
cli_list_revisions(recid, details=True)
elif cmd == '--get-revision':
try:
revid = opts[0]
except IndexError:
print_usage()
sys.exit(1)
cli_get_revision(revid)
elif cmd == '--diff-revisions':
try:
revid1 = opts[0]
revid2 = opts[1]
except IndexError:
print_usage()
sys.exit(1)
cli_diff_revisions(revid1, revid2)
elif cmd == '--revert-to-revision':
try:
revid = opts[0]
except IndexError:
print_usage()
sys.exit(1)
cli_revert_to_revision(revid)
elif cmd == '--check-revisions':
try:
recid = opts[0]
except IndexError:
recid = '*'
cli_check_revisions(recid)
elif cmd == '--fix-revisions':
try:
recid = opts[0]
except IndexError:
recid = '*'
cli_fix_revisions(recid)
elif cmd == '--clean-revisions':
try:
recid = opts[0]
except IndexError:
recid = '*'
cli_clean_revisions(recid, dry_run=False)
else:
print "ERROR: Please specify a command. Please see '--help'."
sys.exit(1)
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/bibedit/engine.py b/invenio/legacy/bibedit/engine.py
index 0d940a418..243942d31 100644
--- a/invenio/legacy/bibedit/engine.py
+++ b/invenio/legacy/bibedit/engine.py
@@ -1,1680 +1,1680 @@
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0103
"""Invenio BibEdit Engine."""
__revision__ = "$Id"
from datetime import datetime
import re
import difflib
import zlib
import copy
import urllib
import urllib2
import cookielib
import json
from invenio.modules import formatter as bibformat
from invenio.utils.json import CFG_JSON_AVAILABLE
from invenio.utils.url import auto_version_url
from invenio.legacy.bibrecord.scripts.xmlmarc2textmarc import create_marc_record
-from invenio.bibedit_config import CFG_BIBEDIT_AJAX_RESULT_CODES, \
+from invenio.legacy.bibedit.config import CFG_BIBEDIT_AJAX_RESULT_CODES, \
CFG_BIBEDIT_JS_CHECK_SCROLL_INTERVAL, CFG_BIBEDIT_JS_HASH_CHECK_INTERVAL, \
CFG_BIBEDIT_JS_CLONED_RECORD_COLOR, \
CFG_BIBEDIT_JS_CLONED_RECORD_COLOR_FADE_DURATION, \
CFG_BIBEDIT_JS_NEW_ADD_FIELD_FORM_COLOR, \
CFG_BIBEDIT_JS_NEW_ADD_FIELD_FORM_COLOR_FADE_DURATION, \
CFG_BIBEDIT_JS_NEW_CONTENT_COLOR, \
CFG_BIBEDIT_JS_NEW_CONTENT_COLOR_FADE_DURATION, \
CFG_BIBEDIT_JS_NEW_CONTENT_HIGHLIGHT_DELAY, \
CFG_BIBEDIT_JS_STATUS_ERROR_TIME, CFG_BIBEDIT_JS_STATUS_INFO_TIME, \
CFG_BIBEDIT_JS_TICKET_REFRESH_DELAY, CFG_BIBEDIT_MAX_SEARCH_RESULTS, \
CFG_BIBEDIT_TAG_FORMAT, CFG_BIBEDIT_AJAX_RESULT_CODES_REV, \
CFG_BIBEDIT_AUTOSUGGEST_TAGS, CFG_BIBEDIT_AUTOCOMPLETE_TAGS_KBS,\
CFG_BIBEDIT_KEYWORD_TAXONOMY, CFG_BIBEDIT_KEYWORD_TAG, \
CFG_BIBEDIT_KEYWORD_RDFLABEL, CFG_BIBEDIT_REQUESTS_UNTIL_SAVE, \
CFG_BIBEDIT_DOI_LOOKUP_FIELD, CFG_DOI_USER_AGENT, \
CFG_BIBEDIT_DISPLAY_REFERENCE_TAGS, CFG_BIBEDIT_DISPLAY_AUTHOR_TAGS
from invenio.config import CFG_SITE_LANG, CFG_DEVEL_SITE
-from invenio.bibedit_dblayer import get_name_tags_all, reserve_record_id, \
+from invenio.legacy.bibedit.db_layer import get_name_tags_all, reserve_record_id, \
get_related_hp_changesets, get_hp_update_xml, delete_hp_change, \
get_record_last_modification_date, get_record_revision_author, \
get_marcxml_of_record_revision, delete_related_holdingpen_changes, \
get_record_revisions
-from invenio.bibedit_utils import cache_exists, cache_expired, \
+from invenio.legacy.bibedit.utils import cache_exists, cache_expired, \
create_cache_file, delete_cache_file, get_bibrecord, \
get_cache_file_contents, get_cache_mtime, get_record_templates, \
get_record_template, latest_record_revision, record_locked_by_other_user, \
record_locked_by_queue, save_xml_record, touch_cache_file, \
update_cache_file_contents, get_field_templates, get_marcxml_of_revision, \
revision_to_timestamp, timestamp_to_revision, \
get_record_revision_timestamps, record_revision_exists, \
can_record_have_physical_copies, extend_record_with_template, \
replace_references, merge_record_with_template, record_xml_output, \
record_is_conference, add_record_cnum, get_xml_from_textmarc, \
record_locked_by_user_details, crossref_process_template, \
modify_record_timestamp
from invenio.legacy.bibrecord import create_record, print_rec, record_add_field, \
record_add_subfield_into, record_delete_field, \
record_delete_subfield_from, \
record_modify_subfield, record_move_subfield, \
create_field, record_replace_field, record_move_fields, \
record_modify_controlfield, record_get_field_values, \
record_get_subfields, record_get_field_instances, record_add_fields, \
record_strip_empty_fields, record_strip_empty_volatile_subfields, \
record_strip_controlfields, record_order_subfields, field_xml_output
from invenio.config import CFG_BIBEDIT_PROTECTED_FIELDS, CFG_CERN_SITE, \
CFG_SITE_URL, CFG_SITE_RECORD, CFG_BIBEDIT_KB_SUBJECTS, \
CFG_BIBEDIT_KB_INSTITUTIONS, CFG_BIBEDIT_AUTOCOMPLETE_INSTITUTIONS_FIELDS, \
CFG_INSPIRE_SITE
from invenio.legacy.search_engine import record_exists, perform_request_search
from invenio.legacy.webuser import session_param_get, session_param_set
-from invenio.bibcatalog import bibcatalog_system
+from invenio.legacy.bibcatalog.api import bibcatalog_system
from invenio.legacy.webpage import page
from invenio.utils.html import get_mathjax_header
from invenio.utils.text import wash_for_xml, show_diff
from invenio.modules.knowledge.api import get_kbd_values_for_bibedit, get_kbr_values, \
get_kbt_items_for_bibedit, kb_exists
from invenio.batchuploader_engine import perform_upload_check
-from invenio.bibcirculation_dblayer import get_number_copies, has_copies
-from invenio.bibcirculation_utils import create_item_details_url
+from invenio.legacy.bibcirculation.db_layer import get_number_copies, has_copies
+from invenio.legacy.bibcirculation.utils import create_item_details_url
from invenio.refextract_api import FullTextNotAvailable
from invenio.legacy.bibrecord.scripts import xmlmarc2textmarc as xmlmarc2textmarc
-from invenio.bibdocfile import BibRecDocs, InvenioBibDocFileError
+from invenio.legacy.bibdocfile.api import BibRecDocs, InvenioBibDocFileError
from invenio.crossrefutils import get_marcxml_for_doi, CrossrefError
import invenio.legacy.template
bibedit_templates = invenio.legacy.template.load('bibedit')
re_revdate_split = re.compile('^(\d\d\d\d)(\d\d)(\d\d)(\d\d)(\d\d)(\d\d)')
def get_empty_fields_templates():
"""
Returning the templates of empty fields::
-an empty data field
-an empty control field
"""
return [{
"name": "Empty field",
"description": "The data field not containing any " + \
"information filled in",
"tag" : "",
"ind1" : "",
"ind2" : "",
"subfields" : [("","")],
"isControlfield" : False
},{
"name" : "Empty control field",
"description" : "The controlfield not containing any " + \
"data or tag description",
"isControlfield" : True,
"tag" : "",
"value" : ""
}]
def get_available_fields_templates():
"""
A method returning all the available field templates
Returns a list of descriptors. Each descriptor has
the same structure as a full field descriptor inside the
record
"""
templates = get_field_templates()
result = get_empty_fields_templates()
for template in templates:
tplTag = template[3].keys()[0]
field = template[3][tplTag][0]
if (field[0] == []):
# if the field is a controlField, add different structure
result.append({
"name" : template[1],
"description" : template[2],
"isControlfield" : True,
"tag" : tplTag,
"value" : field[3]
})
else:
result.append({
"name": template[1],
"description": template[2],
"tag" : tplTag,
"ind1" : field[1],
"ind2" : field[2],
"subfields" : field[0],
"isControlfield" : False
})
return result
def perform_request_init(uid, ln, req, lastupdated):
"""Handle the initial request by adding menu and JavaScript to the page."""
errors = []
warnings = []
body = ''
# Add script data.
record_templates = get_record_templates()
record_templates.sort()
tag_names = get_name_tags_all()
protected_fields = ['001']
protected_fields.extend(CFG_BIBEDIT_PROTECTED_FIELDS.split(','))
cern_site = 'false'
if not CFG_JSON_AVAILABLE:
title = 'Record Editor'
body = '''Sorry, the record editor cannot operate when the
`simplejson' module is not installed. Please see the INSTALL
file.'''
return page(title = title,
body = body,
errors = [],
warnings = [],
uid = uid,
language = ln,
navtrail = "",
lastupdated = lastupdated,
req = req)
body += '<link rel="stylesheet" type="text/css" href="/img/jquery-ui.css" />'
body += '<link rel="stylesheet" type="text/css" href="%s/%s" />' % (CFG_SITE_URL,
auto_version_url("img/" + 'bibedit.css'))
if CFG_CERN_SITE:
cern_site = 'true'
data = {'gRECORD_TEMPLATES': record_templates,
'gTAG_NAMES': tag_names,
'gPROTECTED_FIELDS': protected_fields,
'gSITE_URL': '"' + CFG_SITE_URL + '"',
'gSITE_RECORD': '"' + CFG_SITE_RECORD + '"',
'gCERN_SITE': cern_site,
'gHASH_CHECK_INTERVAL': CFG_BIBEDIT_JS_HASH_CHECK_INTERVAL,
'gCHECK_SCROLL_INTERVAL': CFG_BIBEDIT_JS_CHECK_SCROLL_INTERVAL,
'gSTATUS_ERROR_TIME': CFG_BIBEDIT_JS_STATUS_ERROR_TIME,
'gSTATUS_INFO_TIME': CFG_BIBEDIT_JS_STATUS_INFO_TIME,
'gCLONED_RECORD_COLOR':
'"' + CFG_BIBEDIT_JS_CLONED_RECORD_COLOR + '"',
'gCLONED_RECORD_COLOR_FADE_DURATION':
CFG_BIBEDIT_JS_CLONED_RECORD_COLOR_FADE_DURATION,
'gNEW_ADD_FIELD_FORM_COLOR':
'"' + CFG_BIBEDIT_JS_NEW_ADD_FIELD_FORM_COLOR + '"',
'gNEW_ADD_FIELD_FORM_COLOR_FADE_DURATION':
CFG_BIBEDIT_JS_NEW_ADD_FIELD_FORM_COLOR_FADE_DURATION,
'gNEW_CONTENT_COLOR': '"' + CFG_BIBEDIT_JS_NEW_CONTENT_COLOR + '"',
'gNEW_CONTENT_COLOR_FADE_DURATION':
CFG_BIBEDIT_JS_NEW_CONTENT_COLOR_FADE_DURATION,
'gNEW_CONTENT_HIGHLIGHT_DELAY':
CFG_BIBEDIT_JS_NEW_CONTENT_HIGHLIGHT_DELAY,
'gTICKET_REFRESH_DELAY': CFG_BIBEDIT_JS_TICKET_REFRESH_DELAY,
'gRESULT_CODES': CFG_BIBEDIT_AJAX_RESULT_CODES,
'gAUTOSUGGEST_TAGS' : CFG_BIBEDIT_AUTOSUGGEST_TAGS,
'gAUTOCOMPLETE_TAGS' : CFG_BIBEDIT_AUTOCOMPLETE_TAGS_KBS.keys(),
'gKEYWORD_TAG' : '"' + CFG_BIBEDIT_KEYWORD_TAG + '"',
'gREQUESTS_UNTIL_SAVE' : CFG_BIBEDIT_REQUESTS_UNTIL_SAVE,
'gAVAILABLE_KBS': get_available_kbs(),
'gTagsToAutocomplete': CFG_BIBEDIT_AUTOCOMPLETE_INSTITUTIONS_FIELDS,
'gDOILookupField': '"' + CFG_BIBEDIT_DOI_LOOKUP_FIELD + '"',
'gDisplayReferenceTags': CFG_BIBEDIT_DISPLAY_REFERENCE_TAGS,
'gDisplayAuthorTags': CFG_BIBEDIT_DISPLAY_AUTHOR_TAGS
}
body += '<script type="text/javascript">\n'
for key in data:
body += ' var %s = %s;\n' % (key, data[key])
body += ' </script>\n'
# Adding the information about field templates
fieldTemplates = get_available_fields_templates()
body += "<script>\n" + \
" var fieldTemplates = %s\n" % (json.dumps(fieldTemplates), ) + \
"</script>\n"
# Add scripts (the ordering is NOT irrelevant).
scripts = ['jquery-ui.min.js', 'jquery.jeditable.mini.js', 'jquery.hotkeys.js',
'json2.js', 'bibedit_refextract.js', 'bibedit_display.js', 'bibedit_engine.js', 'bibedit_keys.js',
'bibedit_menu.js', 'bibedit_holdingpen.js', 'marcxml.js',
'bibedit_clipboard.js']
for script in scripts:
body += ' <script type="text/javascript" src="%s/%s">' \
'</script>\n' % (CFG_SITE_URL, auto_version_url("js/" + script))
# Init BibEdit
body += '<script>$(init_bibedit);</script>'
# Build page structure and menu.
# rec = create_record(format_record(235, "xm"))[0]
#oaiId = record_extract_oai_id(rec)
body += bibedit_templates.menu()
body += bibedit_templates.focuson()
body += """<div id="bibEditContent">
<div class="revisionLine"></div>
<div id="Toptoolbar"></div>
<div id="bibEditMessage"></div>
<div id="bibEditContentTable"></div>
</div>"""
return body, errors, warnings
def get_available_kbs():
"""
Return list of KBs that are available in the system to be used with
BibEdit
"""
kb_list = [CFG_BIBEDIT_KB_INSTITUTIONS, CFG_BIBEDIT_KB_SUBJECTS]
available_kbs = [kb for kb in kb_list if kb_exists(kb)]
return available_kbs
def record_has_pdf(recid):
""" Check if record has a pdf attached
"""
rec_info = BibRecDocs(recid)
docs = rec_info.list_bibdocs()
return bool(docs)
def get_marcxml_of_revision_id(recid, revid):
"""
Return MARCXML string with corresponding to revision REVID
(=RECID.REVDATE) of a record. Return empty string if revision
does not exist.
"""
job_date = "%s-%s-%s %s:%s:%s" % re_revdate_split.search(revid).groups()
tmp_res = get_marcxml_of_record_revision(recid, job_date)
if tmp_res:
for row in tmp_res:
xml = zlib.decompress(row[0]) + "\n"
# xml contains marcxml of record
# now we create a record object from this xml and sort fields and subfields
# and return marcxml
rec = create_record(xml)[0]
record_order_subfields(rec)
marcxml = record_xml_output(rec, order_fn="_order_by_tags")
return marcxml
def perform_request_compare(ln, recid, rev1, rev2):
"""Handle a request for comparing two records"""
body = ""
errors = []
warnings = []
if (not record_revision_exists(recid, rev1)) or \
(not record_revision_exists(recid, rev2)):
body = "The requested record revision does not exist !"
else:
xml1 = get_marcxml_of_revision_id(recid, rev1)
xml2 = get_marcxml_of_revision_id(recid, rev2)
# Create MARC representations of the records
marc1 = create_marc_record(create_record(xml1)[0], '', {"text-marc": 1, "aleph-marc": 0})
marc2 = create_marc_record(create_record(xml2)[0], '', {"text-marc": 1, "aleph-marc": 0})
comparison = show_diff(marc1, marc2)
job_date1 = "%s-%s-%s %s:%s:%s" % re_revdate_split.search(rev1).groups()
job_date2 = "%s-%s-%s %s:%s:%s" % re_revdate_split.search(rev2).groups()
body += bibedit_templates.history_comparebox(ln, job_date1,
job_date2, comparison)
return body, errors, warnings
def perform_request_newticket(recid, uid):
"""create a new ticket with this record's number
@param recid: record id
@param uid: user id
@return: (error_msg, url)
"""
t_url = ""
errmsg = ""
if bibcatalog_system is not None:
t_id = bibcatalog_system.ticket_submit(uid, "", recid, "")
if t_id:
#get the ticket's URL
t_url = bibcatalog_system.ticket_get_attribute(uid, t_id, 'url_modify')
else:
errmsg = "ticket_submit failed"
else:
errmsg = "No ticket system configured"
return (errmsg, t_url)
def perform_request_ajax(req, recid, uid, data, isBulk = False):
"""Handle Ajax requests by redirecting to appropriate function."""
response = {}
request_type = data['requestType']
undo_redo = None
if data.has_key("undoRedo"):
undo_redo = data["undoRedo"]
# Call function based on request type.
if request_type == 'searchForRecord':
# Search request.
response.update(perform_request_bibedit_search(data, req))
elif request_type in ['changeTagFormat']:
# User related requests.
response.update(perform_request_user(req, request_type, recid, data))
elif request_type in ('getRecord', 'submit', 'cancel', 'newRecord',
'deleteRecord', 'deleteRecordCache', 'prepareRecordMerge', 'revert',
'updateCacheRef', 'submittextmarc'):
# 'Major' record related requests.
response.update(perform_request_record(req, request_type, recid, uid,
data))
elif request_type in ('addField', 'addSubfields', \
'addFieldsSubfieldsOnPositions', 'modifyContent', \
'modifySubfieldTag', 'modifyFieldTag', \
'moveSubfield', 'deleteFields', 'moveField', \
'modifyField', 'otherUpdateRequest', \
'disableHpChange', 'deactivateHoldingPenChangeset'):
# Record updates.
cacheMTime = data['cacheMTime']
if data.has_key('hpChanges'):
hpChanges = data['hpChanges']
else:
hpChanges = {}
response.update(perform_request_update_record(request_type, recid, \
uid, cacheMTime, data, \
hpChanges, undo_redo, \
isBulk))
elif request_type in ('autosuggest', 'autocomplete', 'autokeyword'):
response.update(perform_request_autocomplete(request_type, recid, uid, \
data))
elif request_type in ('getTickets', ):
# BibCatalog requests.
response.update(perform_request_bibcatalog(request_type, recid, uid))
elif request_type in ('getHoldingPenUpdates', ):
response.update(perform_request_holdingpen(request_type, recid))
elif request_type in ('getHoldingPenUpdateDetails', \
'deleteHoldingPenChangeset'):
updateId = data['changesetNumber']
response.update(perform_request_holdingpen(request_type, recid, \
updateId))
elif request_type in ('applyBulkUpdates', ):
# a general version of a bulk request
changes = data['requestsData']
cacheMTime = data['cacheMTime']
response.update(perform_bulk_request_ajax(req, recid, uid, changes, \
undo_redo, cacheMTime))
elif request_type in ('preview', ):
response.update(perform_request_preview_record(request_type, recid, uid, data))
elif request_type in ('get_pdf_url', ):
response.update(perform_request_get_pdf_url(recid))
elif request_type in ('refextract', ):
txt = None
if data.has_key('txt'):
txt = data["txt"]
response.update(perform_request_ref_extract(recid, uid, txt))
elif request_type in ('refextracturl', ):
response.update(perform_request_ref_extract_url(recid, uid, data['url']))
elif request_type == 'getTextMarc':
response.update(perform_request_get_textmarc(recid, uid))
elif request_type == "getTableView":
response.update(perform_request_get_tableview(recid, uid, data))
elif request_type == "DOISearch":
response.update(perform_doi_search(data['doi']))
return response
def perform_bulk_request_ajax(req, recid, uid, reqsData, undoRedo, cacheMTime):
""" An AJAX handler used when treating bulk updates """
lastResult = {}
lastTime = cacheMTime
isFirst = True
for data in reqsData:
assert data != None
data['cacheMTime'] = lastTime
if isFirst and undoRedo != None:
# we add the undo/redo handler to the first operation in order to
# save the handler on the server side !
data['undoRedo'] = undoRedo
isFirst = False
lastResult = perform_request_ajax(req, recid, uid, data, isBulk=True)
# now we have to update the cacheMtime in next request !
# if lastResult.has_key('cacheMTime'):
try:
lastTime = lastResult['cacheMTime']
except:
raise Exception(str(lastResult))
return lastResult
def perform_request_bibedit_search(data, req):
"""Handle search requests."""
response = {}
searchType = data['searchType']
if searchType is None:
searchType = "anywhere"
searchPattern = data['searchPattern']
if searchType == 'anywhere':
pattern = searchPattern
else:
pattern = searchType + ':' + searchPattern
result_set = list(perform_request_search(req=req, p=pattern))
response['resultCode'] = 1
response['resultSet'] = result_set[0:CFG_BIBEDIT_MAX_SEARCH_RESULTS]
return response
def perform_request_user(req, request_type, recid, data):
"""Handle user related requests."""
response = {}
if request_type == 'changeTagFormat':
tagformat_settings = session_param_get(req, 'bibedit_tagformat', {})
tagformat_settings[recid] = data['tagFormat']
session_param_set(req, 'bibedit_tagformat', tagformat_settings)
response['resultCode'] = 2
return response
def perform_request_holdingpen(request_type, recId, changeId=None):
"""
A method performing the holdingPen ajax request. The following types of
requests can be made::
-getHoldingPenUpdates: retrieving the holding pen updates pending
for a given record
"""
response = {}
if request_type == 'getHoldingPenUpdates':
changeSet = get_related_hp_changesets(recId)
changes = []
for change in changeSet:
changes.append((str(change[0]), str(change[1])))
response["changes"] = changes
elif request_type == 'getHoldingPenUpdateDetails':
# returning the list of changes related to the holding pen update
# the format based on what the record difference xtool returns
assert(changeId != None)
hpContent = get_hp_update_xml(changeId)
holdingPenRecord = create_record(hpContent[0], "xm")[0]
# order subfields alphabetically
record_order_subfields(holdingPenRecord)
# databaseRecord = get_record(hpContent[1])
response['record'] = holdingPenRecord
response['changeset_number'] = changeId
elif request_type == 'deleteHoldingPenChangeset':
assert(changeId != None)
delete_hp_change(changeId)
return response
def perform_request_record(req, request_type, recid, uid, data, ln=CFG_SITE_LANG):
"""Handle 'major' record related requests like fetching, submitting or
deleting a record, cancel editing or preparing a record for merging.
"""
response = {}
if request_type == 'newRecord':
# Create a new record.
new_recid = reserve_record_id()
new_type = data['newType']
if new_type == 'empty':
# Create a new empty record.
create_cache_file(recid, uid)
response['resultCode'], response['newRecID'] = 6, new_recid
elif new_type == 'template':
# Create a new record from XML record template.
template_filename = data['templateFilename']
template = get_record_template(template_filename)
if not template:
response['resultCode'] = 108
else:
record = create_record(template)[0]
if not record:
response['resultCode'] = 109
else:
record_add_field(record, '001',
controlfield_value=str(new_recid))
create_cache_file(new_recid, uid, record, True)
response['resultCode'], response['newRecID'] = 7, new_recid
elif new_type == 'import':
# Import data from external source, using DOI
doi = data['doi']
if not doi:
response['resultCode'] = CFG_BIBEDIT_AJAX_RESULT_CODES_REV['error_no_doi_specified']
else:
try:
marcxml_template = get_marcxml_for_doi(doi)
except CrossrefError, inst:
response['resultCode'] = \
CFG_BIBEDIT_AJAX_RESULT_CODES_REV[inst.code]
except:
response['resultCode'] = 0
else:
record = crossref_process_template(marcxml_template, CFG_INSPIRE_SITE)
if not record:
response['resultCode'] = 109
else:
record_add_field(record, '001',
controlfield_value=str(new_recid))
create_cache_file(new_recid, uid, record, True)
response['resultCode'], response['newRecID'] = 7, new_recid
elif new_type == 'clone':
# Clone an existing record (from the users cache).
existing_cache = cache_exists(recid, uid)
if existing_cache:
try:
record = get_cache_file_contents(recid, uid)[2]
except:
# if, for example, the cache format was wrong (outdated)
record = get_bibrecord(recid)
else:
# Cache missing. Fall back to using original version.
record = get_bibrecord(recid)
record_delete_field(record, '001')
record_add_field(record, '001', controlfield_value=str(new_recid))
create_cache_file(new_recid, uid, record, True)
response['resultCode'], response['newRecID'] = 8, new_recid
elif request_type == 'getRecord':
# Fetch the record. Possible error situations:
# - Non-existing record
# - Deleted record
# - Record locked by other user
# - Record locked by queue
# A cache file will be created if it does not exist.
# If the cache is outdated (i.e., not based on the latest DB revision),
# cacheOutdated will be set to True in the response.
record_status = record_exists(recid)
existing_cache = cache_exists(recid, uid)
read_only_mode = False
if data.has_key("inReadOnlyMode"):
read_only_mode = data['inReadOnlyMode']
if record_status == 0:
response['resultCode'] = 102
elif not read_only_mode and not existing_cache and \
record_locked_by_other_user(recid, uid):
name, email, locked_since = record_locked_by_user_details(recid, uid)
response['locked_details'] = {'name': name,
'email': email,
'locked_since': locked_since}
response['resultCode'] = 104
elif not read_only_mode and existing_cache and \
cache_expired(recid, uid) and \
record_locked_by_other_user(recid, uid):
response['resultCode'] = 104
elif not read_only_mode and record_locked_by_queue(recid):
response['resultCode'] = 105
else:
if data.get('deleteRecordCache'):
delete_cache_file(recid, uid)
existing_cache = False
pending_changes = []
disabled_hp_changes = {}
if read_only_mode:
if data.has_key('recordRevision') and data['recordRevision'] != 'sampleValue':
record_revision_ts = data['recordRevision']
record_xml = get_marcxml_of_revision(recid, \
record_revision_ts)
record = create_record(record_xml)[0]
record_revision = timestamp_to_revision(record_revision_ts)
pending_changes = []
disabled_hp_changes = {}
else:
# a normal cacheless retrieval of a record
record = get_bibrecord(recid)
record_revision = get_record_last_modification_date(recid)
if record_revision == None:
record_revision = datetime.now().timetuple()
pending_changes = []
disabled_hp_changes = {}
cache_dirty = False
mtime = 0
undo_list = []
redo_list = []
elif not existing_cache:
record_revision, record = create_cache_file(recid, uid)
mtime = get_cache_mtime(recid, uid)
pending_changes = []
disabled_hp_changes = {}
undo_list = []
redo_list = []
cache_dirty = False
else:
#TODO: This try except should be replaced with something nicer,
# like an argument indicating if a new cache file is to
# be created
try:
cache_dirty, record_revision, record, pending_changes, \
disabled_hp_changes, undo_list, redo_list = \
get_cache_file_contents(recid, uid)
touch_cache_file(recid, uid)
mtime = get_cache_mtime(recid, uid)
if not latest_record_revision(recid, record_revision) and \
get_record_revisions(recid) != ():
# This sould prevent from using old cache in case of
# viewing old version. If there are no revisions,
# it means we should skip this step because this
# is a new record
response['cacheOutdated'] = True
except:
record_revision, record = create_cache_file(recid, uid)
mtime = get_cache_mtime(recid, uid)
pending_changes = []
disabled_hp_changes = {}
cache_dirty = False
undo_list = []
redo_list = []
if data.get('clonedRecord',''):
response['resultCode'] = 9
else:
response['resultCode'] = 3
revision_author = get_record_revision_author(recid, record_revision)
latest_revision = get_record_last_modification_date(recid)
if latest_revision == None:
latest_revision = datetime.now().timetuple()
last_revision_ts = revision_to_timestamp(latest_revision)
revisions_history = get_record_revision_timestamps(recid)
number_of_physical_copies = get_number_copies(recid)
bibcirc_details_URL = create_item_details_url(recid, ln)
can_have_copies = can_record_have_physical_copies(recid)
# For some collections, merge template with record
template_to_merge = extend_record_with_template(recid)
if template_to_merge:
merged_record = merge_record_with_template(record, template_to_merge)
if merged_record:
record = merged_record
create_cache_file(recid, uid, record, True)
if record_status == -1:
# The record was deleted
response['resultCode'] = 103
response['record_has_pdf'] = record_has_pdf(recid)
# order subfields alphabetically
record_order_subfields(record)
response['cacheDirty'], response['record'], \
response['cacheMTime'], response['recordRevision'], \
response['revisionAuthor'], response['lastRevision'], \
response['revisionsHistory'], response['inReadOnlyMode'], \
response['pendingHpChanges'], response['disabledHpChanges'], \
response['undoList'], response['redoList'] = cache_dirty, \
record, mtime, revision_to_timestamp(record_revision), \
revision_author, last_revision_ts, revisions_history, \
read_only_mode, pending_changes, disabled_hp_changes, \
undo_list, redo_list
response['numberOfCopies'] = number_of_physical_copies
response['bibCirculationUrl'] = bibcirc_details_URL
response['canRecordHavePhysicalCopies'] = can_have_copies
# Set tag format from user's session settings.
tagformat_settings = session_param_get(req, 'bibedit_tagformat')
tagformat = (tagformat_settings is not None) and tagformat_settings.get(recid, CFG_BIBEDIT_TAG_FORMAT) or CFG_BIBEDIT_TAG_FORMAT
response['tagFormat'] = tagformat
# KB information
response['KBSubject'] = CFG_BIBEDIT_KB_SUBJECTS
response['KBInstitution'] = CFG_BIBEDIT_KB_INSTITUTIONS
elif request_type == 'submit':
# Submit the record. Possible error situations:
# - Missing cache file
# - Cache file modified in other editor
# - Record locked by other user
# - Record locked by queue
# If the cache is outdated cacheOutdated will be set to True in the
# response.
if not cache_exists(recid, uid):
response['resultCode'] = 106
elif not get_cache_mtime(recid, uid) == data['cacheMTime']:
response['resultCode'] = 107
elif cache_expired(recid, uid) and \
record_locked_by_other_user(recid, uid):
response['resultCode'] = 104
elif record_locked_by_queue(recid):
response['resultCode'] = 105
else:
try:
tmp_result = get_cache_file_contents(recid, uid)
record_revision = tmp_result[1]
record = tmp_result[2]
pending_changes = tmp_result[3]
# disabled_changes = tmp_result[4]
xml_record = wash_for_xml(print_rec(record))
record, status_code, list_of_errors = create_record(xml_record)
# Simulate upload to catch errors
errors_upload = perform_upload_check(xml_record, '--replace')
if errors_upload:
response['resultCode'], response['errors'] = 113, \
errors_upload
return response
elif status_code == 0:
response['resultCode'], response['errors'] = 110, \
list_of_errors
if not data['force'] and not latest_record_revision(recid, record_revision):
response['cacheOutdated'] = True
else:
if record_is_conference(record):
new_cnum = add_record_cnum(recid, uid)
if new_cnum:
response["new_cnum"] = new_cnum
save_xml_record(recid, uid)
response['resultCode'] = 4
except Exception, e:
response['resultCode'] = CFG_BIBEDIT_AJAX_RESULT_CODES_REV[ \
'error_wrong_cache_file_format']
if CFG_DEVEL_SITE: # return debug information in the request
response['exception_message'] = e.__str__()
elif request_type == 'revert':
revId = data['revId']
job_date = "%s-%s-%s %s:%s:%s" % re_revdate_split.search(revId).groups()
revision_xml = get_marcxml_of_revision(recid, job_date)
# Modify the 005 tag in order to merge with the latest version of record
last_revision_ts = data['lastRevId'] + ".0"
revision_xml = modify_record_timestamp(revision_xml, last_revision_ts)
save_xml_record(recid, uid, revision_xml)
if (cache_exists(recid, uid)):
delete_cache_file(recid, uid)
response['resultCode'] = 4
elif request_type == 'cancel':
# Cancel editing by deleting the cache file. Possible error situations:
# - Cache file modified in other editor
if cache_exists(recid, uid):
if get_cache_mtime(recid, uid) == data['cacheMTime']:
delete_cache_file(recid, uid)
response['resultCode'] = 5
else:
response['resultCode'] = 107
else:
response['resultCode'] = 5
elif request_type == 'deleteRecord':
# Submit the record. Possible error situations:
# - Record locked by other user
# - Record locked by queue
# As the user is requesting deletion we proceed even if the cache file
# is missing and we don't check if the cache is outdated or has
# been modified in another editor.
existing_cache = cache_exists(recid, uid)
pending_changes = []
if has_copies(recid):
response['resultCode'] = \
CFG_BIBEDIT_AJAX_RESULT_CODES_REV['error_physical_copies_exist']
elif existing_cache and cache_expired(recid, uid) and \
record_locked_by_other_user(recid, uid):
response['resultCode'] = \
CFG_BIBEDIT_AJAX_RESULT_CODES_REV['error_rec_locked_by_user']
elif record_locked_by_queue(recid):
response['resultCode'] = \
CFG_BIBEDIT_AJAX_RESULT_CODES_REV['error_rec_locked_by_queue']
else:
if not existing_cache:
record_revision, record, pending_changes, \
deactivated_hp_changes, undo_list, redo_list = \
create_cache_file(recid, uid)
else:
try:
record_revision, record, pending_changes, \
deactivated_hp_changes, undo_list, redo_list = \
get_cache_file_contents(recid, uid)[1:]
except:
record_revision, record, pending_changes, \
deactivated_hp_changes = create_cache_file(recid, uid)
record_add_field(record, '980', ' ', ' ', '', [('c', 'DELETED')])
undo_list = []
redo_list = []
update_cache_file_contents(recid, uid, record_revision, record, \
pending_changes, \
deactivated_hp_changes, undo_list, \
redo_list)
save_xml_record(recid, uid)
delete_related_holdingpen_changes(recid) # we don't need any changes
# related to a deleted record
response['resultCode'] = 10
elif request_type == 'deleteRecordCache':
# Delete the cache file. Ignore the request if the cache has been
# modified in another editor.
if data.has_key('cacheMTime'):
if cache_exists(recid, uid) and get_cache_mtime(recid, uid) == \
data['cacheMTime']:
delete_cache_file(recid, uid)
response['resultCode'] = 11
elif request_type == 'updateCacheRef':
# Update cache with the contents coming from BibEdit JS interface
# Used when updating references using ref extractor
record_revision, record, pending_changes, \
deactivated_hp_changes, undo_list, redo_list = \
get_cache_file_contents(recid, uid)[1:]
record = create_record(data['recXML'])[0]
response['cacheMTime'], response['cacheDirty'] = update_cache_file_contents(recid, uid, record_revision, record, \
pending_changes, \
deactivated_hp_changes, undo_list, \
redo_list), True
response['resultCode'] = CFG_BIBEDIT_AJAX_RESULT_CODES_REV['cache_updated_with_references']
elif request_type == 'prepareRecordMerge':
# We want to merge the cache with the current DB version of the record,
# so prepare an XML file from the file cache, to be used by BibMerge.
# Possible error situations:
# - Missing cache file
# - Record locked by other user
# - Record locked by queue
# We don't check if cache is outdated (a likely scenario for this
# request) or if it has been modified in another editor.
if not cache_exists(recid, uid):
response['resultCode'] = 106
elif cache_expired(recid, uid) and \
record_locked_by_other_user(recid, uid):
response['resultCode'] = 104
elif record_locked_by_queue(recid):
response['resultCode'] = 105
else:
save_xml_record(recid, uid, to_upload=False, to_merge=True)
response['resultCode'] = 12
elif request_type == 'submittextmarc':
# Textmarc content coming from the user
textmarc_record = data['textmarc']
xml_conversion_status = get_xml_from_textmarc(recid, textmarc_record)
if xml_conversion_status['resultMsg'] == "textmarc_parsing_error":
response.update(xml_conversion_status)
return response
# Simulate upload to catch errors
errors_upload = perform_upload_check(xml_conversion_status['resultXML'], '--replace')
if errors_upload:
response['resultCode'], response['errors'] = 113, \
errors_upload
return response
response.update(xml_conversion_status)
if xml_conversion_status['resultMsg'] == 'textmarc_parsing_success':
create_cache_file(recid, uid,
create_record(response['resultXML'])[0])
save_xml_record(recid, uid)
response['resultCode'] = CFG_BIBEDIT_AJAX_RESULT_CODES_REV["record_submitted"]
return response
def perform_request_update_record(request_type, recid, uid, cacheMTime, data, \
hpChanges, undoRedoOp, isBulk=False):
"""
Handle record update requests like adding, modifying, moving or deleting
of fields or subfields. Possible common error situations::
- Missing cache file
- Cache file modified in other editor
@param undoRedoOp: Indicates in "undo"/"redo"/undo_descriptor operation is
performed by a current request.
"""
response = {}
if not cache_exists(recid, uid):
response['resultCode'] = 106
elif not get_cache_mtime(recid, uid) == cacheMTime and isBulk == False:
# In case of a bulk request, the changes are deliberately performed
# immediately one after another
response['resultCode'] = 107
else:
try:
record_revision, record, pending_changes, deactivated_hp_changes, \
undo_list, redo_list = get_cache_file_contents(recid, uid)[1:]
except:
response['resultCode'] = CFG_BIBEDIT_AJAX_RESULT_CODES_REV[ \
'error_wrong_cache_file_format']
return response
# process all the Holding Pen changes operations ... regardles the
# request type
if hpChanges.has_key("toDisable"):
for changeId in hpChanges["toDisable"]:
pending_changes[changeId]["applied_change"] = True
if hpChanges.has_key("toEnable"):
for changeId in hpChanges["toEnable"]:
pending_changes[changeId]["applied_change"] = False
if hpChanges.has_key("toOverride"):
pending_changes = hpChanges["toOverride"]
if hpChanges.has_key("changesetsToDeactivate"):
for changesetId in hpChanges["changesetsToDeactivate"]:
deactivated_hp_changes[changesetId] = True
if hpChanges.has_key("changesetsToActivate"):
for changesetId in hpChanges["changesetsToActivate"]:
deactivated_hp_changes[changesetId] = False
# processing the undo/redo entries
if undoRedoOp == "undo":
try:
redo_list = [undo_list[-1]] + redo_list
undo_list = undo_list[:-1]
except:
raise Exception("An exception occured when undoing previous" + \
" operation. Undo list: " + str(undo_list) + \
" Redo list " + str(redo_list))
elif undoRedoOp == "redo":
try:
undo_list = undo_list + [redo_list[0]]
redo_list = redo_list[1:]
except:
raise Exception("An exception occured when redoing previous" + \
" operation. Undo list: " + str(undo_list) + \
" Redo list " + str(redo_list))
else:
# This is a genuine operation - we have to add a new descriptor
# to the undo list and cancel the redo unless the operation is
# a bulk operation
if undoRedoOp != None:
undo_list = undo_list + [undoRedoOp]
redo_list = []
else:
assert isBulk == True
field_position_local = data.get('fieldPosition')
if field_position_local is not None:
field_position_local = int(field_position_local)
if request_type == 'otherUpdateRequest':
# An empty request. Might be useful if we want to perform
# operations that require only the actions performed globally,
# like modifying the holdingPen changes list
response['resultCode'] = CFG_BIBEDIT_AJAX_RESULT_CODES_REV[ \
'editor_modifications_changed']
elif request_type == 'deactivateHoldingPenChangeset':
# the changeset has been marked as processed ( user applied it in
# the editor). Marking as used in the cache file.
# CAUTION: This function has been implemented here because logically
# it fits with the modifications made to the cache file.
# No changes are made to the Holding Pen physically. The
# changesets are related to the cache because we want to
# cancel the removal every time the cache disappears for
# any reason
response['resultCode'] = CFG_BIBEDIT_AJAX_RESULT_CODES_REV[ \
'disabled_hp_changeset']
elif request_type == 'addField':
if data['controlfield']:
record_add_field(record, data['tag'],
controlfield_value=data['value'])
response['resultCode'] = 20
else:
record_add_field(record, data['tag'], data['ind1'],
data['ind2'], subfields=data['subfields'],
field_position_local=field_position_local)
response['resultCode'] = 21
elif request_type == 'addSubfields':
subfields = data['subfields']
for subfield in subfields:
record_add_subfield_into(record, data['tag'], subfield[0],
subfield[1], subfield_position=None,
field_position_local=field_position_local)
if len(subfields) == 1:
response['resultCode'] = 22
else:
response['resultCode'] = 23
elif request_type == 'addFieldsSubfieldsOnPositions':
#1) Sorting the fields by their identifiers
fieldsToAdd = data['fieldsToAdd']
subfieldsToAdd = data['subfieldsToAdd']
for tag in fieldsToAdd.keys():
positions = fieldsToAdd[tag].keys()
positions.sort()
for position in positions:
# now adding fields at a position
isControlfield = (len(fieldsToAdd[tag][position][0]) == 0)
# if there are n subfields, this is a control field
if isControlfield:
controlfieldValue = fieldsToAdd[tag][position][3]
record_add_field(record, tag, field_position_local = \
int(position), \
controlfield_value = \
controlfieldValue)
else:
subfields = fieldsToAdd[tag][position][0]
ind1 = fieldsToAdd[tag][position][1]
ind2 = fieldsToAdd[tag][position][2]
record_add_field(record, tag, ind1, ind2, subfields = \
subfields, field_position_local = \
int(position))
# now adding the subfields
for tag in subfieldsToAdd.keys():
for fieldPosition in subfieldsToAdd[tag].keys(): #now the fields
#order not important !
subfieldsPositions = subfieldsToAdd[tag][fieldPosition]. \
keys()
subfieldsPositions.sort()
for subfieldPosition in subfieldsPositions:
subfield = subfieldsToAdd[tag][fieldPosition]\
[subfieldPosition]
record_add_subfield_into(record, tag, subfield[0], \
subfield[1], \
subfield_position = \
int(subfieldPosition), \
field_position_local = \
int(fieldPosition))
response['resultCode'] = \
CFG_BIBEDIT_AJAX_RESULT_CODES_REV['added_positioned_subfields']
elif request_type == 'modifyField': # changing the field structure
# first remove subfields and then add new... change the indices
subfields = data['subFields'] # parse the JSON representation of
# the subfields here
new_field = create_field(subfields, data['ind1'], data['ind2'])
record_replace_field(record, data['tag'], new_field, \
field_position_local = data['fieldPosition'])
response['resultCode'] = 26
elif request_type == 'modifyContent':
if data['subfieldIndex'] != None:
record_modify_subfield(record, data['tag'],
data['subfieldCode'], data['value'],
int(data['subfieldIndex']),
field_position_local=field_position_local)
else:
record_modify_controlfield(record, data['tag'], data["value"],
field_position_local=field_position_local)
response['resultCode'] = 24
elif request_type == 'modifySubfieldTag':
record_add_subfield_into(record, data['tag'], data['subfieldCode'],
data["value"], subfield_position= int(data['subfieldIndex']),
field_position_local=field_position_local)
record_delete_subfield_from(record, data['tag'], int(data['subfieldIndex']) + 1,
field_position_local=field_position_local)
response['resultCode'] = 24
elif request_type == 'modifyFieldTag':
subfields = record_get_subfields(record, data['oldTag'],
field_position_local=field_position_local)
record_add_field(record, data['newTag'], data['ind1'],
data['ind2'] , subfields=subfields)
record_delete_field(record, data['oldTag'], ind1=data['oldInd1'], \
ind2=data['oldInd2'], field_position_local=field_position_local)
response['resultCode'] = 32
elif request_type == 'moveSubfield':
record_move_subfield(record, data['tag'],
int(data['subfieldIndex']), int(data['newSubfieldIndex']),
field_position_local=field_position_local)
response['resultCode'] = 25
elif request_type == 'moveField':
if data['direction'] == 'up':
final_position_local = field_position_local-1
else: # direction is 'down'
final_position_local = field_position_local+1
record_move_fields(record, data['tag'], [field_position_local],
final_position_local)
response['resultCode'] = 32
elif request_type == 'deleteFields':
to_delete = data['toDelete']
deleted_fields = 0
deleted_subfields = 0
for tag in to_delete:
#Sorting the fields in a edcreasing order by the local position!
fieldsOrder = to_delete[tag].keys()
fieldsOrder.sort(lambda a, b: int(b) - int(a))
for field_position_local in fieldsOrder:
if not to_delete[tag][field_position_local]:
# No subfields specified - delete entire field.
record_delete_field(record, tag,
field_position_local=int(field_position_local))
deleted_fields += 1
else:
for subfield_position in \
to_delete[tag][field_position_local][::-1]:
# Delete subfields in reverse order (to keep the
# indexing correct).
record_delete_subfield_from(record, tag,
int(subfield_position),
field_position_local=int(field_position_local))
deleted_subfields += 1
if deleted_fields == 1 and deleted_subfields == 0:
response['resultCode'] = 26
elif deleted_fields and deleted_subfields == 0:
response['resultCode'] = 27
elif deleted_subfields == 1 and deleted_fields == 0:
response['resultCode'] = 28
elif deleted_subfields and deleted_fields == 0:
response['resultCode'] = 29
else:
response['resultCode'] = 30
response['cacheMTime'], response['cacheDirty'] = \
update_cache_file_contents(recid, uid, record_revision,
record, \
pending_changes, \
deactivated_hp_changes, \
undo_list, redo_list), \
True
return response
def perform_request_autocomplete(request_type, recid, uid, data):
"""
Perfrom an AJAX request associated with the retrieval of autocomplete
data.
@param request_type: Type of the currently served request
@param recid: the identifer of the record
@param uid: The identifier of the user being currently logged in
@param data: The request data containing possibly important additional
arguments
"""
response = {}
# get the values based on which one needs to search
searchby = data['value']
#we check if the data is properly defined
fulltag = ''
if data.has_key('maintag') and data.has_key('subtag1') and \
data.has_key('subtag2') and data.has_key('subfieldcode'):
maintag = data['maintag']
subtag1 = data['subtag1']
subtag2 = data['subtag2']
u_subtag1 = subtag1
u_subtag2 = subtag2
if (not subtag1) or (subtag1 == ' '):
u_subtag1 = '_'
if (not subtag2) or (subtag2 == ' '):
u_subtag2 = '_'
subfieldcode = data['subfieldcode']
fulltag = maintag+u_subtag1+u_subtag2+subfieldcode
if (request_type == 'autokeyword'):
#call the keyword-form-ontology function
if fulltag and searchby:
items = get_kbt_items_for_bibedit(CFG_BIBEDIT_KEYWORD_TAXONOMY, \
CFG_BIBEDIT_KEYWORD_RDFLABEL, \
searchby)
response['autokeyword'] = items
if (request_type == 'autosuggest'):
#call knowledge base function to put the suggestions in an array..
if fulltag and searchby and len(searchby) > 3:
#add trailing '*' wildcard for 'search_unit_in_bibxxx()' if not already present
suggest_values = get_kbd_values_for_bibedit(fulltag, "", searchby+"*")
#remove ..
new_suggest_vals = []
for sugg in suggest_values:
if sugg.startswith(searchby):
new_suggest_vals.append(sugg)
response['autosuggest'] = new_suggest_vals
if (request_type == 'autocomplete'):
#call the values function with the correct kb_name
if CFG_BIBEDIT_AUTOCOMPLETE_TAGS_KBS.has_key(fulltag):
kbname = CFG_BIBEDIT_AUTOCOMPLETE_TAGS_KBS[fulltag]
#check if the seachby field has semicolons. Take all
#the semicolon-separated items..
items = []
vals = []
if searchby:
if searchby.rfind(';'):
items = searchby.split(';')
else:
items = [searchby.strip()]
for item in items:
item = item.strip()
kbrvals = get_kbr_values(kbname, item, '', 'e') #we want an exact match
if kbrvals and kbrvals[0]: #add the found val into vals
vals.append(kbrvals[0])
#check that the values are not already contained in other
#instances of this field
record = get_cache_file_contents(recid, uid)[2]
xml_rec = wash_for_xml(print_rec(record))
record, status_code, dummy_errors = create_record(xml_rec)
existing_values = []
if (status_code != 0):
existing_values = record_get_field_values(record,
maintag,
subtag1,
subtag2,
subfieldcode)
#get the new values.. i.e. vals not in existing
new_vals = vals
for val in new_vals:
if val in existing_values:
new_vals.remove(val)
response['autocomplete'] = new_vals
response['resultCode'] = CFG_BIBEDIT_AJAX_RESULT_CODES_REV['autosuggestion_scanned']
return response
def perform_request_bibcatalog(request_type, recid, uid):
"""Handle request to BibCatalog (RT).
"""
response = {}
if request_type == 'getTickets':
# Insert the ticket data in the response, if possible
if bibcatalog_system is None:
response['tickets'] = "<!--No ticket system configured-->"
elif bibcatalog_system and uid:
bibcat_resp = bibcatalog_system.check_system(uid)
if bibcat_resp == "":
tickets_found = bibcatalog_system.ticket_search(uid, \
status=['new', 'open'], recordid=recid)
t_url_str = '' #put ticket urls here, formatted for HTML display
for t_id in tickets_found:
#t_url = bibcatalog_system.ticket_get_attribute(uid, \
# t_id, 'url_display')
ticket_info = bibcatalog_system.ticket_get_info( \
uid, t_id, ['url_display', 'url_close'])
t_url = ticket_info['url_display']
t_close_url = ticket_info['url_close']
#format..
t_url_str += "#" + str(t_id) + '<a href="' + t_url + \
'">[read]</a> <a href="' + t_close_url + \
'">[close]</a><br/>'
#put ticket header and tickets links in the box
t_url_str = "<strong>Tickets</strong><br/>" + t_url_str + \
"<br/>" + '<a href="new_ticket?recid=' + str(recid) + \
'>[new ticket]</a>'
response['tickets'] = t_url_str
#add a new ticket link
else:
#put something in the tickets container, for debug
response['tickets'] = "<!--"+bibcat_resp+"-->"
response['resultCode'] = 31
return response
def _add_curated_references_to_record(recid, uid, bibrec):
"""
Adds references from the cache that have been curated (contain $$9CURATOR)
to the bibrecord object
@param recid: record id, used to retrieve cache
@param uid: id of the current user, used to retrieve cache
@param bibrec: bibrecord object to add references to
"""
dummy1, dummy2, record, dummy3, dummy4, dummy5, dummy6 = get_cache_file_contents(recid, uid)
for field_instance in record_get_field_instances(record, "999", "C", "5"):
for subfield_instance in field_instance[0]:
if subfield_instance[0] == '9' and subfield_instance[1] == 'CURATOR':
# Add reference field on top of references, removing first $$o
field_instance = ([subfield for subfield in field_instance[0]
if subfield[0] != 'o'], field_instance[1],
field_instance[2], field_instance[3],
field_instance[4])
record_add_fields(bibrec, '999', [field_instance],
field_position_local=0)
def _xml_to_textmarc_references(bibrec):
"""
Convert XML record to textmarc and return the lines related to references
@param bibrec: bibrecord object to be converted
@return: textmarc lines with references
@rtype: string
"""
sysno = ""
options = {"aleph-marc":0, "correct-mode":1, "append-mode":0,
"delete-mode":0, "insert-mode":0, "replace-mode":0,
"text-marc":1}
# Using deepcopy as function create_marc_record() modifies the record passed
textmarc_references = [ line.strip() for line
in xmlmarc2textmarc.create_marc_record(copy.deepcopy(bibrec),
sysno, options).split('\n')
if '999C5' in line ]
return textmarc_references
def perform_request_ref_extract_url(recid, uid, url):
"""
Making use of the refextractor API, extract references from the url
received from the client
@param recid: opened record id
@param uid: active user id
@param url: URL to extract references from
@return response to be returned to the client code
"""
response = {}
try:
recordExtended = replace_references(recid, uid, url=url)
except FullTextNotAvailable:
response['ref_xmlrecord'] = False
response['ref_msg'] = "File not found. Server returned code 404"
return response
except:
response['ref_xmlrecord'] = False
response['ref_msg'] = """Error while fetching PDF. Bad URL or file could
not be retrieved """
return response
if not recordExtended:
response['ref_msg'] = """No references were found in the given PDF """
return response
ref_bibrecord = create_record(recordExtended)[0]
_add_curated_references_to_record(recid, uid, ref_bibrecord)
response['ref_bibrecord'] = ref_bibrecord
response['ref_xmlrecord'] = record_xml_output(ref_bibrecord)
textmarc_references = _xml_to_textmarc_references(ref_bibrecord)
response['ref_textmarc'] = '<div class="refextracted">' + '<br />'.join(textmarc_references) + "</div>"
return response
def perform_request_ref_extract(recid, uid, txt=None):
""" Handle request to extract references in the given record
@param recid: record id from which the references should be extracted
@type recid: str
@param txt: string containing references
@type txt: str
@param uid: user id
@type uid: int
@return: xml record with references extracted
@rtype: dictionary
"""
text_no_references_found_msg = """ No references extracted. The automatic
extraction did not recognize any reference in the
pasted text.<br /><br />If you want to add the references
manually, an easily recognizable format is:<br/><br/>
&nbsp;&nbsp;&nbsp;&nbsp;[1] Phys. Rev A71 (2005) 42<br />
&nbsp;&nbsp;&nbsp;&nbsp;[2] ATLAS-CMS-2007-333
"""
pdf_no_references_found_msg = """ No references were found in the attached
PDF.
"""
response = {}
response['ref_xmlrecord'] = False
recordExtended = None
try:
if txt:
recordExtended = replace_references(recid, uid,
txt=txt.decode('utf-8'))
if not recordExtended:
response['ref_msg'] = text_no_references_found_msg
else:
recordExtended = replace_references(recid, uid)
if not recordExtended:
response['ref_msg'] = pdf_no_references_found_msg
except FullTextNotAvailable:
response['ref_msg'] = """ The fulltext is not available.
"""
except:
response['ref_msg'] = """ An error ocurred while extracting references.
"""
if not recordExtended:
return response
ref_bibrecord = create_record(recordExtended)[0]
_add_curated_references_to_record(recid, uid, ref_bibrecord)
response['ref_bibrecord'] = ref_bibrecord
response['ref_xmlrecord'] = record_xml_output(ref_bibrecord)
textmarc_references = _xml_to_textmarc_references(ref_bibrecord)
response['ref_textmarc'] = '<div class="refextracted">' + '<br />'.join(textmarc_references) + "</div>"
return response
def perform_request_preview_record(request_type, recid, uid, data):
""" Handle request to preview record with formatting
"""
response = {}
if request_type == "preview":
if data["submitMode"] == "textmarc":
textmarc_record = data['textmarc']
xml_conversion_status = get_xml_from_textmarc(recid, textmarc_record)
if xml_conversion_status['resultMsg'] == 'textmarc_parsing_error':
response['resultCode'] = CFG_BIBEDIT_AJAX_RESULT_CODES_REV['textmarc_parsing_error']
response.update(xml_conversion_status)
return response
record = create_record(xml_conversion_status["resultXML"])[0]
elif cache_exists(recid, uid):
dummy1, dummy2, record, dummy3, dummy4, dummy5, dummy6 = get_cache_file_contents(recid, uid)
else:
record = get_bibrecord(recid)
# clean the record from unfilled volatile fields
record_strip_empty_volatile_subfields(record)
record_strip_empty_fields(record)
response['html_preview'] = _get_formated_record(record, data['new_window'])
# clean the record from unfilled volatile fields
record_strip_empty_volatile_subfields(record)
record_strip_empty_fields(record)
response['html_preview'] = _get_formated_record(record, data['new_window'])
return response
def perform_request_get_pdf_url(recid):
""" Handle request to get the URL of the attached PDF
"""
response = {}
rec_info = BibRecDocs(recid)
docs = rec_info.list_bibdocs()
doc_pdf_url = ""
for doc in docs:
try:
doc_pdf_url = doc.get_file('pdf').get_url()
except InvenioBibDocFileError:
continue
if doc_pdf_url:
response['pdf_url'] = doc_pdf_url
break
if not doc_pdf_url:
response['pdf_url'] = ""
return response
def perform_request_get_textmarc(recid, uid):
""" Get record content from cache, convert it to textmarc and return it
"""
textmarc_options = {"aleph-marc":0, "correct-mode":1, "append-mode":0,
"delete-mode":0, "insert-mode":0, "replace-mode":0,
"text-marc":1}
bibrecord = get_cache_file_contents(recid, uid)[2]
record_strip_empty_fields(bibrecord)
record_strip_controlfields(bibrecord)
textmarc = xmlmarc2textmarc.create_marc_record(
copy.deepcopy(bibrecord), sysno="", options=textmarc_options)
return {'textmarc': textmarc}
def perform_request_get_tableview(recid, uid, data):
""" Convert textmarc inputed by user to marcxml and if there are no
parsing errors, create cache file
"""
response = {}
textmarc_record = data['textmarc']
xml_conversion_status = get_xml_from_textmarc(recid, textmarc_record)
response.update(xml_conversion_status)
if xml_conversion_status['resultMsg'] == 'textmarc_parsing_error':
response['resultCode'] = CFG_BIBEDIT_AJAX_RESULT_CODES_REV['textmarc_parsing_error']
else:
create_cache_file(recid, uid,
create_record(xml_conversion_status['resultXML'])[0], data['recordDirty'])
response['resultCode'] = CFG_BIBEDIT_AJAX_RESULT_CODES_REV['tableview_change_success']
return response
def _get_formated_record(record, new_window):
"""Returns a record in a given format
@param record: BibRecord object
@param new_window: Boolean, indicates if it is needed to add all the headers
to the page (used when clicking Preview button)
"""
from invenio.config import CFG_WEBSTYLE_TEMPLATE_SKIN
xml_record = wash_for_xml(record_xml_output(record))
result = ''
if new_window:
result = """ <html><head><title>Record preview</title>
<script type="text/javascript" src="%(site_url)s/js/jquery.min.js"></script>
<link rel="stylesheet" href="%(site_url)s/img/invenio%(cssskin)s.css" type="text/css"></head>
"""%{'site_url': CFG_SITE_URL,
'cssskin': CFG_WEBSTYLE_TEMPLATE_SKIN != 'default' and '_' + CFG_WEBSTYLE_TEMPLATE_SKIN or ''
}
result += get_mathjax_header(True) + '<body>'
result += "<h2> Brief format preview </h2><br />"
result += bibformat.format_record(recID=None,
of="hb",
xml_record=xml_record) + "<br />"
result += "<br /><h2> Detailed format preview </h2><br />"
result += bibformat.format_record(recID=None,
of="hd",
xml_record=xml_record)
#Preview references
result += "<br /><h2> References </h2><br />"
result += bibformat.format_record(0,
'hdref',
xml_record=xml_record)
result += """<script>
$('#referenceinp_link').hide();
$('#referenceinp_link_span').hide();
</script>
"""
if new_window:
result += "</body></html>"
return result
########### Functions related to templates web interface #############
def perform_request_init_template_interface():
"""Handle a request to manage templates"""
errors = []
warnings = []
body = ''
# Add script data.
record_templates = get_record_templates()
record_templates.sort()
data = {'gRECORD_TEMPLATES': record_templates,
'gSITE_RECORD': '"' + CFG_SITE_RECORD + '"',
'gSITE_URL': '"' + CFG_SITE_URL + '"'}
body += '<script type="text/javascript">\n'
for key in data:
body += ' var %s = %s;\n' % (key, data[key])
body += ' </script>\n'
# Add scripts (the ordering is NOT irrelevant).
scripts = ['jquery-ui.min.js',
'json2.js', 'bibedit_display.js',
'bibedit_template_interface.js']
for script in scripts:
body += ' <script type="text/javascript" src="%s/js/%s">' \
'</script>\n' % (CFG_SITE_URL, script)
body += ' <div id="bibEditTemplateList"></div>\n'
body += ' <div id="bibEditTemplateEdit"></div>\n'
return body, errors, warnings
def perform_request_ajax_template_interface(data):
"""Handle Ajax requests by redirecting to appropriate function."""
response = {}
request_type = data['requestType']
if request_type == 'editTemplate':
# Edit a template request.
response.update(perform_request_edit_template(data))
return response
def perform_request_edit_template(data):
""" Handle request to edit a template """
response = {}
template_filename = data['templateFilename']
template = get_record_template(template_filename)
if not template:
response['resultCode'] = 1
else:
response['templateMARCXML'] = template
return response
def perform_doi_search(doi):
"""Search for DOI on the dx.doi.org page
@return: the url returned by this page"""
response = {}
url = "http://dx.doi.org/"
val = {'hdl': doi}
url_data = urllib.urlencode(val)
cj = cookielib.CookieJar()
header = [('User-Agent', CFG_DOI_USER_AGENT)]
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders = header
try:
resp = opener.open(url, url_data)
except:
return response
else:
response['doi_url'] = resp.geturl()
return response
diff --git a/invenio/legacy/bibedit/utils.py b/invenio/legacy/bibedit/utils.py
index 788518502..74a6ab795 100644
--- a/invenio/legacy/bibedit/utils.py
+++ b/invenio/legacy/bibedit/utils.py
@@ -1,1037 +1,1037 @@
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0103
"""BibEdit Utilities.
This module contains support functions (i.e., those that are not called directly
by the web interface), that might be imported by other modules or that is called
by both the web and CLI interfaces.
"""
__revision__ = "$Id$"
import cPickle
import difflib
import fnmatch
import marshal
import os
import re
import time
import zlib
import tempfile
import sys
from datetime import datetime
try:
from cStringIO import StringIO
except ImportError:
from StringIO import StringIO
-from invenio.bibedit_config import CFG_BIBEDIT_FILENAME, \
+from invenio.legacy.bibedit.config import CFG_BIBEDIT_FILENAME, \
CFG_BIBEDIT_RECORD_TEMPLATES_PATH, CFG_BIBEDIT_TO_MERGE_SUFFIX, \
CFG_BIBEDIT_FIELD_TEMPLATES_PATH, CFG_BIBEDIT_AJAX_RESULT_CODES_REV, \
CFG_BIBEDIT_CACHEDIR
-from invenio.bibedit_dblayer import get_record_last_modification_date, \
+from invenio.legacy.bibedit.db_layer import get_record_last_modification_date, \
delete_hp_change
from invenio.legacy.bibrecord import create_record, create_records, \
record_get_field_value, record_has_field, record_xml_output, \
record_strip_empty_fields, record_strip_empty_volatile_subfields, \
record_order_subfields, record_get_field_instances, \
record_add_field, field_get_subfield_codes, field_add_subfield, \
field_get_subfield_values, record_delete_fields, record_add_fields, \
record_get_field_values, print_rec, record_modify_subfield, \
record_modify_controlfield
-from invenio.bibtask import task_low_level_submission
+from invenio.legacy.bibsched.bibtask import task_low_level_submission
from invenio.config import CFG_BIBEDIT_LOCKLEVEL, \
CFG_BIBEDIT_TIMEOUT, CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG as OAIID_TAG, \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG as SYSNO_TAG, \
CFG_BIBEDIT_QUEUE_CHECK_METHOD, \
CFG_BIBEDIT_EXTEND_RECORD_WITH_COLLECTION_TEMPLATE, CFG_INSPIRE_SITE
from invenio.utils.date import convert_datetext_to_dategui
from invenio.utils.text import wash_for_xml
-from invenio.bibedit_dblayer import get_bibupload_task_opts, \
+from invenio.legacy.bibedit.db_layer import get_bibupload_task_opts, \
get_marcxml_of_record_revision, get_record_revisions, \
get_info_of_record_revision
from invenio.legacy.search_engine import print_record, record_exists, get_colID, \
guess_primary_collection_of_a_record, get_record, \
get_all_collections_of_a_record
from invenio.legacy.bibrecord import get_fieldvalues
from invenio.legacy.webuser import get_user_info, getUid, get_email
from invenio.legacy.dbquery import run_sql
-from invenio.websearchadminlib import get_detailed_page_tabs
+from invenio.legacy.websearch.adminlib import get_detailed_page_tabs
from invenio.modules.access.engine import acc_authorize_action
from invenio.refextract_api import extract_references_from_record_xml, \
extract_references_from_string_xml, \
extract_references_from_url_xml
from invenio.legacy.bibrecord.scripts.textmarc2xmlmarc import transform_file, ParseError
-from invenio.bibauthorid_name_utils import split_name_parts, \
+from invenio.legacy.bibauthorid.name_utils import split_name_parts, \
create_normalized_name
from invenio.modules.knowledge.api import get_kbr_values
# Precompile regexp:
re_file_option = re.compile(r'^%s' % CFG_BIBEDIT_CACHEDIR)
re_xmlfilename_suffix = re.compile('_(\d+)_\d+\.xml$')
re_revid_split = re.compile('^(\d+)\.(\d{14})$')
re_revdate_split = re.compile('^(\d\d\d\d)(\d\d)(\d\d)(\d\d)(\d\d)(\d\d)')
re_taskid = re.compile('ID="(\d+)"')
re_tmpl_name = re.compile('<!-- BibEdit-Template-Name: (.*) -->')
re_tmpl_description = re.compile('<!-- BibEdit-Template-Description: (.*) -->')
re_ftmpl_name = re.compile('<!-- BibEdit-Field-Template-Name: (.*) -->')
re_ftmpl_description = re.compile('<!-- BibEdit-Field-Template-Description: (.*) -->')
VOLATILE_PREFIX = "VOLATILE:"
# Authorization
def user_can_edit_record_collection(req, recid):
""" Check if user has authorization to modify a collection
the recid belongs to
"""
def remove_volatile(field_value):
""" Remove volatile keyword from field value """
if field_value.startswith(VOLATILE_PREFIX):
field_value = field_value[len(VOLATILE_PREFIX):]
return field_value
# Get the collections the record belongs to
record_collections = get_all_collections_of_a_record(recid)
uid = getUid(req)
# In case we are creating a new record
if cache_exists(recid, uid):
dummy1, dummy2, record, dummy3, dummy4, dummy5, dummy6 = get_cache_file_contents(recid, uid)
values = record_get_field_values(record, '980', code="a")
record_collections.extend([remove_volatile(v) for v in values])
normalized_collections = []
for collection in record_collections:
# Get the normalized collection name present in the action table
res = run_sql("""SELECT value FROM accARGUMENT
WHERE keyword='collection'
AND value=%s;""", (collection,))
if res:
normalized_collections.append(res[0][0])
if not normalized_collections:
# Check if user has access to all collections
auth_code, auth_message = acc_authorize_action(req, 'runbibedit',
collection='')
if auth_code == 0:
return True
else:
for collection in normalized_collections:
auth_code, auth_message = acc_authorize_action(req, 'runbibedit',
collection=collection)
if auth_code == 0:
return True
return False
# Helper functions
def assert_undo_redo_lists_correctness(undo_list, redo_list):
for undoItem in undo_list:
assert undoItem != None;
for redoItem in redo_list:
assert redoItem != None;
def record_find_matching_fields(key, rec, tag="", ind1=" ", ind2=" ", \
exact_match=False):
"""
This utility function will look for any fieldvalues containing or equal
to, if exact match is wanted, given keyword string. The found fields will be
returned as a list of field instances per tag. The fields to search can be
narrowed down to tag/indicator level.
@param key: keyword to search for
@type key: string
@param rec: a record structure as returned by bibrecord.create_record()
@type rec: dict
@param tag: a 3 characters long string
@type tag: string
@param ind1: a 1 character long string
@type ind1: string
@param ind2: a 1 character long string
@type ind2: string
@return: a list of found fields in a tuple per tag: (tag, field_instances) where
field_instances is a list of (Subfields, ind1, ind2, value, field_position_global)
and subfields is list of (code, value)
@rtype: list
"""
if not tag:
all_field_instances = rec.items()
else:
all_field_instances = [(tag, record_get_field_instances(rec, tag, ind1, ind2))]
matching_field_instances = []
for current_tag, field_instances in all_field_instances:
found_fields = []
for field_instance in field_instances:
# Get values to match: controlfield_value + subfield values
values_to_match = [field_instance[3]] + \
[val for code, val in field_instance[0]]
if exact_match and key in values_to_match:
found_fields.append(field_instance)
else:
for value in values_to_match:
if value.find(key) > -1:
found_fields.append(field_instance)
break
if len(found_fields) > 0:
matching_field_instances.append((current_tag, found_fields))
return matching_field_instances
# Operations on the BibEdit cache file
def cache_exists(recid, uid):
"""Check if the BibEdit cache file exists."""
return os.path.isfile('%s.tmp' % _get_file_path(recid, uid))
def get_cache_mtime(recid, uid):
"""Get the last modified time of the BibEdit cache file. Check that the
cache exists before calling this function.
"""
try:
return int(os.path.getmtime('%s.tmp' % _get_file_path(recid, uid)))
except OSError:
pass
def cache_expired(recid, uid):
"""Has it been longer than the number of seconds given by
CFG_BIBEDIT_TIMEOUT since last cache update? Check that the
cache exists before calling this function.
"""
return get_cache_mtime(recid, uid) < int(time.time()) - CFG_BIBEDIT_TIMEOUT
def create_cache_file(recid, uid, record='', cache_dirty=False, pending_changes=[], disabled_hp_changes = {}, undo_list = [], redo_list=[]):
"""Create a BibEdit cache file, and return revision and record. This will
overwrite any existing cache the user has for this record.
datetime.
"""
if not record:
record = get_bibrecord(recid)
if not record:
return
file_path = '%s.tmp' % _get_file_path(recid, uid)
record_revision = get_record_last_modification_date(recid)
if record_revision == None:
record_revision = datetime.now().timetuple()
cache_file = open(file_path, 'w')
assert_undo_redo_lists_correctness(undo_list, redo_list)
# Order subfields alphabetically after loading the record
record_order_subfields(record)
cPickle.dump([cache_dirty, record_revision, record, pending_changes, disabled_hp_changes, undo_list, redo_list], cache_file)
cache_file.close()
return record_revision, record
def touch_cache_file(recid, uid):
"""Touch a BibEdit cache file. This should be used to indicate that the
user has again accessed the record, so that locking will work correctly.
"""
if cache_exists(recid, uid):
os.system('touch %s.tmp' % _get_file_path(recid, uid))
def get_bibrecord(recid):
"""Return record in BibRecord wrapping."""
if record_exists(recid):
return create_record(print_record(recid, 'xm'))[0]
def get_cache_file_contents(recid, uid):
"""Return the contents of a BibEdit cache file."""
cache_file = _get_cache_file(recid, uid, 'r')
if cache_file:
cache_dirty, record_revision, record, pending_changes, disabled_hp_changes, undo_list, redo_list = cPickle.load(cache_file)
cache_file.close()
assert_undo_redo_lists_correctness(undo_list, redo_list)
return cache_dirty, record_revision, record, pending_changes, disabled_hp_changes, undo_list, redo_list
def update_cache_file_contents(recid, uid, record_revision, record, pending_changes, disabled_hp_changes, undo_list, redo_list):
"""Save updates to the record in BibEdit cache. Return file modificaton
time.
"""
cache_file = _get_cache_file(recid, uid, 'w')
if cache_file:
assert_undo_redo_lists_correctness(undo_list, redo_list)
cPickle.dump([True, record_revision, record, pending_changes, disabled_hp_changes, undo_list, redo_list], cache_file)
cache_file.close()
return get_cache_mtime(recid, uid)
def delete_cache_file(recid, uid):
"""Delete a BibEdit cache file."""
try:
os.remove('%s.tmp' % _get_file_path(recid, uid))
except OSError:
# File was probably already removed
pass
def delete_disabled_changes(used_changes):
for change_id in used_changes:
delete_hp_change(change_id)
def save_xml_record(recid, uid, xml_record='', to_upload=True, to_merge=False):
"""Write XML record to file. Default behaviour is to read the record from
a BibEdit cache file, filter out the unchanged volatile subfields,
write it back to an XML file and then pass this file to BibUpload.
@param xml_record: give XML as string in stead of reading cache file
@param to_upload: pass the XML file to BibUpload
@param to_merge: prepare an XML file for BibMerge to use
"""
if not xml_record:
# Read record from cache file.
cache = get_cache_file_contents(recid, uid)
if cache:
record = cache[2]
used_changes = cache[4]
xml_record = record_xml_output(record)
delete_cache_file(recid, uid)
delete_disabled_changes(used_changes)
else:
record = create_record(xml_record)[0]
# clean the record from unfilled volatile fields
record_strip_empty_volatile_subfields(record)
record_strip_empty_fields(record)
# order subfields alphabetically before saving the record
record_order_subfields(record)
xml_to_write = wash_for_xml(record_xml_output(record))
# Write XML file.
if not to_merge:
file_path = '%s.xml' % _get_file_path(recid, uid)
else:
file_path = '%s_%s.xml' % (_get_file_path(recid, uid),
CFG_BIBEDIT_TO_MERGE_SUFFIX)
xml_file = open(file_path, 'w')
xml_file.write(xml_to_write)
xml_file.close()
user_name = get_user_info(uid)[1]
if to_upload:
# Pass XML file to BibUpload.
task_low_level_submission('bibupload', 'bibedit', '-P', '5', '-r',
file_path, '-u', user_name)
return True
# Security: Locking and integrity
def latest_record_revision(recid, revision_time):
"""Check if timetuple REVISION_TIME matches latest modification date."""
latest = get_record_last_modification_date(recid)
# this can be none if the record is new
return (latest == None) or (revision_time == latest)
def record_locked_by_other_user(recid, uid):
"""Return true if any other user than UID has active caches for record
RECID.
"""
active_uids = _uids_with_active_caches(recid)
try:
active_uids.remove(uid)
except ValueError:
pass
return bool(active_uids)
def get_record_locked_since(recid, uid):
""" Get modification time for the given recid and uid
"""
filename = "%s_%s_%s.tmp" % (CFG_BIBEDIT_FILENAME,
recid,
uid)
locked_since = ""
try:
locked_since = time.ctime(os.path.getmtime('%s%s%s' % (
CFG_BIBEDIT_CACHEDIR, os.sep, filename)))
except OSError:
pass
return locked_since
def record_locked_by_user_details(recid, uid):
""" Get the details about the user that has locked a record and the
time the record has been locked.
@return: user details and time when record was locked
@rtype: tuple
"""
active_uids = _uids_with_active_caches(recid)
try:
active_uids.remove(uid)
except ValueError:
pass
record_blocked_by_nickname = record_blocked_by_email = locked_since = ""
if active_uids:
record_blocked_by_uid = active_uids[0]
record_blocked_by_nickname = get_user_info(record_blocked_by_uid)[1]
record_blocked_by_email = get_email(record_blocked_by_uid)
locked_since = get_record_locked_since(recid, record_blocked_by_uid)
return record_blocked_by_nickname, record_blocked_by_email, locked_since
def record_locked_by_queue(recid):
"""Check if record should be locked for editing because of the current state
of the BibUpload queue. The level of checking is based on
CFG_BIBEDIT_LOCKLEVEL.
"""
# Check for *any* scheduled bibupload tasks.
if CFG_BIBEDIT_LOCKLEVEL == 2:
return _get_bibupload_task_ids()
filenames = _get_bibupload_filenames()
# Check for match between name of XML-files and record.
# Assumes that filename ends with _<recid>.xml.
if CFG_BIBEDIT_LOCKLEVEL == 1:
recids = []
for filename in filenames:
filename_suffix = re_xmlfilename_suffix.search(filename)
if filename_suffix:
recids.append(int(filename_suffix.group(1)))
return recid in recids
# Check for match between content of files and record.
if CFG_BIBEDIT_LOCKLEVEL == 3:
while True:
lock = _record_in_files_p(recid, filenames)
# Check if any new files were added while we were searching
if not lock:
filenames_updated = _get_bibupload_filenames()
for filename in filenames_updated:
if not filename in filenames:
break
else:
return lock
else:
return lock
# History/revisions
def revision_to_timestamp(td):
"""
Converts the revision date to the timestamp
"""
return "%04i%02i%02i%02i%02i%02i" % (td.tm_year, td.tm_mon, td.tm_mday, \
td.tm_hour, td.tm_min, td.tm_sec)
def timestamp_to_revision(timestamp):
"""
Converts the timestamp to a correct revision date
"""
year = int(timestamp[0:4])
month = int(timestamp[4:6])
day = int(timestamp[6:8])
hour = int(timestamp[8:10])
minute = int(timestamp[10:12])
second = int(timestamp[12:14])
return datetime(year, month, day, hour, minute, second).timetuple()
def get_record_revision_timestamps(recid):
"""return list of timestamps describing teh revisions of a given record"""
rev_ids = get_record_revision_ids(recid)
result = []
for rev_id in rev_ids:
result.append(rev_id.split(".")[1])
return result
def get_record_revision_ids(recid):
"""Return list of all record revision IDs.
Return revision IDs in chronologically decreasing order (latest first).
"""
res = []
tmp_res = get_record_revisions(recid)
for row in tmp_res:
res.append('%s.%s' % (row[0], row[1]))
return res
def get_marcxml_of_revision(recid, revid):
"""Return MARCXML string of revision.
Return empty string if revision does not exist. REVID should be a string.
"""
res = ''
tmp_res = get_marcxml_of_record_revision(recid, revid)
if tmp_res:
for row in tmp_res:
res += zlib.decompress(row[0]) + '\n'
return res;
def get_marcxml_of_revision_id(revid):
"""Return MARCXML string of revision.
Return empty string if revision does not exist. REVID should be a string.
"""
recid, job_date = split_revid(revid, 'datetext')
return get_marcxml_of_revision(recid, job_date);
def get_info_of_revision_id(revid):
"""Return info string regarding revision.
Return empty string if revision does not exist. REVID should be a string.
"""
recid, job_date = split_revid(revid, 'datetext')
res = ''
tmp_res = get_info_of_record_revision(recid, job_date)
if tmp_res:
task_id = str(tmp_res[0][0])
author = tmp_res[0][1]
if not author:
author = 'N/A'
res += '%s %s %s' % (revid.ljust(22), task_id.ljust(15), author.ljust(15))
job_details = tmp_res[0][2].split()
upload_mode = job_details[0] + job_details[1][:-1]
upload_file = job_details[2] + job_details[3][:-1]
res += '%s %s' % (upload_mode, upload_file)
return res
def revision_format_valid_p(revid):
"""Test validity of revision ID format (=RECID.REVDATE)."""
if re_revid_split.match(revid):
return True
return False
def record_revision_exists(recid, revid):
results = get_record_revisions(recid)
for res in results:
if res[1] == revid:
return True
return False
def split_revid(revid, dateformat=''):
"""Split revid and return tuple (recid, revdate).
Optional dateformat can be datetext or dategui.
"""
recid, revdate = re_revid_split.search(revid).groups()
if dateformat:
datetext = '%s-%s-%s %s:%s:%s' % re_revdate_split.search(
revdate).groups()
if dateformat == 'datetext':
revdate = datetext
elif dateformat == 'dategui':
revdate = convert_datetext_to_dategui(datetext, secs=True)
return recid, revdate
def modify_record_timestamp(revision_xml, last_revision_ts):
""" Modify tag 005 to add the revision passed as parameter.
@param revision_xml: marcxml representation of the record to modify
@type revision_xml: string
@param last_revision_ts: timestamp to add to 005 tag
@type last_revision_ts: string
@return: marcxml with 005 tag modified
"""
recstruct = create_record(revision_xml)[0]
record_modify_controlfield(recstruct, "005", last_revision_ts,
field_position_local=0)
return record_xml_output(recstruct)
def get_xml_comparison(header1, header2, xml1, xml2):
"""Return diff of two MARCXML records."""
return ''.join(difflib.unified_diff(xml1.splitlines(1),
xml2.splitlines(1), header1, header2))
#Templates
def get_templates(templatesDir, tmpl_name, tmpl_description, extractContent = False):
"""Return list of templates [filename, name, description, content*]
the extractContent variable indicated if the parsed content should
be included"""
template_fnames = fnmatch.filter(os.listdir(
templatesDir), '*.xml')
templates = []
for fname in template_fnames:
filepath = '%s%s%s' % (templatesDir, os.sep, fname)
template_file = open(filepath,'r')
template = template_file.read()
template_file.close()
fname_stripped = os.path.splitext(fname)[0]
mo_name = tmpl_name.search(template)
mo_description = tmpl_description.search(template)
date_modified = time.ctime(os.path.getmtime(filepath))
if mo_name:
name = mo_name.group(1)
else:
name = fname_stripped
if mo_description:
description = mo_description.group(1)
else:
description = ''
if (extractContent):
parsedTemplate = create_record(template)[0]
if parsedTemplate != None:
# If the template was correct
templates.append([fname_stripped, name, description, parsedTemplate])
else:
raise "Problem when parsing the template %s" % (fname, )
else:
templates.append([fname_stripped, name, description, date_modified])
return templates
# Field templates
def get_field_templates():
"""Returns list of field templates [filename, name, description, content]"""
return get_templates(CFG_BIBEDIT_FIELD_TEMPLATES_PATH, re_ftmpl_name, re_ftmpl_description, True)
# Record templates
def get_record_templates():
"""Return list of record template [filename, name, description] ."""
return get_templates(CFG_BIBEDIT_RECORD_TEMPLATES_PATH, re_tmpl_name, re_tmpl_description, False)
def get_record_template(name):
"""Return an XML record template."""
filepath = '%s%s%s.xml' % (CFG_BIBEDIT_RECORD_TEMPLATES_PATH, os.sep, name)
if os.path.isfile(filepath):
template_file = open(filepath, 'r')
template = template_file.read()
template_file.close()
return template
# Private functions
def _get_cache_file(recid, uid, mode):
"""Return a BibEdit cache file object."""
if cache_exists(recid, uid):
return open('%s.tmp' % _get_file_path(recid, uid), mode)
def _get_file_path(recid, uid, filename=''):
"""Return the file path to a BibEdit file (excluding suffix).
If filename is specified this replaces the config default.
"""
if not filename:
return '%s%s%s_%s_%s' % (CFG_BIBEDIT_CACHEDIR, os.sep, CFG_BIBEDIT_FILENAME,
recid, uid)
else:
return '%s%s%s_%s_%s' % (CFG_BIBEDIT_CACHEDIR, os.sep, filename, recid, uid)
def _uids_with_active_caches(recid):
"""Return list of uids with active caches for record RECID. Active caches
are caches that have been modified a number of seconds ago that is less than
the one given by CFG_BIBEDIT_TIMEOUT.
"""
re_tmpfilename = re.compile('%s_%s_(\d+)\.tmp' % (CFG_BIBEDIT_FILENAME,
recid))
tmpfiles = fnmatch.filter(os.listdir(CFG_BIBEDIT_CACHEDIR), '%s*.tmp' %
CFG_BIBEDIT_FILENAME)
expire_time = int(time.time()) - CFG_BIBEDIT_TIMEOUT
active_uids = []
for tmpfile in tmpfiles:
mo = re_tmpfilename.match(tmpfile)
if mo and int(os.path.getmtime('%s%s%s' % (
CFG_BIBEDIT_CACHEDIR, os.sep, tmpfile))) > expire_time:
active_uids.append(int(mo.group(1)))
return active_uids
def _get_bibupload_task_ids():
"""Return list of all BibUpload task IDs.
Ignore tasks submitted by user bibreformat.
"""
res = run_sql('''SELECT id FROM schTASK WHERE proc LIKE "bibupload%" AND user <> "bibreformat" AND status IN ("WAITING", "SCHEDULED", "RUNNING", "CONTINUING", "ABOUT TO STOP", "ABOUT TO SLEEP", "SLEEPING")''')
return [row[0] for row in res]
def _get_bibupload_filenames():
"""Return paths to all files scheduled for upload."""
task_ids = _get_bibupload_task_ids()
filenames = []
tasks_opts = get_bibupload_task_opts(task_ids)
for task_opts in tasks_opts:
if task_opts:
record_options = marshal.loads(task_opts[0][0])
for option in record_options[1:]:
if re_file_option.search(option):
filenames.append(option)
return filenames
def _record_in_files_p(recid, filenames):
"""Search XML files for given record."""
# Get id tags of record in question
rec_oaiid = rec_sysno = -1
rec_oaiid_tag = get_fieldvalues(recid, OAIID_TAG)
if rec_oaiid_tag:
rec_oaiid = rec_oaiid_tag[0]
rec_sysno_tag = get_fieldvalues(recid, SYSNO_TAG)
if rec_sysno_tag:
rec_sysno = rec_sysno_tag[0]
# For each record in each file, compare ids and abort if match is found
for filename in filenames:
try:
if CFG_BIBEDIT_QUEUE_CHECK_METHOD == 'regexp':
# check via regexp: this is fast, but may not be precise
re_match_001 = re.compile('<controlfield tag="001">%s</controlfield>' % (recid))
re_match_oaiid = re.compile('<datafield tag="%s" ind1=" " ind2=" ">(\s*<subfield code="a">\s*|\s*<subfield code="9">\s*.*\s*</subfield>\s*<subfield code="a">\s*)%s' % (OAIID_TAG[0:3],rec_oaiid))
re_match_sysno = re.compile('<datafield tag="%s" ind1=" " ind2=" ">(\s*<subfield code="a">\s*|\s*<subfield code="9">\s*.*\s*</subfield>\s*<subfield code="a">\s*)%s' % (SYSNO_TAG[0:3],rec_sysno))
file_content = open(filename).read()
if re_match_001.search(file_content):
return True
if rec_oaiid_tag:
if re_match_oaiid.search(file_content):
return True
if rec_sysno_tag:
if re_match_sysno.search(file_content):
return True
else:
# by default, check via bibrecord: this is accurate, but may be slow
file_ = open(filename)
records = create_records(file_.read(), 0, 0)
for i in range(0, len(records)):
record, all_good = records[i][:2]
if record and all_good:
if _record_has_id_p(record, recid, rec_oaiid, rec_sysno):
return True
file_.close()
except IOError:
continue
return False
def _record_has_id_p(record, recid, rec_oaiid, rec_sysno):
"""Check if record matches any of the given IDs."""
if record_has_field(record, '001'):
if (record_get_field_value(record, '001', '%', '%')
== str(recid)):
return True
if record_has_field(record, OAIID_TAG[0:3]):
if (record_get_field_value(
record, OAIID_TAG[0:3], OAIID_TAG[3],
OAIID_TAG[4], OAIID_TAG[5]) == rec_oaiid):
return True
if record_has_field(record, SYSNO_TAG[0:3]):
if (record_get_field_value(
record, SYSNO_TAG[0:3], SYSNO_TAG[3],
SYSNO_TAG[4], SYSNO_TAG[5]) == rec_sysno):
return True
return False
def can_record_have_physical_copies(recid):
"""Determine if the record can have physical copies
(addable through the bibCirculation module).
The information is derieved using the tabs displayed for a given record.
Only records already saved within the collection may have the physical copies
@return: True or False
"""
if get_record(recid) == None:
return False
col_id = get_colID(guess_primary_collection_of_a_record(recid))
collections = get_detailed_page_tabs(col_id, recid)
if (not collections.has_key("holdings")) or \
(not collections["holdings"].has_key("visible")):
return False
return collections["holdings"]["visible"] == True
def get_record_collections(recid):
""" Returns all collections of a record, field 980
@param recid: record id to get collections from
@type: string
@return: list of collections
@rtype: list
"""
recstruct = get_record(recid)
return [collection for collection in record_get_field_values(recstruct,
tag="980",
ind1=" ",
ind2=" ",
code="a")]
def extend_record_with_template(recid):
""" Determine if the record has to be extended with the content
of a template as defined in CFG_BIBEDIT_EXTEND_RECORD_WITH_COLLECTION_TEMPLATE
@return: template name to be applied to record or False if no template
has to be applied
"""
rec_collections = get_record_collections(recid)
for collection in rec_collections:
if collection in CFG_BIBEDIT_EXTEND_RECORD_WITH_COLLECTION_TEMPLATE:
return CFG_BIBEDIT_EXTEND_RECORD_WITH_COLLECTION_TEMPLATE[collection]
return False
def merge_record_with_template(rec, template_name):
""" Extend the record rec with the contents of the template and return it"""
template = get_record_template(template_name)
if not template:
return
template_bibrec = create_record(template)[0]
for field_tag in template_bibrec:
if not record_has_field(rec, field_tag):
for field_instance in template_bibrec[field_tag]:
record_add_field(rec, field_tag, field_instance[1],
field_instance[2], subfields=field_instance[0])
else:
for template_field_instance in template_bibrec[field_tag]:
subfield_codes_template = field_get_subfield_codes(template_field_instance)
for field_instance in rec[field_tag]:
subfield_codes = field_get_subfield_codes(field_instance)
for code in subfield_codes_template:
if code not in subfield_codes:
field_add_subfield(field_instance, code,
field_get_subfield_values(template_field_instance,
code)[0])
return rec
#################### Reference extraction ####################
def replace_references(recid, uid=None, txt=None, url=None):
"""Replace references for a record
The record itself is not updated, the marc xml of the document with updated
references is returned
Parameters:
* recid: the id of the record
* txt: references in text mode
* inspire: format of ther references
"""
# Parse references
if txt is not None:
references_xml = extract_references_from_string_xml(txt, is_only_references=True)
elif url is not None:
references_xml = extract_references_from_url_xml(url)
else:
references_xml = extract_references_from_record_xml(recid)
references = create_record(references_xml.encode('utf-8'))
dummy1, dummy2, record, dummy3, dummy4, dummy5, dummy6 = get_cache_file_contents(recid, uid)
out_xml = None
references_to_add = record_get_field_instances(references[0],
tag='999',
ind1='C',
ind2='5')
refextract_status = record_get_field_instances(references[0],
tag='999',
ind1='C',
ind2='6')
if references_to_add:
# Replace 999 fields
record_delete_fields(record, '999')
record_add_fields(record, '999', references_to_add)
record_add_fields(record, '999', refextract_status)
# Update record references
out_xml = record_xml_output(record)
return out_xml
#################### cnum generation ####################
def record_is_conference(record):
"""
Determine if the record is a new conference based on the value present
on field 980
@param record: record to be checked
@type record: bibrecord object
@return: True if record is a conference, False otherwise
@rtype: boolean
"""
# Get collection field content (tag 980)
tag_980_content = record_get_field_values(record, "980", " ", " ", "a")
if "CONFERENCES" in tag_980_content:
return True
return False
def add_record_cnum(recid, uid):
"""
Check if the record has already a cnum. If not generate a new one
and return the result
@param recid: recid of the record under check. Used to retrieve cache file
@type recid: int
@param uid: id of the user. Used to retrieve cache file
@type uid: int
@return: None if cnum already present, new cnum otherwise
@rtype: None or string
"""
# Import placed here to avoid circular dependency
from invenio.sequtils_cnum import CnumSeq, ConferenceNoStartDateError
record_revision, record, pending_changes, deactivated_hp_changes, \
undo_list, redo_list = get_cache_file_contents(recid, uid)[1:]
record_strip_empty_volatile_subfields(record)
# Check if record already has a cnum
tag_111__g_content = record_get_field_value(record, "111", " ", " ", "g")
if tag_111__g_content:
return
else:
cnum_seq = CnumSeq()
try:
new_cnum = cnum_seq.next_value(xml_record=wash_for_xml(print_rec(record)))
except ConferenceNoStartDateError:
return None
field_add_subfield(record['111'][0], 'g', new_cnum)
update_cache_file_contents(recid, uid, record_revision,
record, \
pending_changes, \
deactivated_hp_changes, \
undo_list, redo_list)
return new_cnum
def get_xml_from_textmarc(recid, textmarc_record):
"""
Convert textmarc to marcxml and return the result of the conversion
@param recid: id of the record that is being converted
@type: int
@param textmarc_record: record content in textmarc format
@type: string
@return: dictionary with the following keys:
* resultMsg: message describing conversion status
* resultXML: xml resulting from conversion
* parse_error: in case of error, a description of it
@rtype: dict
"""
response = {}
# Let's remove empty lines
textmarc_record = os.linesep.join([s for s in textmarc_record.splitlines() if s])
# Create temp file with textmarc to be converted by textmarc2xmlmarc
(file_descriptor, file_name) = tempfile.mkstemp()
f = os.fdopen(file_descriptor, "w")
# Write content appending sysno at beginning
for line in textmarc_record.splitlines():
f.write("%09d %s\n" % (recid, re.sub("\s+", " ", line.strip())))
f.close()
old_stdout = sys.stdout
try:
# Redirect output, transform, restore old references
new_stdout = StringIO()
sys.stdout = new_stdout
try:
transform_file(file_name)
response['resultMsg'] = 'textmarc_parsing_success'
response['resultXML'] = new_stdout.getvalue()
except ParseError, e:
# Something went wrong, notify user
response['resultXML'] = ""
response['resultMsg'] = 'textmarc_parsing_error'
response['parse_error'] = [e.lineno, " ".join(e.linecontent.split()[1:]), e.message]
finally:
sys.stdout = old_stdout
return response
#################### crossref utils ####################
def crossref_process_template(template, change=False):
"""
Creates record from template based on xml template
@param change: if set to True, makes changes to the record (translating the
title, unifying autroh names etc.), if not - returns record without
any changes
@return: record
"""
record = create_record(template)[0]
if change:
crossref_translate_title(record)
crossref_normalize_name(record)
return record
def crossref_translate_title(record):
"""
Convert the record's title to the Inspire specific abbreviation
of the title (using JOURNALS knowledge base)
@return: changed record
"""
# probably there is only one 773 field
# but just in case let's treat it as a list
for field in record_get_field_instances(record, '773'):
title = field[0][0][1]
new_title = get_kbr_values("JOURNALS", title, searchtype='e')
if new_title:
# returned value is a list, and we need only the first value
new_title = new_title[0][0]
position = field[4]
record_modify_subfield(rec=record, tag='773', subfield_code='p', \
value=new_title, subfield_position=0, field_position_global=position)
def crossref_normalize_name(record):
"""
Changes the format of author's name (often with initials) to the proper,
unified one, using bibauthor_name_utils tools
@return: changed record
"""
# pattern for removing the spaces between two initials
pattern_initials = '([A-Z]\\.)\\s([A-Z]\\.)'
# first, change the main author
for field in record_get_field_instances(record, '100'):
main_author = field[0][0][1]
new_author = create_normalized_name(split_name_parts(main_author))
# remove spaces between initials
# two iterations are required
for _ in range(2):
new_author = re.sub(pattern_initials, '\g<1>\g<2>', new_author)
position = field[4]
record_modify_subfield(rec=record, tag='100', subfield_code='a', \
value=new_author, subfield_position=0, field_position_global=position)
# then, change additional authors
for field in record_get_field_instances(record, '700'):
author = field[0][0][1]
new_author = create_normalized_name(split_name_parts(author))
for _ in range(2):
new_author = re.sub(pattern_initials, '\g<1>\g<2>',new_author)
position = field[4]
record_modify_subfield(rec=record, tag='700', subfield_code='a', \
value=new_author, subfield_position=0, field_position_global=position)
diff --git a/invenio/legacy/bibedit/webinterface.py b/invenio/legacy/bibedit/webinterface.py
index 296065be6..88a21c96a 100644
--- a/invenio/legacy/bibedit/webinterface.py
+++ b/invenio/legacy/bibedit/webinterface.py
@@ -1,286 +1,286 @@
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0103
"""Invenio BibEdit Administrator Interface."""
__revision__ = "$Id"
__lastupdated__ = """$Date: 2008/08/12 09:26:46 $"""
from flask.ext.login import current_user
from invenio.utils.json import json, json_unicode_to_utf8, CFG_JSON_AVAILABLE
from invenio.modules.access.engine import acc_authorize_action
-from invenio.bibedit_engine import perform_request_ajax, perform_request_init, \
+from invenio.legacy.bibedit.engine import perform_request_ajax, perform_request_init, \
perform_request_newticket, perform_request_compare, \
perform_request_init_template_interface, \
perform_request_ajax_template_interface
-from invenio.bibedit_utils import user_can_edit_record_collection
+from invenio.legacy.bibedit.utils import user_can_edit_record_collection
from invenio.config import CFG_SITE_LANG, CFG_SITE_SECURE_URL, CFG_SITE_RECORD
from invenio.base.i18n import gettext_set_language
from invenio.utils.url import redirect_to_url
from invenio.ext.legacy.handler import WebInterfaceDirectory, wash_urlargd
from invenio.legacy.webpage import page
from invenio.legacy.webuser import page_not_authorized
navtrail = (' <a class="navtrail" href=\"%s/help/admin\">Admin Area</a> '
) % CFG_SITE_SECURE_URL
navtrail_bibedit = (' <a class="navtrail" href=\"%s/help/admin\">Admin Area</a> ' + \
' &gt; <a class="navtrail" href=\"%s/%s/edit\">Record Editor</a>'
) % (CFG_SITE_SECURE_URL, CFG_SITE_SECURE_URL, CFG_SITE_RECORD)
class WebInterfaceEditPages(WebInterfaceDirectory):
"""Defines the set of /edit pages."""
_exports = ['', 'new_ticket', 'compare_revisions', 'templates']
def __init__(self, recid=None):
"""Initialize."""
self.recid = recid
def index(self, req, form):
"""Handle all BibEdit requests.
The responsibilities of this functions is:
* JSON decoding and encoding.
* Redirection, if necessary.
* Authorization.
* Calling the appropriate function from the engine.
"""
uid = current_user.get_id()
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG)})
# Abort if the simplejson module isn't available
if not CFG_JSON_AVAILABLE:
title = 'Record Editor'
body = '''Sorry, the record editor cannot operate when the
`simplejson' module is not installed. Please see the INSTALL
file.'''
return page(title = title,
body = body,
errors = [],
warnings = [],
uid = uid,
language = argd['ln'],
navtrail = navtrail,
lastupdated = __lastupdated__,
req = req,
body_css_classes = ['bibedit'])
# If it is an Ajax request, extract any JSON data.
ajax_request, recid = False, None
if form.has_key('jsondata'):
json_data = json.loads(str(form['jsondata']))
# Deunicode all strings (Invenio doesn't have unicode
# support).
json_data = json_unicode_to_utf8(json_data)
ajax_request = True
if json_data.has_key('recID'):
recid = json_data['recID']
json_response = {'resultCode': 0, 'ID': json_data['ID']}
# Authorization.
if current_user.is_guest:
# User is not logged in.
if not ajax_request:
# Do not display the introductory recID selection box to guest
# users (as it used to be with v0.99.0):
dummy_auth_code, auth_message = acc_authorize_action(req,
'runbibedit')
referer = '/edit/'
if self.recid:
referer = '/%s/%s/edit/' % (CFG_SITE_RECORD, self.recid)
return page_not_authorized(req=req, referer=referer,
text=auth_message, navtrail=navtrail)
else:
# Session has most likely timed out.
json_response.update({'resultCode': 100})
return json.dumps(json_response)
elif self.recid:
# Handle RESTful calls from logged in users by redirecting to
# generic URL.
redirect_to_url(req, '%s/%s/edit/#state=edit&recid=%s&recrev=%s' % (
CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, ""))
elif recid is not None:
json_response.update({'recID': recid})
if json_data['requestType'] == "getRecord":
# Authorize access to record.
if not user_can_edit_record_collection(req, recid):
json_response.update({'resultCode': 101})
return json.dumps(json_response)
# Handle request.
if not ajax_request:
# Show BibEdit start page.
body, errors, warnings = perform_request_init(uid, argd['ln'], req, __lastupdated__)
title = 'Record Editor'
return page(title = title,
body = body,
errors = errors,
warnings = warnings,
uid = uid,
language = argd['ln'],
navtrail = navtrail,
lastupdated = __lastupdated__,
req = req,
body_css_classes = ['bibedit'])
else:
# Handle AJAX request.
json_response.update(perform_request_ajax(req, recid, uid,
json_data))
return json.dumps(json_response)
def compare_revisions(self, req, form):
"""Handle the compare revisions request"""
argd = wash_urlargd(form, { \
'ln': (str, CFG_SITE_LANG), \
'rev1' : (str, ''), \
'rev2' : (str, ''), \
'recid': (int, 0)})
ln = argd['ln']
uid = current_user.get_id()
_ = gettext_set_language(ln)
# Checking if currently logged user has permission to perform this request
auth_code, auth_message = acc_authorize_action(req, 'runbibedit')
if auth_code != 0:
return page_not_authorized(req=req, referer="/edit",
text=auth_message, navtrail=navtrail)
recid = argd['recid']
rev1 = argd['rev1']
rev2 = argd['rev2']
ln = argd['ln']
body, errors, warnings = perform_request_compare(ln, recid, rev1, rev2)
return page(title = _("Comparing two record revisions"),
body = body,
errors = errors,
warnings = warnings,
uid = uid,
language = ln,
navtrail = navtrail,
lastupdated = __lastupdated__,
req = req,
body_css_classes = ['bibedit'])
def new_ticket(self, req, form):
"""handle a edit/new_ticket request"""
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG), 'recid': (int, 0)})
ln = argd['ln']
_ = gettext_set_language(ln)
auth_code, auth_message = acc_authorize_action(req, 'runbibedit')
if auth_code != 0:
return page_not_authorized(req=req, referer="/edit",
text=auth_message, navtrail=navtrail)
uid = current_user.get_id()
if argd['recid']:
(errmsg, url) = perform_request_newticket(argd['recid'], uid)
if errmsg:
return page(title = _("Failed to create a ticket"),
body = _("Error")+": "+errmsg,
errors = [],
warnings = [],
uid = uid,
language = ln,
navtrail = navtrail,
lastupdated = __lastupdated__,
req = req,
body_css_classes = ['bibedit'])
else:
#redirect..
redirect_to_url(req, url)
def templates(self, req, form):
"""handle a edit/templates request"""
uid = current_user.get_id()
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG)})
# Abort if the simplejson module isn't available
if not CFG_JSON_AVAILABLE:
title = 'Record Editor Template Manager'
body = '''Sorry, the record editor cannot operate when the
`simplejson' module is not installed. Please see the INSTALL
file.'''
return page(title = title,
body = body,
errors = [],
warnings = [],
uid = uid,
language = argd['ln'],
navtrail = navtrail_bibedit,
lastupdated = __lastupdated__,
req = req,
body_css_classes = ['bibedit'])
# If it is an Ajax request, extract any JSON data.
ajax_request = False
if form.has_key('jsondata'):
json_data = json.loads(str(form['jsondata']))
# Deunicode all strings (Invenio doesn't have unicode
# support).
json_data = json_unicode_to_utf8(json_data)
ajax_request = True
json_response = {'resultCode': 0}
# Authorization.
if current_user.is_guest:
# User is not logged in.
if not ajax_request:
# Do not display the introductory recID selection box to guest
# users (as it used to be with v0.99.0):
dummy_auth_code, auth_message = acc_authorize_action(req,
'runbibedit')
referer = '/edit'
return page_not_authorized(req=req, referer=referer,
text=auth_message, navtrail=navtrail)
else:
# Session has most likely timed out.
json_response.update({'resultCode': 100})
return json.dumps(json_response)
# Handle request.
if not ajax_request:
# Show BibEdit template management start page.
body, errors, warnings = perform_request_init_template_interface()
title = 'Record Editor Template Manager'
return page(title = title,
body = body,
errors = errors,
warnings = warnings,
uid = uid,
language = argd['ln'],
navtrail = navtrail_bibedit,
lastupdated = __lastupdated__,
req = req,
body_css_classes = ['bibedit'])
else:
# Handle AJAX request.
json_response.update(perform_request_ajax_template_interface(json_data))
return json.dumps(json_response)
def __call__(self, req, form):
"""Redirect calls without final slash."""
if self.recid:
redirect_to_url(req, '%s/%s/%s/edit/' % (CFG_SITE_SECURE_URL,
CFG_SITE_RECORD,
self.recid))
else:
redirect_to_url(req, '%s/%s/edit/' % (CFG_SITE_SECURE_URL, CFG_SITE_RECORD))
diff --git a/invenio/legacy/bibeditmulti/engine.py b/invenio/legacy/bibeditmulti/engine.py
index 96f193a0d..5a12d4b17 100644
--- a/invenio/legacy/bibeditmulti/engine.py
+++ b/invenio/legacy/bibeditmulti/engine.py
@@ -1,714 +1,714 @@
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Invenio Multiple Record Editor Engine.
Every action related to record modification is performed
by specific class (successor of some of the commands).
Every of these classes is designed to perform specific action.
The engine itself receives a list of this classes and after retrieving
the records, asks the commands to perform their changes. This way the
engine itself is independent of the actions for modification of the records.
When we need to perform a new action on the records, we define a new command
and take care to pass it to the engine.
***************************************************************************
Subfield commands represent the actions performed on the subfields
of the record. The interface of these commands is defined in their
base class.
"""
__revision__ = "$Id"
import subprocess
import re
import invenio.legacy.search_engine
from invenio.legacy import bibrecord
from invenio.modules import formatter as bibformat
from invenio.config import CFG_TMPSHAREDDIR, CFG_BIBEDITMULTI_LIMIT_INSTANT_PROCESSING,\
CFG_BIBEDITMULTI_LIMIT_DELAYED_PROCESSING,\
CFG_BIBEDITMULTI_LIMIT_DELAYED_PROCESSING_TIME
from time import strftime
-from invenio.bibtask import task_low_level_submission
+from invenio.legacy.bibsched.bibtask import task_low_level_submission
from invenio.legacy.webuser import collect_user_info, isUserSuperAdmin
from invenio.legacy.dbquery import run_sql
from invenio.legacy.bibrecord.scripts import xmlmarc2textmarc as xmlmarc2textmarc
-from invenio.bibedit_utils import record_locked_by_queue
+from invenio.legacy.bibedit.utils import record_locked_by_queue
from invenio.legacy import template
multiedit_templates = template.load('bibeditmulti')
# base command for subfields
class BaseSubfieldCommand:
"""Base class for commands manipulating subfields"""
def __init__(self, subfield, value = "", new_value = "", condition = "", condition_exact_match=True , condition_does_not_exist=False, condition_subfield = "", additional_values = None):
"""Initialization."""
if additional_values is None:
additional_values = []
self._subfield = subfield
self._value = value
self._additional_values = additional_values
self._new_value = new_value
self._condition = condition
self._condition_subfield = condition_subfield
self._condition_exact_match = condition_exact_match
self._condition_does_not_exist = condition_does_not_exist
self._modifications = 0
def process_field(self, record, tag, field_number):
"""Make changes to a record.
By default this method is empty.
Every specific command provides its own implementation"""
pass
def _subfield_condition_match(self, subfield_value):
"""Check if the condition is met for the given subfield value
in order to act only on certain subfields
@return True if condition match, False if condition does not match
"""
#if condition is "does not exists" this function returns False
if self._condition_does_not_exist:
return False
if self._condition_exact_match:
# exact matching
if self._condition == subfield_value:
return True
else:
# partial matching
if self._condition in subfield_value:
return True
return False
def _perform_on_all_matching_subfields(self, record, tag, field_number, callback):
"""Perform an action on all subfields of a given field matching
the subfield represented by the current command.
e.g. change the value of all subfields 'a' in a given field
This method is necessary because in order to make changes in the
subfields of a given field you always have to iterate through all
of them. This is repetition of code is extracted in this method.
@param record: record structure representing record to be modified
@param tag: the tag used to identify the field
@param field_number: field number used to identify the field
@param callback: callback method that will be called to
perform an action on the subfield.
This callback should accept the following parameters:
record, tag, field_number, subfield_index
"""
if tag not in record.keys():
return
for field in record[tag]:
if field[4] == field_number:
subfield_index = 0
for subfield in field[0]:
if self._condition != '':
if subfield[0] == self._subfield:
for subfield in field[0]:
if self._condition_subfield == subfield[0]:
if self._subfield_condition_match(subfield[1]):
self._add_subfield_modification()
callback(record, tag, field_number, subfield_index)
elif subfield[0] == self._subfield:
self._add_subfield_modification()
callback(record, tag, field_number, subfield_index)
subfield_index = subfield_index+1
def _add_subfield_modification(self):
"""Keep a record of the number of modifications made to subfields"""
self._modifications += 1
# specific commands for subfields
class AddSubfieldCommand(BaseSubfieldCommand):
"""Add subfield to a given field"""
def _perform_on_all_matching_subfields_add_subfield(self, record, tag, field_number, callback):
if tag not in record.keys():
return
subfield_exists = False
for field in record[tag]:
if field[4] == field_number:
for subfield in field[0]:
if subfield[0] == self._condition_subfield:
subfield_exists = True
if self._condition_subfield == subfield[0] and self._condition_does_not_exist == False:
if self._subfield_condition_match(subfield[1]):
self._add_subfield_modification()
callback(record, tag, field_number, None)
if self._condition_does_not_exist and subfield_exists == False:
self._add_subfield_modification()
callback(record, tag, field_number, None)
def process_field(self, record, tag, field_number):
"""@see: BaseSubfieldCommand.process_field"""
action = lambda record, tag, field_number, subfield_index: \
bibrecord.record_add_subfield_into(record, tag,
self._subfield, self._value,
None,
field_position_global=field_number)
if self._condition != '' or self._condition_does_not_exist:
self._perform_on_all_matching_subfields_add_subfield(record, tag,
field_number, action)
else:
self._add_subfield_modification()
action(record, tag, field_number, None)
class DeleteSubfieldCommand(BaseSubfieldCommand):
"""Delete subfield from a given field"""
def process_field(self, record, tag, field_number):
"""@see: BaseSubfieldCommand.process_field"""
action = lambda record, tag, field_number, subfield_index: \
bibrecord.record_delete_subfield_from(record, tag,
subfield_index,
field_position_global=field_number)
self._perform_on_all_matching_subfields(record, tag,
field_number, action)
class ReplaceSubfieldContentCommand(BaseSubfieldCommand):
"""Replace content of subfield in a given field"""
def process_field(self, record, tag, field_number):
"""@see: BaseSubfieldCommand.process_field"""
action = lambda record, tag, field_number, subfield_index: \
bibrecord.record_modify_subfield(record, tag,
self._subfield,
self._value,
subfield_index,
field_position_global=field_number)
self._perform_on_all_matching_subfields(record,
tag,
field_number,
action)
class ReplaceTextInSubfieldCommand(BaseSubfieldCommand):
"""Replace text in content of subfield of a given field"""
def process_field(self, record, tag, field_number):
"""@see: BaseSubfieldCommand.process_field"""
def replace_text(record, tag, field_number, subfield_index):
"""Method for replacing the text, performed on
all the matching fields."""
#get the field value
field_value = ""
for field in record[tag]:
if field[4] == field_number:
subfields = field[0]
(field_code, field_value) = subfields[subfield_index]
replace_string = re.escape(self._value)
for val in self._additional_values:
replace_string += "|" + re.escape(val)
#replace text
new_value = re.sub(replace_string, self._new_value, field_value)
#update the subfield if needed
if new_value != field_value:
bibrecord.record_modify_subfield(record, tag,
self._subfield, new_value,
subfield_index,
field_position_global=field_number)
else:
#No modification ocurred, update modification counter
self._modifications -= 1
self._perform_on_all_matching_subfields(record,
tag,
field_number,
replace_text)
"""***************************************************************************
Field commands represent the actions performed on the fields
of the record. The interface of these commands is defined in their
base class.
In general the changes related to field's subfields are handled by subfield
commands, that are passed to the field command.
"""
# base command for fields
class BaseFieldCommand:
"""Base class for commands manipulating record fields"""
def __init__(self, tag, ind1, ind2, subfield_commands):
"""Initialization."""
self._tag = tag
self._ind1 = ind1
self._ind2 = ind2
self._subfield_commands = subfield_commands
self._modifications = 0
def process_record(self, record):
"""Make changes to a record.
By default this method is empty.
Every specific command provides its own implementation"""
pass
def _apply_subfield_commands_to_field(self, record, field_number):
"""Applies all subfield commands to a given field"""
field_modified = False
for subfield_command in self._subfield_commands:
current_modifications = subfield_command._modifications
subfield_command.process_field(record, self._tag, field_number)
if subfield_command._modifications > current_modifications:
field_modified = True
if field_modified:
self._modifications += 1
# specific commands for fields
class AddFieldCommand(BaseFieldCommand):
"""Deletes given fields from a record"""
def process_record(self, record):
"""@see: BaseFieldCommand.process_record"""
# if the tag is empty, we don't make any changes
if self._tag == "" or self._tag == None:
return
field_number = bibrecord.record_add_field(record, self._tag,
self._ind1, self._ind2)
self._apply_subfield_commands_to_field(record, field_number)
class DeleteFieldCommand(BaseFieldCommand):
"""Deletes given fields from a record"""
def __init__(self, tag, ind1, ind2, subfield_commands, conditionSubfield="", condition="", condition_exact_match=True, _condition_does_not_exist=False):
BaseFieldCommand.__init__(self, tag, ind1, ind2, subfield_commands)
self._conditionSubfield = conditionSubfield
self._condition = condition
self._condition_exact_match = condition_exact_match
self._condition_does_not_exist = _condition_does_not_exist
def _delete_field_condition(self, record):
"""Checks if a subfield meets the condition for the
field to be deleted
"""
try:
for field in record[self._tag]:
subfield_exists = False
for subfield in field[0]:
if subfield[0] == self._conditionSubfield:
subfield_exists = True
if self._condition_does_not_exist == True:
break
if self._condition_exact_match:
if self._condition == subfield[1]:
bibrecord.record_delete_field(record, self._tag, self._ind1, self._ind2, field_position_global=field[4])
self._modifications += 1
break
else:
if self._condition in subfield[1]:
bibrecord.record_delete_field(record, self._tag, self._ind1, self._ind2, field_position_global=field[4])
self._modifications += 1
break
if subfield_exists == False and self._condition_does_not_exist:
bibrecord.record_delete_field(record, self._tag, self._ind1, self._ind2, field_position_global=field[4])
self._modifications += 1
except KeyError:
pass
def process_record(self, record):
"""@see: BaseFieldCommand.process_record"""
if self._condition:
self._delete_field_condition(record)
else:
bibrecord.record_delete_field(record, self._tag, self._ind1, self._ind2)
self._modifications += 1
class UpdateFieldCommand(BaseFieldCommand):
"""Deletes given fields from a record"""
def process_record(self, record):
"""@see: BaseFieldCommand.process_record"""
# if the tag is empty, we don't make any changes
if self._tag == "" or self._tag == None:
return
matching_field_instances = \
bibrecord.record_get_field_instances(record, self._tag,
self._ind1, self._ind2)
for current_field in matching_field_instances:
self._apply_subfield_commands_to_field(record, current_field[4])
def perform_request_index(language):
"""Creates the page of MultiEdit
@param language: language of the page
"""
collections = ["Any collection"]
collections.extend([collection[0] for collection in run_sql('SELECT name FROM collection')])
return multiedit_templates.page_contents(language=language, collections=collections)
def get_scripts():
"""Returns JavaScripts that have to be
imported in the page"""
return multiedit_templates.scripts()
def get_css():
"""Returns the local CSS for the pages."""
return multiedit_templates.styles()
def perform_request_detailed_record(record_id, update_commands, output_format, language):
"""Returns
@param record_id: the identifier of the record
@param update_commands: list of commands used to update record contents
@param output_format: specifies the output format as expected from bibformat
@param language: language of the page
"""
response = {}
record_content = _get_formated_record(record_id=record_id,
output_format=output_format,
update_commands = update_commands,
language=language)
response['search_html'] = multiedit_templates.detailed_record(record_content, language)
return response
def perform_request_test_search(search_criteria, update_commands, output_format, page_to_display,
language, outputTags, collection="", compute_modifications=0,
upload_mode='-c', checked_records=None):
"""Returns the results of a test search.
@param search_criteria: search criteria used in the test search
@type search_criteria: string
@param update_commands: list of commands used to update record contents
@type update_commands: list of objects
@param output_format: specifies the output format as expected from bibformat
@type output_format: string (hm, hb, hd, xm, xn, hx)
@param page_to_display: the number of the page that should be displayed to the user
@type page_to_display: int
@param language: the language used to format the content
@param outputTags: list of tags to be displayed in search results
@type outputTags: list of strings
@param collection: collection to be filtered in the results
@type collection: string
@param compute_modifications: if equals 0 do not compute else compute modifications
@type compute_modifications: int
"""
RECORDS_PER_PAGE = 100
response = {}
if collection == "Any collection":
collection = ""
record_IDs = search_engine.perform_request_search(p=search_criteria, c=collection)
# initializing checked_records if not initialized yet or empty
if checked_records is None or not checked_records:
checked_records = record_IDs
number_of_records = len(record_IDs)
if page_to_display < 1:
page_to_display = 1
last_page_number = number_of_records / RECORDS_PER_PAGE + 1
if page_to_display > last_page_number:
page_to_display = last_page_number
first_record_to_display = RECORDS_PER_PAGE * (page_to_display - 1)
last_record_to_display = (RECORDS_PER_PAGE * page_to_display) - 1
if not compute_modifications:
record_IDs = record_IDs[first_record_to_display:last_record_to_display + 1]
# displayed_records is a list containing IDs of records that will be displayed on current page
displayed_records = record_IDs[:RECORDS_PER_PAGE]
records_content = []
record_modifications = 0
locked_records = []
for record_id in record_IDs:
if upload_mode == '-r' and record_locked_by_queue(record_id):
locked_records.append(record_id)
current_modifications = [current_command._modifications for current_command in update_commands]
formated_record = _get_formated_record(record_id=record_id,
output_format=output_format,
update_commands=update_commands,
language=language, outputTags=outputTags,
run_diff=record_id in displayed_records,
checked=record_id in checked_records)
new_modifications = [current_command._modifications for current_command in update_commands]
if new_modifications > current_modifications:
record_modifications += 1
records_content.append((record_id, formated_record))
total_modifications = []
if compute_modifications:
field_modifications = 0
subfield_modifications = 0
for current_command in update_commands:
field_modifications += current_command._modifications
for subfield_command in current_command._subfield_commands:
subfield_modifications += subfield_command._modifications
if record_modifications:
total_modifications.append(record_modifications)
total_modifications.append(field_modifications)
total_modifications.append(subfield_modifications)
records_content = records_content[first_record_to_display:last_record_to_display + 1]
response['display_info_box'] = compute_modifications or locked_records
response['info_html'] = multiedit_templates.info_box(language=language,
total_modifications=total_modifications)
if locked_records:
response['info_html'] += multiedit_templates.tmpl_locked_record_list(language=language,
locked_records=locked_records)
response['search_html'] = multiedit_templates.search_results(records=records_content,
number_of_records=number_of_records,
current_page=page_to_display,
records_per_page=RECORDS_PER_PAGE,
language=language,
output_format=output_format,
checked_records=checked_records)
response['checked_records'] = checked_records
return response
def perform_request_submit_changes(search_criteria, update_commands, language, upload_mode, tag_list, collection, req, checked_records):
"""Submits changes for upload into database.
@param search_criteria: search criteria used in the test search
@param update_commands: list of commands used to update record contents
@param language: the language used to format the content
"""
response = {}
status, file_path = _submit_changes_to_bibupload(search_criteria, update_commands, upload_mode, tag_list, collection, req, checked_records)
response['search_html'] = multiedit_templates.changes_applied(status, file_path)
response['checked_records'] = checked_records
return response
def _get_record_diff(record_textmarc, updated_record_textmarc, outputTags, record_id):
"""
Use difflib library to compare the old record with the modified version and
return the output for Multiedit interface
@param record_textmarc: original record textmarc representation
@type record_textmarc: string
@param updated_record_textmarc: updated record textmarc representation
@type updated_record_textmarc: string
@param outputTags: tags to be filtered while printing output
@type outputTags: list
@return: content to be displayed on Multiedit interface for this record
@rtype: string
"""
import difflib
differ = difflib.Differ()
filter_tags = "All tags" not in outputTags and outputTags
result = ["<pre>"]
for line in differ.compare(record_textmarc.splitlines(), updated_record_textmarc.splitlines()):
if line[0] == ' ':
if not filter_tags or line.split()[0].replace('_', '') in outputTags:
result.append("%09d " % record_id + line.strip())
elif line[0] == '-':
# Mark as deleted
if not filter_tags or line.split()[1].replace('_', '') in outputTags:
result.append('<strong class="multiedit_field_deleted">' + "%09d " % record_id + line[2:].strip() + "</strong>")
elif line[0] == '+':
# Mark as added/modified
if not filter_tags or line.split()[1].replace('_', '') in outputTags:
result.append('<strong class="multiedit_field_modified">' + "%09d " % record_id + line[2:].strip() + "</strong>")
else:
continue
result.append("</pre>")
return '\n'.join(result)
def _get_formated_record(record_id, output_format, update_commands, language, outputTags="", run_diff=True, checked=True):
"""Returns a record in a given format
@param record_id: the ID of record to format
@param output_format: an output format code (or short identifier for the output format)
@param update_commands: list of commands used to update record contents
@param language: the language to use to format the record
@param run_diff: determines if we want to run _get_recodr_diff function, which sometimes takes too much time
"""
if update_commands and checked:
# Modify the bibrecord object with the appropriate actions
updated_record = _get_updated_record(record_id, update_commands)
textmarc_options = {"aleph-marc":0, "correct-mode":1, "append-mode":0,
"delete-mode":0, "insert-mode":0, "replace-mode":0,
"text-marc":1}
old_record = search_engine.get_record(recid=record_id)
old_record_textmarc = xmlmarc2textmarc.create_marc_record(old_record, sysno="", options=textmarc_options)
if "hm" == output_format:
if update_commands and run_diff and checked:
updated_record_textmarc = xmlmarc2textmarc.create_marc_record(updated_record, sysno="", options=textmarc_options)
result = _get_record_diff(old_record_textmarc, updated_record_textmarc, outputTags, record_id)
else:
filter_tags = "All tags" not in outputTags and outputTags
result = ['<pre>']
for line in old_record_textmarc.splitlines()[:-1]:
if not filter_tags or line.split()[0].replace('_', '') in outputTags:
result.append("%09d " % record_id + line.strip())
result.append('</pre>')
result = '\n'.join(result)
else:
if update_commands and checked:
# No coloring of modifications in this case
xml_record = bibrecord.record_xml_output(updated_record)
else:
xml_record = bibrecord.record_xml_output(old_record)
result = bibformat.format_record(recID=None,
of=output_format,
xml_record=xml_record,
ln=language)
return result
# FIXME: Remove this method as soon as the formatting for MARC is
# implemented in bibformat
def _create_marc(records_xml):
"""Creates MARC from MARCXML.
@param records_xml: MARCXML containing information about the records
@return: string containing information about the records
in MARC format
"""
aleph_marc_output = ""
records = bibrecord.create_records(records_xml)
for (record, status_code, list_of_errors) in records:
sysno = ""
options = {"aleph-marc":0, "correct-mode":1, "append-mode":0,
"delete-mode":0, "insert-mode":0, "replace-mode":0,
"text-marc":1}
aleph_record = xmlmarc2textmarc.create_marc_record(record,
sysno,
options)
aleph_marc_output += aleph_record
return aleph_marc_output
def _submit_changes_to_bibupload(search_criteria, update_commands, upload_mode, tag_list, collection, req, checked_records):
"""This methods takes care of submitting the changes to the server
through bibupload.
@param search_criteria: the search criteria used for filtering the
records. The changes will be applied to all the records matching
the criteria
@param update_commands: the commands defining the changes. These
commands perform the necessary changes before the records are submitted
"""
if collection == "Any collection":
collection = ""
record_IDs = search_engine.perform_request_search(p=search_criteria, c=collection)
num_records = len(record_IDs)
updated_records = []
# Intersection of record_IDs list and checked_records
id_and_checked = list(set(record_IDs) & set(checked_records))
for current_id in id_and_checked:
current_updated_record = _get_updated_record(current_id, update_commands)
updated_records.append(current_updated_record)
file_path = _get_file_path_for_bibupload()
_save_records_xml(updated_records, file_path, upload_mode, tag_list)
return _upload_file_with_bibupload(file_path, upload_mode, num_records, req)
def _get_updated_record(record_id, update_commands):
"""Applies all the changes specified by the commands
to record identified by record_id and returns resulting record
@param record_id: identifier of the record that will be updated
@param update_commands: list of commands used to update record contents
@return: updated record structure"""
record = search_engine.get_record(recid=record_id)
for current_command in update_commands:
current_command.process_record(record)
return record
def _upload_file_with_bibupload(file_path, upload_mode, num_records, req):
"""
Uploads file with bibupload
@param file_path: path to the file where the XML will be saved.
@param upload_mode: -c for correct or -r for replace
@return tuple formed by status of the upload:
0-changes to be made instantly
1-changes to be made only in limited hours
2-user is superadmin. Changes made in limited hours
3-no rights to upload
and the upload file path
"""
if num_records < CFG_BIBEDITMULTI_LIMIT_INSTANT_PROCESSING:
task_low_level_submission('bibupload', 'multiedit', '-P', '5', upload_mode, '%s' % file_path)
return (0, file_path)
elif num_records < CFG_BIBEDITMULTI_LIMIT_DELAYED_PROCESSING:
task_low_level_submission('bibupload', 'multiedit', '-P', '5', upload_mode, '-L', CFG_BIBEDITMULTI_LIMIT_DELAYED_PROCESSING_TIME,'%s' % file_path)
return (1, file_path)
else:
user_info = collect_user_info(req)
if isUserSuperAdmin(user_info):
task_low_level_submission('bibupload', 'multiedit', '-P', '5', upload_mode, '-L', CFG_BIBEDITMULTI_LIMIT_DELAYED_PROCESSING_TIME, '%s' % file_path)
return (2, file_path)
return (3, file_path)
def _get_file_path_for_bibupload():
"""Returns file path for saving a file for bibupload """
current_time = strftime("%Y%m%d%H%M%S")
return "%s/%s_%s%s" % (CFG_TMPSHAREDDIR, "multiedit", current_time, ".xml")
def _save_records_xml(records, file_path, upload_mode, tag_list):
"""Saves records in a file in XML format
@param records: list of records (record structures)
@param file_path: path to the file where the XML will be saved."""
output_file = None
try:
output_file = open(file_path, "w")
if upload_mode == "-c":
for record in records:
for tag in record.keys():
if tag not in tag_list:
del(record[tag])
records_xml = bibrecord.print_recs(records)
output_file.write(records_xml)
finally:
if not output_file is None:
output_file.close()
diff --git a/invenio/legacy/bibexport/daemon.py b/invenio/legacy/bibexport/daemon.py
index cd98d9167..5f5f18142 100644
--- a/invenio/legacy/bibexport/daemon.py
+++ b/invenio/legacy/bibexport/daemon.py
@@ -1,147 +1,147 @@
## -*- mode: python; coding: utf-8; -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibExport daemon.
Usage: %s [options]
Scheduling options:
-u, --user=USER user name to store task, password needed
-s, --sleeptime=SLEEP time after which to repeat tasks (no)
e.g.: 1s, 30m, 24h, 7d
-t, --time=TIME moment for the task to be active (now)
e.g.: +15s, 5m, 3h , 2002-10-27 13:57:26
General options:
-h, --help print this help and exit
-V, --version print version and exit
-v, --verbose=LEVEL verbose level (from 0 to 9, default 1)
"""
__revision__ = "$Id$"
import os
import sys
from ConfigParser import ConfigParser
from invenio.config import CFG_ETCDIR
from invenio.legacy.dbquery import run_sql
-from invenio.bibtask import task_init, write_message, task_set_option, \
+from invenio.legacy.bibsched.bibtask import task_init, write_message, task_set_option, \
task_get_option, task_has_option, task_get_task_param
def _detect_jobs_to_run(string_of_jobnames=None):
"""Detect which jobs to run from optional string of jobs.
If not passed, run all jobs.
Return list of jobnames to run."""
if string_of_jobnames:
jobnames = string_of_jobnames.split(',')
else:
jobnames = []
# FIXME: pay attention to periodicity; extract only jobs needed to run
res = run_sql("SELECT jobname FROM expJOB")
for row in res:
jobnames.append(row[0])
return jobnames
def _detect_export_method(jobname):
"""Detect export method of JOBNAME. Basically, parse JOBNAME.cfg
and return export_method. Return None if problem found."""
jobconf = ConfigParser()
jobconffile = CFG_ETCDIR + os.sep + 'bibexport' + os.sep + jobname + '.cfg'
if not os.path.exists(jobconffile):
write_message("ERROR: cannot find config file %s." % jobconffile, sys.stderr)
return None
jobconf.read(jobconffile)
export_method = jobconf.get('export_job', 'export_method')
return export_method
def _update_job_lastrun_time(jobname):
"""Update expJOB table and set lastrun time of JOBNAME to the task
starting time."""
run_sql("UPDATE expJOB SET lastrun=%s WHERE jobname=%s",
(task_get_task_param('task_starting_time'), jobname,))
def task_run_core():
"""
Runs the task by fetching arguments from the BibSched task queue. This is
what BibSched will be invoking via daemon call.
"""
errors_encountered_p = False
jobnames = _detect_jobs_to_run(task_get_option('wjob'))
for jobname in jobnames:
jobname_export_method = _detect_export_method(jobname)
if not jobname_export_method:
write_message("ERROR: cannot detect export method for job %s." % jobname, sys.stderr)
errors_encountered_p = True
else:
try:
# every bibexport method must define run_export_job() that will do the job
exec "from invenio.bibexport_method_%s import run_export_method" % jobname_export_method
write_message("started export job " + jobname, verbose=3)
# pylint: disable=E0602
# The import is done via the exec command 2 lines above.
run_export_method(jobname)
# pylint: enable=E0602
_update_job_lastrun_time(jobname)
write_message("finished export job " + jobname, verbose=3)
except Exception, msg:
write_message("ERROR: cannot run export job %s: %s." % (jobname, msg), sys.stderr)
errors_encountered_p = True
return not errors_encountered_p
def task_submit_check_options():
"""Check that options are valid."""
if task_has_option('wjob'):
jobnames = task_get_option('wjob')
if jobnames:
jobnames = jobnames.split(',')
for jobname in jobnames:
res = run_sql("SELECT COUNT(*) FROM expJOB WHERE jobname=%s", (jobname,))
if res and res[0][0]:
# okay, jobname exists
pass
else:
write_message("Sorry, job name %s is not known. Exiting." % jobname)
return False
return True
def task_submit_elaborate_specific_parameter(key, value, opts, args):
"""Usual 'elaboration' of task specific parameters adapted to the bibexport task."""
if key in ("-w", "--wjob"):
task_set_option("wjob", value)
else:
return False
return True
def main():
"""Main function that constructs full bibtask."""
task_init(authorization_action='runbibexport',
authorization_msg="BibExport Task Submission",
help_specific_usage="""Export options:
-w, --wjob=j1[,j2]\tRun specific exporting jobs j1, j2, etc (e.g. 'sitemap').
""",
version=__revision__,
specific_params=("w:", ["wjob=",]),
task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
task_submit_check_options_fnc=task_submit_check_options,
task_run_fnc=task_run_core)
if __name__ == "__main__":
_detect_export_method("sitemap")
main()
diff --git a/invenio/legacy/bibexport/fieldexporter.py b/invenio/legacy/bibexport/fieldexporter.py
index 58dc61dc4..000baf326 100644
--- a/invenio/legacy/bibexport/fieldexporter.py
+++ b/invenio/legacy/bibexport/fieldexporter.py
@@ -1,556 +1,556 @@
# -*- coding: utf-8 -*-
## $Id: search_engine_query_parser.py,v 1.12 2008/06/13 15:35:13 rivanov Exp $
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0301
"""Invenio Search Engine query parsers."""
__lastupdated__ = """$Date: 2008/06/13 15:35:13 $"""
__revision__ = "$Id: search_engine_query_parser.py,v 1.12 2008/06/13 15:35:13 rivanov Exp $"
-from invenio.bibtask import write_message
+from invenio.legacy.bibsched.bibtask import write_message
# imports used in FieldExporter class
import invenio.legacy.search_engine
from invenio.legacy import bibrecord
from invenio import bibdocfile
import os
# imports used in perform_request_... methods
from invenio.config import CFG_SITE_LANG
from invenio import bibexport_method_fieldexporter_dblayer as fieldexporter_dblayer
from invenio.legacy import template
fieldexporter_templates = template.load('bibexport_method_fieldexporter')
from invenio.base.i18n import gettext_set_language
def run_export_method(jobname):
"""Main function, reading params and running the task."""
write_message("bibexport_fieldexporter: job %s started." % jobname)
job = fieldexporter_dblayer.get_job_by_name(jobname)
job_result = _run_job(job)
if job_result.STATUS_CODE_OK != job_result.get_status():
error_message = job_result.get_status_message()
write_message("Error during %s execution. Error message: %s" % (jobname, error_message) )
write_message("bibexport_fieldexporter: job %s started." % jobname)
def _run_job(job):
"""Execute a job and saves the results
@param job: Job object containing inforamtion about the job
@return: JobResult object containing informatoin about the result
of job execution
"""
exporter = FieldExporter()
job_result = exporter.execute_job(job)
fieldexporter_dblayer.save_job_result(job_result)
return job_result
class FieldExporter:
"""Provides mothods for exporting given fields from
records corresponding to a given search criteria.
It provides also methods for transforming the resulting
MARC XML into other formats.
"""
def __init__(self):
"""Nothing to init"""
pass
def _export_fields(self, search_criteria, output_fields):
"""Export fields that are among output_fields from
all the records that match the search criteria.
@param search_criteria: combination of search terms in Invenio
@param output_fields: list of fields that should remain in the records
@return: MARC XML with records containing only the fields that are
among output fields
"""
records = self._get_records(search_criteria)
filtered_xml = self._filter_records_fields(records, output_fields)
return filtered_xml
def execute_query(self, query):
"""Executes a query and returns the result of execution.
@param query: Query object containing information about the query.
@return: QueryResult object containing the result.
"""
search_criteria = query.get_search_criteria()
output_fields = query.get_output_fields()
xml_result = self._export_fields(search_criteria, output_fields)
query_result = fieldexporter_dblayer.QueryResult(query, xml_result)
return query_result
def execute_job(self, job):
"""Executes a job and returns the result of execution.
@param job: Job object containing information about the job.
@return: JobResult object containing the result.
"""
job_result = fieldexporter_dblayer.JobResult(job)
job_queries = fieldexporter_dblayer.get_job_queries(job.get_id())
for current_query in job_queries:
current_query_result = self.execute_query(current_query)
job_result.add_query_result(current_query_result)
return job_result
def _get_records(self, search_criteria):
"""Creates MARC XML containing all the records corresponding
to a given search criteria.
@param search_criteria: combination of search terms in Invenio
@return: MARC XML containing all the records corresponding
to the search criteria"""
record_IDs = search_engine.perform_request_search(p = search_criteria)
records_XML = self._create_records_xml(record_IDs)
return records_XML
def _filter_records_fields(self, records_xml, output_fields):
"""Leaves in the records only fields that are necessary.
All the other fields are removed from the records.
@param records_xml: MARC XML containing all the information about the records
@param output_fields: list of fields that should remain in the records
@return: MARC XML with records containing only fields that are
in output_fields list.
"""
# Add 001/970 to the output fields. 970 is necessary for system number
# extraction when exporting in aleph marc. When we add more formats,
# we can add it optionally only when exporting aleph marc.
output_fields.append("001")
output_fields.append("970")
records = bibrecord.create_records(records_xml)
output_records = []
for (record, status_code, list_of_errors) in records:
record = self._filter_fields(record, output_fields)
# do not return empty records
if not self._is_record_empty(record):
output_records.append(record)
output_xml = bibrecord.print_recs(output_records)
return output_xml
def _is_record_empty(self, record):
"""Check if a record is empty.
We assume that record is empty if all the values of the
tags are empty lists or the record dictionary itself is empty.
@param record: record structure (@see: bibrecord.py for details)
@return True if the record is empty
"""
for value in record.values():
if len(value) > 0:
return False
return True
def _filter_fields(self, record, output_fields):
"""Removes from the record all the fields
that are not output_fields.
@param record: record structure (@see: bibrecord.py for details)
@param output_fields: list of fields that should remain in the record
@return: record containing only fields among output_fields
"""
# Tibor's new implementation:
for tag in record.keys():
if tag not in output_fields:
bibrecord.record_delete_fields(record, tag)
return record
# Rado's old implementation that leads to bibrecord-related
# bug, see <https://savannah.cern.ch/task/?10267>:
record_keys = record.keys()
# Check if any of the tags, fields or subfields match
# any value in output_fields. In case of match we leave
# the element and its children in the record.
#
# If the element and all its children are not among the
# output fields, it is deleted
for tag in record_keys:
tag = tag.lower()
if tag not in output_fields:
for (subfields, ind1, ind2, value, field_number) in record[tag]:
current_field = tag + ind1.strip() + ind2.strip()
current_field = current_field.lower()
if current_field not in output_fields:
delete_parents = True
for (code, value) in subfields:
current_subfield = current_field + code
current_subfield = current_subfield.lower()
if current_subfield not in output_fields:
bibrecord.record_delete_subfield(record, tag, code, ind1, ind2)
else:
delete_parents = False
if delete_parents:
bibrecord.record_delete_field(record, tag, ind1, ind2)
return record
def _create_records_xml(self, record_IDs):
"""Creates XML containing all the information
for the records with the given identifiers
@param record_IDs: list of identifiers of records
@return: MARC XML containing all the information about the records
"""
output_xml = "<collection>"
for record_id in record_IDs:
record_xml = search_engine.print_record(recID = record_id, format = "xm")
output_xml += record_xml
output_xml += "</collection>"
return output_xml
def get_css():
"""Returns the CSS for field exporter pages."""
return fieldexporter_templates.tmpl_styles()
def get_navigation_menu(language = CFG_SITE_LANG):
"""Returns HTML reresenting the navigation menu
of field exporter
@param language: language of the page
"""
return fieldexporter_templates.tmpl_navigation_menu(language)
def perform_request_new_job(language = CFG_SITE_LANG):
"""Displays a page for creation of a new job.
@param language: language of the page
"""
job = fieldexporter_dblayer.Job()
return fieldexporter_templates.tmpl_edit_job(job, language = language)
def perform_request_edit_job(job_id, user_id, language = CFG_SITE_LANG):
"""Displays a page where the user can edit information
about a job.
@param job_id: identifier of the job that will be edited
@param user_id: identifier of the user
@param language: language of the page
"""
_check_user_ownership_on_job(user_id, job_id, language)
job = fieldexporter_dblayer.get_job(job_id)
return fieldexporter_templates.tmpl_edit_job(job, language = language)
def perform_request_save_job(job, user_id, language = CFG_SITE_LANG):
"""Saves a job.
@param job: Object containing information about the job
@param user_id: identifier of the user saving the job
@param language: language of the page
@return: identifier of the job
"""
job_id = job.get_id()
_check_user_ownership_on_job(user_id, job_id, language)
return fieldexporter_dblayer.save_job(user_id, job)
def perform_request_delete_jobs(job_ids, user_id, language = CFG_SITE_LANG):
"""Deletes all the jobs which ids are given as a parameter.
@param job_ids: list with identifiers of jobs that have to be deleted
@param user_id: identifier of the user deleting the jobs
@param language: language of the page
"""
for job_id in job_ids:
_check_user_ownership_on_job(user_id, job_id, language)
fieldexporter_dblayer.delete_job(job_id)
def perform_request_run_jobs(job_ids, user_id, language = CFG_SITE_LANG):
"""Runs all the jobs which ids are given as a parameter
@param job_ids: list with identifiers of jobs that have to be run
@param user_id: identifier of the user running the jobs
@param language: language of the page
"""
for current_job_id in job_ids:
_check_user_ownership_on_job(user_id, current_job_id, language)
current_job = fieldexporter_dblayer.get_job(current_job_id)
_run_job(current_job)
def perform_request_jobs(user_id, language = CFG_SITE_LANG):
"""Displays a page containing list of all
jobs of the current user
@param user_id: identifier of the user owning the jobs
@param language: language of the page
"""
all_jobs = fieldexporter_dblayer.get_all_jobs(user_id)
return fieldexporter_templates.tmpl_display_jobs(jobs = all_jobs, language = language)
def perform_request_job_queries(job_id, user_id, language = CFG_SITE_LANG):
"""Displays a page containing list of all
all queries for a given job
@param job_id: identifier of the job containing the queries
@param user_id: identifier of the current user
@param language: language of the page
"""
_check_user_ownership_on_job(user_id, job_id, language)
queries = fieldexporter_dblayer.get_job_queries(job_id)
return fieldexporter_templates.tmpl_display_job_queries(job_queries = queries,
job_id = job_id,
language = language)
def perform_request_new_query(job_id, user_id, language = CFG_SITE_LANG):
"""Displays a page for creation of new query.
@param job_id: identifier of the job containing the query
@param user_id: identifier of user creating the query
@param language: language of the page
"""
_check_user_ownership_on_job(user_id, job_id, language)
query = fieldexporter_dblayer.Query()
return fieldexporter_templates.tmpl_edit_query(query, job_id, language)
def perform_request_edit_query(query_id, job_id, user_id, language = CFG_SITE_LANG):
"""Displays a page where the user can edit information
about a job.
@param query_id: identifier of the query that will be edited
@param job_id: identifier of the job containing the query
@param user_id: identifier of the user editing the query
@param language: language of the page
"""
_check_user_ownership_on_job(user_id, job_id, language)
_check_user_ownership_on_query(user_id, query_id, language)
query = fieldexporter_dblayer.get_query(query_id)
return fieldexporter_templates.tmpl_edit_query(query, job_id, language)
def perform_request_save_query(query, job_id, user_id, language = CFG_SITE_LANG):
"""Saves a query in database.
@param query: Query objectect containing the necessary informatoin
@param job_id: identifier of the job containing the query
@param user_id: identifier of the user saving the query
@param language: language of the page
"""
_check_user_ownership_on_job(user_id, job_id, language)
_check_user_ownership_on_query(user_id, query.get_id(), language)
fieldexporter_dblayer.save_query(query, job_id)
def perform_request_delete_queries(query_ids, user_id, language = CFG_SITE_LANG):
"""Deletes all the queries which ids are given as a parameter.
@param query_ids: list with identifiers of queries that have to be deleted
@param user_id: identifier of the user deleting the queries
@param language: language of the page
"""
for query_id in query_ids:
_check_user_ownership_on_query(user_id, query_id, language)
fieldexporter_dblayer.delete_query(query_id)
def perform_request_run_queries(query_ids, user_id, job_id, language = CFG_SITE_LANG):
"""Displays a page contining results from execution of given queries.
@param query_ids: list of query identifiers
@param user_id: identifier of the user running the queries
@param language: language of the page
"""
exporter = FieldExporter()
_check_user_ownership_on_job(user_id, job_id, language)
job = fieldexporter_dblayer.get_job(job_id)
job_result = fieldexporter_dblayer.JobResult(job)
queries_results = []
for current_id in query_ids:
_check_user_ownership_on_query(user_id, current_id, language)
current_query = fieldexporter_dblayer.get_query(current_id)
current_result = exporter.execute_query(current_query)
job_result.add_query_result(current_result)
return fieldexporter_templates.tmpl_display_queries_results(job_result, language)
def perform_request_job_history(user_id, language = CFG_SITE_LANG):
"""Displays a page containing information about the executed jobs.
@param user_id: identifier of the user owning the reuslts
@param language: language of the page
"""
job_result_identifiers = fieldexporter_dblayer.get_all_job_result_ids(user_id = user_id)
job_results = fieldexporter_dblayer.get_job_results(job_result_identifiers)
return fieldexporter_templates.tmpl_display_job_history(job_results, language)
def perform_request_job_results(job_result_id, user_id, language = CFG_SITE_LANG):
"""Displays a page with information about the results of a particular job.
@param job_result_id: identifier of the job result that should be displayed
@param user_id: identifier of the current user
@param language: language of the page
"""
_check_user_ownership_on_job_result(user_id, job_result_id, language)
job_result = fieldexporter_dblayer.get_job_result(job_result_id)
return fieldexporter_templates.tmpl_display_job_result_information(job_result, language)
def perform_request_download_job_result(req, job_result_id, output_format, user_id, language = CFG_SITE_LANG):
"""
Returns to the browser zip file containing the content of the job result
@param req: request as received from apache
@param job_result_id: identifier of the job result that should be displayed
@param user_id: identifier of the current user
@param language: language of the page
@param output_format: format for downloading the result
"""
_check_user_ownership_on_job_result(user_id, job_result_id, language)
job_result = fieldexporter_dblayer.get_job_result(job_result_id)
if output_format != fieldexporter_dblayer.Job.OUTPUT_FORMAT_MISSING:
job_result.get_job().set_output_format(output_format)
download_file_name = "result.zip"
temp_zip_file_path = ""
try:
temp_zip_file_path = fieldexporter_dblayer.create_temporary_zip_file_with_job_result(job_result)
bibdocfile.stream_file(req, temp_zip_file_path, download_file_name)
finally:
if os.path.exists(temp_zip_file_path):
os.remove(temp_zip_file_path)
def perform_request_display_job_result(job_result_id, output_format, user_id, language = CFG_SITE_LANG):
"""Displays a page with the results of a particular job.
@param job_result_id: identifier of the job result that should be displayed
@param user_id: identifier of the current user
@param language: language of the page
"""
_check_user_ownership_on_job_result(user_id, job_result_id, language)
job_result = fieldexporter_dblayer.get_job_result(job_result_id)
if output_format != fieldexporter_dblayer.Job.OUTPUT_FORMAT_MISSING:
job_result.get_job().set_output_format(output_format)
return fieldexporter_templates.tmpl_display_queries_results(job_result, language)
def _check_user_ownership_on_job(user_id, job_id, language = CFG_SITE_LANG):
"""Check if user owns a job. In case user is not the owner, exception is thrown.
@param user_id: identifier of the user
@param job_id: identifier of the job
@param language: language of the page
"""
if fieldexporter_dblayer.Job.ID_MISSING == job_id:
return
if not fieldexporter_dblayer.is_user_owner_of_job(user_id, job_id):
_ = gettext_set_language(language)
error_message = _("You are not authorised to access this resource.")
raise AccessDeniedError(error_message)
def _check_user_ownership_on_job_result(user_id, job_result_id, language = CFG_SITE_LANG):
"""Check if user owns a job result. In case user is not the owner, exception is thrown.
@param user_id: identifier of the user
@param job_result_id: identifier of the job result
@param language: language of the page
"""
if fieldexporter_dblayer.JobResult.ID_MISSING == job_result_id:
return
if not fieldexporter_dblayer.is_user_owner_of_job_result(user_id, job_result_id):
_ = gettext_set_language(language)
error_message = _("You are not authorised to access this resource.")
raise AccessDeniedError(error_message)
def _check_user_ownership_on_query(user_id, query_id, language = CFG_SITE_LANG):
"""Check if user owns a job result. In case user is not the owner, exception is thrown.
@param user_id: identifier of the user
@param job_result_id: identifier of the job result
@param language: language of the page
"""
if fieldexporter_dblayer.Query.ID_MISSING == query_id:
return
if not fieldexporter_dblayer.is_user_owner_of_query(user_id, query_id):
_ = gettext_set_language(language)
error_message = _("You are not authorised to access this resource.")
raise AccessDeniedError(error_message)
class AccessDeniedError(Exception):
"""Exception indicating an error during exportting for Google scholar."""
_error_message = ""
_inner_exception = None
def __init__(self, error_message, inner_exception = None):
"""Constructor of the exception"""
Exception.__init__(self, error_message, inner_exception)
self._error_message = error_message
self._inner_exception = inner_exception
def get_error_message(self):
"""Returns the error message that explains the reason for the exception"""
return self._error_message
def get_inner_exception(self):
"""Returns the inner exception that is the cause for the current exception"""
return self._inner_exception
def __str__(self):
"""Returns string representation"""
return self._error_message
diff --git a/invenio/legacy/bibexport/googlescholar.py b/invenio/legacy/bibexport/googlescholar.py
index 903bc6ca7..e8c2aced2 100644
--- a/invenio/legacy/bibexport/googlescholar.py
+++ b/invenio/legacy/bibexport/googlescholar.py
@@ -1,291 +1,291 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibExport plugin implementing 'googlescholar' exporting method.
The main function is run_export_method(jobname) defined at the end.
This is what BibExport daemon calls for all the export jobs that use
this exporting method.
The Google Scholar exporting method answers this use case: every first
of the month, please export all records modified during the last month
and matching these search criteria in an NLM format in such a way that
the output is split into files containing not more than 1000 records
and compressed via gzip and placed in this place from where Google
Scholar would fetch them. The output files would be organized like
this:
* all exportable records:
/export/googlescholar/all-index.html - links to parts below
/export/googlescholar/all-part1.xml.gz - first batch of 1000 records
/export/googlescholar/all-part2.xml.gz - second batch of 1000 records
...
/export/googlescholar/all-partM.xml.gz - last batch of 1000 records
* records modified in the last month:
/export/googlescholar/lastmonth-index.html - links to parts below
/export/googlescholar/lastmonth-part1.xml.gz - first batch of 1000 records
/export/googlescholar/lastmonth-part2.xml.gz - second batch of 1000 records
...
/export/googlescholar/lastmonth-partN.xml.gz - last batch of 1000 records
"""
from invenio.config import CFG_WEBDIR, CFG_CERN_SITE
-from invenio.bibtask import write_message
+from invenio.legacy.bibsched.bibtask import write_message
from invenio.legacy.search_engine import perform_request_search, print_record
import os
import gzip
import datetime
def run_export_method(jobname):
"""Main function, reading params and running the task."""
# FIXME: read jobname's cfg file to detect collection and fulltext status arguments
write_message("bibexport_sitemap: job %s started." % jobname)
try:
output_directory = CFG_WEBDIR + os.sep + "export" + os.sep + "googlescholar"
exporter = GoogleScholarExporter(output_directory)
exporter.export()
except GoogleScholarExportException, ex:
write_message("%s Exception: %s" %(ex.get_error_message(), ex.get_inner_exception()))
write_message("bibexport_sitemap: job %s finished." % jobname)
class GoogleScholarExporter:
"""Export data for google scholar"""
_output_directory = ""
_records_with_fulltext_only = True
#FIXME: Read collections from configuration file
_collections = ["Theses"]
if CFG_CERN_SITE:
_collections = ["CERN Theses"]
def __init__(self, output_directory):
"""Constructor of GoogleScholarExporter
output_directory - directory where files will be placed
"""
self.set_output_directory(output_directory)
def export(self):
"""Export all records and records modified last month"""
LAST_MONTH_FILE_NAME_PATTERN = "lastmonth"
ALL_MONTH_FILE_NAME_PATTERN = "all"
SPLIT_BY_RECORDS = 1000
# Export records modified last month
records = self._get_records_modified_last_month()
self._delete_files(self._output_directory, LAST_MONTH_FILE_NAME_PATTERN)
self._split_records_into_files(records, SPLIT_BY_RECORDS, LAST_MONTH_FILE_NAME_PATTERN, self._output_directory)
# Export all records
all_records = self._get_all_records()
self._delete_files(self._output_directory, ALL_MONTH_FILE_NAME_PATTERN)
self._split_records_into_files(all_records, SPLIT_BY_RECORDS, ALL_MONTH_FILE_NAME_PATTERN, self._output_directory)
def set_output_directory(self, path_to_directory):
"""Check if directory exists. If it does not exists it creates it."""
directory = path_to_directory
# remove the slash from the end of the path if exists
if directory[-1] == os.sep:
directory = directory[:-1]
# if directory does not exists then create it
if not os.path.exists(directory):
try:
os.makedirs(directory)
except(IOError, OSError), exception:
self._report_error("Directory %s does not exist and cannot be ctreated." % (directory, ), exception)
# if it is not path to a directory report an error
if not os.path.isdir(directory):
self._report_error("%s is not a directory." % (directory, ))
return
self._output_directory = directory
def _get_records_modified_last_month(self):
"""Returns all records modified last month and matching the criteria."""
current_date = datetime.date.today()
one_month_ago = current_date - datetime.timedelta(days = 31)
#FIXME: Return only records with full texts available for Google Scholar
#FIXME: There is a problem with searching in modification date. It searches only in creation date
return perform_request_search(dt="m", c = self._collections, d1y = one_month_ago.year, d1m = one_month_ago.month, d1d = one_month_ago.day)
def _get_all_records(self):
"""Return all records matching the criteria no matter of their modification date."""
#FIXME: Return only records with full texts available for Google Scholar
return perform_request_search(c = self._collections)
def _split_records_into_files(self, records, max_records_per_file, file_name_pattern, output_directory):
"""Split and save records into files containing not more than max_records_per_file records.
records - list of record numbers
max_records_per_file - the maximum number of records per file
file_name_pattern - the pattern used to name the files. Filenames will start with this
pattern.
output_directory - directory where all the files will be placed
"""
file_number = 1
file_name = self._get_part_file_name(file_name_pattern, file_number)
begin = 0
number_of_records = len(records)
if 0 == number_of_records:
return
for end in xrange(max_records_per_file, number_of_records, max_records_per_file):
self._save_records_into_file(records[begin:end], file_name, output_directory)
begin = end
file_number = file_number + 1
file_name = self._get_part_file_name(file_name_pattern, file_number)
if(begin != number_of_records):
self._save_records_into_file(records[begin:number_of_records], file_name, output_directory)
self._create_index_file(file_number, file_name_pattern, output_directory)
def _get_part_file_name(self, file_name_pattern, file_number):
"""Returns name of the file containing part of the records
file_name_pattern - the pattetn used to create the filename
file_number - the number of the file in the sequence of files
The result is filename like lastmonth-part2.xml.gz
where lastmonth is the file_name_pattern and 2 is the file_number
"""
file_name = "%s-part%d.xml.gz" % (file_name_pattern, file_number)
return file_name
def _create_index_file(self, number_of_files, file_name_pattern, output_directory):
"""Creates HTML file containing links to all files containing records"""
try:
index_file = open(output_directory + os.sep +file_name_pattern+"-index.html", "w")
index_file.write("<html><body>\n")
for file_number in xrange(1, number_of_files + 1):
file_name = self._get_part_file_name(file_name_pattern, file_number)
index_file.write('<a href="%s">%s</a><br>\n' % (file_name, file_name))
index_file.write("</body></html>\n")
except (IOError, OSError), exception:
self._report_error("Failed to create index file.", exception)
if index_file is not None:
index_file.close()
def _save_records_into_file(self, records, file_name, output_directory):
"""Save all the records into file in proper format (currently
National Library of Medicine XML).
file_name - the name of the file where records will be saved
output_directory - directory where the file will be placed"""
output_file = self._open_output_file(file_name, output_directory)
self._write_to_output_file(output_file, "<articles>\n")
for record in records:
nlm_xml = self._get_record_NLM_XML(record)
output_file.write(nlm_xml)
self._write_to_output_file(output_file, "\n</articles>")
self._close_output_file(output_file)
def _open_output_file(self, file_name, output_directory):
"""Opens new file for writing.
file_name - the name of the file without the extention.
output_directory - the directory where file will be created"""
path = output_directory + os.sep + file_name
try:
output_file = gzip.GzipFile(filename = path, mode = "w")
return output_file
except (IOError, OSError), exception:
self._report_error("Failed to open file file %s." % (path, ), exception)
return None
def _close_output_file(self, output_file):
"""Closes the file"""
if output_file is None:
return
output_file.close()
def _write_to_output_file(self, output_file, text_to_write):
""""Wirtes a the text passed as a parameter to file"""
try:
output_file.write(text_to_write)
except (IOError, OSError), exception:
self._report_error("Failed to write to file " + output_file.name, exception)
def _get_record_NLM_XML(self, record):
"""Returns the record in National Library of Medicine XML format."""
return print_record(record, format='xn')
def _delete_files(self, path_to_directory, name_pattern):
"""Deletes files with file name starting with name_pattern
from directory specified by path_to_directory"""
files = os.listdir(path_to_directory)
for current_file in files:
if current_file.startswith(name_pattern):
path_to_file = path_to_directory + os.sep + current_file
os.remove(path_to_file)
def _report_error(self, error_message, exception = None):
"""Reprts an error during exprotring"""
raise GoogleScholarExportException(error_message, exception)
class GoogleScholarExportException(Exception):
"""Exception indicating an error during exportting for Google scholar."""
_error_message = ""
_inner_exception = None
def __init__(self, error_message, inner_exception = None):
"""Constructor of the exception"""
Exception.__init__(self, error_message, inner_exception)
self._error_message = error_message
self._inner_exception = inner_exception
def get_error_message(self):
"""Returns the error message that explains the reason for the exception"""
return self._error_message
def get_inner_exception(self):
"""Returns the inner exception that is the cause for the current exception"""
return self._inner_exception
diff --git a/invenio/legacy/bibexport/marcxml.py b/invenio/legacy/bibexport/marcxml.py
index a4fa7611b..583d983ed 100644
--- a/invenio/legacy/bibexport/marcxml.py
+++ b/invenio/legacy/bibexport/marcxml.py
@@ -1,224 +1,224 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibExport plugin implementing MARCXML exporting method.
The main function is run_export_method(jobname) defined at the end.
This is what BibExport daemon calls for all the export jobs that use
this exporting method.
The MARCXML exporting method export as MARCXML all the records
matching a particular search query, zip them and move them to the
requested folder. The output of this exporting method is similar to
what one would get by listing the records in MARCXML from the web
search interface. The exporter also export all the records modified
in the last month.
* all exportable records:
/export/marcxml/all_"export_name".xml.gz - where "export_name" is the name specified in the config
* records modified in the last month:
/export/marcxml/lastmonth_"export_name".xml.gz - where "export_name" is the name specified in the config
"""
from invenio.config import CFG_WEBDIR, CFG_ETCDIR
-from invenio.bibtask import write_message
+from invenio.legacy.bibsched.bibtask import write_message
from invenio.legacy.search_engine import perform_request_search, print_record
from ConfigParser import ConfigParser
import os
import gzip
import datetime
def run_export_method(jobname):
"""Main function, reading params and running the task."""
# read jobname's cfg file to detect export criterias
jobconf = ConfigParser()
jobconffile = CFG_ETCDIR + os.sep + 'bibexport' + os.sep + jobname + '.cfg'
if not os.path.exists(jobconffile):
write_message("ERROR: cannot find config file %s." % jobconffile)
return None
jobconf.read(jobconffile)
export_criterias = dict(jobconf.items('export_criterias'))
write_message("bibexport_marcxml: job %s started." % jobname)
try:
output_directory = CFG_WEBDIR + os.sep + "export" + os.sep + "marcxml"
exporter = MARCXMLExporter(output_directory, export_criterias)
exporter.export()
except MARCXMLExportException, ex:
write_message("%s Exception: %s" %(ex.get_error_message(), ex.get_inner_exception()))
write_message("bibexport_marcxml: job %s finished." % jobname)
class MARCXMLExporter:
"""Export data to MARCXML"""
_output_directory = ""
_export_criterias = {}
def __init__(self, output_directory, export_criterias):
"""Constructor of MARCXMLExporter
@param output_directory: directory where files will be placed
@param export_criterias: dictionary of names and associated search patterns
"""
self.set_output_directory(output_directory)
self._export_criterias = export_criterias
def export(self):
"""Export all records and records modified last month"""
for export_name, export_pattern in self._export_criterias.iteritems():
LAST_MONTH_FILE_NAME = "lastmonth_" + export_name + '.xml'
ALL_MONTH_FILE_NAME = "all_" + export_name + '.xml'
# Export records modified last month
records = self._get_records_modified_last_month(export_name, export_pattern)
self._delete_files(self._output_directory, LAST_MONTH_FILE_NAME)
#self._split_records_into_files(records, SPLIT_BY_RECORDS, LAST_MONTH_FILE_NAME_PATTERN, self._output_directory)
self._save_records_into_file(records, LAST_MONTH_FILE_NAME, self._output_directory)
# Export all records
all_records = self._get_all_records(export_name, export_pattern)
self._delete_files(self._output_directory, ALL_MONTH_FILE_NAME)
self._save_records_into_file(all_records, ALL_MONTH_FILE_NAME, self._output_directory)
def set_output_directory(self, path_to_directory):
"""Check if directory exists. If it does not exists it creates it."""
directory = path_to_directory
# remove the slash from the end of the path if exists
if directory[-1] == os.sep:
directory = directory[:-1]
# if directory does not exists then create it
if not os.path.exists(directory):
try:
os.makedirs(directory)
except(IOError, OSError), exception:
self._report_error("Directory %s does not exist and cannot be created." % (directory, ), exception)
# if it is not path to a directory report an error
if not os.path.isdir(directory):
self._report_error("%s is not a directory." % (directory, ))
return
self._output_directory = directory
def _get_records_modified_last_month(self, export_name, export_pattern):
"""Returns all records modified last month and matching the criteria."""
current_date = datetime.date.today()
one_month_ago = current_date - datetime.timedelta(days = 31)
return perform_request_search(dt="m", p=export_pattern, d1y = one_month_ago.year, d1m = one_month_ago.month, d1d = one_month_ago.day)
def _get_all_records(self, export_name, export_pattern):
"""Return all records matching the criteria no matter of their modification date."""
return perform_request_search(p=export_pattern)
def _save_records_into_file(self, records, file_name, output_directory):
"""Save all the records into file in MARCXML
file_name - the name of the file where records will be saved
output_directory - directory where the file will be placed"""
output_file = self._open_output_file(file_name, output_directory)
self._write_to_output_file(output_file,
'<?xml version="1.0" encoding="UTF-8"?>\n<collection xmlns="http://www.loc.gov/MARC21/slim">\n')
for record in records:
marcxml = self._get_record_MARCXML(record)
output_file.write(marcxml)
self._write_to_output_file(output_file, "\n</collection>")
self._close_output_file(output_file)
def _open_output_file(self, file_name, output_directory):
"""Opens new file for writing.
file_name - the name of the file without the extention.
output_directory - the directory where file will be created"""
path = output_directory + os.sep + file_name + '.gz'
try:
output_file = gzip.GzipFile(filename = path, mode = "w")
return output_file
except (IOError, OSError), exception:
self._report_error("Failed to open file file %s." % (path, ), exception)
return None
def _close_output_file(self, output_file):
"""Closes the file"""
if output_file is None:
return
output_file.close()
def _write_to_output_file(self, output_file, text_to_write):
""""Wirtes a the text passed as a parameter to file"""
try:
output_file.write(text_to_write)
except (IOError, OSError), exception:
self._report_error("Failed to write to file " + output_file.name, exception)
def _get_record_MARCXML(self, record):
"""Returns the record in MARCXML format."""
return print_record(record, format='xm')
def _delete_files(self, path_to_directory, name_pattern):
"""Deletes files with file name starting with name_pattern
from directory specified by path_to_directory"""
files = os.listdir(path_to_directory)
for current_file in files:
if current_file.startswith(name_pattern):
path_to_file = path_to_directory + os.sep + current_file
os.remove(path_to_file)
def _report_error(self, error_message, exception = None):
"""Reprts an error during exprotring"""
raise MARCXMLExportException(error_message, exception)
class MARCXMLExportException(Exception):
"""Exception indicating an error when exporting to MARCXML."""
_error_message = ""
_inner_exception = None
def __init__(self, error_message, inner_exception = None):
"""Constructor of the exception"""
Exception.__init__(self, error_message, inner_exception)
self._error_message = error_message
self._inner_exception = inner_exception
def get_error_message(self):
"""Returns the error message that explains the reason for the exception"""
return self._error_message
def get_inner_exception(self):
"""Returns the inner exception that is the cause for the current exception"""
return self._inner_exception
diff --git a/invenio/legacy/bibexport/sitemap.py b/invenio/legacy/bibexport/sitemap.py
index 758e13819..c5a789d78 100644
--- a/invenio/legacy/bibexport/sitemap.py
+++ b/invenio/legacy/bibexport/sitemap.py
@@ -1,422 +1,422 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibExport plugin implementing 'sitemap' exporting method.
The main function is run_export_method(jobname) defined at the end.
This is what BibExport daemon calls for all the export jobs that use
this exporting method.
"""
from datetime import datetime
from urllib import quote
from ConfigParser import ConfigParser
import os
import gzip
import time
from invenio.legacy.search_engine import get_collection_reclist
from invenio.legacy.dbquery import run_sql
from invenio.config import CFG_SITE_URL, CFG_WEBDIR, CFG_ETCDIR, \
CFG_SITE_RECORD, CFG_SITE_LANGS
from invenio.intbitset import intbitset
-from invenio.websearch_webcoll import Collection
-from invenio.bibtask import write_message, task_update_progress, task_sleep_now_if_required
+from invenio.legacy.websearch.webcoll import Collection
+from invenio.legacy.bibsched.bibtask import write_message, task_update_progress, task_sleep_now_if_required
from invenio.utils.text import encode_for_xml
from invenio.utils.url import get_canonical_and_alternates_urls
DEFAULT_TIMEZONE = '+01:00'
DEFAULT_PRIORITY_HOME = 1
DEFAULT_CHANGEFREQ_HOME = 'hourly'
DEFAULT_PRIORITY_RECORDS = 0.8
DEFAULT_CHANGEFREQ_RECORDS = 'weekly'
DEFAULT_PRIORITY_COMMENTS = 0.4
DEFAULT_CHANGEFREQ_COMMENTS = 'weekly'
DEFAULT_PRIORITY_REVIEWS = 0.6
DEFAULT_CHANGEFREQ_REVIEWS = 'weekly'
DEFAULT_PRIORITY_FULLTEXTS = 0.9
DEFAULT_CHANGEFREQ_FULLTEXTS = 'weekly'
DEFAULT_PRIORITY_COLLECTIONS = 0.3
DEFAULT_CHANGEFREQ_COLLECTIONS = 'hourly'
MAX_RECORDS = 50000
MAX_SIZE = 10000000
def get_all_public_records(collections):
""" Get all records which exist (i.e. not suppressed ones) and are in
accessible collection.
returns list of (recid, last_modification) tuples
"""
recids = intbitset()
for collection in collections:
recids += get_collection_reclist(collection)
query = 'SELECT id, modification_date FROM bibrec'
res = run_sql(query)
return [(recid, lastmod) for (recid, lastmod) in res if recid in recids]
def get_all_public_collections(base_collections):
""" Return a list of (collection.name, last_modification) tuples for all
collections and subcollections of base_collections
"""
def get_collection_last_modification(collection):
""" last modification = modification date fo latest added record """
last_mod = None
query_last_mod = "SELECT modification_date FROM bibrec WHERE id=%s"
try:
latest_recid = collection.reclist.tolist()[-1]
except IndexError:
# this collection is empty
return last_mod
res = run_sql(query_last_mod, (latest_recid,))
if res and res[0][0]:
last_mod = res[0][0]
return last_mod
output = []
for coll_name in base_collections:
mother_collection = Collection(coll_name)
if not mother_collection.restricted_p():
last_mod = get_collection_last_modification(mother_collection)
output.append((coll_name, last_mod))
for descendant in mother_collection.get_descendants(type='r'):
if not descendant.restricted_p():
last_mod = get_collection_last_modification(descendant)
output.append((descendant.name, last_mod))
for descendant in mother_collection.get_descendants(type='v'):
if not descendant.restricted_p():
last_mod = get_collection_last_modification(descendant)
output.append((descendant.name, last_mod))
return output
def filter_fulltexts(recids, fulltext_type=None):
""" returns list of records having a fulltext of type fulltext_type.
If fulltext_type is empty, return all records having a fulltext"""
recids = dict(recids)
if fulltext_type:
query = """SELECT id_bibrec, max(modification_date)
FROM bibrec_bibdoc
LEFT JOIN bibdoc ON bibrec_bibdoc.id_bibdoc=bibdoc.id
WHERE type=%s
GROUP BY id_bibrec"""
res = run_sql(query, (fulltext_type,))
else:
query = """SELECT id_bibrec, max(modification_date)
FROM bibrec_bibdoc
LEFT JOIN bibdoc ON bibrec_bibdoc.id_bibdoc=bibdoc.id
GROUP BY id_bibrec"""
res = run_sql(query)
return [(recid, lastmod) for (recid, lastmod) in res if recid in recids]
def filter_comments(recids):
""" Retrieve recids having a comment. return (recid, last_review_date)"""
recids = dict(recids)
query = """SELECT id_bibrec, max(date_creation)
FROM cmtRECORDCOMMENT
WHERE star_score=0
GROUP BY id_bibrec"""
res = run_sql(query)
return [(recid, lastmod) for (recid, lastmod) in res if recid in recids]
def filter_reviews(recids):
""" Retrieve recids having a review. return (recid, last_review_date)"""
recids = dict(recids)
query = """SELECT id_bibrec, max(date_creation)
FROM cmtRECORDCOMMENT
WHERE star_score>0
GROUP BY id_bibrec"""
res = run_sql(query)
return [(recid, lastmod) for (recid, lastmod) in res if recid in recids]
SITEMAP_HEADER = """\
<?xml version="1.0" encoding="UTF-8"?>
<urlset
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">"""
SITEMAP_FOOTER = '\n</urlset>\n'
class SitemapWriter(object):
""" Writer for sitemaps"""
def __init__(self, sitemap_id):
""" Constructor.
name: path to the sitemap file to be created
"""
self.header = SITEMAP_HEADER
self.footer = SITEMAP_FOOTER
self.sitemap_id = sitemap_id
self.name = os.path.join(CFG_WEBDIR, 'sitemap-%02d.xml.gz' % sitemap_id)
self.filedescriptor = gzip.open(self.name + '.part', 'w')
self.num_urls = 0
self.file_size = 0
self.filedescriptor.write(self.header)
self.file_size += len(self.footer)
def add_url(self, url, lastmod=datetime(1900, 1, 1), changefreq="", priority="", alternate=False):
""" create a new url node. Returns the number of url nodes in sitemap"""
self.num_urls += 1
canonical_url, alternate_urls = get_canonical_and_alternates_urls(url, drop_ln=not alternate)
url_node = u"""
<url>
<loc>%s</loc>%s
</url>"""
optional = ''
if lastmod:
optional += u"""
<lastmod>%s</lastmod>""" % lastmod.strftime('%Y-%m-%dT%H:%M:%S' + \
DEFAULT_TIMEZONE)
if changefreq:
optional += u"""
<changefreq>%s</changefreq>""" % changefreq
if priority:
optional += u"""
<priority>%s</priority>""" % priority
if alternate:
for ln, alternate_url in alternate_urls.iteritems():
ln = ln.replace('_', '-') ## zh_CN -> zh-CN
optional += u"""
<xhtml:link rel="alternate" hreflang="%s" href="%s" />""" % (ln, encode_for_xml(alternate_url, quote=True))
url_node %= (encode_for_xml(canonical_url), optional)
self.file_size += len(url_node)
self.filedescriptor.write(url_node)
return self.num_urls
def get_size(self):
""" File size. Should not be > 10MB """
return self.file_size + len(self.footer)
def get_number_of_urls(self):
""" Number of urls in the sitemap. Should not be > 50'000"""
return self.num_urls
def get_name(self):
""" Returns the filename """
return self.name
def get_sitemap_url(self):
""" Returns the sitemap URL"""
return CFG_SITE_URL + '/' + os.path.basename(self.name)
def __del__(self):
""" Writes the whole sitemap """
self.filedescriptor.write(self.footer)
self.filedescriptor.close()
os.rename(self.name + '.part', self.name)
SITEMAP_INDEX_HEADER = \
'<?xml version="1.0" encoding="UTF-8"?>\n' \
'<sitemapindex\n' \
' xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\n' \
' xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9\n' \
' http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd"\n' \
' xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'
SITEMAP_INDEX_FOOTER = '\n</sitemapindex>\n'
class SitemapIndexWriter(object):
"""class for writing Sitemap Index files."""
def __init__(self, name):
""" Constructor.
name: path to the sitemap index file to be created
"""
self.header = SITEMAP_INDEX_HEADER
self.footer = SITEMAP_INDEX_FOOTER
self.name = name
self.filedescriptor = gzip.open(self.name + '.part', 'w')
self.num_urls = 0
self.file_size = 0
self.filedescriptor.write(self.header)
self.file_size += len(self.footer)
def add_url(self, url):
""" create a new url node. Returns the number of url nodes in sitemap"""
self.num_urls += 1
url_node = u"""
<sitemap>
<loc>%s</loc>%s
</sitemap>"""
optional = u"""
<lastmod>%s</lastmod>""" % time.strftime('%Y-%m-%dT%H:%M:%S' +\
DEFAULT_TIMEZONE)
url_node %= (url, optional)
self.file_size += len(url_node)
self.filedescriptor.write(url_node)
return self.num_urls
def __del__(self):
""" Writes the whole sitemap """
self.filedescriptor.write(self.footer)
self.filedescriptor.close()
os.rename(self.name + '.part', self.name)
def generate_sitemaps(sitemap_index_writer, collection_names, fulltext_filter=''):
"""
Generate sitemaps themselves. Return list of generated sitemaps files
"""
sitemap_id = 1
writer = SitemapWriter(sitemap_id)
sitemap_index_writer.add_url(writer.get_sitemap_url())
nb_urls = 0
for lang in CFG_SITE_LANGS:
writer.add_url(CFG_SITE_URL + '/?ln=%s' % lang,
lastmod=datetime.today(),
changefreq=DEFAULT_CHANGEFREQ_HOME,
priority=DEFAULT_PRIORITY_HOME)
nb_urls += 1
write_message("... Getting all public records...")
recids = get_all_public_records(collection_names)
write_message("... Generating urls for %s records..." % len(recids))
task_sleep_now_if_required(can_stop_too=True)
for i, (recid, lastmod) in enumerate(recids):
if nb_urls % 100 == 0 and (writer.get_size() >= MAX_SIZE or nb_urls >= MAX_RECORDS):
sitemap_id += 1
writer = SitemapWriter(sitemap_id)
sitemap_index_writer.add_url(writer.get_sitemap_url())
nb_urls = writer.add_url(CFG_SITE_URL + '/%s/%s' % (CFG_SITE_RECORD, recid),
lastmod = lastmod,
changefreq = DEFAULT_CHANGEFREQ_RECORDS,
priority = DEFAULT_PRIORITY_RECORDS)
if i % 100 == 0:
task_update_progress("Sitemap for recid %s/%s" % (i + 1, len(recids)))
task_sleep_now_if_required(can_stop_too=True)
write_message("... Generating urls for collections...")
collections = get_all_public_collections(collection_names)
for i, (collection, lastmod) in enumerate(collections):
for lang in CFG_SITE_LANGS:
if nb_urls % 100 == 0 and (writer.get_size() >= MAX_SIZE or nb_urls >= MAX_RECORDS):
sitemap_id += 1
writer = SitemapWriter(sitemap_id)
sitemap_index_writer.add_url(writer.get_sitemap_url())
nb_urls = writer.add_url('%s/collection/%s?ln=%s' % (CFG_SITE_URL, quote(collection), lang),
lastmod = lastmod,
changefreq = DEFAULT_CHANGEFREQ_COLLECTIONS,
priority = DEFAULT_PRIORITY_COLLECTIONS,
alternate=True)
if i % 100 == 0:
task_update_progress("Sitemap for collection %s/%s" % (i + 1, len(collections)))
task_sleep_now_if_required(can_stop_too=True)
write_message("... Generating urls for fulltexts...")
recids = filter_fulltexts(recids, fulltext_filter)
for i, (recid, lastmod) in enumerate(recids):
if nb_urls % 100 == 0 and (writer.get_size() >= MAX_SIZE or nb_urls >= MAX_RECORDS):
sitemap_id += 1
writer = SitemapWriter(sitemap_id)
sitemap_index_writer.add_url(writer.get_sitemap_url())
nb_urls = writer.add_url(CFG_SITE_URL + '/%s/%s/files' % (CFG_SITE_RECORD, recid),
lastmod = lastmod,
changefreq = DEFAULT_CHANGEFREQ_FULLTEXTS,
priority = DEFAULT_PRIORITY_FULLTEXTS)
if i % 100 == 0:
task_update_progress("Sitemap for files page %s/%s" % (i, len(recids)))
task_sleep_now_if_required(can_stop_too=True)
write_message("... Generating urls for comments...")
recids = filter_comments(recids)
for i, (recid, lastmod) in enumerate(recids):
if nb_urls % 100 == 0 and (writer.get_size() >= MAX_SIZE or nb_urls >= MAX_RECORDS):
sitemap_id += 1
writer = SitemapWriter(sitemap_id)
sitemap_index_writer.add_url(writer.get_sitemap_url())
nb_urls = writer.add_url(CFG_SITE_URL + '/%s/%s/comments' % (CFG_SITE_RECORD, recid),
lastmod = lastmod,
changefreq = DEFAULT_CHANGEFREQ_COMMENTS,
priority = DEFAULT_PRIORITY_COMMENTS)
if i % 100 == 0:
task_update_progress("Sitemap for comments page %s/%s" % (i, len(recids)))
task_sleep_now_if_required(can_stop_too=True)
write_message("... Generating urls for reviews")
recids = filter_reviews(recids)
for i, (recid, lastmod) in enumerate(recids):
if nb_urls % 100 == 0 and (writer.get_size() >= MAX_SIZE or nb_urls >= MAX_RECORDS):
sitemap_id += 1
write_message("")
writer = SitemapWriter(sitemap_id)
sitemap_index_writer.add_url(writer.get_sitemap_url())
nb_urls = writer.add_url(CFG_SITE_URL + '/%s/%s/reviews' % (CFG_SITE_RECORD, recid),
lastmod = lastmod,
changefreq = DEFAULT_CHANGEFREQ_REVIEWS,
priority = DEFAULT_PRIORITY_REVIEWS)
if i % 100 == 0:
task_update_progress("Sitemap for reviews page %s/%s" % (i, len(recids)))
task_sleep_now_if_required(can_stop_too=True)
def generate_sitemaps_index(collection_list, fulltext_filter=None):
"""main function. Generates the sitemap index and the sitemaps
collection_list: list of collection names to add in sitemap
fulltext_filter: if provided the parser will intergrate only give fulltext
types
"""
write_message("Generating all sitemaps...")
sitemap_index_writer = SitemapIndexWriter(CFG_WEBDIR + '/sitemap-index.xml.gz')
generate_sitemaps(sitemap_index_writer, collection_list, fulltext_filter)
def run_export_method(jobname):
"""Main function, reading params and running the task."""
write_message("bibexport_sitemap: job %s started." % jobname)
collections = get_config_parameter(jobname=jobname, parameter_name="collection", is_parameter_collection = True)
fulltext_type = get_config_parameter(jobname=jobname, parameter_name="fulltext_status")
generate_sitemaps_index(collections, fulltext_type)
write_message("bibexport_sitemap: job %s finished." % jobname)
def get_config_parameter(jobname, parameter_name, is_parameter_collection = False):
"""Detect export method of JOBNAME. Basically, parse JOBNAME.cfg
and return export_method. Return None if problem found."""
jobconfig = ConfigParser()
jobconffile = CFG_ETCDIR + os.sep + 'bibexport' + os.sep + jobname + '.cfg'
if not os.path.exists(jobconffile):
write_message("ERROR: cannot find config file %s." % jobconffile)
return None
jobconfig.read(jobconffile)
if is_parameter_collection:
all_items = jobconfig.items(section='export_job')
parameters = []
for item_name, item_value in all_items:
if item_name.startswith(parameter_name):
parameters.append(item_value)
return parameters
else:
parameter = jobconfig.get('export_job', parameter_name)
return parameter
diff --git a/invenio/legacy/bibfield/functions/get_bibdoc.py b/invenio/legacy/bibfield/functions/get_bibdoc.py
index b7f5ac012..ed6efae7e 100644
--- a/invenio/legacy/bibfield/functions/get_bibdoc.py
+++ b/invenio/legacy/bibfield/functions/get_bibdoc.py
@@ -1,35 +1,35 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
def get_bibdoc(recid):
"""
Retrieves using BibDoc all the files related with a given record
@param recid
@return BibDoc of the given record
"""
if not recid or recid < 0:
return None
- from invenio.bibdocfile import BibDoc, InvenioBibDocFileError
+ from invenio.legacy.bibdocfile.api import BibDoc, InvenioBibDocFileError
try:
return BibDoc(int(recid))
except InvenioBibDocFileError:
return None
diff --git a/invenio/legacy/bibfield/functions/get_files_from_bibdoc.py b/invenio/legacy/bibfield/functions/get_files_from_bibdoc.py
index 02e347a1b..f4e84088f 100644
--- a/invenio/legacy/bibfield/functions/get_files_from_bibdoc.py
+++ b/invenio/legacy/bibfield/functions/get_files_from_bibdoc.py
@@ -1,58 +1,58 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
def get_files_from_bibdoc(recid):
"""
Retrieves using BibDoc all the files related with a given record
@param recid
@return List of dictionaries containing all the information stored
inside BibDoc if the current record has files attached, the
empty list otherwise
"""
if not recid or recid < 0:
return []
- from invenio.bibdocfile import BibRecDocs, InvenioBibDocFileError
+ from invenio.legacy.bibdocfile.api import BibRecDocs, InvenioBibDocFileError
files = []
try:
bibrecdocs = BibRecDocs(int(recid))
except InvenioBibDocFileError:
return []
latest_files = bibrecdocs.list_latest_files()
for afile in latest_files:
file_dict = {}
file_dict['comment'] = afile.get_comment()
file_dict['description'] = afile.get_description()
file_dict['eformat'] = afile.get_format()
file_dict['full_name'] = afile.get_full_name()
file_dict['full_path'] = afile.get_full_path()
file_dict['magic'] = afile.get_magic()
file_dict['name'] = afile.get_name()
file_dict['path'] = afile.get_path()
file_dict['size'] = afile.get_size()
file_dict['status'] = afile.get_status()
file_dict['subformat'] = afile.get_subformat()
file_dict['superformat'] = afile.get_superformat()
file_dict['type'] = afile.get_type()
file_dict['url'] = afile.get_url()
file_dict['version'] = afile.get_version()
files.append(file_dict)
return files
diff --git a/invenio/legacy/bibformat/bibreformat.py b/invenio/legacy/bibformat/bibreformat.py
index cbcd4cc62..f8fddf0d9 100644
--- a/invenio/legacy/bibformat/bibreformat.py
+++ b/invenio/legacy/bibformat/bibreformat.py
@@ -1,618 +1,618 @@
## -*- mode: python; coding: utf-8; -*-
##
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Call BibFormat engine and create HTML brief (and other) formats cache for
bibliographic records."""
__revision__ = "$Id$"
import sys
from invenio.base.factory import with_app_context
try:
from invenio.legacy.dbquery import run_sql
from invenio.config import \
CFG_SITE_URL,\
CFG_TMPDIR,\
CFG_BINDIR
from invenio.intbitset import intbitset
from invenio.legacy.search_engine import perform_request_search, search_pattern
from invenio.legacy.search_engine import print_record
from invenio.legacy.bibrank.citation_searcher import get_cited_by
from invenio.legacy.bibrank.citation_indexer import get_bibrankmethod_lastupdate
from invenio.modules.formatter import format_record
from invenio.modules.formatter.config import CFG_BIBFORMAT_USE_OLD_BIBFORMAT
from invenio.utils.shell import split_cli_ids_arg
- from invenio.bibtask import task_init, write_message, task_set_option, \
+ from invenio.legacy.bibsched.bibtask import task_init, write_message, task_set_option, \
task_get_option, task_update_progress, task_has_option, \
task_low_level_submission, task_sleep_now_if_required, \
task_get_task_param
import os
import time
import zlib
from datetime import datetime
except ImportError, e:
print "Error: %s" % e
sys.exit(1)
def fetch_last_updated(format):
select_sql = "SELECT last_updated FROM format WHERE code = %s"
row = run_sql(select_sql, (format.lower(), ))
# Fallback in case we receive None instead of a valid date
last_date = row[0][0] or datetime(year=1900, month=1, day=1)
return last_date
def store_last_updated(format, update_date):
sql = "UPDATE format SET last_updated = %s " \
"WHERE code = %s AND (last_updated < %s or last_updated IS NULL)"
iso_date = update_date.strftime("%Y-%m-%d %H:%M:%S")
run_sql(sql, (iso_date, format.lower(), iso_date))
### run the bibreformat task bibsched scheduled
###
@with_app_context()
def bibreformat_task(fmt, sql, sql_queries, cds_query, process_format, process, recids):
"""
BibReformat main task
@param fmt: output format to use
@param sql: dictionary with pre-created sql queries for various cases (for selecting records). Some of these queries will be picked depending on the case
@param sql_queries: a list of sql queries to be executed to select records to reformat.
@param cds_query: a search query to be executed to select records to reformat
@param process_format:
@param process:
@param recids: a list of record IDs to reformat
@return: None
"""
write_message("Processing format %s" % fmt)
t1 = os.times()[4]
start_date = datetime.now()
### Query the database
###
task_update_progress('Fetching records to process')
if process_format: # '-without' parameter
write_message("Querying database for records without cache...")
without_format = without_fmt(sql)
recIDs = intbitset(recids)
if cds_query['field'] != "" or \
cds_query['collection'] != "" or \
cds_query['pattern'] != "":
write_message("Querying database (CDS query)...")
if cds_query['collection'] == "":
# use search_pattern() whenever possible, as it can search
# even in private collections
res = search_pattern(p=cds_query['pattern'],
f=cds_query['field'],
m=cds_query['matching'])
else:
# use perform_request_search when '-c' argument has been
# defined, as it is not supported by search_pattern()
res = intbitset(perform_request_search(req=None, of='id',
c=cds_query['collection'],
p=cds_query['pattern'],
f=cds_query['field']))
recIDs |= res
for sql_query in sql_queries:
write_message("Querying database (%s) ..." % sql_query, verbose=2)
recIDs |= intbitset(run_sql(sql_query))
if fmt == "HDREF" and recIDs:
# HDREF represents the references tab
# the tab needs to be recomputed not only when the record changes
# but also when one of the citations changes
latest_bibrank_run = get_bibrankmethod_lastupdate('citation')
start_date = latest_bibrank_run
sql = """SELECT id, modification_date FROM bibrec
WHERE id in (%s)""" % ','.join(str(r) for r in recIDs)
def check_date(mod_date):
return mod_date < latest_bibrank_run
recIDs = intbitset([recid for recid, mod_date in run_sql(sql) \
if check_date(mod_date)])
for r in recIDs:
recIDs |= intbitset(get_cited_by(r))
### list of corresponding record IDs was retrieved
### now format the selected records
if process_format:
write_message("Records to be processed: %d" % (len(recIDs) \
+ len(without_format)))
write_message("Out of it records without existing cache: %d" % len(without_format))
else:
write_message("Records to be processed: %d" % (len(recIDs)))
### Initialize main loop
total_rec = 0 # Total number of records
tbibformat = 0 # time taken up by external call
tbibupload = 0 # time taken up by external call
### Iterate over all records prepared in lists I (option)
if process:
if CFG_BIBFORMAT_USE_OLD_BIBFORMAT: # FIXME: remove this
# when migration from php to
# python bibformat is done
(total_rec_1, tbibformat_1, tbibupload_1) = iterate_over_old(recIDs,
fmt)
else:
(total_rec_1, tbibformat_1, tbibupload_1) = iterate_over_new(recIDs,
fmt)
total_rec += total_rec_1
tbibformat += tbibformat_1
tbibupload += tbibupload_1
### Iterate over all records prepared in list II (no_format)
if process_format and process:
if CFG_BIBFORMAT_USE_OLD_BIBFORMAT: # FIXME: remove this
# when migration from php to
# python bibformat is done
(total_rec_2, tbibformat_2, tbibupload_2) = iterate_over_old(without_format,
fmt)
else:
(total_rec_2, tbibformat_2, tbibupload_2) = iterate_over_new(without_format,
fmt)
total_rec += total_rec_2
tbibformat += tbibformat_2
tbibupload += tbibupload_2
### Store last run time
if task_has_option("last"):
write_message("storing run date to %s" % start_date)
store_last_updated(fmt, start_date)
### Final statistics
t2 = os.times()[4]
elapsed = t2 - t1
message = "total records processed: %d" % total_rec
write_message(message)
message = "total processing time: %2f sec" % elapsed
write_message(message)
message = "Time spent on external call (os.system):"
write_message(message)
message = " bibformat: %2f sec" % tbibformat
write_message(message)
message = " bibupload: %2f sec" % tbibupload
write_message(message)
def check_validity_input_formats(input_formats):
"""
Checks the validity of every input format.
@param input_formats: list of given formats
@type input_formats: list
@return: if there is any invalid input format it returns this value
@rtype: string
"""
from invenio.legacy.search_engine import get_available_output_formats
valid_formats = get_available_output_formats()
# let's to extract the values of the available formats
format_values = []
for aformat in valid_formats:
format_values.append(aformat['value'])
invalid_format = ''
for aformat in input_formats:
if aformat.lower() not in format_values:
invalid_format = aformat.lower()
break
return invalid_format
### Identify recIDs of records with missing format
###
def without_fmt(queries, chunk_size=2000):
"""
List of record IDs to be reformated, not having the specified format yet
@param sql: a dictionary with sql queries to pick from
@return: a list of record ID without pre-created format cache
"""
sql = queries['missing']
recids = intbitset()
max_id = run_sql("SELECT max(id) FROM bibrec")[0][0]
for start in xrange(1, max_id + 1, chunk_size):
end = start + chunk_size
recids += intbitset(run_sql(sql, (start, end)))
return recids
### Bibreformat all selected records (using new python bibformat)
### (see iterate_over_old further down)
def iterate_over_new(list, fmt):
"""
Iterate over list of IDs
@param list: the list of record IDs to format
@param fmt: the output format to use
@return: tuple (total number of records, time taken to format, time taken to insert)
"""
global total_rec
formatted_records = '' # (string-)List of formatted record of an iteration
tbibformat = 0 # time taken up by external call
tbibupload = 0 # time taken up by external call
start_date = task_get_task_param('task_starting_time') # Time at which the record was formatted
tot = len(list)
count = 0
for recID in list:
t1 = os.times()[4]
start_date = time.strftime('%Y-%m-%d %H:%M:%S')
- write_message(format_record(recID, fmt, on_the_fly=True))
+ format_record(recID, fmt, on_the_fly=True)
formatted_record = zlib.compress(format_record(recID, fmt, on_the_fly=True))
run_sql('REPLACE LOW_PRIORITY INTO bibfmt (id_bibrec, format, last_updated, value) VALUES (%s, %s, %s, %s)',
(recID, fmt, start_date, formatted_record))
t2 = os.times()[4]
tbibformat += (t2 - t1)
count += 1
if (count % 100) == 0:
write_message(" ... formatted %s records out of %s" % (count, tot))
task_update_progress('Formatted %s out of %s' % (count, tot))
task_sleep_now_if_required(can_stop_too=True)
if (tot % 100) != 0:
write_message(" ... formatted %s records out of %s" % (count, tot))
return (tot, tbibformat, tbibupload)
def iterate_over_old(list, fmt):
"""
Iterate over list of IDs
@param list: the list of record IDs to format
@param fmt: the output format to use
@return: tuple (total number of records, time taken to format, time taken to insert)
"""
n_rec = 0
n_max = 10000
xml_content = '' # hold the contents
tbibformat = 0 # time taken up by external call
tbibupload = 0 # time taken up by external call
total_rec = 0 # Number of formatted records
for record in list:
n_rec = n_rec + 1
total_rec = total_rec + 1
message = "Processing record: %d" % (record)
write_message(message, verbose=9)
query = "id=%d&of=xm" % (record)
count = 0
contents = print_record(record, 'xm')
while (contents == "") and (count < 10):
contents = print_record(record, 'xm')
count = count + 1
time.sleep(10)
if count == 10:
sys.stderr.write("Failed to download %s from %s after 10 attempts... terminating" % (query, CFG_SITE_URL))
sys.exit(0)
xml_content = xml_content + contents
if xml_content:
if n_rec >= n_max:
finalfilename = "%s/rec_fmt_%s.xml" % (CFG_TMPDIR, time.strftime('%Y%m%d_%H%M%S'))
filename = "%s/bibreformat.xml" % CFG_TMPDIR
filehandle = open(filename ,"w")
filehandle.write(xml_content)
filehandle.close()
### bibformat external call
###
task_sleep_now_if_required(can_stop_too=True)
t11 = os.times()[4]
message = "START bibformat external call"
write_message(message, verbose=9)
command = "%s/bibformat otype='%s' < %s/bibreformat.xml > %s 2> %s/bibreformat.err" % (CFG_BINDIR, fmt.upper(), CFG_TMPDIR, finalfilename, CFG_TMPDIR)
os.system(command)
t22 = os.times()[4]
message = "END bibformat external call (time elapsed:%2f)" % (t22-t11)
write_message(message, verbose=9)
task_sleep_now_if_required(can_stop_too=True)
tbibformat = tbibformat + (t22 - t11)
### bibupload external call
###
t11 = os.times()[4]
message = "START bibupload external call"
write_message(message, verbose=9)
task_id = task_low_level_submission('bibupload', 'bibreformat', '-f', finalfilename)
write_message("Task #%s submitted" % task_id)
t22 = os.times()[4]
message = "END bibupload external call (time elapsed:%2f)" % (t22-t11)
write_message(message, verbose=9)
tbibupload = tbibupload + (t22- t11)
n_rec = 0
xml_content = ''
### Process the last re-formated chunk
###
if n_rec > 0:
write_message("Processing last record set (%d)" % n_rec, verbose=9)
finalfilename = "%s/rec_fmt_%s.xml" % (CFG_TMPDIR, time.strftime('%Y%m%d_%H%M%S'))
filename = "%s/bibreformat.xml" % CFG_TMPDIR
filehandle = open(filename, "w")
filehandle.write(xml_content)
filehandle.close()
### bibformat external call
###
t11 = os.times()[4]
message = "START bibformat external call"
write_message(message, verbose=9)
command = "%s/bibformat otype='%s' < %s/bibreformat.xml > %s 2> %s/bibreformat.err" % (CFG_BINDIR, fmt.upper(), CFG_TMPDIR, finalfilename, CFG_TMPDIR)
os.system(command)
t22 = os.times()[4]
message = "END bibformat external call (time elapsed:%2f)" % (t22 - t11)
write_message(message, verbose=9)
tbibformat = tbibformat + (t22 - t11)
### bibupload external call
###
t11 = os.times()[4]
message = "START bibupload external call"
write_message(message, verbose=9)
task_id = task_low_level_submission('bibupload', 'bibreformat', '-f', finalfilename)
write_message("Task #%s submitted" % task_id)
t22 = os.times()[4]
message = "END bibupload external call (time elapsed:%2f)" % (t22 - t11)
write_message(message, verbose=9)
tbibupload = tbibupload + (t22 - t11)
return (total_rec, tbibformat, tbibupload)
def task_run_core():
"""Runs the task by fetching arguments from the BibSched task queue. This is what BibSched will be invoking via daemon call."""
## initialize parameters
if task_get_option('format'):
fmts = task_get_option('format')
else:
fmts = 'HB' # default value if no format option given
for fmt in fmts.split(','):
last_updated = fetch_last_updated(fmt)
write_message("last stored run date is %s" % last_updated)
sql = {
"all" : """SELECT br.id FROM bibrec AS br, bibfmt AS bf
WHERE bf.id_bibrec = br.id AND bf.format = '%s'""" % fmt,
"last": """SELECT br.id FROM bibrec AS br
INNER JOIN bibfmt AS bf ON bf.id_bibrec = br.id
WHERE br.modification_date >= '%(last_updated)s'
AND bf.format='%(format)s'
AND bf.last_updated < br.modification_date""" \
% {'format': fmt,
'last_updated': last_updated.strftime('%Y-%m-%d %H:%M:%S')},
"missing" : """SELECT br.id
FROM bibrec as br
LEFT JOIN bibfmt as bf
ON bf.id_bibrec = br.id AND bf.format ='%s'
WHERE bf.id_bibrec IS NULL
AND br.id BETWEEN %%s AND %%s
""" % fmt,
}
sql_queries = []
cds_query = {}
if task_has_option("all"):
sql_queries.append(sql['all'])
if task_has_option("last"):
sql_queries.append(sql['last'])
if task_has_option("collection"):
cds_query['collection'] = task_get_option('collection')
else:
cds_query['collection'] = ""
if task_has_option("field"):
cds_query['field'] = task_get_option('field')
else:
cds_query['field'] = ""
if task_has_option("pattern"):
cds_query['pattern'] = task_get_option('pattern')
else:
cds_query['pattern'] = ""
if task_has_option("matching"):
cds_query['matching'] = task_get_option('matching')
else:
cds_query['matching'] = ""
if task_has_option("recids"):
recids = list(split_cli_ids_arg(task_get_option('recids')))
else:
recids = []
### sql commands to be executed during the script run
###
bibreformat_task(fmt, sql, sql_queries, cds_query, task_has_option('without'), not task_has_option('noprocess'), recids)
return True
def main():
"""Main that construct all the bibtask."""
task_init(authorization_action='runbibformat',
authorization_msg="BibReformat Task Submission",
description="""
BibReformat formats the records and saves the produced outputs for
later retrieval.
BibReformat is usually run periodically via BibSched in order to (1)
format new records in the database and to (2) reformat records for
which the meta data has been modified.
BibReformat has to be run manually when (3) format config files have
been modified, in order to see the changes in the web interface.
Although it is not necessary to run BibReformat to display formatted
records in the web interface, BibReformat allows to improve serving
speed by precreating the outputs. It is suggested to run
BibReformat for 'HB' output.
Option -m cannot be used at the same time as option -c.
Option -c prevents from finding records in private collections.
Examples:
bibreformat Format all new or modified records (in HB).
bibreformat -o HD Format all new or modified records in HD.
bibreformat -o HD,HB Format all new or modified records in HD and HB.
bibreformat -a Force reformatting all records (in HB).
bibreformat -c 'Photos' Force reformatting all records in 'Photos' collection (in HB).
bibreformat -c 'Photos' -o HD Force reformatting all records in 'Photos' collection in HD.
bibreformat -i 15 Force reformatting record 15 (in HB).
bibreformat -i 15:20 Force reformatting records 15 to 20 (in HB).
bibreformat -i 15,16,17 Force reformatting records 15, 16 and 17 (in HB).
bibreformat -n Show how many records are to be (re)formatted.
bibreformat -n -c 'Articles' Show how many records are to be (re)formatted in 'Articles' collection.
bibreformat -oHB -s1h Format all new and modified records every hour, in HB.
""", help_specific_usage=""" -o, --formats \t Specify output format/s (default HB)
-n, --noprocess \t Count records to be formatted (no processing done)
Reformatting options:
-a, --all \t Force reformatting all records
-c, --collection \t Force reformatting records by collection
-f, --field \t Force reformatting records by field
-p, --pattern \t Force reformatting records by pattern
-i, --id \t Force reformatting records by record id(s)
Pattern options:
-m, --matching \t Specify if pattern is exact (e), regular expression (r),
\t partial (p), any of the words (o) or all of the words (a)
""",
version=__revision__,
specific_params=("ac:f:p:lo:nm:i:",
["all",
"collection=",
"matching=",
"field=",
"pattern=",
"format=",
"noprocess",
"id="]),
task_submit_check_options_fnc=task_submit_check_options,
task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
task_run_fnc=task_run_core)
def task_submit_check_options():
"""Last checks and updating on the options..."""
if not (task_has_option('all') or task_has_option('collection') \
or task_has_option('field') or task_has_option('pattern') \
or task_has_option('matching') or task_has_option('recids')):
task_set_option('without', 1)
task_set_option('last', 1)
return True
def task_submit_elaborate_specific_parameter(key, value, opts, args):
"""
Elaborate specific CLI parameters of BibReformat.
@param key: a parameter key to check
@param value: a value associated to parameter X{Key}
@return: True for known X{Key} else False.
"""
if key in ("-a", "--all"):
task_set_option("all", 1)
task_set_option("without", 1)
elif key in ("-c", "--collection"):
task_set_option("collection", value)
elif key in ("-n", "--noprocess"):
task_set_option("noprocess", 1)
elif key in ("-f", "--field"):
task_set_option("field", value)
elif key in ("-p", "--pattern"):
task_set_option("pattern", value)
elif key in ("-m", "--matching"):
task_set_option("matching", value)
elif key in ("-o", "--format"):
input_formats = value.split(',')
## check the validity of the given output formats
invalid_format = check_validity_input_formats(input_formats)
if invalid_format:
try:
raise Exception('Invalid output format.')
except Exception:
from invenio.ext.logging import register_exception
register_exception(prefix="The given output format '%s' is not available or is invalid. Please try again" % invalid_format, alert_admin=True)
return
else: # every given format is available
task_set_option("format", value)
elif key in ("-i", "--id"):
task_set_option("recids", value)
else:
return False
return True
### okay, here we go:
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/bibformat/scripts/bibreformat.py b/invenio/legacy/bibformat/scripts/bibreformat.py
index 603bd2c82..fb6f4eae3 100644
--- a/invenio/legacy/bibformat/scripts/bibreformat.py
+++ b/invenio/legacy/bibformat/scripts/bibreformat.py
@@ -1,31 +1,31 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Call BibFormat engine and create HTML brief (and other) formats for
bibliographic records. Upload formats via BibUpload."""
__revision__ = "$Id$"
from invenio.base.factory import with_app_context
@with_app_context()
def main():
- from invenio.legacy.bibreformat import main as bibreformat_main
+ from invenio.legacy.bibformat.bibreformat import main as bibreformat_main
return bibreformat_main()
diff --git a/invenio/legacy/bibindex/adminlib.py b/invenio/legacy/bibindex/adminlib.py
index 697d71024..93f67613b 100644
--- a/invenio/legacy/bibindex/adminlib.py
+++ b/invenio/legacy/bibindex/adminlib.py
@@ -1,2767 +1,2767 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Invenio BibIndex Administrator Interface."""
__revision__ = "$Id$"
import random
from invenio.config import \
CFG_SITE_LANG, \
CFG_SITE_URL, \
CFG_BINDIR
from invenio.legacy.bibrank.adminlib import write_outcome, modify_translations, \
get_def_name, get_name, get_languages, addadminbox, tupletotable, \
createhiddenform
from invenio.modules.access.engine import acc_authorize_action
from invenio.legacy.dbquery import run_sql, get_table_status_info, wash_table_column_name
-from invenio.bibindex_engine_stemmer import get_stemming_language_map
+from invenio.legacy.bibindex.engine_stemmer import get_stemming_language_map
import invenio.legacy.template
from invenio.bibindex_engine_config import CFG_BIBINDEX_SYNONYM_MATCH_TYPE, \
CFG_BIBINDEX_COLUMN_VALUE_SEPARATOR
from invenio.bibknowledge_dblayer import get_all_kb_names
from invenio.bibindex_engine_utils import load_tokenizers, \
get_idx_indexer, \
get_all_indexes, \
get_all_virtual_indexes, \
get_virtual_index_building_blocks, \
get_index_name_from_index_id, \
get_all_index_names_and_column_values, \
is_index_virtual
_TOKENIZERS = load_tokenizers()
websearch_templates = invenio.template.load('websearch')
def getnavtrail(previous = ''):
"""Get the navtrail"""
navtrail = """<a class="navtrail" href="%s/help/admin">Admin Area</a> """ % (CFG_SITE_URL,)
navtrail = navtrail + previous
return navtrail
def perform_index(ln=CFG_SITE_LANG, mtype='', content='', **params):
"""start area for modifying indexes
mtype - the method that called this method.
content - the output from that method."""
fin_output = """
<table>
<tr>
<td>0.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/index?ln=%s">Show all</a></small></td>
<td>1.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/index?ln=%s&amp;mtype=perform_showindexoverview#1">Overview of indexes</a></small></td>
<td>2.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/index?ln=%s&amp;mtype=perform_showvirtualindexoverview#2">Overview of virtual indexes</a></small></td>
<td>3.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/index?ln=%s&amp;mtype=perform_editindexes#2">Edit index</a></small></td>
<td>4.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/index?ln=%s&amp;mtype=perform_addindex#3">Add new index</a></small></td>
<td>5.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/field?ln=%s">Manage logical fields</a></small></td>
<td>6.&nbsp;<small><a href="%s/help/admin/bibindex-admin-guide">Guide</a></small></td>
</tr>
</table>
""" % (CFG_SITE_URL, ln, CFG_SITE_URL, ln, CFG_SITE_URL, ln, CFG_SITE_URL, ln, CFG_SITE_URL, ln, CFG_SITE_URL, ln, CFG_SITE_URL)
if mtype == "perform_showindexoverview" and content:
fin_output += content
elif mtype == "perform_showindexoverview" or not mtype:
fin_output += perform_showindexoverview(ln, callback='', **params)
if mtype == "perform_showvirtualindexoverview" and content:
fin_output += content
elif mtype == "perform_showvirtualindexoverview" or not mtype:
fin_output += perform_showvirtualindexoverview(ln, callback='', **params)
if mtype == "perform_editindexes" and content:
fin_output += content
elif mtype == "perform_editindexes" or not mtype:
fin_output += perform_editindexes(ln, callback='', **params)
if mtype == "perform_addindex" and content:
fin_output += content
elif mtype == "perform_addindex" or not mtype:
fin_output += perform_addindex(ln, callback='', **params)
if mtype == "perform_editvirtualindexes" and content:
fin_output += content
elif mtype == "perform_editvirtualindexes":
#not visible in 'show all' view of 'Manage Indexes'
fin_output += perform_editvirtualindexes(ln, callback='', **params)
if mtype == "perform_addvirtualindex" and content:
fin_output += content
elif mtype == "perform_addvirtualindex":
#not visible in 'show all' view of 'Manage Indexes'
fin_output += perform_addvirtualindex(ln, callback='', **params)
if mtype == "perform_deletevirtualindex" and content:
fin_output += content
elif mtype == "perform_deletevirtualindex":
#not visible in 'show all' view of 'Manage Indexes'
fin_output += perform_deletevirtualindex(ln, callback='', **params)
return addadminbox("<b>Menu</b>", [fin_output])
def perform_field(ln=CFG_SITE_LANG, mtype='', content=''):
"""Start area for modifying fields
mtype - the method that called this method.
content - the output from that method."""
fin_output = """
<table>
<tr>
<td>0.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/field?ln=%s">Show all</a></small></td>
<td>1.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/field?ln=%s&amp;mtype=perform_showfieldoverview#1">Overview of logical fields</a></small></td>
<td>2.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/field?ln=%s&amp;mtype=perform_editfields#2">Edit logical field</a></small></td>
<td>3.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/field?ln=%s&amp;mtype=perform_addfield#3">Add new logical field</a></small></td>
<td>4.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/index?ln=%s">Manage Indexes</a></small></td>
<td>5.&nbsp;<small><a href="%s/help/admin/bibindex-admin-guide">Guide</a></small></td>
</tr>
</table>
""" % (CFG_SITE_URL, ln, CFG_SITE_URL, ln, CFG_SITE_URL, ln, CFG_SITE_URL, ln, CFG_SITE_URL, ln, CFG_SITE_URL)
if mtype == "perform_showfieldoverview" and content:
fin_output += content
elif mtype == "perform_showfieldoverview" or not mtype:
fin_output += perform_showfieldoverview(ln, callback='')
if mtype == "perform_editfields" and content:
fin_output += content
elif mtype == "perform_editfields" or not mtype:
fin_output += perform_editfields(ln, callback='')
if mtype == "perform_addfield" and content:
fin_output += content
elif mtype == "perform_addfield" or not mtype:
fin_output += perform_addfield(ln, callback='')
return addadminbox("<b>Menu</b>", [fin_output])
def perform_editfield(fldID, ln=CFG_SITE_LANG, mtype='', content='', callback='yes', confirm=-1):
"""form to modify a field. this method is calling other methods which again is calling this and sending back the output of the method.
if callback, the method will call perform_editcollection, if not, it will just return its output.
fldID - id of the field
mtype - the method that called this method.
content - the output from that method."""
fld_dict = dict(get_def_name('', "field"))
if fldID in [-1, "-1"]:
return addadminbox("Edit logical field", ["""<b><span class="info">Please go back and select a logical field</span></b>"""])
fin_output = """
<table>
<tr>
<td><b>Menu</b></td>
</tr>
<tr>
<td>0.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editfield?fldID=%s&amp;ln=%s">Show all</a></small></td>
<td>1.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editfield?fldID=%s&amp;ln=%s&amp;mtype=perform_modifyfield">Modify field code</a></small></td>
<td>2.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editfield?fldID=%s&amp;ln=%s&amp;mtype=perform_modifyfieldtranslations">Modify translations</a></small></td>
<td>3.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editfield?fldID=%s&amp;ln=%s&amp;mtype=perform_modifyfieldtags">Modify MARC tags</a></small></td>
<td>4.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editfield?fldID=%s&amp;ln=%s&amp;mtype=perform_deletefield">Delete field</a></small></td>
</tr><tr>
<td>5.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editfield?fldID=%s&amp;ln=%s&amp;mtype=perform_showdetailsfield">Show field usage</a></small></td>
</tr>
</table>
""" % (CFG_SITE_URL, fldID, ln, CFG_SITE_URL, fldID, ln, CFG_SITE_URL, fldID, ln, CFG_SITE_URL, fldID, ln, CFG_SITE_URL, fldID, ln, CFG_SITE_URL, fldID, ln)
if mtype == "perform_modifyfield" and content:
fin_output += content
elif mtype == "perform_modifyfield" or not mtype:
fin_output += perform_modifyfield(fldID, ln, callback='')
if mtype == "perform_modifyfieldtranslations" and content:
fin_output += content
elif mtype == "perform_modifyfieldtranslations" or not mtype:
fin_output += perform_modifyfieldtranslations(fldID, ln, callback='')
if mtype == "perform_modifyfieldtags" and content:
fin_output += content
elif mtype == "perform_modifyfieldtags" or not mtype:
fin_output += perform_modifyfieldtags(fldID, ln, callback='')
if mtype == "perform_deletefield" and content:
fin_output += content
elif mtype == "perform_deletefield" or not mtype:
fin_output += perform_deletefield(fldID, ln, callback='')
return addadminbox("Edit logical field '%s'" % fld_dict[int(fldID)], [fin_output])
def perform_editindex(idxID, ln=CFG_SITE_LANG, mtype='', content='', callback='yes', confirm=-1):
"""form to modify a index. this method is calling other methods which again is calling this and sending back the output of the method.
idxID - id of the index
mtype - the method that called this method.
content - the output from that method."""
if idxID in [-1, "-1"]:
return addadminbox("Edit index", ["""<b><span class="info">Please go back and select a index</span></b>"""])
fin_output = """
<table>
<tr>
<td><b>Menu</b></td>
</tr>
<tr>
<td>0.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s">Show all</a></small></td>
<td>1.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s&amp;mtype=perform_modifyindex">Modify index name / descriptor</a></small></td>
<td>2.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s&amp;mtype=perform_modifyindextranslations">Modify translations</a></small></td>
<td>3.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s&amp;mtype=perform_modifyindexfields">Modify index fields</a></small></td>
<td>4.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s&amp;mtype=perform_modifyindexstemming">Modify index stemming language</a></small></td>
<td>5.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s&amp;mtype=perform_modifysynonymkb">Modify synonym knowledge base</a></small></td>
<td>6.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s&amp;mtype=perform_modifystopwords">Modify remove stopwords</a></small></td>
<td>7.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s&amp;mtype=perform_modifyremovehtml">Modify remove HTML markup</a></small></td>
<td>8.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s&amp;mtype=perform_modifyremovelatex">Modify remove latex markup</a></small></td>
<td>9.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s&amp;mtype=perform_modifytokenizer">Modify tokenizer</a></small></td>
<td>10.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s&amp;mtype=perform_modifyindexer">Modify indexer</a></small></td>
<td>11.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s&amp;mtype=perform_deleteindex">Delete index</a></small></td>
</tr>
</table>
""" % (CFG_SITE_URL, idxID, ln, CFG_SITE_URL, idxID, ln, CFG_SITE_URL, idxID, ln, CFG_SITE_URL, idxID, ln, CFG_SITE_URL, idxID, ln, CFG_SITE_URL, idxID, ln, CFG_SITE_URL, idxID, ln, CFG_SITE_URL, idxID, ln, CFG_SITE_URL, idxID, ln, CFG_SITE_URL, idxID, ln, CFG_SITE_URL, idxID, ln, CFG_SITE_URL, idxID, ln)
if mtype == "perform_modifyindex" and content:
fin_output += content
elif mtype == "perform_modifyindex" or not mtype:
fin_output += perform_modifyindex(idxID, ln, callback='')
if mtype == "perform_modifyindextranslations" and content:
fin_output += content
elif mtype == "perform_modifyindextranslations" or not mtype:
fin_output += perform_modifyindextranslations(idxID, ln, callback='')
if mtype == "perform_modifyindexfields" and content:
fin_output += content
elif mtype == "perform_modifyindexfields" or not mtype:
fin_output += perform_modifyindexfields(idxID, ln, callback='')
if mtype == "perform_modifyindexstemming" and content:
fin_output += content
elif mtype == "perform_modifyindexstemming" or not mtype:
fin_output += perform_modifyindexstemming(idxID, ln, callback='')
if mtype == "perform_modifysynonymkb" and content:
fin_output += content
elif mtype == "perform_modifysynonymkb" or not mtype:
fin_output += perform_modifysynonymkb(idxID, ln, callback='')
if mtype == "perform_modifystopwords" and content:
fin_output += content
elif mtype == "perform_modifystopwords" or not mtype:
fin_output += perform_modifystopwords(idxID, ln, callback='')
if mtype == "perform_modifyremovehtml" and content:
fin_output += content
elif mtype == "perform_modifyremovehtml" or not mtype:
fin_output += perform_modifyremovehtml(idxID, ln, callback='')
if mtype == "perform_modifyremovelatex" and content:
fin_output += content
elif mtype == "perform_modifyremovelatex" or not mtype:
fin_output += perform_modifyremovelatex(idxID, ln, callback='')
if mtype == "perform_modifytokenizer" and content:
fin_output += content
elif mtype == "perform_modifytokenizer" or not mtype:
fin_output += perform_modifytokenizer(idxID, ln, callback='')
if mtype == "perform_modifyindexer" and content:
fin_output += content
elif mtype == "perform_modifyindexer" or not mtype:
fin_output += perform_modifyindexer(idxID, ln, callback='')
if mtype == "perform_deleteindex" and content:
fin_output += content
elif mtype == "perform_deleteindex" or not mtype:
fin_output += perform_deleteindex(idxID, ln, callback='')
return addadminbox("Edit index", [fin_output])
def perform_editvirtualindex(idxID, ln=CFG_SITE_LANG, mtype='', content='', callback='yes', confirm=-1):
if idxID in [-1, "-1"]:
return addadminbox("Edit virtual index", ["""<b><span class="info">Please go back and select an index</span></b>"""])
fin_output = """
<table>
<tr>
<td><b>Menu</b></td>
</tr>
<tr>
<td>0.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editvirtualindex?idxID=%s&amp;ln=%s">Show all</a></small></td>
<td>1.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/editvirtualindex?idxID=%s&amp;ln=%s&amp;mtype=perform_modifydependentindexes">Modify depedent indexes</a></small></td>
<td>2.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/index?ln=%s&amp;mtype=perform_showvirtualindexoverview#2">Overview of virtual indexes</a></small></td>
</tr>
</table>
""" % (CFG_SITE_URL, idxID, ln, CFG_SITE_URL, idxID, ln, CFG_SITE_URL, ln)
if mtype == "perform_modifydependentindexes" and content:
fin_output += content
elif mtype == "perform_modifydependentindexes" or not mtype:
fin_output += perform_modifydependentindexes(idxID, ln, callback='')
index_name = "( %s )" % get_index_name_from_index_id(idxID)
return addadminbox("Edit virtual index %s" % index_name, [fin_output])
def perform_showindexoverview(ln=CFG_SITE_LANG, callback='', confirm=0):
subtitle = """<a name="1"></a>1. Overview of all indexes"""
output = """<table cellpadding="3" border="1">"""
output += """<tr><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></tr>""" % ("ID", "Name", "Fwd.Idx Size", "Rev.Idx Size", "Fwd.Idx Words", "Rev.Idx Records", "Last updated", "Fields", "Translations", "Stemming Language", "Synonym knowledge base", "Remove stopwords", "Remove HTML markup", "Remove Latex markup", "Tokenizer", "Indexer type")
idx = get_idx()
idx_dict = dict(get_def_name('', "idxINDEX"))
stemming_language_map = get_stemming_language_map()
stemming_language_map_reversed = dict([(elem[1], elem[0]) for elem in stemming_language_map.iteritems()])
virtual_indexes = dict(get_all_virtual_indexes())
for idxID, idxNAME, idxDESC, idxUPD, idxSTEM, idxSYNKB, idxSTOPWORDS, idxHTML, idxLATEX, idxTOK in idx:
forward_table_status_info = get_table_status_info('idxWORD%sF' % (idxID < 10 and '0%s' % idxID or idxID))
reverse_table_status_info = get_table_status_info('idxWORD%sR' % (idxID < 10 and '0%s' % idxID or idxID))
if str(idxUPD)[-3:] == ".00":
idxUPD = str(idxUPD)[0:-3]
lang = get_lang_list("idxINDEXNAME", "id_idxINDEX", idxID)
idx_fld = get_idx_fld(idxID)
fld = ""
for row in idx_fld:
fld += row[3] + ", "
if fld.endswith(", "):
fld = fld[:-2]
if len(fld) == 0:
fld = """<strong><span class="info">None</span></strong>"""
date = (idxUPD and idxUPD or """<strong><span class="info">Not updated</span></strong>""")
stemming_lang = stemming_language_map_reversed.get(idxSTEM, None)
if not stemming_lang:
stemming_lang = """<strong><span class="info">None</span></strong>"""
synonym_kb = get_idx_synonym_kb(idxID)
if not synonym_kb:
synonym_kb = """<strong><span class="info">None</span></strong>"""
remove_stopwords = get_idx_remove_stopwords(idxID)
if not remove_stopwords:
remove_stopwords = """<strong><span class="info">None</span></strong>"""
remove_html_markup = get_idx_remove_html_markup(idxID)
if not remove_html_markup:
remove_html_markup = """<strong><span class="info">None</span></strong>"""
remove_latex_markup = get_idx_remove_latex_markup(idxID)
if not remove_latex_markup:
remove_latex_markup = """<strong><span class="info">None</span></strong>"""
tokenizer = get_idx_tokenizer(idxID)
if not remove_latex_markup:
tokenizer = """<strong><span class="info">None</span></strong>"""
type_of_indexer = virtual_indexes.get(idxID) and "virtual" or get_idx_indexer(idxNAME)
if forward_table_status_info and reverse_table_status_info:
output += """<tr><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td></tr>""" % \
(idxID,
"""<a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s" title="%s">%s</a>""" % (CFG_SITE_URL, idxID, ln, idxDESC, idx_dict.get(idxID, idxNAME)),
"%s MB" % websearch_templates.tmpl_nice_number(forward_table_status_info['Data_length'] / 1048576.0, max_ndigits_after_dot=3),
"%s MB" % websearch_templates.tmpl_nice_number(reverse_table_status_info['Data_length'] / 1048576.0, max_ndigits_after_dot=3),
websearch_templates.tmpl_nice_number(forward_table_status_info['Rows']),
websearch_templates.tmpl_nice_number(reverse_table_status_info['Rows'], max_ndigits_after_dot=3),
date,
fld,
lang,
stemming_lang,
synonym_kb,
remove_stopwords,
remove_html_markup,
remove_latex_markup,
tokenizer,
type_of_indexer)
elif not forward_table_status_info:
output += """<tr><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td></tr>""" % \
(idxID,
"""<a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s">%s</a>""" % (CFG_SITE_URL, idxID, ln, idx_dict.get(idxID, idxNAME)),
"Error", "%s MB" % websearch_templates.tmpl_nice_number(reverse_table_status_info['Data_length'] / 1048576.0, max_ndigits_after_dot=3),
"Error",
websearch_templates.tmpl_nice_number(reverse_table_status_info['Rows'], max_ndigits_after_dot=3),
date,
"",
lang,
synonym_kb,
remove_stopwords,
remove_html_markup,
remove_latex_markup,
tokenizer,
type_of_indexer)
elif not reverse_table_status_info:
output += """<tr><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td></tr>""" % \
(idxID,
"""<a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&amp;ln=%s">%s</a>""" % (CFG_SITE_URL, idxID, ln, idx_dict.get(idxID, idxNAME)),
"%s MB" % websearch_templates.tmpl_nice_number(forward_table_status_info['Data_length'] / 1048576.0, max_ndigits_after_dot=3),
"Error", websearch_templates.tmpl_nice_number(forward_table_status_info['Rows'], max_ndigits_after_dot=3),
"Error",
date,
"",
lang,
synonym_kb,
remove_stopwords,
remove_html_markup,
remove_latex_markup,
tokenizer,
type_of_indexer)
output += "</table>"
body = [output]
if callback:
return perform_index(ln, "perform_showindexoverview", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_showvirtualindexoverview(ln=CFG_SITE_LANG, callback='', confirm=0):
subtitle = """<a name="1"></a>2. Overview of virtual indexes"""
output = """
<table>
<tr>
<td>1.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/index?ln=%s&amp;mtype=perform_editvirtualindexes#1">Edit virtual index</a></small></td>
<td>2.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/index?ln=%s&amp;mtype=perform_addvirtualindex#2">Add new virtual index</a></small></td>
<td>3.&nbsp;<small><a href="%s/admin/bibindex/bibindexadmin.py/index?ln=%s&amp;mtype=perform_deletevirtualindex#3">Delete virtual index</a></small></td>
</tr>
</table>
""" % (CFG_SITE_URL, ln, CFG_SITE_URL, ln, CFG_SITE_URL, ln)
output += """<table cellpadding="3" border="1">"""
output += """<tr><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td></tr>""" % ("ID", "Virtual index", "Dependent indexes")
idx = get_all_virtual_indexes()
for idxID, idxNAME in idx:
normal_indexes = zip(*get_virtual_index_building_blocks(idxID))[1]
output += """<tr><td>%s</td><td>%s</td><td>%s</td></tr>""" % \
(idxID,
"""<a href="%s/admin/bibindex/bibindexadmin.py/editvirtualindex?idxID=%s&amp;ln=%s">%s</a>""" % (CFG_SITE_URL, idxID, ln, idxNAME),
", ".join(normal_indexes))
output += "</table>"
body = [output]
if callback:
return perform_index(ln, "perform_showvirtualindexoverview", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_editindexes(ln=CFG_SITE_LANG, callback='yes', content='', confirm=-1):
"""show a list of indexes that can be edited."""
subtitle = """<a name="3"></a>3. Edit index&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % (CFG_SITE_URL)
fin_output = ''
idx = get_idx()
output = ""
if len(idx) > 0:
text = """
<span class="adminlabel">Index name</span>
<select name="idxID" class="admin_w200">
<option value="-1">- Select a index -</option>
"""
for (idxID, idxNAME, idxDESC, idxUPD, idxSTEM, idxSYNKB, idxSTOPWORDS, idxHTML, idxLATEX, idxTOK) in idx:
text += """<option value="%s">%s</option>""" % (idxID, idxNAME)
text += """</select>"""
output += createhiddenform(action="%s/admin/bibindex/bibindexadmin.py/editindex" % CFG_SITE_URL,
text=text,
button="Edit",
ln=ln,
confirm=1)
else:
output += """No indexes exists"""
body = [output]
if callback:
return perform_index(ln, "perform_editindexes", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_editvirtualindexes(ln=CFG_SITE_LANG, callback='yes', content='', confirm=-1):
"""show a list of indexes that can be edited."""
subtitle = """<a name="2"></a>1. Edit virtual index&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % (CFG_SITE_URL)
idx = get_all_virtual_indexes()
output = ""
if len(idx) > 0:
text = """
<span class="adminlabel">Virtual index name</span>
<select name="idxID" class="admin_w200">
<option value="-1">- Select a index -</option>
"""
for (idxID, idxNAME) in idx:
text += """<option value="%s">%s</option>""" % (idxID, idxNAME)
text += """</select>"""
output += createhiddenform(action="%s/admin/bibindex/bibindexadmin.py/editvirtualindex" % CFG_SITE_URL,
text=text,
button="Edit",
ln=ln,
confirm=1)
else:
output += """No indexes exist"""
body = [output]
if callback:
return perform_index(ln, "perform_editvirtualindexes", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_editfields(ln=CFG_SITE_LANG, callback='yes', content='', confirm=-1):
"""show a list of all logical fields that can be edited."""
subtitle = """<a name="4"></a>4. Edit logical field&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % (CFG_SITE_URL)
fin_output = ''
res = get_fld()
output = ""
if len(res) > 0:
text = """
<span class="adminlabel">Field name</span>
<select name="fldID" class="admin_w200">
<option value="-1">- Select a field -</option>
"""
for (fldID, name, code) in res:
text += """<option value="%s">%s</option>""" % (fldID, name)
text += """</select>"""
output += createhiddenform(action="%s/admin/bibindex/bibindexadmin.py/editfield" % CFG_SITE_URL,
text=text,
button="Edit",
ln=ln,
confirm=1)
else:
output += """No logical fields exists"""
body = [output]
if callback:
return perform_field(ln, "perform_editfields", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_addindex(ln=CFG_SITE_LANG, idxNAME='', callback="yes", confirm=-1):
"""form to add a new index.
idxNAME - the name of the new index"""
output = ""
subtitle = """<a name="3"></a>3. Add new index"""
text = """
<span class="adminlabel">Index name</span>
<input class="admin_w200" type="text" name="idxNAME" value="%s" /><br />
""" % idxNAME
output = createhiddenform(action="%s/admin/bibindex/bibindexadmin.py/addindex" % CFG_SITE_URL,
text=text,
ln=ln,
button="Add index",
confirm=1)
if idxNAME and confirm in ["1", 1]:
res = add_idx(idxNAME)
output += write_outcome(res) + """<br /><a href="%s/admin/bibindex/bibindexadmin.py/editindex?idxID=%s&ln=%s">Configure this index</a>.""" % (CFG_SITE_URL, res[1], ln)
elif confirm not in ["-1", -1]:
output += """<b><span class="info">Please give the index a name.</span></b>
"""
body = [output]
if callback:
return perform_index(ln, "perform_addindex", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_addvirtualindex(ln=CFG_SITE_LANG, idxNEWVID='', idxNEWPID='', callback="yes", confirm=-1):
"""form to add a new virtual index from the set of physical indexes.
idxID - the name of the new virtual index"""
idx = get_all_indexes(virtual=False, with_ids=True)
output = ""
subtitle = """<a name="3"></a>2. Add new virtual index"""
if len(idx) > 0:
text = """
<span class="adminlabel">Choose new virtual index</span>
<select name="idxNEWVID" class="admin_w200">
<option value="-1">- Select an index -</option>
"""
for (idxID, idxNAME) in idx:
checked = str(idxNEWVID) == str(idxID) and 'selected="selected"' or ''
text += """<option value="%s" %s>%s</option>
""" % (idxID, checked, idxNAME)
text += """</select>"""
text += """&nbsp;&nbsp;
<span class="adminlabel">Add physical index</span>
<select name="idxNEWPID" class="admin_w200">
<option value="-1">- Select an index -</option>
"""
for (idxID, idxNAME) in idx:
text += """<option value="%s">%s</option>""" % (idxID, idxNAME)
text += """</select>"""
output += createhiddenform(action="%s/admin/bibindex/bibindexadmin.py/addvirtualindex" % CFG_SITE_URL,
text=text,
button="Add index",
ln=ln,
confirm=1)
else:
output += """No index exists"""
if idxNEWVID not in ['', "-1", -1] and idxNEWPID not in ['', "-1", -1] and confirm in ["1", 1]:
res = add_virtual_idx(idxNEWVID, idxNEWPID)
output += write_outcome(res)
output += """<br /><span class="info">Please note you must run as soon as possible:
<pre>$> %s/bibindex --reindex -w %s</pre></span>""" % (CFG_BINDIR, dict(idx)[int(idxNEWPID)])
elif confirm not in ["-1", -1] or idxNEWVID in ["-1", -1] or idxNEWPID in ["-1", -1]:
output += """<b><span class="info">Please specify the index.</span></b>"""
body = [output]
if callback:
return perform_index(ln, "perform_addvirtualindex", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifyindextranslations(idxID, ln=CFG_SITE_LANG, sel_type='', trans=[], confirm=-1, callback='yes'):
"""Modify the translations of a index
sel_type - the nametype to modify
trans - the translations in the same order as the languages from get_languages()"""
output = ''
subtitle = ''
langs = get_languages()
if confirm in ["2", 2] and idxID:
finresult = modify_translations(idxID, langs, sel_type, trans, "idxINDEX")
idx_dict = dict(get_def_name('', "idxINDEX"))
if idxID and idx_dict.has_key(int(idxID)):
idxID = int(idxID)
subtitle = """<a name="2"></a>2. Modify translations for index.&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % CFG_SITE_URL
if type(trans) is str:
trans = [trans]
if sel_type == '':
sel_type = get_idx_nametypes()[0][0]
header = ['Language', 'Translation']
actions = []
types = get_idx_nametypes()
if len(types) > 1:
text = """
<span class="adminlabel">Name type</span>
<select name="sel_type" class="admin_w200">
"""
for (key, value) in types:
text += """<option value="%s" %s>%s""" % (key, key == sel_type and 'selected="selected"' or '', value)
trans_names = get_name(idxID, ln, key, "field")
if trans_names and trans_names[0][0]:
text += ": %s" % trans_names[0][0]
text += "</option>"
text += """</select>"""
output += createhiddenform(action="modifyindextranslations#2",
text=text,
button="Select",
idxID=idxID,
ln=ln,
confirm=0)
if confirm in [-1, "-1", 0, "0"]:
trans = []
for (key, value) in langs:
try:
trans_names = get_name(idxID, key, sel_type, "idxINDEX")
trans.append(trans_names[0][0])
except StandardError, e:
trans.append('')
for nr in range(0,len(langs)):
actions.append(["%s" % (langs[nr][1],)])
actions[-1].append('<input type="text" name="trans" size="30" value="%s"/>' % trans[nr])
text = tupletotable(header=header, tuple=actions)
output += createhiddenform(action="modifyindextranslations#2",
text=text,
button="Modify",
idxID=idxID,
sel_type=sel_type,
ln=ln,
confirm=2)
if sel_type and len(trans):
if confirm in ["2", 2]:
output += write_outcome(finresult)
body = [output]
if callback:
return perform_editindex(idxID, ln, "perform_modifyindextranslations", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifyfieldtranslations(fldID, ln=CFG_SITE_LANG, sel_type='', trans=[], confirm=-1, callback='yes'):
"""Modify the translations of a field
sel_type - the nametype to modify
trans - the translations in the same order as the languages from get_languages()"""
output = ''
subtitle = ''
langs = get_languages()
if confirm in ["2", 2] and fldID:
finresult = modify_translations(fldID, langs, sel_type, trans, "field")
fld_dict = dict(get_def_name('', "field"))
if fldID and fld_dict.has_key(int(fldID)):
fldID = int(fldID)
subtitle = """<a name="3"></a>3. Modify translations for logical field '%s'&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % (fld_dict[fldID], CFG_SITE_URL)
if type(trans) is str:
trans = [trans]
if sel_type == '':
sel_type = get_fld_nametypes()[0][0]
header = ['Language', 'Translation']
actions = []
types = get_fld_nametypes()
if len(types) > 1:
text = """
<span class="adminlabel">Name type</span>
<select name="sel_type" class="admin_w200">
"""
for (key, value) in types:
text += """<option value="%s" %s>%s""" % (key, key == sel_type and 'selected="selected"' or '', value)
trans_names = get_name(fldID, ln, key, "field")
if trans_names and trans_names[0][0]:
text += ": %s" % trans_names[0][0]
text += "</option>"
text += """</select>"""
output += createhiddenform(action="modifyfieldtranslations#3",
text=text,
button="Select",
fldID=fldID,
ln=ln,
confirm=0)
if confirm in [-1, "-1", 0, "0"]:
trans = []
for (key, value) in langs:
try:
trans_names = get_name(fldID, key, sel_type, "field")
trans.append(trans_names[0][0])
except StandardError, e:
trans.append('')
for nr in range(0,len(langs)):
actions.append(["%s" % (langs[nr][1],)])
actions[-1].append('<input type="text" name="trans" size="30" value="%s"/>' % trans[nr])
text = tupletotable(header=header, tuple=actions)
output += createhiddenform(action="modifyfieldtranslations#3",
text=text,
button="Modify",
fldID=fldID,
sel_type=sel_type,
ln=ln,
confirm=2)
if sel_type and len(trans):
if confirm in ["2", 2]:
output += write_outcome(finresult)
body = [output]
if callback:
return perform_editfield(fldID, ln, "perform_modifytranslations", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_showdetailsfieldtag(fldID, tagID, ln=CFG_SITE_LANG, callback="yes", confirm=-1):
"""form to add a new field.
fldNAME - the name of the new field
code - the field code"""
fld_dict = dict(get_def_name('', "field"))
fldID = int(fldID)
tagname = run_sql("SELECT name from tag where id=%s", (tagID, ))[0][0]
output = ""
subtitle = """<a name="4.1"></a>Showing details for MARC tag '%s'""" % tagname
output += "<br /><b>This MARC tag is used directly in these logical fields:</b>&nbsp;"
fld_tag = get_fld_tags('', tagID)
exist = {}
for (id_field,id_tag, tname, tvalue, score) in fld_tag:
output += "%s, " % fld_dict[int(id_field)]
exist[id_field] = 1
output += "<br /><b>This MARC tag is used indirectly in these logical fields:</b>&nbsp;"
tag = run_sql("SELECT value from tag where id=%s", (id_tag, ))
tag = tag[0][0]
for i in range(0, len(tag) - 1):
res = run_sql("SELECT id_field,id_tag FROM field_tag,tag WHERE tag.id=field_tag.id_tag AND tag.value=%s", ('%' + tag[0:i] + '%',))
for (id_field, id_tag) in res:
output += "%s, " % fld_dict[int(id_field)]
exist[id_field] = 1
res = run_sql("SELECT id_field,id_tag FROM field_tag,tag WHERE tag.id=field_tag.id_tag AND tag.value like %s", (tag, ))
for (id_field, id_tag) in res:
if not exist.has_key(id_field):
output += "%s, " % fld_dict[int(id_field)]
body = [output]
if callback:
return perform_modifyfieldtags(fldID, ln, "perform_showdetailsfieldtag", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_showdetailsfield(fldID, ln=CFG_SITE_LANG, callback="yes", confirm=-1):
"""form to add a new field.
fldNAME - the name of the new field
code - the field code"""
fld_dict = dict(get_def_name('', "field"))
col_dict = dict(get_def_name('', "collection"))
fldID = int(fldID)
col_fld = get_col_fld('', '', fldID)
sort_types = dict(get_sort_nametypes())
fin_output = ""
subtitle = """<a name="1"></a>5. Show usage for logical field '%s'""" % fld_dict[fldID]
output = "This logical field is used in these collections:<br />"
ltype = ''
exist = {}
for (id_collection, id_field, id_fieldvalue, ftype, score, score_fieldvalue) in col_fld:
if ltype != ftype:
output += "<br /><b>%s:&nbsp;</b>" % sort_types[ftype]
ltype = ftype
exist = {}
if not exist.has_key(id_collection):
output += "%s, " % col_dict[int(id_collection)]
exist[id_collection] = 1
if not col_fld:
output = "This field is not used by any collections."
fin_output = addadminbox('Collections', [output])
body = [fin_output]
if callback:
return perform_editfield(ln, "perform_showdetailsfield", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_addfield(ln=CFG_SITE_LANG, fldNAME='', code='', callback="yes", confirm=-1):
"""form to add a new field.
fldNAME - the name of the new field
code - the field code"""
output = ""
subtitle = """<a name="3"></a>3. Add new logical field"""
code = str.replace(code,' ', '')
text = """
<span class="adminlabel">Field name</span>
<input class="admin_w200" type="text" name="fldNAME" value="%s" /><br />
<span class="adminlabel">Field code</span>
<input class="admin_w200" type="text" name="code" value="%s" /><br />
""" % (fldNAME, code)
output = createhiddenform(action="%s/admin/bibindex/bibindexadmin.py/addfield" % CFG_SITE_URL,
text=text,
ln=ln,
button="Add field",
confirm=1)
if fldNAME and code and confirm in ["1", 1]:
res = add_fld(fldNAME, code)
output += write_outcome(res)
elif confirm not in ["-1", -1]:
output += """<b><span class="info">Please give the logical field a name and code.</span></b>
"""
body = [output]
if callback:
return perform_field(ln, "perform_addfield", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_deletefield(fldID, ln=CFG_SITE_LANG, callback='yes', confirm=0):
"""form to remove a field.
fldID - the field id from table field.
"""
fld_dict = dict(get_def_name('', "field"))
if not fld_dict.has_key(int(fldID)):
return """<b><span class="info">Field does not exist</span></b>"""
subtitle = """<a name="4"></a>4. Delete the logical field '%s'&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % (fld_dict[int(fldID)], CFG_SITE_URL)
output = ""
if fldID:
fldID = int(fldID)
if confirm in ["0", 0]:
check = run_sql("SELECT id_field from idxINDEX_field where id_field=%s", (fldID, ))
text = ""
if check:
text += """<b><span class="info">This field is used in an index, deletion may cause problems.</span></b><br />"""
text += """Do you want to delete the logical field '%s' and all its relations and definitions.""" % (fld_dict[fldID])
output += createhiddenform(action="deletefield#4",
text=text,
button="Confirm",
fldID=fldID,
confirm=1)
elif confirm in ["1", 1]:
res = delete_fld(fldID)
if res[0] == 1:
return """<br /><b><span class="info">Field deleted.</span></b>""" + write_outcome(res)
else:
output += write_outcome(res)
body = [output]
if callback:
return perform_editfield(fldID, ln, "perform_deletefield", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_deleteindex(idxID, ln=CFG_SITE_LANG, callback='yes', confirm=0):
"""form to delete an index.
idxID - the index id from table idxINDEX.
"""
if idxID:
subtitle = """<a name="5"></a>11. Delete the index.&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % CFG_SITE_URL
output = ""
if confirm in ["0", 0]:
idx = get_idx(idxID)
if idx:
text = ""
text += """<b><span class="info">By deleting an index, you may also loose any indexed data in the forward and reverse table for this index.</span></b><br />"""
text += """Do you want to delete the index '%s' and all its relations and definitions.""" % (idx[0][1])
output += createhiddenform(action="deleteindex#5",
text=text,
button="Confirm",
idxID=idxID,
confirm=1)
else:
return """<br /><b><span class="info">Index specified does not exist.</span></b>"""
elif confirm in ["1", 1]:
res = delete_idx(idxID)
if res[0] == 1:
return """<br /><b><span class="info">Index deleted.</span></b>""" + write_outcome(res)
else:
output += write_outcome(res)
body = [output]
if callback:
return perform_editindex(idxID, ln, "perform_deleteindex", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_deletevirtualindex(ln=CFG_SITE_LANG, idxID='', callback='yes', confirm=-1):
"""form to delete a virtual index.
idxID - the index id from table idxINDEX.
"""
output = ""
subtitle = """<a name="3"></a>3. Delete virtual index"""
idx = get_all_virtual_indexes()
if len(idx) > 0:
text = """<span class="adminlabel">Choose a virtual index</span>
<select name="idxID" class="admin_w200">
<option value="-1">- Select an index -</option>
"""
for idx_id, idx_name in idx:
selected = str(idxID) == str(idx_id) and 'selected="selected"' or ''
text += """<option value="%s" %s>%s</option>""" % (idx_id, selected, idx_name)
text += """</select>"""
output += createhiddenform(action="deletevirtualindex#3",
text=text,
button="Confirm",
confirm=1)
else:
output = "No index specified"
if confirm in ["1", 1] and idxID not in ['', "-1", -1]:
res = delete_virtual_idx(int(idxID))
if res[0] == 1:
output += """<br /><b><span class="info">Virtual index deleted.</span></b><br />"""
output += write_outcome(res)
else:
output += write_outcome(res)
elif idxID in ["-1", -1]:
output += """<b><span class="info">Please specify the index.</span></b>"""
body = [output]
if callback:
return perform_index(ln, "perform_deletevirtualindex", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifydependentindexes(idxID, ln=CFG_SITE_LANG, newIDs=[], callback='yes', confirm=-1):
"""page on which dependent indexes for specific virtual index
can be chosen"""
subtitle = ""
output = ""
non_virtual_indexes = dict(get_all_indexes(virtual=False, with_ids=True)) #[(id1, name1), (id2, name2)..]
already_dependent = dict(get_virtual_index_building_blocks(idxID))
if not already_dependent:
idxID = -1
if idxID not in [-1, "-1"]:
subtitle = """<a name="1"></a>1. Modify dependent indexes.&nbsp;&nbsp;&nbsp;
<small>
[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]
</small>""" % CFG_SITE_URL
if confirm in [-1, "-1"]:
newIDs = []
if not newIDs:
newIDs = []
tick_list = ""
checked_values = already_dependent.values()
if confirm > -1:
checked_values = newIDs
for index_name in non_virtual_indexes.values():
checked = index_name in checked_values and 'checked="checked"' or ''
tick_list += """<input type="checkbox" name='newIDs' value="%s" %s >%s </br>""" % \
(index_name, checked, index_name)
output += createhiddenform(action="modifydependentindexes#1",
text=tick_list,
button="Modify",
idxID=idxID,
ln=ln,
confirm=0)
if confirm in [0, "0"] and newIDs == []:
output += "</br>"
text = """
<span class="important">Removing all dependent indexes
means removing virtual index.</span>
<br /> <strong>Are you sure you want to do this?</strong>"""
output += createhiddenform(action="modifydependentindexes#1",
text=text,
button="Confirm",
idxID=idxID,
newIDs=newIDs,
ln=ln,
confirm=1)
elif confirm in [0, "0"]:
output += "</br>"
text = """
<span class="important">You are about to change dependent indexes</span>.<br /> <strong>Are you sure you want to do this?</strong>"""
output += createhiddenform(action="modifydependentindexes#1",
text=text,
button="Confirm",
idxID=idxID,
newIDs=newIDs,
ln=ln,
confirm=1)
elif idxID > -1 and confirm in [1, "1"]:
output += "</br>"
to_add, to_remove = find_dependent_indexes_to_change(idxID, newIDs)
res = modify_dependent_indexes(idxID, to_add, to_remove)
output += write_outcome(res)
if len(to_remove) + len(to_add) > 0:
output += """<br /><span class="info">Please note you should run as soon as possible:"""
for index in to_add:
output += """<pre>$> %s/bibindex --reindex -w %s</pre>
""" % (CFG_BINDIR, index)
for index in to_remove:
output += """<pre>$> %s/bibindex -w %s --remove-dependent-index %s</pre>
""" % (CFG_BINDIR, get_index_name_from_index_id(idxID), index)
if len(to_remove) + len(to_add) > 0:
output += "</span>"
elif confirm in [1, "1"]:
output += """<br /><b><span class="info">Please give a name for the index.</span></b>"""
else:
output = """It seems that this index is not virtual."""
body = [output]
if callback:
return perform_editvirtualindex(idxID, ln, "perform_modifydependentindexes", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def find_dependent_indexes_to_change(idxID, new_indexes):
"""From new set of dependent indexes finds out
which indexes should be added and which should be removed
from database (idxINDEX_idxINDEX table)
@param idxID: id of the virtual index
@param new_indexes: future set of dependent indexes
"""
if not type(new_indexes) is list:
new_indexes = [new_indexes]
dependent_indexes = dict(get_virtual_index_building_blocks(idxID)).values()
to_add = set(new_indexes) - set(dependent_indexes)
to_remove = set(dependent_indexes) - set(new_indexes)
return list(to_add), list(to_remove)
def perform_showfieldoverview(ln=CFG_SITE_LANG, callback='', confirm=0):
subtitle = """<a name="1"></a>1. Logical fields overview"""
output = """<table cellpadding="3" border="1">"""
output += """<tr><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td></tr>""" % ("Field", "MARC Tags", "Translations")
query = "SELECT id,name FROM field"
res = run_sql(query)
col_dict = dict(get_def_name('', "collection"))
fld_dict = dict(get_def_name('', "field"))
for field_id,field_name in res:
query = "SELECT tag.value FROM tag, field_tag WHERE tag.id=field_tag.id_tag AND field_tag.id_field=%s ORDER BY field_tag.score DESC,tag.value ASC"
res = run_sql(query, (field_id, ))
field_tags = ""
for row in res:
field_tags = field_tags + row[0] + ", "
if field_tags.endswith(", "):
field_tags = field_tags[:-2]
if not field_tags:
field_tags = """<b><span class="info">None</span></b>"""
lang = get_lang_list("fieldname", "id_field", field_id)
output += """<tr><td>%s</td><td>%s</td><td>%s</td></tr>""" % ("""<a href="%s/admin/bibindex/bibindexadmin.py/editfield?fldID=%s&ln=%s">%s</a>""" % (CFG_SITE_URL, field_id, ln, fld_dict[field_id]), field_tags, lang)
output += "</table>"
body = [output]
if callback:
return perform_field(ln, "perform_showfieldoverview", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifyindex(idxID, ln=CFG_SITE_LANG, idxNAME='', idxDESC='', callback='yes', confirm=-1):
"""form to modify an index name.
idxID - the index name to change.
idxNAME - new name of index
idxDESC - description of index content"""
subtitle = ""
output = ""
idx = get_idx(idxID)
if not idx:
idxID = -1
if idxID not in [-1, "-1"]:
subtitle = """<a name="2"></a>1. Modify index name.&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % CFG_SITE_URL
if confirm in [-1, "-1"]:
idxNAME = idx[0][1]
idxDESC = idx[0][2]
text = """
<span class="adminlabel">Index name</span>
<input class="admin_w200" type="text" name="idxNAME" value="%s" /><br />
<span class="adminlabel">Index description</span>
<textarea class="admin_w200" name="idxDESC">%s</textarea><br />
""" % (idxNAME, idxDESC)
output += createhiddenform(action="modifyindex#1",
text=text,
button="Modify",
idxID=idxID,
ln=ln,
confirm=1)
if idxID > -1 and idxNAME and confirm in [1, "1"]:
res = modify_idx(idxID, idxNAME, idxDESC)
output += write_outcome(res)
elif confirm in [1, "1"]:
output += """<br /><b><span class="info">Please give a name for the index.</span></b>"""
else:
output = """No index to modify."""
body = [output]
if callback:
return perform_editindex(idxID, ln, "perform_modifyindex", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifyindexstemming(idxID, ln=CFG_SITE_LANG, idxSTEM='', callback='yes', confirm=-1):
"""form to modify an index name.
idxID - the index name to change.
idxSTEM - new stemming language code"""
subtitle = ""
output = ""
stemming_language_map = get_stemming_language_map()
stemming_language_map['None'] = ''
idx = get_idx(idxID)
if not idx:
idxID = -1
if idxID not in [-1, "-1"]:
subtitle = """<a name="4"></a>4. Modify index stemming language.&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % CFG_SITE_URL
if confirm in [-1, "-1"]:
idxSTEM = idx[0][4]
if not idxSTEM:
idxSTEM = ''
language_html_element = """<select name="idxSTEM" class="admin_w200">"""
languages = stemming_language_map.keys()
languages.sort()
for language in languages:
if stemming_language_map[language] == idxSTEM:
selected = 'selected="selected"'
else:
selected = ""
language_html_element += """<option value="%s" %s>%s</option>""" % (stemming_language_map[language], selected, language)
language_html_element += """</select>"""
text = """
<span class="adminlabel">Index stemming language</span>
""" + language_html_element
output += createhiddenform(action="modifyindexstemming#4",
text=text,
button="Modify",
idxID=idxID,
ln=ln,
confirm=0)
if confirm in [0, "0"] and get_idx(idxID)[0][4] == idxSTEM:
output += """<span class="info">Stemming language has not been changed</span>"""
elif confirm in [0, "0"]:
text = """
<span class="important">You are about to either disable or change the stemming language setting for this index. Please note that it is not recommended to enable stemming for structured-data indexes like "report number", "year", "author" or "collection". On the contrary, it is advisable to enable stemming for indexes like "fulltext", "abstract", "title", etc. since this would overall improve the retrieval quality. <br /> Beware, however, that after disabling or changing the stemming language setting of an index you will have to reindex it. It is a good idea to change the stemming language and to reindex during low usage hours of your service, since searching results will be potentially affected by the discrepancy between search terms now being (not) stemmed and indexes still using the previous settings until the reindexing is completed</span>.<br /> <strong>Are you sure you want to disable/change the stemming language setting of this index?</strong>"""
output += createhiddenform(action="modifyindexstemming#4",
text=text,
button="Modify",
idxID=idxID,
idxSTEM=idxSTEM,
ln=ln,
confirm=1)
elif idxID > -1 and confirm in [1, "1"]:
res = modify_idx_stemming(idxID, idxSTEM)
output += write_outcome(res)
output += """<br /><span class="info">Please note you must run as soon as possible:
<pre>$> %s/bibindex --reindex -w %s</pre></span>
""" % (CFG_BINDIR, get_idx(idxID)[0][1])
elif confirm in [1, "1"]:
output += """<br /><b><span class="info">Please give a name for the index.</span></b>"""
else:
output = """No index to modify."""
body = [output]
if callback:
return perform_editindex(idxID, ln, "perform_modifyindexstemming", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifyindexer(idxID, ln=CFG_SITE_LANG, indexer='', callback='yes', confirm=-1):
"""form to modify an indexer.
idxID - the index name to change.
idexer - indexer type: native/SOLR/XAPIAN/virtual"""
subtitle = ""
output = ""
idx = get_idx(idxID)
if idx:
current_indexer = is_index_virtual(idx[0][0]) and "virtual" or get_idx_indexer(idx[0][1])
subtitle = """<a name="4"></a>5. Modify indexer.&nbsp;&nbsp;&nbsp;
<small>
[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]
</small>""" % CFG_SITE_URL
if confirm in [-1, "-1"]:
indexer = current_indexer or ''
items = ["native"]
if idx[0][1] == "fulltext":
items.extend(["SOLR", "XAPIAN"])
else:
items.extend(["virtual"])
html_element = """<select name="indexer" class="admin_w200">"""
for item in items:
selected = indexer==item and 'selected="selected"' or ''
html_element += """<option value="%s" %s>%s</option>""" % (item, selected, item)
html_element += """</select>"""
text = """<span class="adminlabel">Indexer type</span>""" + html_element
output += createhiddenform(action="modifyindexer#5",
text=text,
button="Modify",
idxID=idxID,
ln=ln,
confirm=1)
if confirm in [1, "1"] and idx[0][1]=="fulltext":
res = modify_idx_indexer(idxID, indexer)
output += write_outcome(res)
output += """<br /><span class="info">Please note you should run:
<pre>$> %s/bibindex --reindex -w fulltext</pre></span>""" % CFG_BINDIR
elif confirm in [1, "1"]:
if indexer=="virtual" and current_indexer == "native":
params = {'idxNEWVID': idxID}
return perform_index(ln, "perform_addvirtualindex", "", **params)
elif indexer=="native" and current_indexer == "virtual":
params = {'idxID':idxID}
return perform_index(ln, "perform_deletevirtualindex", "", **params)
else:
output = """No index to modify."""
body = [output]
if callback:
return perform_editindex(idxID, ln, "perform_modifyindexer", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifysynonymkb(idxID, ln=CFG_SITE_LANG, idxKB='', idxMATCH='', callback='yes', confirm=-1):
"""form to modify the knowledge base for the synonym lookup.
idxID - the index name to change.
idxKB - new knowledge base name
idxMATCH - new match type
"""
subtitle = ""
output = ""
idx = get_idx(idxID)
if not idx:
idxID = -1
if idxID not in [-1, "-1"]:
subtitle = """<a name="4"></a>5. Modify knowledge base for synonym lookup.&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % CFG_SITE_URL
if confirm in [-1, "-1"]:
field_value = get_idx_synonym_kb(idxID)
if CFG_BIBINDEX_COLUMN_VALUE_SEPARATOR in field_value:
idxKB, idxMATCH = field_value.split(CFG_BIBINDEX_COLUMN_VALUE_SEPARATOR)
if not idxKB:
idxKB = ''
idxMATCH = ''
kb_html_element = """<select name="idxKB" class="admin_w200">"""
knowledge_base_names = get_all_kb_names()
knowledge_base_names.append(CFG_BIBINDEX_SYNONYM_MATCH_TYPE["None"])
knowledge_base_names.sort()
for knowledge_base_name in knowledge_base_names:
if knowledge_base_name == idxKB:
selected = 'selected="selected"'
else:
selected = ""
kb_html_element += """<option value="%s" %s>%s</option>""" % (knowledge_base_name, selected, knowledge_base_name)
kb_html_element += """</select>"""
match_html_element = """<select name="idxMATCH" class="admin_w200">"""
match_names = CFG_BIBINDEX_SYNONYM_MATCH_TYPE.values()
match_names.sort()
for match_name in match_names:
if match_name == idxMATCH:
selected = 'selected="selected"'
else:
selected = ""
match_html_element += """<option value="%s" %s>%s</option>""" % (match_name, selected, match_name)
match_html_element += """</select>"""
text = """<span class="adminlabel">Knowledge base name and match type</span>""" + kb_html_element + match_html_element
output += createhiddenform(action="modifysynonymkb#4",
text=text,
button="Modify",
idxID=idxID,
ln=ln,
confirm=0)
if confirm in [0, "0"] and get_idx(idxID)[0][5] == idxKB + CFG_BIBINDEX_COLUMN_VALUE_SEPARATOR + idxMATCH:
output += """<span class="info">Knowledge base has not been changed</span>"""
elif confirm in [0, "0"]:
text = """
<span class="important">You are going to change the knowledge base for this index.<br /> <strong>Are you sure you want
to change the knowledge base of this index?</strong>"""
output += createhiddenform(action="modifysynonymkb#4",
text=text,
button="Modify",
idxID=idxID,
idxKB=idxKB,
idxMATCH=idxMATCH,
ln=ln,
confirm=1)
elif idxID > -1 and confirm in [1, "1"]:
res = modify_idx_synonym_kb(idxID, idxKB, idxMATCH)
output += write_outcome(res)
output += """<br /><span class="info">Please note that you must run as soon as possible:
<pre>$> %s/bibindex --reindex -w %s</pre></span>""" % (CFG_BINDIR, get_idx(idxID)[0][1])
elif confirm in [1, "1"]:
output += """<br /><b><span class="info">Please give a name for the index.</span></b>"""
else:
output = """No index to modify."""
body = [output]
if callback:
return perform_editindex(idxID, ln, "perform_modifysynonymkb", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifystopwords(idxID, ln=CFG_SITE_LANG, idxSTOPWORDS='', callback='yes', confirm=-1):
"""Form to modify the stopwords configuration
@param idxID: id of the index on which modification will be performed.
@param idxSTOPWORDS: remove stopwords or not ('Yes' or 'No')
"""
subtitle = ""
output = ""
idx = get_idx(idxID)
if not idx:
idxID = -1
if idxID not in [-1, "-1"]:
subtitle = """<a name="4"></a>6. Modify remove stopwords.&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % CFG_SITE_URL
if confirm in [-1, "-1"]:
idxSTOPWORDS = get_idx_remove_stopwords(idxID)
if not idxSTOPWORDS:
idxSTOPWORDS = ''
if isinstance(idxSTOPWORDS, tuple):
idxSTOPWORDS = ''
stopwords_html_element = """<input class="admin_w200" type="text" name="idxSTOPWORDS" value="%s" /><br />""" % idxSTOPWORDS
text = """<span class="adminlabel">Remove stopwords</span><br />""" + stopwords_html_element
output += createhiddenform(action="modifystopwords#4",
text=text,
button="Modify",
idxID=idxID,
ln=ln,
confirm=0)
if confirm in [0, "0"] and get_idx(idxID)[0][6] == idxSTOPWORDS:
output += """<span class="info">Stopwords have not been changed</span>"""
elif confirm in [0, "0"] and idxSTOPWORDS == '':
output += """<span class="info">You need to provide a name of the file with stopwords</span>"""
elif confirm in [0, "0"]:
text = """<span class="important">You are going to change the stopwords configuration for this index.<br />
<strong>Are you sure you want to do this?</strong>"""
output += createhiddenform(action="modifystopwords#4",
text=text,
button="Modify",
idxID=idxID,
idxSTOPWORDS=idxSTOPWORDS,
ln=ln,
confirm=1)
elif idxID > -1 and confirm in [1, "1"]:
res = modify_idx_stopwords(idxID, idxSTOPWORDS)
output += write_outcome(res)
output += """<br /><span class="info">Please note you must run as soon as possible:
<pre>$> %s/bibindex --reindex -w %s</pre></span>""" % (CFG_BINDIR, get_idx(idxID)[0][1])
elif confirm in [1, "1"]:
output += """<br /><b><span class="info">Please give a name for the index.</span></b>"""
else:
output = """No index to modify."""
body = [output]
if callback:
return perform_editindex(idxID, ln, "perform_modifystopwords", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifyremovehtml(idxID, ln=CFG_SITE_LANG, idxHTML='', callback='yes', confirm=-1):
"""Form to modify the 'remove html' configuration.
@param idxID: id of the index on which modification will be performed.
@param idxHTML: remove html markup or not ('Yes' or 'No')"""
subtitle = ""
output = ""
idx = get_idx(idxID)
if not idx:
idxID = -1
if idxID not in [-1, "-1"]:
subtitle = """<a name="4"></a>7. Modify remove HTML markup.&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % CFG_SITE_URL
if confirm in [-1, "-1"]:
idxHTML = get_idx_remove_html_markup(idxID)
if not idxHTML:
idxHTML = ''
remove_html_element = """<select name="idxHTML" class="admin_w200">"""
if idxHTML == 'Yes':
remove_html_element += """<option value="Yes" selected ="selected">Yes</option>"""
remove_html_element += """<option value="No">No</option>"""
elif idxHTML == 'No':
remove_html_element += """<option value="Yes">Yes</option>"""
remove_html_element += """<option value="No" selected ="selected">No</option>"""
else:
remove_html_element += """<option value="Yes">Yes</option>"""
remove_html_element += """<option value="No">No</option>"""
remove_html_element += """</select>"""
text = """<span class="adminlabel">Remove HTML markup</span>""" + remove_html_element
output += createhiddenform(action="modifyremovehtml#4",
text=text,
button="Modify",
idxID=idxID,
ln=ln,
confirm=0)
if confirm in [0, "0"] and get_idx_remove_html_markup(idxID) == idxHTML:
output += """<span class="info">Remove HTML markup parameter has not been changed</span>"""
elif confirm in [0, "0"]:
text = """<span class="important">You are going to change the remove HTML markup for this index.<br />
<strong>Are you sure you want to change the remove HTML markup of this index?</strong>"""
output += createhiddenform(action="modifyremovehtml#4",
text=text,
button="Modify",
idxID=idxID,
idxHTML=idxHTML,
ln=ln,
confirm=1)
elif idxID > -1 and confirm in [1, "1"]:
res = modify_idx_html_markup(idxID, idxHTML)
output += write_outcome(res)
output += """<br /><span class="info">Please note you must run as soon as possible:
<pre>$> %s/bibindex --reindex -w %s</pre></span>""" % (CFG_BINDIR, get_idx(idxID)[0][1])
elif confirm in [1, "1"]:
output += """<br /><b><span class="info">Please give a name for the index.</span></b>"""
else:
output = """No index to modify."""
body = [output]
if callback:
return perform_editindex(idxID, ln, "perform_modifyremovehtml", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifyremovelatex(idxID, ln=CFG_SITE_LANG, idxLATEX='', callback='yes', confirm=-1):
"""Form to modify the 'remove latex' configuration.
@param idxID: id of the index on which modification will be performed.
@param idxLATEX: remove latex markup or not ('Yes' or 'No')"""
subtitle = ""
output = ""
idx = get_idx(idxID)
if not idx:
idxID = -1
if idxID not in [-1, "-1"]:
subtitle = """<a name="4"></a>8. Modify remove latex markup.&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % CFG_SITE_URL
if confirm in [-1, "-1"]:
idxLATEX = get_idx_remove_latex_markup(idxID)
if not idxLATEX:
idxLATEX = ''
remove_latex_element = """<select name="idxLATEX" class="admin_w200">"""
if idxLATEX == 'Yes':
remove_latex_element += """<option value="Yes" selected ="selected">Yes</option>"""
remove_latex_element += """<option value="No">No</option>"""
elif idxLATEX == 'No':
remove_latex_element += """<option value="Yes">Yes</option>"""
remove_latex_element += """<option value="No" selected ="selected">No</option>"""
else:
remove_latex_element += """<option value="Yes">Yes</option>"""
remove_latex_element += """<option value="No">No</option>"""
remove_latex_element += """</select>"""
text = """<span class="adminlabel">Remove latex markup</span>""" + remove_latex_element
output += createhiddenform(action="modifyremovelatex#4",
text=text,
button="Modify",
idxID=idxID,
ln=ln,
confirm=0)
if confirm in [0, "0"] and get_idx_remove_latex_markup(idxID) == idxLATEX:
output += """<span class="info">Remove latex markup parameter has not been changed</span>"""
elif confirm in [0, "0"]:
text = """<span class="important">You are going to change the remove latex markup for this index.<br />
<strong>Are you sure you want to change the remove latex markup of this index?</strong>"""
output += createhiddenform(action="modifyremovelatex#4",
text=text,
button="Modify",
idxID=idxID,
idxLATEX=idxLATEX,
ln=ln,
confirm=1)
elif idxID > -1 and confirm in [1, "1"]:
res = modify_idx_latex_markup(idxID, idxLATEX)
output += write_outcome(res)
output += """<br /><span class="info">Please note you must run as soon as possible:
<pre>$> %s/bibindex --reindex -w %s</pre></span>""" % (CFG_BINDIR, get_idx(idxID)[0][1])
elif confirm in [1, "1"]:
output += """<br /><b><span class="info">Please give a name for the index.</span></b>"""
else:
output = """No index to modify."""
body = [output]
if callback:
return perform_editindex(idxID, ln, "perform_modifyremovelatex", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifytokenizer(idxID, ln=CFG_SITE_LANG, idxTOK='', callback='yes', confirm=-1):
"""Form to modify the 'tokenizer' configuration.
@param idxID: id of the index on which modification will be performed.
@param idxTOK: tokenizer name"""
subtitle = ""
output = ""
idx = get_idx(idxID)
if not idx:
idxID = -1
if idxID not in [-1, "-1"]:
subtitle = """<a name="4"></a>9. Modify tokenizer.&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % CFG_SITE_URL
if confirm in [-1, "-1"]:
idxTOK = get_idx_tokenizer(idxID)
if not idxTOK:
idxTOK = ''
tokenizer_element = """<select name="idxTOK" class="admin_w200">"""
for key in _TOKENIZERS:
if key == idxTOK:
tokenizer_element += """<option value="%s" selected ="selected">%s</option>""" % (key, key)
else:
tokenizer_element += """<option value="%s">%s</option>""" % (key, key)
tokenizer_element += """</select>"""
text = """<span class="adminlabel">Tokenizer</span>""" + tokenizer_element
output += createhiddenform(action="modifytokenizer#4",
text=text,
button="Modify",
idxID=idxID,
ln=ln,
confirm=0)
if confirm in [0, "0"] and get_idx_tokenizer(idxID) == idxTOK:
output += """<span class="info">Tokenizer has not been changed</span>"""
elif confirm in [0, "0"]:
text = """<span class="important">You are going to change a tokenizer for this index.<br />
<strong>Are you sure you want to do this?</strong>"""
output += createhiddenform(action="modifytokenizer#4",
text=text,
button="Modify",
idxID=idxID,
idxTOK=idxTOK,
ln=ln,
confirm=1)
elif idxID > -1 and confirm in [1, "1"]:
res = modify_idx_tokenizer(idxID, idxTOK)
output += write_outcome(res)
output += """<br /><span class="info">Please note you must run as soon as possible:
<pre>$> %s/bibindex --reindex -w %s</pre></span>""" % (CFG_BINDIR, get_idx(idxID)[0][1])
elif confirm in [1, "1"]:
output += """<br /><b><span class="info">Please give a name for the index.</span></b>"""
else:
output = """No index to modify."""
body = [output]
if callback:
return perform_editindex(idxID, ln, "perform_modifytokenizer", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifyfield(fldID, ln=CFG_SITE_LANG, code='', callback='yes', confirm=-1):
"""form to modify a field.
fldID - the field to change."""
subtitle = ""
output = ""
fld_dict = dict(get_def_name('', "field"))
if fldID not in [-1, "-1"]:
if confirm in [-1, "-1"]:
res = get_fld(fldID)
code = res[0][2]
else:
code = str.replace("%s" % code, " ", "")
fldID = int(fldID)
subtitle = """<a name="2"></a>1. Modify field code for logical field '%s'&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % (fld_dict[int(fldID)], CFG_SITE_URL)
text = """
<span class="adminlabel">Field code</span>
<input class="admin_w200" type="text" name="code" value="%s" /><br />
""" % code
output += createhiddenform(action="modifyfield#2",
text=text,
button="Modify",
fldID=fldID,
ln=ln,
confirm=1)
if fldID > -1 and confirm in [1, "1"]:
fldID = int(fldID)
res = modify_fld(fldID, code)
output += write_outcome(res)
else:
output = """No field to modify.
"""
body = [output]
if callback:
return perform_editfield(fldID, ln, "perform_modifyfield", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifyindexfields(idxID, ln=CFG_SITE_LANG, callback='yes', content='', confirm=-1):
"""Modify which logical fields to use in this index.."""
output = ''
subtitle = """<a name="3"></a>3. Modify index fields.&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % CFG_SITE_URL
output = """<dl>
<dt>Menu</dt>
<dd><a href="%s/admin/bibindex/bibindexadmin.py/addindexfield?idxID=%s&amp;ln=%s#3.1">Add field to index</a></dd>
<dd><a href="%s/admin/bibindex/bibindexadmin.py/field?ln=%s">Manage fields</a></dd>
</dl>
""" % (CFG_SITE_URL, idxID, ln, CFG_SITE_URL, ln)
header = ['Field', '']
actions = []
idx_fld = get_idx_fld(idxID)
if len(idx_fld) > 0:
for (idxID, idxNAME,fldID, fldNAME, regexp_punct, regexp_alpha_sep) in idx_fld:
actions.append([fldNAME])
for col in [(('Remove','removeindexfield'),)]:
actions[-1].append('<a href="%s/admin/bibindex/bibindexadmin.py/%s?idxID=%s&amp;fldID=%s&amp;ln=%s#3.1">%s</a>' % (CFG_SITE_URL, col[0][1], idxID, fldID, ln, col[0][0]))
for (str, function) in col[1:]:
actions[-1][-1] += ' / <a href="%s/admin/bibindex/bibindexadmin.py/%s?fldID=%s&amp;flID=%s&amp;ln=%s#4.1">%s</a>' % (CFG_SITE_URL, function, idxID, fldID, ln, str)
output += tupletotable(header=header, tuple=actions)
else:
output += """No index fields exists"""
output += content
body = [output]
if callback:
return perform_editindex(idxID, ln, "perform_modifyindexfields", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifyfieldtags(fldID, ln=CFG_SITE_LANG, callback='yes', content='', confirm=-1):
"""show the sort fields of this collection.."""
output = ''
fld_dict = dict(get_def_name('', "field"))
fld_type = get_fld_nametypes()
fldID = int(fldID)
subtitle = """<a name="4"></a>3. Modify MARC tags for the logical field '%s'&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/bibindex-admin-guide">?</a>]</small>""" % (fld_dict[int(fldID)], CFG_SITE_URL)
output = """<dl>
<dt>Menu</dt>
<dd><a href="%s/admin/bibindex/bibindexadmin.py/addtag?fldID=%s&amp;ln=%s#4.1">Add MARC tag</a></dd>
<dd><a href="%s/admin/bibindex/bibindexadmin.py/deletetag?fldID=%s&amp;ln=%s#4.1">Delete unused MARC tags</a></dd>
</dl>
""" % (CFG_SITE_URL, fldID, ln, CFG_SITE_URL, fldID, ln)
header = ['', 'Value', 'Comment', 'Actions']
actions = []
res = get_fld_tags(fldID)
if len(res) > 0:
i = 0
for (fldID, tagID, tname, tvalue, score) in res:
move = ""
if i != 0:
move += """<a href="%s/admin/bibindex/bibindexadmin.py/switchtagscore?fldID=%s&amp;id_1=%s&amp;id_2=%s&amp;ln=%s&amp=rand=%s#4"><img border="0" src="%s/img/smallup.gif" title="Move tag up"></a>""" % (CFG_SITE_URL, fldID, tagID, res[i - 1][1], ln, random.randint(0, 1000), CFG_SITE_URL)
else:
move += "&nbsp;&nbsp;&nbsp;"
i += 1
if i != len(res):
move += '<a href="%s/admin/bibindex/bibindexadmin.py/switchtagscore?fldID=%s&amp;id_1=%s&amp;id_2=%s&amp;ln=%s&amp;rand=%s#4"><img border="0" src="%s/img/smalldown.gif" title="Move tag down"></a>' % (CFG_SITE_URL, fldID, tagID, res[i][1], ln, random.randint(0, 1000), CFG_SITE_URL)
actions.append([move, tvalue, tname])
for col in [(('Details','showdetailsfieldtag'), ('Modify','modifytag'),('Remove','removefieldtag'),)]:
actions[-1].append('<a href="%s/admin/bibindex/bibindexadmin.py/%s?fldID=%s&amp;tagID=%s&amp;ln=%s#4.1">%s</a>' % (CFG_SITE_URL, col[0][1], fldID, tagID, ln, col[0][0]))
for (str, function) in col[1:]:
actions[-1][-1] += ' / <a href="%s/admin/bibindex/bibindexadmin.py/%s?fldID=%s&amp;tagID=%s&amp;ln=%s#4.1">%s</a>' % (CFG_SITE_URL, function, fldID, tagID, ln, str)
output += tupletotable(header=header, tuple=actions)
else:
output += """No fields exists"""
output += content
body = [output]
if callback:
return perform_editfield(fldID, ln, "perform_modifyfieldtags", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_addtag(fldID, ln=CFG_SITE_LANG, value=['',-1], name='', callback="yes", confirm=-1):
"""form to add a new field.
fldNAME - the name of the new field
code - the field code"""
output = ""
subtitle = """<a name="4.1"></a>Add MARC tag to logical field"""
text = """
Add new tag:<br />
<span class="adminlabel">Tag value</span>
<input class="admin_w200" maxlength="6" type="text" name="value" value="%s" /><br />
<span class="adminlabel">Tag comment</span>
<input class="admin_w200" type="text" name="name" value="%s" /><br />
""" % ((name=='' and value[0] or name), value[0])
text += """Or existing tag:<br />
<span class="adminlabel">Tag</span>
<select name="value" class="admin_w200">
<option value="-1">- Select a tag -</option>
"""
fld_tags = get_fld_tags(fldID)
tags = get_tags()
fld_tags = dict(map(lambda x: (x[1], x[0]), fld_tags))
for (id_tag, tname, tvalue) in tags:
if not fld_tags.has_key(id_tag):
text += """<option value="%s" %s>%s</option>""" % (tvalue, (tvalue==value[1] and 'selected="selected"' or ''), "%s - %s" % (tvalue, tname))
text += """</select>"""
output = createhiddenform(action="%s/admin/bibindex/bibindexadmin.py/addtag" % CFG_SITE_URL,
text=text,
fldID=fldID,
ln=ln,
button="Add tag",
confirm=1)
if (value[0] and value[1] in [-1, "-1"]) or (not value[0] and value[1] not in [-1, "-1"]):
if confirm in ["1", 1]:
res = add_fld_tag(fldID, name, (value[0] !='' and value[0] or value[1]))
output += write_outcome(res)
elif confirm not in ["-1", -1]:
output += """<b><span class="info">Please choose to add either a new or an existing MARC tag, but not both.</span></b>
"""
body = [output]
if callback:
return perform_modifyfieldtags(fldID, ln, "perform_addtag", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifytag(fldID, tagID, ln=CFG_SITE_LANG, name='', value='', callback='yes', confirm=-1):
"""form to modify a field.
fldID - the field to change."""
subtitle = ""
output = ""
fld_dict = dict(get_def_name('', "field"))
fldID = int(fldID)
tagID = int(tagID)
tag = get_tags(tagID)
if confirm in [-1, "-1"] and not value and not name:
name = tag[0][1]
value = tag[0][2]
subtitle = """<a name="3.1"></a>Modify MARC tag"""
text = """
Any modifications will apply to all logical fields using this tag.<br />
<span class="adminlabel">Tag value</span>
<input class="admin_w200" type="text" name="value" value="%s" /><br />
<span class="adminlabel">Comment</span>
<input class="admin_w200" type="text" name="name" value="%s" /><br />
""" % (value, name)
output += createhiddenform(action="modifytag#4.1",
text=text,
button="Modify",
fldID=fldID,
tagID=tagID,
ln=ln,
confirm=1)
if name and value and confirm in [1, "1"]:
res = modify_tag(tagID, name, value)
output += write_outcome(res)
body = [output]
if callback:
return perform_modifyfieldtags(fldID, ln, "perform_modifytag", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_removefieldtag(fldID, tagID, ln=CFG_SITE_LANG, callback='yes', confirm=0):
"""form to remove a tag from a field.
fldID - the current field, remove the tag from this field.
tagID - remove the tag with this id"""
subtitle = """<a name="4.1"></a>Remove MARC tag from logical field"""
output = ""
fld_dict = dict(get_def_name('', "field"))
if fldID and tagID:
fldID = int(fldID)
tagID = int(tagID)
tag = get_fld_tags(fldID, tagID)
if confirm not in ["1", 1]:
text = """Do you want to remove the tag '%s - %s ' from the field '%s'.""" % (tag[0][3], tag[0][2], fld_dict[fldID])
output += createhiddenform(action="removefieldtag#4.1",
text=text,
button="Confirm",
fldID=fldID,
tagID=tagID,
confirm=1)
elif confirm in ["1", 1]:
res = remove_fldtag(fldID, tagID)
output += write_outcome(res)
body = [output]
if callback:
return perform_modifyfieldtags(fldID, ln, "perform_removefieldtag", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_addindexfield(idxID, ln=CFG_SITE_LANG, fldID='', callback="yes", confirm=-1):
"""form to add a new field.
fldNAME - the name of the new field
code - the field code"""
output = ""
subtitle = """<a name="4.1"></a>Add logical field to index"""
text = """
<span class="adminlabel">Field name</span>
<select name="fldID" class="admin_w200">
<option value="-1">- Select a field -</option>
"""
fld = get_fld()
for (fldID2, fldNAME, fldCODE) in fld:
text += """<option value="%s" %s>%s</option>""" % (fldID2, (fldID==fldID2 and 'selected="selected"' or ''), fldNAME)
text += """</select>"""
output = createhiddenform(action="%s/admin/bibindex/bibindexadmin.py/addindexfield" % CFG_SITE_URL,
text=text,
idxID=idxID,
ln=ln,
button="Add field",
confirm=1)
if fldID and not fldID in [-1, "-1"] and confirm in ["1", 1]:
res = add_idx_fld(idxID, fldID)
output += write_outcome(res)
elif confirm in ["1", 1]:
output += """<b><span class="info">Please select a field to add.</span></b>"""
body = [output]
if callback:
return perform_modifyindexfields(idxID, ln, "perform_addindexfield", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_removeindexfield(idxID, fldID, ln=CFG_SITE_LANG, callback='yes', confirm=0):
"""form to remove a field from an index.
idxID - the current index, remove the field from this index.
fldID - remove the field with this id"""
subtitle = """<a name="3.1"></a>Remove field from index"""
output = ""
if fldID and idxID:
fldID = int(fldID)
idxID = int(idxID)
fld = get_fld(fldID)
idx = get_idx(idxID)
if fld and idx and confirm not in ["1", 1]:
text = """Do you want to remove the field '%s' from the index '%s'.""" % (fld[0][1], idx[0][1])
output += createhiddenform(action="removeindexfield#3.1",
text=text,
button="Confirm",
idxID=idxID,
fldID=fldID,
confirm=1)
elif confirm in ["1", 1]:
res = remove_idxfld(idxID, fldID)
output += write_outcome(res)
body = [output]
if callback:
return perform_modifyindexfields(idxID, ln, "perform_removeindexfield", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_switchtagscore(fldID, id_1, id_2, ln=CFG_SITE_LANG):
"""Switch the score of id_1 and id_2 in the table type.
colID - the current collection
id_1/id_2 - the id's to change the score for.
type - like "format" """
output = ""
name_1 = run_sql("select name from tag where id=%s", (id_1, ))[0][0]
name_2 = run_sql("select name from tag where id=%s", (id_2, ))[0][0]
res = switch_score(fldID, id_1, id_2)
output += write_outcome(res)
return perform_modifyfieldtags(fldID, ln, content=output)
def perform_deletetag(fldID, ln=CFG_SITE_LANG, tagID=-1, callback='yes', confirm=-1):
"""form to delete an MARC tag not in use.
fldID - the collection id of the current collection.
fmtID - the format id to delete."""
subtitle = """<a name="10.3"></a>Delete an unused MARC tag"""
output = """
<dl>
<dd>Deleting an MARC tag will also delete the translations associated.</dd>
</dl>
"""
fldID = int(fldID)
if tagID not in [-1," -1"] and confirm in [1, "1"]:
ares = delete_tag(tagID)
fld_tag = get_fld_tags()
fld_tag = dict(map(lambda x: (x[1], x[0]), fld_tag))
tags = get_tags()
text = """
<span class="adminlabel">MARC tag</span>
<select name="tagID" class="admin_w200">
"""
text += """<option value="-1">- Select MARC tag -"""
i = 0
for (id, name, value) in tags:
if not fld_tag.has_key(id):
text += """<option value="%s" %s>%s</option>""" % (id, id == int(tagID) and 'selected="selected"' or '', "%s - %s" % (value, name))
i += 1
text += """</select><br />"""
if i == 0:
output += """<b><span class="info">No unused MARC tags</span></b><br />"""
else:
output += createhiddenform(action="deletetag#4.1",
text=text,
button="Delete",
fldID=fldID,
ln=ln,
confirm=0)
if tagID not in [-1,"-1"]:
tagID = int(tagID)
tags = get_tags(tagID)
if confirm in [0, "0"]:
text = """<b>Do you want to delete the MARC tag '%s'.</b>""" % tags[0][2]
output += createhiddenform(action="deletetag#4.1",
text=text,
button="Confirm",
fldID=fldID,
tagID=tagID,
ln=ln,
confirm=1)
elif confirm in [1, "1"]:
output += write_outcome(ares)
elif confirm not in [-1, "-1"]:
output += """<b><span class="info">Choose a MARC tag to delete.</span></b>"""
body = [output]
output = "<br />" + addadminbox(subtitle, body)
return perform_modifyfieldtags(fldID, ln, content=output)
def compare_on_val(first, second):
"""Compare the two values"""
return cmp(first[1], second[1])
def get_col_fld(colID=-1, type = '', id_field=''):
"""Returns either all portalboxes associated with a collection, or based on either colID or language or both.
colID - collection id
ln - language id"""
sql = "SELECT id_collection,id_field,id_fieldvalue,type,score,score_fieldvalue FROM collection_field_fieldvalue, field WHERE id_field=field.id"
params = []
try:
if id_field:
sql += " AND id_field=%s"
params.append(id_field)
sql += " ORDER BY type, score desc, score_fieldvalue desc"
res = run_sql(sql, tuple(params))
return res
except StandardError, e:
return ""
def get_idx(idxID=''):
sql = "SELECT id,name,description,last_updated,stemming_language, synonym_kbrs,remove_stopwords,remove_html_markup,remove_latex_markup,tokenizer FROM idxINDEX"
params = []
try:
if idxID:
sql += " WHERE id=%s"
params.append(idxID)
sql += " ORDER BY id asc"
res = run_sql(sql, tuple(params))
return res
except StandardError, e:
return ""
def get_idx_synonym_kb(idxID):
"""Returns a synonym knowledge base field value"""
try:
return run_sql("SELECT synonym_kbrs FROM idxINDEX WHERE ID=%s", (idxID, ))[0][0]
except StandardError, e:
return e.__str__()
def get_idx_remove_stopwords(idxID):
"""Returns a stopwords field value"""
try:
return run_sql("SELECT remove_stopwords FROM idxINDEX WHERE ID=%s", (idxID, ))[0][0]
except StandardError, e:
return (0, e)
def get_idx_remove_html_markup(idxID):
"""Returns a remove html field value"""
try:
return run_sql("SELECT remove_html_markup FROM idxINDEX WHERE ID=%s", (idxID, ))[0][0]
except StandardError, e:
return (0, e)
def get_idx_remove_latex_markup(idxID):
"""Returns a remove latex field value"""
try:
return run_sql("SELECT remove_latex_markup FROM idxINDEX WHERE ID=%s", (idxID, ))[0][0]
except StandardError, e:
return (0, e)
def get_idx_tokenizer(idxID):
"""Returns a tokenizer field value"""
try:
return run_sql("SELECT tokenizer FROM idxINDEX WHERE ID=%s", (idxID, ))[0][0]
except StandardError, e:
return (0, e)
def get_fld_tags(fldID='', tagID=''):
"""Returns tags associated with a field.
fldID - field id
tagID - tag id"""
sql = "SELECT id_field,id_tag, tag.name, tag.value, score FROM field_tag,tag WHERE tag.id=field_tag.id_tag"
params = []
try:
if fldID:
sql += " AND id_field=%s"
params.append(fldID)
if tagID:
sql += " AND id_tag=%s"
params.append(tagID)
sql += " ORDER BY score desc, tag.value, tag.name"
res = run_sql(sql, tuple(params))
return res
except StandardError, e:
return ""
def get_tags(tagID=''):
"""Returns all or a given tag.
tagID - tag id
ln - language id"""
sql = "SELECT id, name, value FROM tag"
params = []
try:
if tagID:
sql += " WHERE id=%s"
params.append(tagID)
sql += " ORDER BY name, value"
res = run_sql(sql, tuple(params))
return res
except StandardError, e:
return ""
def get_fld(fldID=''):
"""Returns all fields or only the given field"""
try:
if not fldID:
res = run_sql("SELECT id, name, code FROM field ORDER by name, code")
else:
res = run_sql("SELECT id, name, code FROM field WHERE id=%s ORDER by name, code", (fldID, ))
return res
except StandardError, e:
return ""
def get_fld_id(fld_name=''):
"""Returns field id for a field name"""
try:
res = run_sql('SELECT id FROM field WHERE name=%s', (fld_name,))
return res[0][0]
except StandardError, e:
return ''
def get_fld_value(fldvID = ''):
"""Returns fieldvalue"""
try:
sql = "SELECT id, name, value FROM fieldvalue"
params = []
if fldvID:
sql += " WHERE id=%s"
params.append(fldvID)
res = run_sql(sql, tuple(params))
return res
except StandardError, e:
return ""
def get_idx_fld(idxID=''):
"""Return a list of fields associated with one or all indexes"""
try:
sql = "SELECT id_idxINDEX, idxINDEX.name, id_field, field.name, regexp_punctuation, regexp_alphanumeric_separators FROM idxINDEX, field, idxINDEX_field WHERE idxINDEX.id = idxINDEX_field.id_idxINDEX AND field.id = idxINDEX_field.id_field"
params = []
if idxID:
sql += " AND id_idxINDEX=%s"
params.append(idxID)
sql += " ORDER BY id_idxINDEX asc"
res = run_sql(sql, tuple(params))
return res
except StandardError, e:
return ""
def get_col_nametypes():
"""Return a list of the various translationnames for the fields"""
type = []
type.append(('ln', 'Long name'))
return type
def get_fld_nametypes():
"""Return a list of the various translationnames for the fields"""
type = []
type.append(('ln', 'Long name'))
return type
def get_idx_nametypes():
"""Return a list of the various translationnames for the index"""
type = []
type.append(('ln', 'Long name'))
return type
def get_sort_nametypes():
"""Return a list of the various translationnames for the fields"""
type = {}
type['soo'] = 'Sort options'
type['seo'] = 'Search options'
type['sew'] = 'Search within'
return type
def remove_fld(colID,fldID, fldvID=''):
"""Removes a field from the collection given.
colID - the collection the format is connected to
fldID - the field which should be removed from the collection."""
try:
sql = "DELETE FROM collection_field_fieldvalue WHERE id_collection=%s AND id_field=%s"
params = [colID, fldID]
if fldvID:
sql += " AND id_fieldvalue=%s"
params.append(fldvID)
res = run_sql(sql, tuple(params))
return (1, "")
except StandardError, e:
return (0, e)
def remove_idxfld(idxID, fldID):
"""Remove a field from a index in table idxINDEX_field
idxID - index id from idxINDEX
fldID - field id from field table"""
try:
sql = "DELETE FROM idxINDEX_field WHERE id_field=%s and id_idxINDEX=%s"
res = run_sql(sql, (fldID, idxID))
return (1, "")
except StandardError, e:
return (0, e)
def remove_fldtag(fldID,tagID):
"""Removes a tag from the field given.
fldID - the field the tag is connected to
tagID - the tag which should be removed from the field."""
try:
sql = "DELETE FROM field_tag WHERE id_field=%s AND id_tag=%s"
res = run_sql(sql, (fldID, tagID))
return (1, "")
except StandardError, e:
return (0, e)
def delete_tag(tagID):
"""Deletes all data for the given field
fldID - delete all data in the tables associated with field and this id """
try:
res = run_sql("DELETE FROM tag where id=%s", (tagID, ))
return (1, "")
except StandardError, e:
return (0, e)
def delete_idx(idxID):
"""Deletes all data for the given index together with the idxWORDXXR and idxWORDXXF tables"""
try:
idxID = int(idxID)
res = run_sql("DELETE FROM idxINDEX WHERE id=%s", (idxID, ))
res = run_sql("DELETE FROM idxINDEXNAME WHERE id_idxINDEX=%s", (idxID, ))
res = run_sql("DELETE FROM idxINDEX_field WHERE id_idxINDEX=%s", (idxID, ))
res = run_sql("DROP TABLE idxWORD%02dF" % idxID) # kwalitee: disable=sql
res = run_sql("DROP TABLE idxWORD%02dR" % idxID) # kwalitee: disable=sql
res = run_sql("DROP TABLE idxPAIR%02dF" % idxID) # kwalitee: disable=sql
res = run_sql("DROP TABLE idxPAIR%02dR" % idxID) # kwalitee: disable=sql
res = run_sql("DROP TABLE idxPHRASE%02dF" % idxID) # kwalitee: disable=sql
res = run_sql("DROP TABLE idxPHRASE%02dR" % idxID) # kwalitee: disable=sql
return (1, "")
except StandardError, e:
return (0, e)
def delete_virtual_idx(idxID):
"""Deletes this virtual index - it means that function
changes type of the index from 'virtual' to 'normal'
@param idxID -id of the virtual index to delete/change into normal idx
"""
try:
run_sql("""UPDATE idxINDEX SET indexer='native'
WHERE id=%s""", (idxID, ))
run_sql("""DELETE FROM idxINDEX_idxINDEX
WHERE id_virtual=%s""", (idxID, ))
return (1, "")
except StandardError, e:
return (0, e)
def delete_fld(fldID):
"""Deletes all data for the given field
fldID - delete all data in the tables associated with field and this id """
try:
res = run_sql("DELETE FROM collection_field_fieldvalue WHERE id_field=%s", (fldID, ))
res = run_sql("DELETE FROM field_tag WHERE id_field=%s", (fldID, ))
res = run_sql("DELETE FROM idxINDEX_field WHERE id_field=%s", (fldID, ))
res = run_sql("DELETE FROM field WHERE id=%s", (fldID, ))
return (1, "")
except StandardError, e:
return (0, e)
def add_idx(idxNAME):
"""Add a new index. returns the id of the new index.
idxID - the id for the index, number
idxNAME - the default name for the default language of the format."""
try:
idxID = 0
res = run_sql("SELECT id from idxINDEX WHERE name=%s", (idxNAME,))
if res:
return (0, (0, "A index with the given name already exists."))
for i in xrange(1, 100):
res = run_sql("SELECT id from idxINDEX WHERE id=%s", (i, ))
res2 = get_table_status_info("idxWORD%02d%%" % i)
if not res and not res2:
idxID = i
break
if idxID == 0:
return (0, (0, "Not possible to create new indexes, delete an index and try again."))
res = run_sql("INSERT INTO idxINDEX (id, name) VALUES (%s,%s)", (idxID, idxNAME))
type = get_idx_nametypes()[0][0]
res = run_sql("INSERT INTO idxINDEXNAME (id_idxINDEX, ln, type, value) VALUES (%s,%s,%s,%s)",
(idxID, CFG_SITE_LANG, type, idxNAME))
res = run_sql("""CREATE TABLE IF NOT EXISTS idxWORD%02dF (
id mediumint(9) unsigned NOT NULL auto_increment,
term varchar(50) default NULL,
hitlist longblob,
PRIMARY KEY (id),
UNIQUE KEY term (term)
) ENGINE=MyISAM""" % idxID)
res = run_sql("""CREATE TABLE IF NOT EXISTS idxWORD%02dR (
id_bibrec mediumint(9) unsigned NOT NULL,
termlist longblob,
type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT',
PRIMARY KEY (id_bibrec,type)
) ENGINE=MyISAM""" % idxID)
res = run_sql("""CREATE TABLE IF NOT EXISTS idxPAIR%02dF (
id mediumint(9) unsigned NOT NULL auto_increment,
term varchar(100) default NULL,
hitlist longblob,
PRIMARY KEY (id),
UNIQUE KEY term (term)
) ENGINE=MyISAM""" % idxID)
res = run_sql("""CREATE TABLE IF NOT EXISTS idxPAIR%02dR (
id_bibrec mediumint(9) unsigned NOT NULL,
termlist longblob,
type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT',
PRIMARY KEY (id_bibrec,type)
) ENGINE=MyISAM""" % idxID)
res = run_sql("""CREATE TABLE IF NOT EXISTS idxPHRASE%02dF (
id mediumint(9) unsigned NOT NULL auto_increment,
term text default NULL,
hitlist longblob,
PRIMARY KEY (id),
KEY term (term(50))
) ENGINE=MyISAM""" % idxID)
res = run_sql("""CREATE TABLE IF NOT EXISTS idxPHRASE%02dR (
id_bibrec mediumint(9) unsigned NOT NULL default '0',
termlist longblob,
type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT',
PRIMARY KEY (id_bibrec,type)
) ENGINE=MyISAM""" % idxID)
res = run_sql("SELECT id from idxINDEX WHERE id=%s", (idxID, ))
res2 = get_table_status_info("idxWORD%02dF" % idxID)
res3 = get_table_status_info("idxWORD%02dR" % idxID)
if res and res2 and res3:
return (1, res[0][0])
elif not res:
return (0, (0, "Could not add the new index to idxINDEX"))
elif not res2:
return (0, (0, "Forward table not created for unknown reason."))
elif not res3:
return (0, (0, "Reverse table not created for unknown reason."))
except StandardError, e:
return (0, e)
def add_virtual_idx(id_virtual, id_normal):
"""Adds new virtual index and its first dependent index.
Doesn't change index's settings, but they're not
used anymore.
Uses function add_dependent_index, because
query in both cases is the same.
"""
try:
run_sql("""UPDATE idxINDEX SET indexer='virtual'
WHERE id=%s""", (id_virtual, ))
return add_dependent_index(id_virtual, id_normal)
except StandardError, e:
return (0, e)
def modify_dependent_indexes(idxID, indexes_to_add, indexes_to_remove):
"""Adds and removes dependent indexes"""
all_indexes = dict(get_all_index_names_and_column_values("id"))
for index_name in indexes_to_add:
res = add_dependent_index(idxID, all_indexes[index_name])
if res[0] == 0:
return res
for index_name in indexes_to_remove:
res = remove_dependent_index(idxID, all_indexes[index_name])
if res[0] == 0:
return res
return (1, "")
def add_dependent_index(id_virtual, id_normal):
"""Adds dependent index to specific virtual index"""
try:
query = """INSERT INTO idxINDEX_idxINDEX (id_virtual, id_normal)
VALUES (%s, %s)""" % (id_virtual, id_normal)
res = run_sql(query)
return (1, "")
except StandardError, e:
return (0, e)
def remove_dependent_index(id_virtual, id_normal):
"""Remove dependent index to specific virtual index"""
try:
query = """DELETE FROM idxINDEX_idxINDEX
WHERE id_virtual=%s AND
id_normal=%s
""" % (id_virtual, id_normal)
res = run_sql(query)
return (1, "")
except StandardError, e:
return (0, e)
def add_fld(name, code):
"""Add a new logical field. Returns the id of the field.
code - the code for the field,
name - the default name for the default language of the field."""
try:
type = get_fld_nametypes()[0][0]
res = run_sql("INSERT INTO field (name, code) VALUES (%s,%s)", (name, code))
fldID = run_sql("SELECT id FROM field WHERE code=%s", (code,))
res = run_sql("INSERT INTO fieldname (id_field, type, ln, value) VALUES (%s,%s,%s,%s)", (fldID[0][0], type, CFG_SITE_LANG, name))
if fldID:
return (1, fldID[0][0])
else:
raise StandardError
except StandardError, e:
return (0, e)
def add_fld_tag(fldID, name, value):
"""Add a sort/search/field to the collection.
colID - the id of the collection involved
fmtID - the id of the format.
score - the score of the format, decides sorting, if not given, place the format on top"""
try:
res = run_sql("SELECT score FROM field_tag WHERE id_field=%s ORDER BY score desc", (fldID, ))
if res:
score = int(res[0][0]) + 1
else:
score = 0
res = run_sql("SELECT id FROM tag WHERE value=%s", (value,))
if not res:
if name == '':
name = value
res = run_sql("INSERT INTO tag (name, value) VALUES (%s,%s)", (name, value))
res = run_sql("SELECT id FROM tag WHERE value=%s", (value,))
res = run_sql("INSERT INTO field_tag(id_field, id_tag, score) values(%s, %s, %s)", (fldID, res[0][0], score))
return (1, "")
except StandardError, e:
return (0, e)
def add_idx_fld(idxID, fldID):
"""Add a field to an index"""
try:
sql = "SELECT id_idxINDEX FROM idxINDEX_field WHERE id_idxINDEX=%s and id_field=%s"
res = run_sql(sql, (idxID, fldID))
if res:
return (0, (0, "The field selected already exists for this index"))
sql = "INSERT INTO idxINDEX_field(id_idxINDEX, id_field) values (%s, %s)"
res = run_sql(sql, (idxID, fldID))
return (1, "")
except StandardError, e:
return (0, e)
def modify_idx(idxID, idxNAME, idxDESC):
"""Modify index name or index description in idxINDEX table"""
try:
res = run_sql("UPDATE idxINDEX SET name=%s WHERE id=%s", (idxNAME, idxID))
res = run_sql("UPDATE idxINDEX SET description=%s WHERE ID=%s", (idxDESC, idxID))
return (1, "")
except StandardError, e:
return (0, e)
def modify_idx_stemming(idxID, idxSTEM):
"""Modify the index stemming language in idxINDEX table"""
try:
run_sql("UPDATE idxINDEX SET stemming_language=%s WHERE ID=%s", (idxSTEM, idxID))
return (1, "")
except StandardError, e:
return (0, e)
def modify_idx_indexer(idxID, indexer):
"""Modify an indexer type in idxINDEX table"""
try:
res = run_sql("UPDATE idxINDEX SET indexer=%s WHERE ID=%s", (indexer, idxID))
return (1, "")
except StandardError, e:
return (0, e)
def modify_idx_synonym_kb(idxID, idxKB, idxMATCH):
"""Modify the knowledge base for the synonym lookup in idxINDEX table
@param idxID: id of the index in idxINDEX table
@param idxKB: name of the knowledge base (for example: INDEX-SYNONYM-TITLE)
@param idxMATCH: type of match in the knowledge base: exact, leading-to-coma, leading-to-number
"""
try:
field_value = ""
if idxKB != CFG_BIBINDEX_SYNONYM_MATCH_TYPE["None"] and idxMATCH != CFG_BIBINDEX_SYNONYM_MATCH_TYPE["None"]:
field_value = idxKB + CFG_BIBINDEX_COLUMN_VALUE_SEPARATOR + idxMATCH
run_sql("UPDATE idxINDEX SET synonym_kbrs=%s WHERE ID=%s", (field_value, idxID))
return (1, "")
except StandardError, e:
return (0, e)
def modify_idx_stopwords(idxID, idxSTOPWORDS):
"""Modify the stopwords in idxINDEX table
@param idxID: id of the index which we modify
@param idxSTOPWORDS: tells if stopwords should be removed ('Yes' or 'No')
"""
try:
run_sql("UPDATE idxINDEX SET remove_stopwords=%s WHERE ID=%s", (idxSTOPWORDS, idxID))
return (1, "")
except StandardError, e:
return (0, e)
def modify_idx_html_markup(idxID, idxHTML):
"""Modify the index remove html markup in idxINDEX table"""
try:
run_sql("UPDATE idxINDEX SET remove_html_markup=%s WHERE ID=%s", (idxHTML, idxID))
return (1, "")
except StandardError, e:
return (0, e)
def modify_idx_latex_markup(idxID, idxLATEX):
"""Modify the index remove latex markup in idxINDEX table"""
try:
run_sql("UPDATE idxINDEX SET remove_latex_markup=%s WHERE ID=%s", (idxLATEX, idxID))
return (1, "")
except StandardError, e:
return (0, e)
def modify_idx_tokenizer(idxID, idxTOK):
"""Modify a tokenizer in idxINDEX table for given index"""
try:
run_sql("UPDATE idxINDEX SET tokenizer=%s WHERE ID=%s", (idxTOK, idxID))
return (1, "")
except StandardError, e:
return (0, e)
def modify_fld(fldID, code):
"""Modify the code of field
fldID - the id of the field to modify
code - the new code"""
try:
sql = "UPDATE field SET code=%s"
sql += " WHERE id=%s"
res = run_sql(sql, (code, fldID))
return (1, "")
except StandardError, e:
return (0, e)
def modify_tag(tagID, name, value):
"""Modify the name and value of a tag.
tagID - the id of the tag to modify
name - the new name of the tag
value - the new value of the tag"""
try:
sql = "UPDATE tag SET name=%s WHERE id=%s"
res = run_sql(sql, (name, tagID))
sql = "UPDATE tag SET value=%s WHERE id=%s"
res = run_sql(sql, (value, tagID))
return (1, "")
except StandardError, e:
return (0, e)
def switch_score(fldID, id_1, id_2):
"""Switch the scores of id_1 and id_2 in the table given by the argument.
colID - collection the id_1 or id_2 is connected to
id_1/id_2 - id field from tables like format..portalbox...
table - name of the table"""
try:
res1 = run_sql("SELECT score FROM field_tag WHERE id_field=%s and id_tag=%s", (fldID, id_1))
res2 = run_sql("SELECT score FROM field_tag WHERE id_field=%s and id_tag=%s", (fldID, id_2))
res = run_sql("UPDATE field_tag SET score=%s WHERE id_field=%s and id_tag=%s", (res2[0][0], fldID, id_1))
res = run_sql("UPDATE field_tag SET score=%s WHERE id_field=%s and id_tag=%s", (res1[0][0], fldID, id_2))
return (1, "")
except StandardError, e:
return (0, e)
def get_lang_list(table, field, id):
langs = run_sql("SELECT ln FROM %s WHERE %s=%%s" % (wash_table_column_name(table), wash_table_column_name(field)), (id, )) # kwalitee: disable=sql
exists = {}
lang = ''
for lng in langs:
if not exists.has_key(lng[0]):
lang += lng[0] + ", "
exists[lng[0]] = 1
if lang.endswith(", "):
lang = lang [:-2]
if len(exists) == 0:
lang = """<b><span class="info">None</span></b>"""
return lang
def check_user(req, role, adminarea=2, authorized=0):
# FIXME: Add doctype.
# This function is similar to the one found in
# oairepository/lib/oai_repository_admin.py, bibrank/lib/bibrankadminlib.py and
# websubmit/lib/websubmitadmin_engine.py.
auth_code, auth_message = acc_authorize_action(req, role)
if not authorized and auth_code != 0:
return ("false", auth_message)
return ("", auth_message)
diff --git a/invenio/legacy/bibindex/engine.py b/invenio/legacy/bibindex/engine.py
index a4695077b..7878fee98 100644
--- a/invenio/legacy/bibindex/engine.py
+++ b/invenio/legacy/bibindex/engine.py
@@ -1,1985 +1,1984 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibIndex indexing engine implementation. See bibindex executable for entry point.
"""
__revision__ = "$Id$"
import re
import sys
import time
import fnmatch
from datetime import datetime
from time import strptime
from invenio.config import CFG_SOLR_URL
from invenio.bibindex_engine_config import CFG_MAX_MYSQL_THREADS, \
CFG_MYSQL_THREAD_TIMEOUT, \
CFG_CHECK_MYSQL_THREADS, \
CFG_BIBINDEX_COLUMN_VALUE_SEPARATOR, \
CFG_BIBINDEX_INDEX_TABLE_TYPE, \
CFG_BIBINDEX_ADDING_RECORDS_STARTED_STR, \
CFG_BIBINDEX_UPDATE_MESSAGE
from invenio.bibauthority_config import \
CFG_BIBAUTHORITY_CONTROLLED_FIELDS_BIBLIOGRAPHIC, \
CFG_BIBAUTHORITY_RECORD_CONTROL_NUMBER_FIELD
from invenio.bibauthority_engine import get_index_strings_by_control_no,\
get_control_nos_from_recID
from invenio.bibindexadminlib import get_idx_remove_html_markup, \
get_idx_remove_latex_markup, \
get_idx_remove_stopwords
from invenio.bibdocfile import BibRecDocs
-from invenio.search_engine import perform_request_search, \
from invenio.legacy.search_engine import perform_request_search, \
get_index_stemming_language, \
get_synonym_terms, \
search_pattern, \
search_unit_in_bibrec
from invenio.legacy.dbquery import run_sql, DatabaseError, serialize_via_marshal, \
deserialize_via_marshal, wash_table_column_name
-from invenio.bibindex_engine_washer import wash_index_term
-from invenio.bibtask import task_init, write_message, get_datetime, \
+from invenio.legacy.bibindex.engine_washer import wash_index_term
+from invenio.legacy.bibsched.bibtask import task_init, write_message, get_datetime, \
task_set_option, task_get_option, task_get_task_param, \
task_update_progress, task_sleep_now_if_required
from invenio.intbitset import intbitset
from invenio.ext.logging import register_exception
from invenio.legacy.bibrank.adminlib import get_def_name
from invenio.solrutils_bibindex_indexer import solr_commit
from invenio.bibindex_tokenizers.BibIndexJournalTokenizer import \
CFG_JOURNAL_TAG, \
CFG_JOURNAL_PUBINFO_STANDARD_FORM, \
CFG_JOURNAL_PUBINFO_STANDARD_FORM_REGEXP_CHECK
from invenio.bibindex_engine_utils import load_tokenizers, \
get_all_index_names_and_column_values, \
get_idx_indexer, \
get_index_tags, \
get_field_tags, \
get_tag_indexes, \
get_all_indexes, \
get_all_virtual_indexes, \
get_index_virtual_indexes, \
is_index_virtual, \
get_virtual_index_building_blocks, \
get_index_id_from_index_name, \
get_index_name_from_index_id, \
run_sql_drop_silently, \
get_min_last_updated, \
remove_inexistent_indexes
from invenio.legacy.bibrecord import get_fieldvalues
from invenio.bibfield import get_record
from invenio.memoiseutils import Memoise
if sys.hexversion < 0x2040000:
# pylint: disable=W0622
from sets import Set as set
# pylint: enable=W0622
## precompile some often-used regexp for speed reasons:
re_subfields = re.compile('\$\$\w')
re_datetime_shift = re.compile("([-\+]{0,1})([\d]+)([dhms])")
nb_char_in_line = 50 # for verbose pretty printing
chunksize = 1000 # default size of chunks that the records will be treated by
base_process_size = 4500 # process base size
_last_word_table = None
_TOKENIZERS = load_tokenizers()
def list_union(list1, list2):
"Returns union of the two lists."
union_dict = {}
for e in list1:
union_dict[e] = 1
for e in list2:
union_dict[e] = 1
return union_dict.keys()
def list_unique(_list):
"""Returns a _list with duplicates removed."""
_dict = {}
for e in _list:
_dict[e] = 1
return _dict.keys()
## safety function for killing slow DB threads:
def kill_sleepy_mysql_threads(max_threads=CFG_MAX_MYSQL_THREADS, thread_timeout=CFG_MYSQL_THREAD_TIMEOUT):
"""Check the number of DB threads and if there are more than
MAX_THREADS of them, lill all threads that are in a sleeping
state for more than THREAD_TIMEOUT seconds. (This is useful
for working around the the max_connection problem that appears
during indexation in some not-yet-understood cases.) If some
threads are to be killed, write info into the log file.
"""
res = run_sql("SHOW FULL PROCESSLIST")
if len(res) > max_threads:
for row in res:
r_id, dummy, dummy, dummy, r_command, r_time, dummy, dummy = row
if r_command == "Sleep" and int(r_time) > thread_timeout:
run_sql("KILL %s", (r_id,))
write_message("WARNING: too many DB threads, killing thread %s" % r_id, verbose=1)
return
def get_associated_subfield_value(recID, tag, value, associated_subfield_code):
"""Return list of ASSOCIATED_SUBFIELD_CODE, if exists, for record
RECID and TAG of value VALUE. Used by fulltext indexer only.
Note: TAG must be 6 characters long (tag+ind1+ind2+sfcode),
otherwise en empty string is returned.
FIXME: what if many tag values have the same value but different
associated_subfield_code? Better use bibrecord library for this.
"""
out = ""
if len(tag) != 6:
return out
bibXXx = "bib" + tag[0] + tag[1] + "x"
bibrec_bibXXx = "bibrec_" + bibXXx
query = """SELECT bb.field_number, b.tag, b.value FROM %s AS b, %s AS bb
WHERE bb.id_bibrec=%%s AND bb.id_bibxxx=b.id AND tag LIKE
%%s%%""" % (bibXXx, bibrec_bibXXx)
res = run_sql(query, (recID, tag[:-1]))
field_number = -1
for row in res:
if row[1] == tag and row[2] == value:
field_number = row[0]
if field_number > 0:
for row in res:
if row[0] == field_number and row[1] == tag[:-1] + associated_subfield_code:
out = row[2]
break
return out
def get_author_canonical_ids_for_recid(recID):
"""
Return list of author canonical IDs (e.g. `J.Ellis.1') for the
given record. Done by consulting BibAuthorID module.
"""
- from invenio.bibauthorid_dbinterface import get_persons_from_recids
+ from invenio.legacy.bibauthorid.dbinterface import get_persons_from_recids
lwords = []
res = get_persons_from_recids([recID])
if res is None:
## BibAuthorID is not enabled
return lwords
else:
dpersons, dpersoninfos = res
for aid in dpersoninfos.keys():
author_canonical_id = dpersoninfos[aid].get('canonical_id', '')
if author_canonical_id:
lwords.append(author_canonical_id)
return lwords
def swap_temporary_reindex_tables(index_id, reindex_prefix="tmp_"):
"""Atomically swap reindexed temporary table with the original one.
Delete the now-old one."""
is_virtual = is_index_virtual(index_id)
if is_virtual:
write_message("Removing %s index tables for id %s" % (reindex_prefix, index_id))
query = """DROP TABLE IF EXISTS %%sidxWORD%02dR, %%sidxWORD%02dF,
%%sidxPAIR%02dR, %%sidxPAIR%02dF,
%%sidxPHRASE%02dR, %%sidxPHRASE%02dF
""" % ((index_id,)*6)
query = query % ((reindex_prefix,)*6)
run_sql(query)
else:
write_message("Putting new tmp index tables for id %s into production" % index_id)
run_sql(
"RENAME TABLE " +
"idxWORD%02dR TO old_idxWORD%02dR," % (index_id, index_id) +
"%sidxWORD%02dR TO idxWORD%02dR," % (reindex_prefix, index_id, index_id) +
"idxWORD%02dF TO old_idxWORD%02dF," % (index_id, index_id) +
"%sidxWORD%02dF TO idxWORD%02dF," % (reindex_prefix, index_id, index_id) +
"idxPAIR%02dR TO old_idxPAIR%02dR," % (index_id, index_id) +
"%sidxPAIR%02dR TO idxPAIR%02dR," % (reindex_prefix, index_id, index_id) +
"idxPAIR%02dF TO old_idxPAIR%02dF," % (index_id, index_id) +
"%sidxPAIR%02dF TO idxPAIR%02dF," % (reindex_prefix, index_id, index_id) +
"idxPHRASE%02dR TO old_idxPHRASE%02dR," % (index_id, index_id) +
"%sidxPHRASE%02dR TO idxPHRASE%02dR," % (reindex_prefix, index_id, index_id) +
"idxPHRASE%02dF TO old_idxPHRASE%02dF," % (index_id, index_id) +
"%sidxPHRASE%02dF TO idxPHRASE%02dF;" % (reindex_prefix, index_id, index_id)
)
write_message("Dropping old index tables for id %s" % index_id)
run_sql_drop_silently("DROP TABLE old_idxWORD%02dR, old_idxWORD%02dF, old_idxPAIR%02dR, old_idxPAIR%02dF, old_idxPHRASE%02dR, old_idxPHRASE%02dF" % (index_id, index_id, index_id, index_id, index_id, index_id)) # kwalitee: disable=sql
def init_temporary_reindex_tables(index_id, reindex_prefix="tmp_"):
"""Create reindexing temporary tables."""
write_message("Creating new tmp index tables for id %s" % index_id)
run_sql_drop_silently("""DROP TABLE IF EXISTS %sidxWORD%02dF""" % (wash_table_column_name(reindex_prefix), index_id)) # kwalitee: disable=sql
run_sql("""CREATE TABLE %sidxWORD%02dF (
id mediumint(9) unsigned NOT NULL auto_increment,
term varchar(50) default NULL,
hitlist longblob,
PRIMARY KEY (id),
UNIQUE KEY term (term)
) ENGINE=MyISAM""" % (reindex_prefix, index_id))
run_sql_drop_silently("""DROP TABLE IF EXISTS %sidxWORD%02dR""" % (wash_table_column_name(reindex_prefix), index_id)) # kwalitee: disable=sql
run_sql("""CREATE TABLE %sidxWORD%02dR (
id_bibrec mediumint(9) unsigned NOT NULL,
termlist longblob,
type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT',
PRIMARY KEY (id_bibrec,type)
) ENGINE=MyISAM""" % (reindex_prefix, index_id))
run_sql_drop_silently("""DROP TABLE IF EXISTS %sidxPAIR%02dF""" % (wash_table_column_name(reindex_prefix), index_id)) # kwalitee: disable=sql
run_sql("""CREATE TABLE %sidxPAIR%02dF (
id mediumint(9) unsigned NOT NULL auto_increment,
term varchar(100) default NULL,
hitlist longblob,
PRIMARY KEY (id),
UNIQUE KEY term (term)
) ENGINE=MyISAM""" % (reindex_prefix, index_id))
run_sql_drop_silently("""DROP TABLE IF EXISTS %sidxPAIR%02dR""" % (wash_table_column_name(reindex_prefix), index_id)) # kwalitee: disable=sql
run_sql("""CREATE TABLE %sidxPAIR%02dR (
id_bibrec mediumint(9) unsigned NOT NULL,
termlist longblob,
type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT',
PRIMARY KEY (id_bibrec,type)
) ENGINE=MyISAM""" % (reindex_prefix, index_id))
run_sql_drop_silently("""DROP TABLE IF EXISTS %sidxPHRASE%02dF""" % (wash_table_column_name(reindex_prefix), index_id)) # kwalitee: disable=sql
run_sql("""CREATE TABLE %sidxPHRASE%02dF (
id mediumint(9) unsigned NOT NULL auto_increment,
term text default NULL,
hitlist longblob,
PRIMARY KEY (id),
KEY term (term(50))
) ENGINE=MyISAM""" % (reindex_prefix, index_id))
run_sql_drop_silently("""DROP TABLE IF EXISTS %sidxPHRASE%02dR""" % (wash_table_column_name(reindex_prefix), index_id)) # kwalitee: disable=sql
run_sql("""CREATE TABLE %sidxPHRASE%02dR (
id_bibrec mediumint(9) unsigned NOT NULL default '0',
termlist longblob,
type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT',
PRIMARY KEY (id_bibrec,type)
) ENGINE=MyISAM""" % (reindex_prefix, index_id))
def remove_subfields(s):
"Removes subfields from string, e.g. 'foo $$c bar' becomes 'foo bar'."
return re_subfields.sub(' ', s)
def get_field_indexes(field):
"""Returns indexes names and ids corresponding to the given field"""
if field[0:3].isdigit():
#field is actually a tag
return get_tag_indexes(field, virtual=False)
else:
#future implemeptation for fields
return []
get_field_indexes_memoised = Memoise(get_field_indexes)
def get_all_synonym_knowledge_bases():
"""Returns a dictionary of name key and knowledge base name and match type tuple value
information of all defined words indexes that have knowledge base information.
Returns empty dictionary in case there are no tags indexed.
Example: output['global'] = ('INDEX-SYNONYM-TITLE', 'exact'), output['title'] = ('INDEX-SYNONYM-TITLE', 'exact')."""
res = get_all_index_names_and_column_values("synonym_kbrs")
out = {}
for row in res:
kb_data = row[1]
# ignore empty strings
if len(kb_data):
out[row[0]] = tuple(kb_data.split(CFG_BIBINDEX_COLUMN_VALUE_SEPARATOR))
return out
def get_index_remove_stopwords(index_id):
"""Returns value of a remove_stopword field from idxINDEX database table
if it's not 'No'. If it's 'No' returns False.
Just for consistency with WordTable.
@param index_id: id of the index
"""
result = get_idx_remove_stopwords(index_id)
if isinstance(result, tuple):
return False
if result == 'No' or result == '':
return False
return result
def get_index_remove_html_markup(index_id):
""" Gets remove_html_markup parameter from database ('Yes' or 'No') and
changes it to True, False.
Just for consistency with WordTable."""
result = get_idx_remove_html_markup(index_id)
if result == 'Yes':
return True
return False
def get_index_remove_latex_markup(index_id):
""" Gets remove_latex_markup parameter from database ('Yes' or 'No') and
changes it to True, False.
Just for consistency with WordTable."""
result = get_idx_remove_latex_markup(index_id)
if result == 'Yes':
return True
return False
def get_index_tokenizer(index_id):
"""Returns value of a tokenizer field from idxINDEX database table
@param index_id: id of the index
"""
query = "SELECT tokenizer FROM idxINDEX WHERE id=%s" % index_id
out = None
try:
res = run_sql(query)
if res:
out = _TOKENIZERS[res[0][0]]
except DatabaseError:
write_message("Exception caught for SQL statement: %s; column tokenizer might not exist" % query, sys.stderr)
except KeyError:
write_message("Exception caught: there is no such tokenizer")
out = None
return out
def get_last_updated_all_indexes():
"""Returns last modification date for all defined indexes"""
query= """SELECT name, last_updated FROM idxINDEX"""
res = run_sql(query)
return res
def split_ranges(parse_string):
"""Parse a string a return the list or ranges."""
recIDs = []
ranges = parse_string.split(",")
for arange in ranges:
tmp_recIDs = arange.split("-")
if len(tmp_recIDs) == 1:
recIDs.append([int(tmp_recIDs[0]), int(tmp_recIDs[0])])
else:
if int(tmp_recIDs[0]) > int(tmp_recIDs[1]): # sanity check
tmp = tmp_recIDs[0]
tmp_recIDs[0] = tmp_recIDs[1]
tmp_recIDs[1] = tmp
recIDs.append([int(tmp_recIDs[0]), int(tmp_recIDs[1])])
return recIDs
def get_word_tables(tables):
""" Given a list of table names it return a list of tuples
(index_id, index_name, index_tags).
"""
wordTables = []
if tables:
for index in tables:
index_id = get_index_id_from_index_name(index)
if index_id:
wordTables.append((index_id, index, get_index_tags(index)))
else:
write_message("Error: There is no %s words table." % index, sys.stderr)
return wordTables
def get_date_range(var):
"Returns the two dates contained as a low,high tuple"
limits = var.split(",")
if len(limits) == 1:
low = get_datetime(limits[0])
return low, None
if len(limits) == 2:
low = get_datetime(limits[0])
high = get_datetime(limits[1])
return low, high
return None, None
def create_range_list(res):
"""Creates a range list from a recID select query result contained
in res. The result is expected to have ascending numerical order."""
if not res:
return []
row = res[0]
if not row:
return []
else:
range_list = [[row, row]]
for row in res[1:]:
row_id = row
if row_id == range_list[-1][1] + 1:
range_list[-1][1] = row_id
else:
range_list.append([row_id, row_id])
return range_list
def beautify_range_list(range_list):
"""Returns a non overlapping, maximal range list"""
ret_list = []
for new in range_list:
found = 0
for old in ret_list:
if new[0] <= old[0] <= new[1] + 1 or new[0] - 1 <= old[1] <= new[1]:
old[0] = min(old[0], new[0])
old[1] = max(old[1], new[1])
found = 1
break
if not found:
ret_list.append(new)
return ret_list
def truncate_index_table(index_name):
"""Properly truncate the given index."""
index_id = get_index_id_from_index_name(index_name)
if index_id:
write_message('Truncating %s index table in order to reindex.' % index_name, verbose=2)
run_sql("UPDATE idxINDEX SET last_updated='0000-00-00 00:00:00' WHERE id=%s", (index_id,))
run_sql("TRUNCATE idxWORD%02dF" % index_id) # kwalitee: disable=sql
run_sql("TRUNCATE idxWORD%02dR" % index_id) # kwalitee: disable=sql
run_sql("TRUNCATE idxPHRASE%02dF" % index_id) # kwalitee: disable=sql
run_sql("TRUNCATE idxPHRASE%02dR" % index_id) # kwalitee: disable=sql
def update_index_last_updated(indexes, starting_time=None):
"""Update last_updated column of the index table in the database.
Puts starting time there so that if the task was interrupted for record download,
the records will be reindexed next time.
@param indexes: list of indexes names
"""
if starting_time is None:
return None
for index_name in indexes:
write_message("updating last_updated to %s...for %s index" % (starting_time, index_name), verbose=9)
run_sql("UPDATE idxINDEX SET last_updated=%s WHERE name=%s", (starting_time, index_name,))
def get_percentage_completed(num_done, num_total):
""" Return a string containing the approx. percentage completed """
percentage_remaining = 100.0 * float(num_done) / float(num_total)
if percentage_remaining:
percentage_display = "(%.1f%%)" % (percentage_remaining,)
else:
percentage_display = ""
return percentage_display
def _fill_dict_of_indexes_with_empty_sets():
"""find_affected_records internal function.
Creates dict: {'index_name1':set([]), ...}
"""
index_dict = {}
tmp_all_indexes = get_all_indexes(virtual=False)
for index in tmp_all_indexes:
index_dict[index] = set([])
return index_dict
def find_affected_records_for_index(indexes=[], recIDs=[], force_all_indexes=False):
"""
Function checks which records need to be changed/reindexed
for given index/indexes.
Makes use of hstRECORD table where different revisions of record
are kept.
If parameter force_all_indexes is set function will assign all recIDs to all indexes.
@param indexes: names of indexes for reindexation separated by coma
@param recIDs: recIDs for reindexation in form: [[range1_down, range1_up],[range2_down, range2_up]..]
@param force_all_indexes: should we index all indexes?
"""
tmp_dates = dict(get_last_updated_all_indexes())
modification_dates = dict([(date, tmp_dates[date] or datetime(1000,1,1,1,1,1)) for date in tmp_dates])
tmp_all_indexes = get_all_indexes(virtual=False)
indexes = remove_inexistent_indexes(indexes, leave_virtual=False)
if not indexes:
return {}
def _should_reindex_for_revision(index_name, revision_date):
try:
if modification_dates[index_name] < revision_date and index_name in indexes:
return True
return False
except KeyError:
return False
if force_all_indexes:
records_for_indexes = {}
all_recIDs = []
for recIDs_range in recIDs:
all_recIDs.extend(range(recIDs_range[0], recIDs_range[1]+1))
for index in indexes:
records_for_indexes[index] = all_recIDs
return records_for_indexes
min_last_updated = get_min_last_updated(indexes)[0][0] or datetime(1000,1,1,1,1,1)
indexes_to_change = _fill_dict_of_indexes_with_empty_sets()
recIDs_info = []
for recIDs_range in recIDs:
query = """SELECT id_bibrec,job_date,affected_fields FROM hstRECORD WHERE
id_bibrec BETWEEN %s AND %s AND job_date > '%s'""" % (recIDs_range[0], recIDs_range[1], min_last_updated)
res = run_sql(query)
if res:
recIDs_info.extend(res)
for recID_info in recIDs_info:
recID, revision, affected_fields = recID_info
affected_fields = affected_fields.split(",")
indexes_for_recID = set()
for field in affected_fields:
if field:
field_indexes = get_field_indexes_memoised(field) or []
indexes_names = set([idx[1] for idx in field_indexes])
indexes_for_recID |= indexes_names
else:
#record was inserted, all fields were changed, no specific affected fields
indexes_for_recID |= set(tmp_all_indexes)
indexes_for_recID_filtered = [ind for ind in indexes_for_recID if _should_reindex_for_revision(ind, revision)]
for index in indexes_for_recID_filtered:
indexes_to_change[index].add(recID)
indexes_to_change = dict((k, list(sorted(v))) for k, v in indexes_to_change.iteritems() if v)
return indexes_to_change
#def update_text_extraction_date(first_recid, last_recid):
#"""for all the bibdoc connected to the specified recid, set
#the text_extraction_date to the task_starting_time."""
#run_sql("UPDATE bibdoc JOIN bibrec_bibdoc ON id=id_bibdoc SET text_extraction_date=%s WHERE id_bibrec BETWEEN %s AND %s", (task_get_task_param('task_starting_time'), first_recid, last_recid))
class WordTable:
"A class to hold the words table."
def __init__(self, index_name, index_id, fields_to_index, table_name_pattern, wordtable_type, tag_to_tokenizer_map, wash_index_terms=50):
"""Creates words table instance.
@param index_name: the index name
@param index_id: the index integer identificator
@param fields_to_index: a list of fields to index
@param table_name_pattern: i.e. idxWORD%02dF or idxPHRASE%02dF
@parm wordtable_type: type of the wordtable: Words, Pairs, Phrases
@param tag_to_tokenizer_map: a mapping to specify particular tokenizer to
extract words from particular metdata (such as 8564_u)
@param wash_index_terms: do we wash index terms, and if yes (when >0),
how many characters do we keep in the index terms; see
max_char_length parameter of wash_index_term()
"""
self.index_name = index_name
self.index_id = index_id
self.tablename = table_name_pattern % index_id
self.virtual_tablename_pattern = table_name_pattern[table_name_pattern.find('idx'):-1]
self.humanname = get_def_name('%s' % (str(index_id),), "idxINDEX")[0][1]
self.recIDs_in_mem = []
self.fields_to_index = fields_to_index
self.value = {}
try:
self.stemming_language = get_index_stemming_language(index_id)
except KeyError:
self.stemming_language = ''
self.remove_stopwords = get_index_remove_stopwords(index_id)
self.remove_html_markup = get_index_remove_html_markup(index_id)
self.remove_latex_markup = get_index_remove_latex_markup(index_id)
self.tokenizer = get_index_tokenizer(index_id)(self.stemming_language,
self.remove_stopwords,
self.remove_html_markup,
self.remove_latex_markup)
self.default_tokenizer_function = self.tokenizer.get_tokenizing_function(wordtable_type)
self.wash_index_terms = wash_index_terms
self.is_virtual = is_index_virtual(self.index_id)
self.virtual_indexes = get_index_virtual_indexes(self.index_id)
# tagToTokenizer mapping. It offers an indirection level necessary for
# indexing fulltext.
self.tag_to_words_fnc_map = {}
for k in tag_to_tokenizer_map.keys():
special_tokenizer_for_tag = _TOKENIZERS[tag_to_tokenizer_map[k]](self.stemming_language,
self.remove_stopwords,
self.remove_html_markup,
self.remove_latex_markup)
special_tokenizer_function = special_tokenizer_for_tag.get_tokenizing_function(wordtable_type)
self.tag_to_words_fnc_map[k] = special_tokenizer_function
if self.stemming_language and self.tablename.startswith('idxWORD'):
write_message('%s has stemming enabled, language %s' % (self.tablename, self.stemming_language))
def turn_off_virtual_indexes(self):
self.virtual_indexes = []
def turn_on_virtual_indexes(self):
self.virtual_indexes = get_index_virtual_indexes(self.index_id)
def get_field(self, recID, tag):
"""Returns list of values of the MARC-21 'tag' fields for the
record 'recID'."""
out = []
bibXXx = "bib" + tag[0] + tag[1] + "x"
bibrec_bibXXx = "bibrec_" + bibXXx
query = """SELECT value FROM %s AS b, %s AS bb
WHERE bb.id_bibrec=%%s AND bb.id_bibxxx=b.id
AND tag LIKE %%s""" % (bibXXx, bibrec_bibXXx)
res = run_sql(query, (recID, tag))
for row in res:
out.append(row[0])
return out
def clean(self):
"Cleans the words table."
self.value = {}
def put_into_db(self, mode="normal"):
"""Updates the current words table in the corresponding DB
idxFOO table. Mode 'normal' means normal execution,
mode 'emergency' means words index reverting to old state.
"""
write_message("%s %s wordtable flush started" % (self.tablename, mode))
write_message('...updating %d words into %s started' % \
(len(self.value), self.tablename))
task_update_progress("(%s:%s) flushed %d/%d words" % (self.tablename, self.humanname, 0, len(self.value)))
self.recIDs_in_mem = beautify_range_list(self.recIDs_in_mem)
all_indexes = [(self.index_id, self.humanname)]
if self.virtual_indexes:
all_indexes.extend(self.virtual_indexes)
for ind_id, ind_name in all_indexes:
tab_name = self.tablename[:-1] + "R"
if ind_id != self.index_id:
tab_name = self.virtual_tablename_pattern % ind_id + "R"
if mode == "normal":
for group in self.recIDs_in_mem:
query = """UPDATE %s SET type='TEMPORARY' WHERE id_bibrec
BETWEEN %%s AND %%s AND type='CURRENT'""" % tab_name
write_message(query % (group[0], group[1]), verbose=9)
run_sql(query, (group[0], group[1]))
nb_words_total = len(self.value)
nb_words_report = int(nb_words_total / 10.0)
nb_words_done = 0
for word in self.value.keys():
self.put_word_into_db(word, ind_id)
nb_words_done += 1
if nb_words_report != 0 and ((nb_words_done % nb_words_report) == 0):
write_message('......processed %d/%d words' % (nb_words_done, nb_words_total))
percentage_display = get_percentage_completed(nb_words_done, nb_words_total)
task_update_progress("(%s:%s) flushed %d/%d words %s" % (tab_name, ind_name, nb_words_done, nb_words_total, percentage_display))
write_message('...updating %d words into %s ended' % \
(nb_words_total, tab_name))
write_message('...updating reverse table %s started' % tab_name)
if mode == "normal":
for group in self.recIDs_in_mem:
query = """UPDATE %s SET type='CURRENT' WHERE id_bibrec
BETWEEN %%s AND %%s AND type='FUTURE'""" % tab_name
write_message(query % (group[0], group[1]), verbose=9)
run_sql(query, (group[0], group[1]))
query = """DELETE FROM %s WHERE id_bibrec
BETWEEN %%s AND %%s AND type='TEMPORARY'""" % tab_name
write_message(query % (group[0], group[1]), verbose=9)
run_sql(query, (group[0], group[1]))
#if self.is_fulltext_index:
#update_text_extraction_date(group[0], group[1])
write_message('End of updating wordTable into %s' % tab_name, verbose=9)
elif mode == "emergency":
for group in self.recIDs_in_mem:
query = """UPDATE %s SET type='CURRENT' WHERE id_bibrec
BETWEEN %%s AND %%s AND type='TEMPORARY'""" % tab_name
write_message(query % (group[0], group[1]), verbose=9)
run_sql(query, (group[0], group[1]))
query = """DELETE FROM %s WHERE id_bibrec
BETWEEN %%s AND %%s AND type='FUTURE'""" % tab_name
write_message(query % (group[0], group[1]), verbose=9)
run_sql(query, (group[0], group[1]))
write_message('End of emergency flushing wordTable into %s' % tab_name, verbose=9)
write_message('...updating reverse table %s ended' % tab_name)
self.clean()
self.recIDs_in_mem = []
write_message("%s %s wordtable flush ended" % (self.tablename, mode))
task_update_progress("(%s:%s) flush ended" % (self.tablename, self.humanname))
def load_old_recIDs(self, word, index_id=None):
"""Load existing hitlist for the word from the database index files."""
tab_name = self.tablename
if index_id != self.index_id:
tab_name = self.virtual_tablename_pattern % index_id + "F"
query = "SELECT hitlist FROM %s WHERE term=%%s" % tab_name
res = run_sql(query, (word,))
if res:
return intbitset(res[0][0])
else:
return None
def merge_with_old_recIDs(self, word, set):
"""Merge the system numbers stored in memory (hash of recIDs with value +1 or -1
according to whether to add/delete them) with those stored in the database index
and received in set universe of recIDs for the given word.
Return False in case no change was done to SET, return True in case SET
was changed.
"""
oldset = intbitset(set)
set.update_with_signs(self.value[word])
return set != oldset
def put_word_into_db(self, word, index_id):
"""Flush a single word to the database and delete it from memory"""
tab_name = self.tablename
if index_id != self.index_id:
tab_name = self.virtual_tablename_pattern % index_id + "F"
set = self.load_old_recIDs(word, index_id)
if set is not None: # merge the word recIDs found in memory:
if not self.merge_with_old_recIDs(word, set):
# nothing to update:
write_message("......... unchanged hitlist for ``%s''" % word, verbose=9)
pass
else:
# yes there were some new words:
write_message("......... updating hitlist for ``%s''" % word, verbose=9)
run_sql("UPDATE %s SET hitlist=%%s WHERE term=%%s" % wash_table_column_name(tab_name), (set.fastdump(), word)) # kwalitee: disable=sql
else: # the word is new, will create new set:
write_message("......... inserting hitlist for ``%s''" % word, verbose=9)
set = intbitset(self.value[word].keys())
try:
run_sql("INSERT INTO %s (term, hitlist) VALUES (%%s, %%s)" % wash_table_column_name(tab_name), (word, set.fastdump())) # kwalitee: disable=sql
except Exception, e:
## We send this exception to the admin only when is not
## already reparing the problem.
register_exception(prefix="Error when putting the term '%s' into db (hitlist=%s): %s\n" % (repr(word), set, e), alert_admin=(task_get_option('cmd') != 'repair'))
if not set: # never store empty words
run_sql("DELETE FROM %s WHERE term=%%s" % wash_table_column_name(tab_name), (word,)) # kwalitee: disable=sql
def display(self):
"Displays the word table."
keys = self.value.keys()
keys.sort()
for k in keys:
write_message("%s: %s" % (k, self.value[k]))
def count(self):
"Returns the number of words in the table."
return len(self.value)
def info(self):
"Prints some information on the words table."
write_message("The words table contains %d words." % self.count())
def lookup_words(self, word=""):
"Lookup word from the words table."
if not word:
done = 0
while not done:
try:
word = raw_input("Enter word: ")
done = 1
except (EOFError, KeyboardInterrupt):
return
if self.value.has_key(word):
write_message("The word '%s' is found %d times." \
% (word, len(self.value[word])))
else:
write_message("The word '%s' does not exist in the word file."\
% word)
def add_recIDs(self, recIDs, opt_flush):
"""Fetches records which id in the recIDs range list and adds
them to the wordTable. The recIDs range list is of the form:
[[i1_low,i1_high],[i2_low,i2_high], ..., [iN_low,iN_high]].
"""
if self.is_virtual:
return
global chunksize, _last_word_table
flush_count = 0
records_done = 0
records_to_go = 0
for arange in recIDs:
records_to_go = records_to_go + arange[1] - arange[0] + 1
time_started = time.time() # will measure profile time
for arange in recIDs:
i_low = arange[0]
chunksize_count = 0
while i_low <= arange[1]:
task_sleep_now_if_required()
# calculate chunk group of recIDs and treat it:
i_high = min(i_low + opt_flush - flush_count - 1, arange[1])
i_high = min(i_low + chunksize - chunksize_count - 1, i_high)
try:
self.chk_recID_range(i_low, i_high)
except StandardError:
if self.index_name == 'fulltext' and CFG_SOLR_URL:
solr_commit()
raise
write_message(CFG_BIBINDEX_ADDING_RECORDS_STARTED_STR % \
(self.tablename, i_low, i_high))
if CFG_CHECK_MYSQL_THREADS:
kill_sleepy_mysql_threads()
percentage_display = get_percentage_completed(records_done, records_to_go)
task_update_progress("(%s:%s) adding recs %d-%d %s" % (self.tablename, self.humanname, i_low, i_high, percentage_display))
self.del_recID_range(i_low, i_high)
just_processed = self.add_recID_range(i_low, i_high)
flush_count = flush_count + i_high - i_low + 1
chunksize_count = chunksize_count + i_high - i_low + 1
records_done = records_done + just_processed
write_message(CFG_BIBINDEX_ADDING_RECORDS_STARTED_STR % \
(self.tablename, i_low, i_high))
if chunksize_count >= chunksize:
chunksize_count = 0
# flush if necessary:
if flush_count >= opt_flush:
self.put_into_db()
self.clean()
if self.index_name == 'fulltext' and CFG_SOLR_URL:
solr_commit()
write_message("%s backing up" % (self.tablename))
flush_count = 0
self.log_progress(time_started, records_done, records_to_go)
# iterate:
i_low = i_high + 1
if flush_count > 0:
self.put_into_db()
if self.index_name == 'fulltext' and CFG_SOLR_URL:
solr_commit()
self.log_progress(time_started, records_done, records_to_go)
def add_recID_range(self, recID1, recID2):
"""Add records from RECID1 to RECID2."""
wlist = {}
self.recIDs_in_mem.append([recID1, recID2])
# special case of author indexes where we also add author
# canonical IDs:
if self.index_name in ('author', 'firstauthor', 'exactauthor', 'exactfirstauthor'):
for recID in range(recID1, recID2 + 1):
if not wlist.has_key(recID):
wlist[recID] = []
wlist[recID] = list_union(get_author_canonical_ids_for_recid(recID),
wlist[recID])
if len(self.fields_to_index) == 0:
#'no tag' style of indexing - use bibfield instead of directly consulting bibrec
tokenizing_function = self.default_tokenizer_function
for recID in range(recID1, recID2 + 1):
record = get_record(recID)
if record:
new_words = tokenizing_function(record)
if not wlist.has_key(recID):
wlist[recID] = []
wlist[recID] = list_union(new_words, wlist[recID])
# case of special indexes:
elif self.index_name in ('authorcount', 'journal'):
for tag in self.fields_to_index:
tokenizing_function = self.tag_to_words_fnc_map.get(tag, self.default_tokenizer_function)
for recID in range(recID1, recID2 + 1):
new_words = tokenizing_function(recID)
if not wlist.has_key(recID):
wlist[recID] = []
wlist[recID] = list_union(new_words, wlist[recID])
# usual tag-by-tag indexing for the rest:
else:
for tag in self.fields_to_index:
tokenizing_function = self.tag_to_words_fnc_map.get(tag, self.default_tokenizer_function)
phrases = self.get_phrases_for_tokenizing(tag, recID1, recID2)
for row in sorted(phrases):
recID, phrase = row
if not wlist.has_key(recID):
wlist[recID] = []
new_words = tokenizing_function(phrase)
wlist[recID] = list_union(new_words, wlist[recID])
# lookup index-time synonyms:
synonym_kbrs = get_all_synonym_knowledge_bases()
if synonym_kbrs.has_key(self.index_name):
if len(wlist) == 0: return 0
recIDs = wlist.keys()
for recID in recIDs:
for word in wlist[recID]:
word_synonyms = get_synonym_terms(word,
synonym_kbrs[self.index_name][0],
synonym_kbrs[self.index_name][1],
use_memoise=True)
if word_synonyms:
wlist[recID] = list_union(word_synonyms, wlist[recID])
# were there some words for these recIDs found?
recIDs = wlist.keys()
for recID in recIDs:
# was this record marked as deleted?
if "DELETED" in self.get_field(recID, "980__c"):
wlist[recID] = []
write_message("... record %d was declared deleted, removing its word list" % recID, verbose=9)
write_message("... record %d, termlist: %s" % (recID, wlist[recID]), verbose=9)
self.index_virtual_indexes_reversed(wlist, recID1, recID2)
if len(wlist) == 0: return 0
# put words into reverse index table with FUTURE status:
for recID in recIDs:
run_sql("INSERT INTO %sR (id_bibrec,termlist,type) VALUES (%%s,%%s,'FUTURE')" % wash_table_column_name(self.tablename[:-1]), (recID, serialize_via_marshal(wlist[recID]))) # kwalitee: disable=sql
# ... and, for new records, enter the CURRENT status as empty:
try:
run_sql("INSERT INTO %sR (id_bibrec,termlist,type) VALUES (%%s,%%s,'CURRENT')" % wash_table_column_name(self.tablename[:-1]), (recID, serialize_via_marshal([]))) # kwalitee: disable=sql
except DatabaseError:
# okay, it's an already existing record, no problem
pass
# put words into memory word list:
put = self.put
for recID in recIDs:
for w in wlist[recID]:
put(recID, w, 1)
return len(recIDs)
def get_phrases_for_tokenizing(self, tag, first_recID, last_recID):
"""Gets phrases for later tokenization for a range of records and
specific tag.
@param tag: MARC tag
@param first_recID: first recID from the range of recIDs to index
@param last_recID: last recID from the range of recIDs to index
"""
bibXXx = "bib" + tag[0] + tag[1] + "x"
bibrec_bibXXx = "bibrec_" + bibXXx
query = """SELECT bb.id_bibrec,b.value FROM %s AS b, %s AS bb
WHERE bb.id_bibrec BETWEEN %%s AND %%s
AND bb.id_bibxxx=b.id AND tag LIKE %%s""" % (bibXXx, bibrec_bibXXx)
phrases = run_sql(query, (first_recID, last_recID, tag))
if tag == '8564_u':
## FIXME: Quick hack to be sure that hidden files are
## actually indexed.
phrases = set(phrases)
for recid in xrange(int(first_recID), int(last_recID) + 1):
for bibdocfile in BibRecDocs(recid).list_latest_files():
phrases.add((recid, bibdocfile.get_url()))
#authority records
pattern = tag.replace('%', '*')
matches = fnmatch.filter(CFG_BIBAUTHORITY_CONTROLLED_FIELDS_BIBLIOGRAPHIC.keys(), pattern)
if not len(matches):
return phrases
phrases = set(phrases)
for tag_match in matches:
authority_tag = tag_match[0:3] + "__0"
for recID in xrange(int(first_recID), int(last_recID) + 1):
control_nos = get_fieldvalues(recID, authority_tag)
for control_no in control_nos:
new_strings = get_index_strings_by_control_no(control_no)
for string_value in new_strings:
phrases.add((recID, string_value))
return phrases
def index_virtual_indexes_reversed(self, wlist, recID1, recID2):
"""Inserts indexed words into all virtual indexes connected to
this index"""
#first: need to take old values from given index to remove
#them from virtual indexes
query = """SELECT id_bibrec, termlist FROM %sR WHERE id_bibrec
BETWEEN %%s AND %%s""" % wash_table_column_name(self.tablename[:-1])
old_index_values = run_sql(query, (recID1, recID2))
if old_index_values:
zipped = zip(*old_index_values)
old_index_values = dict(zip(zipped[0], map(deserialize_via_marshal, zipped[1])))
else:
old_index_values = dict()
recIDs = wlist.keys()
for vindex_id, vindex_name in self.virtual_indexes:
#second: need to take old values from virtual index
#to have a list of words from which we can remove old values from given index
tab_name = self.virtual_tablename_pattern % vindex_id + "R"
query = """SELECT id_bibrec, termlist FROM %s WHERE type='CURRENT' AND id_bibrec
BETWEEN %%s AND %%s""" % tab_name
old_virtual_index_values = run_sql(query, (recID1, recID2))
if old_virtual_index_values:
zipped = zip(*old_virtual_index_values)
old_virtual_index_values = dict(zip(zipped[0], map(deserialize_via_marshal, zipped[1])))
else:
old_virtual_index_values = dict()
for recID in recIDs:
to_serialize = list((set(old_virtual_index_values.get(recID) or []) - set(old_index_values.get(recID) or [])) | set(wlist[recID]))
run_sql("INSERT INTO %s (id_bibrec,termlist,type) VALUES (%%s,%%s,'FUTURE')" % wash_table_column_name(tab_name), (recID, serialize_via_marshal(to_serialize))) # kwalitee: disable=sql
try:
run_sql("INSERT INTO %s (id_bibrec,termlist,type) VALUES (%%s,%%s,'CURRENT')" % wash_table_column_name(tab_name), (recID, serialize_via_marshal([]))) # kwalitee: disable=sql
except DatabaseError:
pass
if len(recIDs) != (recID2 - recID1 + 1):
#for records in range(recID1, recID2) which weren't updated:
#need to prevent them from being deleted by function: 'put_into_db'
#which deletes all records with 'CURRENT' status
query = """INSERT INTO %s (id_bibrec, termlist, type)
SELECT id_bibrec, termlist, 'FUTURE' FROM %s
WHERE id_bibrec BETWEEN %%s AND %%s
AND type='CURRENT'
AND id_bibrec IN (
SELECT id_bibrec FROM %s
WHERE id_bibrec BETWEEN %%s AND %%s
GROUP BY id_bibrec HAVING COUNT(id_bibrec) = 1
)
""" % ((wash_table_column_name(tab_name),)*3)
run_sql(query, (recID1, recID2, recID1, recID2))
def log_progress(self, start, done, todo):
"""Calculate progress and store it.
start: start time,
done: records processed,
todo: total number of records"""
time_elapsed = time.time() - start
# consistency check
if time_elapsed == 0 or done > todo:
return
time_recs_per_min = done / (time_elapsed / 60.0)
write_message("%d records took %.1f seconds to complete.(%1.f recs/min)"\
% (done, time_elapsed, time_recs_per_min))
if time_recs_per_min:
write_message("Estimated runtime: %.1f minutes" % \
((todo - done) / time_recs_per_min))
def put(self, recID, word, sign):
"""Adds/deletes a word to the word list."""
try:
if self.wash_index_terms:
word = wash_index_term(word, self.wash_index_terms)
if self.value.has_key(word):
# the word 'word' exist already: update sign
self.value[word][recID] = sign
else:
self.value[word] = {recID: sign}
except:
write_message("Error: Cannot put word %s with sign %d for recID %s." % (word, sign, recID))
def del_recIDs(self, recIDs):
"""Fetches records which id in the recIDs range list and adds
them to the wordTable. The recIDs range list is of the form:
[[i1_low,i1_high],[i2_low,i2_high], ..., [iN_low,iN_high]].
"""
count = 0
for arange in recIDs:
task_sleep_now_if_required()
self.del_recID_range(arange[0], arange[1])
count = count + arange[1] - arange[0]
self.put_into_db()
if self.index_name == 'fulltext' and CFG_SOLR_URL:
solr_commit()
def del_recID_range(self, low, high):
"""Deletes records with 'recID' system number between low
and high from memory words index table."""
write_message("%s fetching existing words for records #%d-#%d started" % \
(self.tablename, low, high), verbose=3)
self.recIDs_in_mem.append([low, high])
query = """SELECT id_bibrec,termlist FROM %sR as bb WHERE bb.id_bibrec
BETWEEN %%s AND %%s""" % (self.tablename[:-1])
recID_rows = run_sql(query, (low, high))
for recID_row in recID_rows:
recID = recID_row[0]
wlist = deserialize_via_marshal(recID_row[1])
for word in wlist:
self.put(recID, word, -1)
write_message("%s fetching existing words for records #%d-#%d ended" % \
(self.tablename, low, high), verbose=3)
def report_on_table_consistency(self):
"""Check reverse words index tables (e.g. idxWORD01R) for
interesting states such as 'TEMPORARY' state.
Prints small report (no of words, no of bad words).
"""
# find number of words:
query = """SELECT COUNT(*) FROM %s""" % (self.tablename)
res = run_sql(query, None, 1)
if res:
nb_words = res[0][0]
else:
nb_words = 0
# find number of records:
query = """SELECT COUNT(DISTINCT(id_bibrec)) FROM %sR""" % (self.tablename[:-1])
res = run_sql(query, None, 1)
if res:
nb_records = res[0][0]
else:
nb_records = 0
# report stats:
write_message("%s contains %d words from %d records" % (self.tablename, nb_words, nb_records))
# find possible bad states in reverse tables:
query = """SELECT COUNT(DISTINCT(id_bibrec)) FROM %sR WHERE type <> 'CURRENT'""" % (self.tablename[:-1])
res = run_sql(query)
if res:
nb_bad_records = res[0][0]
else:
nb_bad_records = 999999999
if nb_bad_records:
write_message("EMERGENCY: %s needs to repair %d of %d index records" % \
(self.tablename, nb_bad_records, nb_records))
else:
write_message("%s is in consistent state" % (self.tablename))
return nb_bad_records
def repair(self, opt_flush):
"""Repair the whole table"""
# find possible bad states in reverse tables:
query = """SELECT COUNT(DISTINCT(id_bibrec)) FROM %sR WHERE type <> 'CURRENT'""" % (self.tablename[:-1])
res = run_sql(query, None, 1)
if res:
nb_bad_records = res[0][0]
else:
nb_bad_records = 0
if nb_bad_records == 0:
return
query = """SELECT id_bibrec FROM %sR WHERE type <> 'CURRENT'""" \
% (self.tablename[:-1])
res = intbitset(run_sql(query))
recIDs = create_range_list(list(res))
flush_count = 0
records_done = 0
records_to_go = 0
for arange in recIDs:
records_to_go = records_to_go + arange[1] - arange[0] + 1
time_started = time.time() # will measure profile time
for arange in recIDs:
i_low = arange[0]
chunksize_count = 0
while i_low <= arange[1]:
task_sleep_now_if_required()
# calculate chunk group of recIDs and treat it:
i_high = min(i_low + opt_flush - flush_count - 1, arange[1])
i_high = min(i_low + chunksize - chunksize_count - 1, i_high)
self.fix_recID_range(i_low, i_high)
flush_count = flush_count + i_high - i_low + 1
chunksize_count = chunksize_count + i_high - i_low + 1
records_done = records_done + i_high - i_low + 1
if chunksize_count >= chunksize:
chunksize_count = 0
# flush if necessary:
if flush_count >= opt_flush:
self.put_into_db("emergency")
self.clean()
flush_count = 0
self.log_progress(time_started, records_done, records_to_go)
# iterate:
i_low = i_high + 1
if flush_count > 0:
self.put_into_db("emergency")
self.log_progress(time_started, records_done, records_to_go)
write_message("%s inconsistencies repaired." % self.tablename)
def chk_recID_range(self, low, high):
"""Check if the reverse index table is in proper state"""
## check db
query = """SELECT COUNT(*) FROM %sR WHERE type <> 'CURRENT'
AND id_bibrec BETWEEN %%s AND %%s""" % self.tablename[:-1]
res = run_sql(query, (low, high), 1)
if res[0][0] == 0:
write_message("%s for %d-%d is in consistent state" % (self.tablename, low, high))
return # okay, words table is consistent
## inconsistency detected!
write_message("EMERGENCY: %s inconsistencies detected..." % self.tablename)
error_message = "Errors found. You should check consistency of the " \
"%s - %sR tables.\nRunning 'bibindex --repair' is " \
"recommended." % (self.tablename, self.tablename[:-1])
write_message("EMERGENCY: " + error_message, stream=sys.stderr)
raise StandardError(error_message)
def fix_recID_range(self, low, high):
"""Try to fix reverse index database consistency (e.g. table idxWORD01R) in the low,high doc-id range.
Possible states for a recID follow:
CUR TMP FUT: very bad things have happened: warn!
CUR TMP : very bad things have happened: warn!
CUR FUT: delete FUT (crash before flushing)
CUR : database is ok
TMP FUT: add TMP to memory and del FUT from memory
flush (revert to old state)
TMP : very bad things have happened: warn!
FUT: very bad things have happended: warn!
"""
state = {}
query = "SELECT id_bibrec,type FROM %sR WHERE id_bibrec BETWEEN %%s AND %%s"\
% self.tablename[:-1]
res = run_sql(query, (low, high))
for row in res:
if not state.has_key(row[0]):
state[row[0]] = []
state[row[0]].append(row[1])
ok = 1 # will hold info on whether we will be able to repair
for recID in state.keys():
if not 'TEMPORARY' in state[recID]:
if 'FUTURE' in state[recID]:
if 'CURRENT' not in state[recID]:
write_message("EMERGENCY: Index record %d is in inconsistent state. Can't repair it." % recID)
ok = 0
else:
write_message("EMERGENCY: Inconsistency in index record %d detected" % recID)
query = """DELETE FROM %sR
WHERE id_bibrec=%%s""" % self.tablename[:-1]
run_sql(query, (recID,))
write_message("EMERGENCY: Inconsistency in record %d repaired." % recID)
else:
if 'FUTURE' in state[recID] and not 'CURRENT' in state[recID]:
self.recIDs_in_mem.append([recID, recID])
# Get the words file
query = """SELECT type,termlist FROM %sR
WHERE id_bibrec=%%s""" % self.tablename[:-1]
write_message(query, verbose=9)
res = run_sql(query, (recID,))
for row in res:
wlist = deserialize_via_marshal(row[1])
write_message("Words are %s " % wlist, verbose=9)
if row[0] == 'TEMPORARY':
sign = 1
else:
sign = -1
for word in wlist:
self.put(recID, word, sign)
else:
write_message("EMERGENCY: %s for %d is in inconsistent "
"state. Couldn't repair it." % (self.tablename,
recID), stream=sys.stderr)
ok = 0
if not ok:
error_message = "Unrepairable errors found. You should check " \
"consistency of the %s - %sR tables. Deleting affected " \
"TEMPORARY and FUTURE entries from these tables is " \
"recommended; see the BibIndex Admin Guide." % \
(self.tablename, self.tablename[:-1])
write_message("EMERGENCY: " + error_message, stream=sys.stderr)
raise StandardError(error_message)
def remove_dependent_index(self, id_dependent):
"""Removes terms found in dependent index from virtual index.
Function finds words for removal and then removes them from
forward and reversed tables term by term.
@param id_dependent: id of an index which we want to remove from this
virtual index
"""
if not self.is_virtual:
write_message("Index is not virtual...")
return
global chunksize
terms_current_counter = 0
terms_done = 0
terms_to_go = 0
for_full_removal, for_partial_removal = self.get_words_to_remove(id_dependent, misc_lookup=False)
query = """SELECT t.term, m.hitlist FROM %s%02dF as t INNER JOIN %s%02dF as m
ON t.term=m.term""" % (self.tablename[:-3], self.index_id, self.tablename[:-3], id_dependent)
terms_and_hitlists = dict(run_sql(query))
terms_to_go = len(for_full_removal) + len(for_partial_removal)
task_sleep_now_if_required()
#full removal
for term in for_full_removal:
terms_current_counter += 1
hitlist = intbitset(terms_and_hitlists[term])
for recID in hitlist:
self.remove_single_word_reversed_table(term, recID)
self.remove_single_word_forward_table(term)
if terms_current_counter % chunksize == 0:
terms_done += terms_current_counter
terms_current_counter = 0
write_message("removed %s/%s terms..." % (terms_done, terms_to_go))
task_sleep_now_if_required()
terms_done += terms_current_counter
terms_current_counter = 0
#partial removal
for term, indexes in for_partial_removal.iteritems():
self.value = {}
terms_current_counter += 1
hitlist = intbitset(terms_and_hitlists[term])
if len(indexes) > 0:
hitlist -= self._find_common_hitlist(term, id_dependent, indexes)
for recID in hitlist:
self.remove_single_word_reversed_table(term, recID)
if self.value.has_key(term):
self.value[term][recID] = -1
else:
self.value[term] = {recID: -1}
if self.value:
self.put_word_into_db(term, self.index_id)
if terms_current_counter % chunksize == 0:
terms_done += terms_current_counter
terms_current_counter = 0
write_message("removed %s/%s terms..." % (terms_done, terms_to_go))
task_sleep_now_if_required()
def remove_single_word_forward_table(self, word):
"""Immediately and irreversibly removes a word from forward table"""
run_sql("""DELETE FROM %s WHERE term=%%s""" % self.tablename, (word, )) # kwalitee: disable=sql
def remove_single_word_reversed_table(self, word, recID):
"""Removes single word from temlist for given recID"""
old_set = run_sql("""SELECT termlist FROM %sR WHERE id_bibrec=%%s""" % \
wash_table_column_name(self.tablename[:-1]), (recID, ))
new_set = []
if old_set:
new_set = deserialize_via_marshal(old_set[0][0])
if word in new_set:
new_set.remove(word)
if new_set:
run_sql("""UPDATE %sR SET termlist=%%s
WHERE id_bibrec=%%s AND
type='CURRENT'""" % \
wash_table_column_name(self.tablename[:-1]), (serialize_via_marshal(new_set), recID))
def _find_common_hitlist(self, term, id_dependent, indexes):
"""Checks 'indexes' for records that have 'term' indexed
and returns intersection between found records
and records that have a 'term' inside index
defined by id_dependent parameter"""
query = """SELECT m.hitlist FROM idxWORD%02dF as t INNER JOIN idxWORD%02dF as m
ON t.term=m.term WHERE t.term='%s'"""
common_hitlist = intbitset([])
for _id in indexes:
res = run_sql(query % (id_dependent, _id, term))
if res:
common_hitlist |= intbitset(res[0][0])
return common_hitlist
def get_words_to_remove(self, id_dependent, misc_lookup=False):
"""Finds words in dependent index which should be removed from virtual index.
Example:
Virtual index 'A' consists of 'B' and 'C' dependent indexes and we want to
remove 'B' from virtual index 'A'.
First we need to check if 'B' and 'C' have common words. If they have
we need to be careful not to remove common words from 'A', because we want
to remove only words from 'B'.
Then we need to check common words for 'A' and 'B'. These are potential words
for removal. We need to substract common words for 'B' and 'C' from common words
for 'A' and 'B' to be sure that correct words are removed.
@return: (list, dict), list contains terms/words for full removal, dict
contains words for partial removal together with ids of indexes in which
given term/word also exists
"""
query = """SELECT t.term FROM %s%02dF as t INNER JOIN %s%02dF as m
ON t.term=m.term"""
dependent_indexes = get_virtual_index_building_blocks(self.index_id)
other_ids = list(dependent_indexes and zip(*dependent_indexes)[0] or [])
if id_dependent in other_ids:
other_ids.remove(id_dependent)
if not misc_lookup:
misc_id = get_index_id_from_index_name('miscellaneous')
if misc_id in other_ids:
other_ids.remove(misc_id)
#intersections between dependent indexes
left_in_other_indexes = {}
for _id in other_ids:
intersection = zip(*run_sql(query % (self.tablename[:-3], id_dependent, self.tablename[:-3], _id))) # kwalitee: disable=sql
terms = bool(intersection) and intersection[0] or []
for term in terms:
if left_in_other_indexes.has_key(term):
left_in_other_indexes[term].append(_id)
else:
left_in_other_indexes[term] = [_id]
#intersection between virtual index and index we want to remove
main_intersection = zip(*run_sql(query % (self.tablename[:-3], self.index_id, self.tablename[:-3], id_dependent))) # kwalitee: disable=sql
terms_main = set(bool(main_intersection) and main_intersection[0] or [])
return list(terms_main - set(left_in_other_indexes.keys())), left_in_other_indexes
def main():
"""Main that construct all the bibtask."""
task_init(authorization_action='runbibindex',
authorization_msg="BibIndex Task Submission",
description="""Examples:
\t%s -a -i 234-250,293,300-500 -u admin@localhost
\t%s -a -w author,fulltext -M 8192 -v3
\t%s -d -m +4d -A on --flush=10000\n""" % ((sys.argv[0],) * 3), help_specific_usage=""" Indexing options:
-a, --add\t\tadd or update words for selected records
-d, --del\t\tdelete words for selected records
-i, --id=low[-high]\t\tselect according to doc recID
-m, --modified=from[,to]\tselect according to modification date
-c, --collection=c1[,c2]\tselect according to collection
-R, --reindex\treindex the selected indexes from scratch
Repairing options:
-k, --check\t\tcheck consistency for all records in the table(s)
-r, --repair\t\ttry to repair all records in the table(s)
Specific options:
-w, --windex=w1[,w2]\tword/phrase indexes to consider (all)
-M, --maxmem=XXX\tmaximum memory usage in kB (no limit)
-f, --flush=NNN\t\tfull consistent table flush after NNN records (10000)
--force\tforce indexing of all records for provided indexes
-Z, --remove-dependent-index=w\tname of an index for removing from virtual index
""",
version=__revision__,
specific_params=("adi:m:c:w:krRM:f:oZ:", [
"add",
"del",
"id=",
"modified=",
"collection=",
"windex=",
"check",
"repair",
"reindex",
"maxmem=",
"flush=",
"force",
"remove-dependent-index="
]),
task_stop_helper_fnc=task_stop_table_close_fnc,
task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
task_run_fnc=task_run_core,
task_submit_check_options_fnc=task_submit_check_options)
def task_submit_check_options():
"""Check for options compatibility."""
if task_get_option("reindex"):
if task_get_option("cmd") != "add" or task_get_option('id') or task_get_option('collection'):
print >> sys.stderr, "ERROR: You can use --reindex only when adding modified record."
return False
return True
def task_submit_elaborate_specific_parameter(key, value, opts, args):
""" Given the string key it checks it's meaning, eventually using the
value. Usually it fills some key in the options dict.
It must return True if it has elaborated the key, False, if it doesn't
know that key.
eg:
if key in ['-n', '--number']:
self.options['number'] = value
return True
return False
"""
if key in ("-a", "--add"):
task_set_option("cmd", "add")
if ("-x", "") in opts or ("--del", "") in opts:
raise StandardError("Can not have --add and --del at the same time!")
elif key in ("-k", "--check"):
task_set_option("cmd", "check")
elif key in ("-r", "--repair"):
task_set_option("cmd", "repair")
elif key in ("-d", "--del"):
task_set_option("cmd", "del")
elif key in ("-i", "--id"):
task_set_option('id', task_get_option('id') + split_ranges(value))
elif key in ("-m", "--modified"):
task_set_option("modified", get_date_range(value))
elif key in ("-c", "--collection"):
task_set_option("collection", value)
elif key in ("-R", "--reindex"):
task_set_option("reindex", True)
elif key in ("-w", "--windex"):
task_set_option("windex", value)
elif key in ("-M", "--maxmem"):
task_set_option("maxmem", int(value))
if task_get_option("maxmem") < base_process_size + 1000:
raise StandardError("Memory usage should be higher than %d kB" % \
(base_process_size + 1000))
elif key in ("-f", "--flush"):
task_set_option("flush", int(value))
elif key in ("-o", "--force"):
task_set_option("force", True)
elif key in ("-Z", "--remove-dependent-index",):
task_set_option("remove-dependent-index", value)
else:
return False
return True
def task_stop_table_close_fnc():
""" Close tables to STOP. """
global _last_word_table
if _last_word_table:
_last_word_table.put_into_db()
def get_recIDs_by_date_bibliographic(dates, index_name, force_all=False):
""" Finds records that were modified between DATES[0] and DATES[1]
for given index.
If DATES is not set, then finds records that were modified since
the last update of the index.
@param wordtable_type: can be 'Words', 'Pairs' or 'Phrases'
"""
index_id = get_index_id_from_index_name(index_name)
if not dates:
query = """SELECT last_updated FROM idxINDEX WHERE id=%s"""
res = run_sql(query, (index_id,))
if not res:
return set([])
if not res[0][0] or force_all:
dates = ("0000-00-00", None)
else:
dates = (res[0][0], None)
if dates[1] is None:
res = intbitset(run_sql("""SELECT b.id FROM bibrec AS b WHERE b.modification_date >= %s""",
(dates[0],)))
if index_name == 'fulltext':
res |= intbitset(run_sql("""SELECT id_bibrec FROM bibrec_bibdoc JOIN bibdoc ON id_bibdoc=id
WHERE text_extraction_date <= modification_date AND
modification_date >= %s
AND status<>'DELETED'""",
(dates[0],)))
elif dates[0] is None:
res = intbitset(run_sql("""SELECT b.id FROM bibrec AS b WHERE b.modification_date <= %s""",
(dates[1],)))
if index_name == 'fulltext':
res |= intbitset(run_sql("""SELECT id_bibrec FROM bibrec_bibdoc JOIN bibdoc ON id_bibdoc=id
WHERE text_extraction_date <= modification_date
AND modification_date <= %s
AND status<>'DELETED'""",
(dates[1],)))
else:
res = intbitset(run_sql("""SELECT b.id FROM bibrec AS b
WHERE b.modification_date >= %s AND
b.modification_date <= %s""",
(dates[0], dates[1])))
if index_name == 'fulltext':
res |= intbitset(run_sql("""SELECT id_bibrec FROM bibrec_bibdoc JOIN bibdoc ON id_bibdoc=id
WHERE text_extraction_date <= modification_date AND
modification_date >= %s AND
modification_date <= %s AND
status<>'DELETED'""",
(dates[0], dates[1],)))
# special case of author indexes where we need to re-index
# those records that were affected by changed BibAuthorID attributions:
if index_name in ('author', 'firstauthor', 'exactauthor', 'exactfirstauthor'):
from invenio.bibauthorid_personid_maintenance import get_recids_affected_since
# dates[1] is ignored, since BibAuthorID API does not offer upper limit search
rec_list_author = intbitset(get_recids_affected_since(dates[0]))
res = res | rec_list_author
return set(res)
def get_recIDs_by_date_authority(dates, index_name, force_all=False):
""" Finds records that were modified between DATES[0] and DATES[1]
for given index.
If DATES is not set, then finds records that were modified since
the last update of the index.
Searches for bibliographic records connected to authority records
that have been changed.
"""
index_id = get_index_id_from_index_name(index_name)
index_tags = get_index_tags(index_name)
if not dates:
query = """SELECT last_updated FROM idxINDEX WHERE id=%s"""
res = run_sql(query, (index_id,))
if not res:
return set([])
if not res[0][0] or force_all:
dates = ("0000-00-00", None)
else:
dates = (res[0][0], None)
res = intbitset()
for tag in index_tags:
pattern = tag.replace('%', '*')
matches = fnmatch.filter(CFG_BIBAUTHORITY_CONTROLLED_FIELDS_BIBLIOGRAPHIC.keys(), pattern)
if not len(matches):
continue
for tag_match in matches:
# get the type of authority record associated with this field
auth_type = CFG_BIBAUTHORITY_CONTROLLED_FIELDS_BIBLIOGRAPHIC.get(tag_match)
# find updated authority records of this type
# dates[1] is ignored, needs dates[0] to find res
now = datetime.now()
auth_recIDs = search_pattern(p='980__a:' + auth_type) \
& search_unit_in_bibrec(str(dates[0]), str(now), type='m')
# now find dependent bibliographic records
for auth_recID in auth_recIDs:
# get the fix authority identifier of this authority record
control_nos = get_control_nos_from_recID(auth_recID)
# there may be multiple control number entries! (the '035' field is repeatable!)
for control_no in control_nos:
# get the bibrec IDs that refer to AUTHORITY_ID in TAG
tag_0 = tag_match[:5] + '0' # possibly do the same for '4' subfields ?
fieldvalue = '"' + control_no + '"'
res |= search_pattern(p=tag_0 + ':' + fieldvalue)
return set(res)
def get_not_updated_recIDs(modified_dates, indexes, force_all=False):
"""Finds not updated recIDs in database for indexes.
@param modified_dates: between this dates we should look for modified records
@type modified_dates: [date_old, date_new]
@param indexes: list of indexes
@type indexes: string separated by coma
@param force_all: if True all records will be taken
"""
found_recIDs = set()
write_message(CFG_BIBINDEX_UPDATE_MESSAGE)
for index in indexes:
found_recIDs |= get_recIDs_by_date_bibliographic(modified_dates, index, force_all)
found_recIDs |= get_recIDs_by_date_authority(modified_dates, index, force_all)
return list(sorted(found_recIDs))
def get_recIDs_from_cli(indexes=[]):
"""
Gets recIDs ranges from CLI for indexing when
user specified 'id' or 'collection' option or
search for modified recIDs for provided indexes
when recIDs are not specified.
@param indexes: it's a list of specified indexes, which
can be obtained from CLI with use of:
get_indexes_from_cli() function.
@type indexes: list of strings
"""
# need to first update idxINDEX table to find proper recIDs for reindexing
if task_get_option("reindex"):
for index_name in indexes:
run_sql("""UPDATE idxINDEX SET last_updated='0000-00-00 00:00:00'
WHERE name=%s""", (index_name,))
if task_get_option("id"):
return task_get_option("id")
elif task_get_option("collection"):
l_of_colls = task_get_option("collection").split(",")
recIDs = perform_request_search(c=l_of_colls)
recIDs_range = []
for recID in recIDs:
recIDs_range.append([recID, recID])
return recIDs_range
elif task_get_option("cmd") == "add":
recs = get_not_updated_recIDs(task_get_option("modified"),
indexes,
task_get_option("force"))
recIDs_range = beautify_range_list(create_range_list(recs))
return recIDs_range
return []
def get_indexes_from_cli():
"""
Gets indexes from CLI and checks if they are
valid. If indexes weren't specified function
will return all known indexes.
"""
indexes = task_get_option("windex")
if not indexes:
indexes = get_all_indexes()
else:
indexes = indexes.split(",")
indexes = remove_inexistent_indexes(indexes, leave_virtual=True)
return indexes
def remove_dependent_index(virtual_indexes, dependent_index):
"""
Removes dependent index from virtual indexes.
@param virtual_indexes: names of virtual_indexes
@type virtual_indexes: list of strings
@param dependent_index: name of dependent index
@type dependent_index: string
"""
if not virtual_indexes:
write_message("You should specify a name of a virtual index...")
id_dependent = get_index_id_from_index_name(dependent_index)
wordTables = get_word_tables(virtual_indexes)
for index_id, index_name, index_tags in wordTables:
wordTable = WordTable(index_name=index_name,
index_id=index_id,
fields_to_index=index_tags,
table_name_pattern='idxWORD%02dF',
wordtable_type=CFG_BIBINDEX_INDEX_TABLE_TYPE["Words"],
tag_to_tokenizer_map={'8564_u': "BibIndexEmptyTokenizer"},
wash_index_terms=50)
wordTable.remove_dependent_index(id_dependent)
wordTable.report_on_table_consistency()
task_sleep_now_if_required()
wordTable = WordTable(index_name=index_name,
index_id=index_id,
fields_to_index=index_tags,
table_name_pattern='idxPAIR%02dF',
wordtable_type=CFG_BIBINDEX_INDEX_TABLE_TYPE["Pairs"],
tag_to_tokenizer_map={'8564_u': "BibIndexEmptyTokenizer"},
wash_index_terms=50)
wordTable.remove_dependent_index(id_dependent)
wordTable.report_on_table_consistency()
task_sleep_now_if_required()
wordTable = WordTable(index_name=index_name,
index_id=index_id,
fields_to_index=index_tags,
table_name_pattern='idxPHRASE%02dF',
wordtable_type=CFG_BIBINDEX_INDEX_TABLE_TYPE["Words"],
tag_to_tokenizer_map={'8564_u': "BibIndexEmptyTokenizer"},
wash_index_terms=50)
wordTable.remove_dependent_index(id_dependent)
wordTable.report_on_table_consistency()
query = """DELETE FROM idxINDEX_idxINDEX WHERE id_virtual=%s AND id_normal=%s"""
run_sql(query, (index_id, id_dependent))
def task_run_core():
"""Runs the task by fetching arguments from the BibSched task queue.
This is what BibSched will be invoking via daemon call.
"""
global _last_word_table
indexes = get_indexes_from_cli()
if len(indexes) == 0:
write_message("Specified indexes can't be found.")
return True
# check tables consistency
if task_get_option("cmd") == "check":
wordTables = get_word_tables(indexes)
for index_id, index_name, index_tags in wordTables:
wordTable = WordTable(index_name=index_name,
index_id=index_id,
fields_to_index=index_tags,
table_name_pattern='idxWORD%02dF',
wordtable_type=CFG_BIBINDEX_INDEX_TABLE_TYPE["Words"],
tag_to_tokenizer_map={'8564_u': "BibIndexFulltextTokenizer"},
wash_index_terms=50)
_last_word_table = wordTable
wordTable.report_on_table_consistency()
task_sleep_now_if_required(can_stop_too=True)
wordTable = WordTable(index_name=index_name,
index_id=index_id,
fields_to_index=index_tags,
table_name_pattern='idxPAIR%02dF',
wordtable_type=CFG_BIBINDEX_INDEX_TABLE_TYPE["Pairs"],
tag_to_tokenizer_map={'8564_u': "BibIndexEmptyTokenizer"},
wash_index_terms=100)
_last_word_table = wordTable
wordTable.report_on_table_consistency()
task_sleep_now_if_required(can_stop_too=True)
wordTable = WordTable(index_name=index_name,
index_id=index_id,
fields_to_index=index_tags,
table_name_pattern='idxPHRASE%02dF',
wordtable_type=CFG_BIBINDEX_INDEX_TABLE_TYPE["Phrases"],
tag_to_tokenizer_map={'8564_u': "BibIndexEmptyTokenizer"},
wash_index_terms=0)
_last_word_table = wordTable
wordTable.report_on_table_consistency()
task_sleep_now_if_required(can_stop_too=True)
_last_word_table = None
return True
#virtual index: remove dependent index
if task_get_option("remove-dependent-index"):
remove_dependent_index(indexes,
task_get_option("remove-dependent-index"))
return True
#initialization for Words,Pairs,Phrases
recIDs_range = get_recIDs_from_cli(indexes)
recIDs_for_index = find_affected_records_for_index(indexes,
recIDs_range,
(task_get_option("force") or \
task_get_option("reindex") or \
task_get_option("cmd") == "del"))
wordTables = get_word_tables(recIDs_for_index.keys())
if not wordTables:
write_message("Selected indexes/recIDs are up to date.")
# Let's work on single words!
for index_id, index_name, index_tags in wordTables:
reindex_prefix = ""
if task_get_option("reindex"):
reindex_prefix = "tmp_"
init_temporary_reindex_tables(index_id, reindex_prefix)
wordTable = WordTable(index_name=index_name,
index_id=index_id,
fields_to_index=index_tags,
table_name_pattern=reindex_prefix + 'idxWORD%02dF',
wordtable_type=CFG_BIBINDEX_INDEX_TABLE_TYPE["Words"],
tag_to_tokenizer_map={'8564_u': "BibIndexFulltextTokenizer"},
wash_index_terms=50)
_last_word_table = wordTable
wordTable.report_on_table_consistency()
try:
if task_get_option("cmd") == "del":
if task_get_option("id") or task_get_option("collection"):
wordTable.del_recIDs(recIDs_range)
task_sleep_now_if_required(can_stop_too=True)
else:
error_message = "Missing IDs of records to delete from " \
"index %s." % wordTable.tablename
write_message(error_message, stream=sys.stderr)
raise StandardError(error_message)
elif task_get_option("cmd") == "add":
final_recIDs = beautify_range_list(create_range_list(recIDs_for_index[index_name]))
wordTable.add_recIDs(final_recIDs, task_get_option("flush"))
task_sleep_now_if_required(can_stop_too=True)
elif task_get_option("cmd") == "repair":
wordTable.repair(task_get_option("flush"))
task_sleep_now_if_required(can_stop_too=True)
else:
error_message = "Invalid command found processing %s" % \
wordTable.tablename
write_message(error_message, stream=sys.stderr)
raise StandardError(error_message)
except StandardError, e:
write_message("Exception caught: %s" % e, sys.stderr)
register_exception(alert_admin=True)
if _last_word_table:
_last_word_table.put_into_db()
raise
wordTable.report_on_table_consistency()
task_sleep_now_if_required(can_stop_too=True)
# Let's work on pairs now
wordTable = WordTable(index_name=index_name,
index_id=index_id,
fields_to_index=index_tags,
table_name_pattern=reindex_prefix + 'idxPAIR%02dF',
wordtable_type=CFG_BIBINDEX_INDEX_TABLE_TYPE["Pairs"],
tag_to_tokenizer_map={'8564_u': "BibIndexEmptyTokenizer"},
wash_index_terms=100)
_last_word_table = wordTable
wordTable.report_on_table_consistency()
try:
if task_get_option("cmd") == "del":
if task_get_option("id") or task_get_option("collection"):
wordTable.del_recIDs(recIDs_range)
task_sleep_now_if_required(can_stop_too=True)
else:
error_message = "Missing IDs of records to delete from " \
"index %s." % wordTable.tablename
write_message(error_message, stream=sys.stderr)
raise StandardError(error_message)
elif task_get_option("cmd") == "add":
final_recIDs = beautify_range_list(create_range_list(recIDs_for_index[index_name]))
wordTable.add_recIDs(final_recIDs, task_get_option("flush"))
task_sleep_now_if_required(can_stop_too=True)
elif task_get_option("cmd") == "repair":
wordTable.repair(task_get_option("flush"))
task_sleep_now_if_required(can_stop_too=True)
else:
error_message = "Invalid command found processing %s" % \
wordTable.tablename
write_message(error_message, stream=sys.stderr)
raise StandardError(error_message)
except StandardError, e:
write_message("Exception caught: %s" % e, sys.stderr)
register_exception()
if _last_word_table:
_last_word_table.put_into_db()
raise
wordTable.report_on_table_consistency()
task_sleep_now_if_required(can_stop_too=True)
# Let's work on phrases now
wordTable = WordTable(index_name=index_name,
index_id=index_id,
fields_to_index=index_tags,
table_name_pattern=reindex_prefix + 'idxPHRASE%02dF',
wordtable_type=CFG_BIBINDEX_INDEX_TABLE_TYPE["Phrases"],
tag_to_tokenizer_map={'8564_u': "BibIndexEmptyTokenizer"},
wash_index_terms=0)
_last_word_table = wordTable
wordTable.report_on_table_consistency()
try:
if task_get_option("cmd") == "del":
if task_get_option("id") or task_get_option("collection"):
wordTable.del_recIDs(recIDs_range)
task_sleep_now_if_required(can_stop_too=True)
else:
error_message = "Missing IDs of records to delete from " \
"index %s." % wordTable.tablename
write_message(error_message, stream=sys.stderr)
raise StandardError(error_message)
elif task_get_option("cmd") == "add":
final_recIDs = beautify_range_list(create_range_list(recIDs_for_index[index_name]))
wordTable.add_recIDs(final_recIDs, task_get_option("flush"))
if not task_get_option("id") and not task_get_option("collection"):
update_index_last_updated([index_name], task_get_task_param('task_starting_time'))
task_sleep_now_if_required(can_stop_too=True)
elif task_get_option("cmd") == "repair":
wordTable.repair(task_get_option("flush"))
task_sleep_now_if_required(can_stop_too=True)
else:
error_message = "Invalid command found processing %s" % \
wordTable.tablename
write_message(error_message, stream=sys.stderr)
raise StandardError(error_message)
except StandardError, e:
write_message("Exception caught: %s" % e, sys.stderr)
register_exception()
if _last_word_table:
_last_word_table.put_into_db()
raise
wordTable.report_on_table_consistency()
task_sleep_now_if_required(can_stop_too=True)
if task_get_option("reindex"):
swap_temporary_reindex_tables(index_id, reindex_prefix)
update_index_last_updated([index_name], task_get_task_param('task_starting_time'))
task_sleep_now_if_required(can_stop_too=True)
# update modification date also for indexes that were up to date
if not task_get_option("id") and not task_get_option("collection") and \
task_get_option("cmd") == "add":
up_to_date = set(indexes) - set(recIDs_for_index.keys())
update_index_last_updated(list(up_to_date), task_get_task_param('task_starting_time'))
_last_word_table = None
return True
### okay, here we go:
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/bibindex/engine_stemmer.py b/invenio/legacy/bibindex/engine_stemmer.py
index af1cdd1b9..809b3285b 100644
--- a/invenio/legacy/bibindex/engine_stemmer.py
+++ b/invenio/legacy/bibindex/engine_stemmer.py
@@ -1,487 +1,487 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibIndex stemmer facility based on the Porter Stemming Algorithm.
<http://tartarus.org/~martin/PorterStemmer/>
"""
__revision__ = "$Id$"
from thread import get_ident
-from invenio.bibindex_engine_stemmer_greek import GreekStemmer
+from invenio.legacy.bibindex.engine_stemmer_greek import GreekStemmer
_stemmers = {}
try:
### Let's try to use SnowBall PyStemmer
import Stemmer
_lang_map = {
'danish' : 'da',
'dutch' : 'nl',
'english' : 'en',
'finnish' : 'fi',
'french' : 'fr',
'german' : 'de',
'greek' : 'el',
'hungarian' : 'hu',
'italian' : 'it',
'norwegian' : 'no',
'portuguese' : 'pt',
'romanian' : 'ro',
'russian' : 'ru',
'spanish' : 'es',
'swedish' : 'sv',
'turkish' : 'tr'
}
def is_stemmer_available_for_language(lang):
"""Return true if stemmer for language LANG is available.
Return false otherwise.
"""
thread_ident = get_ident()
if not _stemmers.has_key(thread_ident):
_stemmers[thread_ident] = _create_stemmers()
return _stemmers[thread_ident].has_key(lang)
def stem(word, lang):
"""Return WORD stemmed according to language LANG (e.g. 'en')."""
if lang and is_stemmer_available_for_language(lang):
return _stemmers[get_ident()][lang].stemWord(word)
else:
return word
def stemWords(words, lang):
"""Return WORDS stemmed according to language LANG (e.g. 'en')."""
if lang and is_stemmer_available_for_language(lang):
return _stemmers[get_ident()][lang].stemWords(words)
else:
return words
def get_stemming_language_map():
"""Return a diction of code language, language name for all the available
languages."""
ret = {}
for language_name, language_code in _lang_map.iteritems():
if is_stemmer_available_for_language(language_code):
ret[language_name] = language_code
return ret
def _create_stemmers():
"""Create stemmers dictionary for all possible languages."""
stemmers_initialized = {}
for src_lang in Stemmer.algorithms():
try:
dst_lang = _lang_map.get(src_lang)
if dst_lang:
stemmers_initialized[dst_lang] = Stemmer.Stemmer(src_lang, 40000)
except (TypeError, KeyError):
pass
stemmers_initialized['el'] = GreekStemmer()
return stemmers_initialized
except ImportError:
### Here is the original PorterStemmer class provided as a fallback,
### the "free of charge for any purpose" implementation of the Porter stemmer
### algorithm in Python. The Invenio API interface follows below.
_stemmers = {}
class PorterStemmer:
"""
This is the Porter stemming algorithm, ported to Python from the
version coded up in ANSI C by the author. It may be be regarded
as canonical, in that it follows the algorithm presented in
Porter, 1980, An algorithm for suffix stripping, Program, Vol. 14,
no. 3, pp 130-137,
only differing from it at the points maked --DEPARTURE-- below.
See also http://www.tartarus.org/~martin/PorterStemmer
The algorithm as described in the paper could be exactly replicated
by adjusting the points of DEPARTURE, but this is barely necessary,
because (a) the points of DEPARTURE are definitely improvements, and
(b) no encoding of the Porter stemmer I have seen is anything like
as exact as this version, even with the points of DEPARTURE!
Vivake Gupta (v@nano.com)
Release 1: January 2001
"""
def __init__(self):
"""The main part of the stemming algorithm starts here.
b is a buffer holding a word to be stemmed. The letters are in b[k0],
b[k0+1] ... ending at b[k]. In fact k0 = 0 in this demo program. k is
readjusted downwards as the stemming progresses. Zero termination is
not in fact used in the algorithm.
Note that only lower case sequences are stemmed. Forcing to lower case
should be done before stem(...) is called.
"""
self.b = "" # buffer for word to be stemmed
self.k = 0
self.k0 = 0
self.j = 0 # j is a general offset into the string
def cons(self, i):
"""cons(i) is TRUE <=> b[i] is a consonant."""
if self.b[i] == 'a' or self.b[i] == 'e' or self.b[i] == 'i' or self.b[i] == 'o' or self.b[i] == 'u':
return 0
if self.b[i] == 'y':
if i == self.k0:
return 1
else:
return (not self.cons(i - 1))
return 1
def m(self):
"""m() measures the number of consonant sequences between k0 and j.
if c is a consonant sequence and v a vowel sequence, and <..>
indicates arbitrary presence,
<c><v> gives 0
<c>vc<v> gives 1
<c>vcvc<v> gives 2
<c>vcvcvc<v> gives 3
....
"""
n = 0
i = self.k0
while 1:
if i > self.j:
return n
if not self.cons(i):
break
i = i + 1
i = i + 1
while 1:
while 1:
if i > self.j:
return n
if self.cons(i):
break
i = i + 1
i = i + 1
n = n + 1
while 1:
if i > self.j:
return n
if not self.cons(i):
break
i = i + 1
i = i + 1
def vowelinstem(self):
"""vowelinstem() is TRUE <=> k0,...j contains a vowel"""
for i in range(self.k0, self.j + 1):
if not self.cons(i):
return 1
return 0
def doublec(self, j):
"""doublec(j) is TRUE <=> j,(j-1) contain a double consonant."""
if j < (self.k0 + 1):
return 0
if (self.b[j] != self.b[j-1]):
return 0
return self.cons(j)
def cvc(self, i):
"""cvc(i) is TRUE <=> i-2,i-1,i has the form consonant - vowel - consonant
and also if the second c is not w,x or y. this is used when trying to
restore an e at the end of a short e.g.
cav(e), lov(e), hop(e), crim(e), but
snow, box, tray.
"""
if i < (self.k0 + 2) or not self.cons(i) or self.cons(i-1) or not self.cons(i-2):
return 0
ch = self.b[i]
if ch == 'w' or ch == 'x' or ch == 'y':
return 0
return 1
def ends(self, s):
"""ends(s) is TRUE <=> k0,...k ends with the string s."""
length = len(s)
if s[length - 1] != self.b[self.k]: # tiny speed-up
return 0
if length > (self.k - self.k0 + 1):
return 0
if self.b[self.k-length+1:self.k+1] != s:
return 0
self.j = self.k - length
return 1
def setto(self, s):
"""setto(s) sets (j+1),...k to the characters in the string s, readjusting k."""
length = len(s)
self.b = self.b[:self.j+1] + s + self.b[self.j+length+1:]
self.k = self.j + length
def r(self, s):
"""r(s) is used further down."""
if self.m() > 0:
self.setto(s)
def step1ab(self):
"""step1ab() gets rid of plurals and -ed or -ing. e.g.
caresses -> caress
ponies -> poni
ties -> ti
caress -> caress
cats -> cat
feed -> feed
agreed -> agree
disabled -> disable
matting -> mat
mating -> mate
meeting -> meet
milling -> mill
messing -> mess
meetings -> meet
"""
if self.b[self.k] == 's':
if self.ends("sses"):
self.k = self.k - 2
elif self.ends("ies"):
self.setto("i")
elif self.b[self.k - 1] != 's':
self.k = self.k - 1
if self.ends("eed"):
if self.m() > 0:
self.k = self.k - 1
elif (self.ends("ed") or self.ends("ing")) and self.vowelinstem():
self.k = self.j
if self.ends("at"): self.setto("ate")
elif self.ends("bl"): self.setto("ble")
elif self.ends("iz"): self.setto("ize")
elif self.doublec(self.k):
self.k = self.k - 1
ch = self.b[self.k]
if ch == 'l' or ch == 's' or ch == 'z':
self.k = self.k + 1
elif (self.m() == 1 and self.cvc(self.k)):
self.setto("e")
def step1c(self):
"""step1c() turns terminal y to i when there is another vowel in the stem."""
if (self.ends("y") and self.vowelinstem()):
self.b = self.b[:self.k] + 'i' + self.b[self.k+1:]
def step2(self):
"""step2() maps double suffices to single ones.
so -ization ( = -ize plus -ation) maps to -ize etc. note that the
string before the suffix must give m() > 0.
"""
if self.b[self.k - 1] == 'a':
if self.ends("ational"): self.r("ate")
elif self.ends("tional"): self.r("tion")
elif self.b[self.k - 1] == 'c':
if self.ends("enci"): self.r("ence")
elif self.ends("anci"): self.r("ance")
elif self.b[self.k - 1] == 'e':
if self.ends("izer"): self.r("ize")
elif self.b[self.k - 1] == 'l':
if self.ends("bli"): self.r("ble") # --DEPARTURE--
# To match the published algorithm, replace this phrase with
# if self.ends("abli"): self.r("able")
elif self.ends("alli"): self.r("al")
elif self.ends("entli"): self.r("ent")
elif self.ends("eli"): self.r("e")
elif self.ends("ousli"): self.r("ous")
elif self.b[self.k - 1] == 'o':
if self.ends("ization"): self.r("ize")
elif self.ends("ation"): self.r("ate")
elif self.ends("ator"): self.r("ate")
elif self.b[self.k - 1] == 's':
if self.ends("alism"): self.r("al")
elif self.ends("iveness"): self.r("ive")
elif self.ends("fulness"): self.r("ful")
elif self.ends("ousness"): self.r("ous")
elif self.b[self.k - 1] == 't':
if self.ends("aliti"): self.r("al")
elif self.ends("iviti"): self.r("ive")
elif self.ends("biliti"): self.r("ble")
elif self.b[self.k - 1] == 'g': # --DEPARTURE--
if self.ends("logi"): self.r("log")
# To match the published algorithm, delete this phrase
def step3(self):
"""step3() dels with -ic-, -full, -ness etc. similar strategy to step2."""
if self.b[self.k] == 'e':
if self.ends("icate"): self.r("ic")
elif self.ends("ative"): self.r("")
elif self.ends("alize"): self.r("al")
elif self.b[self.k] == 'i':
if self.ends("iciti"): self.r("ic")
elif self.b[self.k] == 'l':
if self.ends("ical"): self.r("ic")
elif self.ends("ful"): self.r("")
elif self.b[self.k] == 's':
if self.ends("ness"): self.r("")
def step4(self):
"""step4() takes off -ant, -ence etc., in context <c>vcvc<v>."""
if self.b[self.k - 1] == 'a':
if self.ends("al"): pass
else: return
elif self.b[self.k - 1] == 'c':
if self.ends("ance"): pass
elif self.ends("ence"): pass
else: return
elif self.b[self.k - 1] == 'e':
if self.ends("er"): pass
else: return
elif self.b[self.k - 1] == 'i':
if self.ends("ic"): pass
else: return
elif self.b[self.k - 1] == 'l':
if self.ends("able"): pass
elif self.ends("ible"): pass
else: return
elif self.b[self.k - 1] == 'n':
if self.ends("ant"): pass
elif self.ends("ement"): pass
elif self.ends("ment"): pass
elif self.ends("ent"): pass
else: return
elif self.b[self.k - 1] == 'o':
if self.ends("ion") and (self.b[self.j] == 's' or self.b[self.j] == 't'): pass
elif self.ends("ou"): pass
# takes care of -ous
else: return
elif self.b[self.k - 1] == 's':
if self.ends("ism"): pass
else: return
elif self.b[self.k - 1] == 't':
if self.ends("ate"): pass
elif self.ends("iti"): pass
else: return
elif self.b[self.k - 1] == 'u':
if self.ends("ous"): pass
else: return
elif self.b[self.k - 1] == 'v':
if self.ends("ive"): pass
else: return
elif self.b[self.k - 1] == 'z':
if self.ends("ize"): pass
else: return
else:
return
if self.m() > 1:
self.k = self.j
def step5(self):
"""step5() removes a final -e if m() > 1, and changes -ll to -l if
m() > 1.
"""
self.j = self.k
if self.b[self.k] == 'e':
a = self.m()
if a > 1 or (a == 1 and not self.cvc(self.k-1)):
self.k = self.k - 1
if self.b[self.k] == 'l' and self.doublec(self.k) and self.m() > 1:
self.k = self.k -1
def stem(self, p, i, j):
"""In stem(p,i,j), p is a char pointer, and the string to be stemmed
is from p[i] to p[j] inclusive. Typically i is zero and j is the
offset to the last character of a string, (p[j+1] == '\0'). The
stemmer adjusts the characters p[i] ... p[j] and returns the new
end-point of the string, k. Stemming never increases word length, so
i <= k <= j. To turn the stemmer into a module, declare 'stem' as
extern, and delete the remainder of this file.
"""
# copy the parameters into statics
self.b = p
self.k = j
self.k0 = i
if self.k <= self.k0 + 1:
return self.b # --DEPARTURE--
# With this line, strings of length 1 or 2 don't go through the
# stemming process, although no mention is made of this in the
# published algorithm. Remove the line to match the published
# algorithm.
self.step1ab()
self.step1c()
self.step2()
self.step3()
self.step4()
self.step5()
return self.b[self.k0:self.k+1]
def is_stemmer_available_for_language(lang):
"""Return true if stemmer for language LANG is available.
Return false otherwise.
"""
return lang in ('en', 'el')
def stem(word, lang):
"""Return WORD stemmed according to language LANG (e.g. 'en')."""
if lang in ('en', 'el'):
if not get_ident() in _stemmers:
_stemmers[get_ident()] = {
'en': PorterStemmer(),
'el': GreekStemmer()}
if lang == 'en':
#make sure _stemmers[get_ident()] is avail..
return _stemmers[get_ident()]['en'].stem(word, 0, len(word)-1)
elif lang == 'el':
return _stemmers[get_ident()]['el'].stemWord(word)
else:
return word
def stemWords(words, lang):
"""Return WORDS stemmed according to language LANG (e.g. 'en')."""
if lang in ('en', 'el'):
if not get_ident() in _stemmers:
_stemmers[get_ident()] = {
'en': PorterStemmer(),
'el': GreekStemmer()}
if lang == 'en':
#make sure _stemmers[get_ident()] is avail..
return [_stemmers[get_ident()]['en'].stem(word, 0, len(word)-1) for word in words]
elif lang == 'el':
return _stemmers[get_ident()]['el'].stemWords(words)
else:
return words
def get_stemming_language_map():
"""Return a diction of code language, language name for all the available
languages."""
return {'english' : 'en', 'greek': 'el'}
if __name__ == '__main__':
# when invoked via CLI, simply stem the arguments:
import sys
if len(sys.argv) > 1:
for word in sys.argv[1:]:
print stem(word)
diff --git a/invenio/legacy/bibindex/engine_washer.py b/invenio/legacy/bibindex/engine_washer.py
index f40242cbe..6a50f5fd1 100644
--- a/invenio/legacy/bibindex/engine_washer.py
+++ b/invenio/legacy/bibindex/engine_washer.py
@@ -1,170 +1,170 @@
# -*- coding:utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import re
-from invenio.bibindex_engine_stemmer import stem
-from invenio.bibindex_engine_stopwords import is_stopword
+from invenio.legacy.bibindex.engine_stemmer import stem
+from invenio.legacy.bibindex.engine_stopwords import is_stopword
from invenio.config import CFG_BIBINDEX_MIN_WORD_LENGTH, \
CFG_ETCDIR
re_pattern_fuzzy_author_dots = re.compile(r'[\.\-]+')
re_pattern_fuzzy_author_spaces = re.compile(r'\s+')
re_pattern_author_canonical_id = re.compile(r'\.[0-9]+$')
re_unicode_lowercase_a = re.compile(unicode(r"(?u)[áàäâãå]", "utf-8"))
re_unicode_lowercase_ae = re.compile(unicode(r"(?u)[æ]", "utf-8"))
re_unicode_lowercase_e = re.compile(unicode(r"(?u)[éèëê]", "utf-8"))
re_unicode_lowercase_i = re.compile(unicode(r"(?u)[íìïî]", "utf-8"))
re_unicode_lowercase_o = re.compile(unicode(r"(?u)[óòöôõø]", "utf-8"))
re_unicode_lowercase_u = re.compile(unicode(r"(?u)[úùüû]", "utf-8"))
re_unicode_lowercase_y = re.compile(unicode(r"(?u)[ýÿ]", "utf-8"))
re_unicode_lowercase_c = re.compile(unicode(r"(?u)[çć]", "utf-8"))
re_unicode_lowercase_n = re.compile(unicode(r"(?u)[ñ]", "utf-8"))
re_unicode_uppercase_a = re.compile(unicode(r"(?u)[ÁÀÄÂÃÅ]", "utf-8"))
re_unicode_uppercase_ae = re.compile(unicode(r"(?u)[Æ]", "utf-8"))
re_unicode_uppercase_e = re.compile(unicode(r"(?u)[ÉÈËÊ]", "utf-8"))
re_unicode_uppercase_i = re.compile(unicode(r"(?u)[ÍÌÏÎ]", "utf-8"))
re_unicode_uppercase_o = re.compile(unicode(r"(?u)[ÓÒÖÔÕØ]", "utf-8"))
re_unicode_uppercase_u = re.compile(unicode(r"(?u)[ÚÙÜÛ]", "utf-8"))
re_unicode_uppercase_y = re.compile(unicode(r"(?u)[Ý]", "utf-8"))
re_unicode_uppercase_c = re.compile(unicode(r"(?u)[ÇĆ]", "utf-8"))
re_unicode_uppercase_n = re.compile(unicode(r"(?u)[Ñ]", "utf-8"))
re_latex_lowercase_a = re.compile("\\\\[\"H'`~^vu=k]\{?a\}?")
re_latex_lowercase_ae = re.compile("\\\\ae\\{\\}?")
re_latex_lowercase_e = re.compile("\\\\[\"H'`~^vu=k]\\{?e\\}?")
re_latex_lowercase_i = re.compile("\\\\[\"H'`~^vu=k]\\{?i\\}?")
re_latex_lowercase_o = re.compile("\\\\[\"H'`~^vu=k]\\{?o\\}?")
re_latex_lowercase_u = re.compile("\\\\[\"H'`~^vu=k]\\{?u\\}?")
re_latex_lowercase_y = re.compile("\\\\[\"']\\{?y\\}?")
re_latex_lowercase_c = re.compile("\\\\['uc]\\{?c\\}?")
re_latex_lowercase_n = re.compile("\\\\[c'~^vu]\\{?n\\}?")
re_latex_uppercase_a = re.compile("\\\\[\"H'`~^vu=k]\\{?A\\}?")
re_latex_uppercase_ae = re.compile("\\\\AE\\{?\\}?")
re_latex_uppercase_e = re.compile("\\\\[\"H'`~^vu=k]\\{?E\\}?")
re_latex_uppercase_i = re.compile("\\\\[\"H'`~^vu=k]\\{?I\\}?")
re_latex_uppercase_o = re.compile("\\\\[\"H'`~^vu=k]\\{?O\\}?")
re_latex_uppercase_u = re.compile("\\\\[\"H'`~^vu=k]\\{?U\\}?")
re_latex_uppercase_y = re.compile("\\\\[\"']\\{?Y\\}?")
re_latex_uppercase_c = re.compile("\\\\['uc]\\{?C\\}?")
re_latex_uppercase_n = re.compile("\\\\[c'~^vu]\\{?N\\}?")
def lower_index_term(term):
"""
Return safely lowered index term TERM. This is done by converting
to UTF-8 first, because standard Python lower() function is not
UTF-8 safe. To be called by both the search engine and the
indexer when appropriate (e.g. before stemming).
In case of problems with UTF-8 compliance, this function raises
UnicodeDecodeError, so the client code may want to catch it.
"""
return unicode(term, 'utf-8').lower().encode('utf-8')
latex_markup_re = re.compile(r"\\begin(\[.+?\])?\{.+?\}|\\end\{.+?}|\\\w+(\[.+?\])?\{(?P<inside1>.*?)\}|\{\\\w+ (?P<inside2>.*?)\}")
def remove_latex_markup(phrase):
ret_phrase = ''
index = 0
for match in latex_markup_re.finditer(phrase):
ret_phrase += phrase[index:match.start()]
ret_phrase += match.group('inside1') or match.group('inside2') or ''
index = match.end()
ret_phrase += phrase[index:]
return ret_phrase
def apply_stemming(word, stemming_language):
"""Returns word after applying stemming (if stemming language is set).
You can change your stemming language in database.
@param word: word to be checked
@type word: str
@param stemming_language: abbreviation of language or None
@type stemming_language: str
"""
if stemming_language:
word = stem(word, stemming_language)
return word
def remove_stopwords(word, stopwords_kb = False):
"""Returns word after stopword check.
One must specify the name of the knowledge base.
@param word: word to be checked
@type word: str
@param stopwords_kb: name of the stopwords knowledge base
@type word: str
"""
if stopwords_kb:
stopwords_path = CFG_ETCDIR + "/bibrank/" + stopwords_kb
if is_stopword(word, stopwords_path):
return ""
return word
def length_check(word):
"""Returns word after length check.
@param word: word to be checked
@type word: str
"""
if len(word) < CFG_BIBINDEX_MIN_WORD_LENGTH:
return ""
return word
def wash_index_term(term, max_char_length=50, lower_term=True):
"""
Return washed form of the index term TERM that would be suitable
for storing into idxWORD* tables. I.e., lower the TERM if
LOWER_TERM is True, and truncate it safely to MAX_CHAR_LENGTH
UTF-8 characters (meaning, in principle, 4*MAX_CHAR_LENGTH bytes).
The function works by an internal conversion of TERM, when needed,
from its input Python UTF-8 binary string format into Python
Unicode format, and then truncating it safely to the given number
of UTF-8 characters, without possible mis-truncation in the middle
of a multi-byte UTF-8 character that could otherwise happen if we
would have been working with UTF-8 binary representation directly.
Note that MAX_CHAR_LENGTH corresponds to the length of the term
column in idxINDEX* tables.
"""
if lower_term:
washed_term = unicode(term, 'utf-8').lower()
else:
washed_term = unicode(term, 'utf-8')
if len(washed_term) <= max_char_length:
# no need to truncate the term, because it will fit
# nicely even if it uses four-byte UTF-8 characters
return washed_term.encode('utf-8')
else:
# truncate the term in a safe position:
return washed_term[:max_char_length].encode('utf-8')
def wash_author_name(p):
"""
Wash author name suitable for author searching. Notably, replace
dots and hyphens with spaces, and collapse spaces.
"""
if re_pattern_author_canonical_id.search(p):
# we have canonical author ID form, so ignore all washing
return p
out = re_pattern_fuzzy_author_dots.sub(" ", p)
out = re_pattern_fuzzy_author_spaces.sub(" ", out)
return out.strip()
diff --git a/invenio/legacy/bibindex/scripts/bibindex.py b/invenio/legacy/bibindex/scripts/bibindex.py
index 65016f5ef..13da7b042 100644
--- a/invenio/legacy/bibindex/scripts/bibindex.py
+++ b/invenio/legacy/bibindex/scripts/bibindex.py
@@ -1,64 +1,64 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibIndex bibliographic data, reference and fulltext indexing utility.
Usage: bibindex %s [options]
Examples:
bibindex -a -i 234-250,293,300-500 -u admin@localhost
bibindex -a -w author,fulltext -M 8192 -v3
bibindex -d -m +4d -A on --flush=10000
Indexing options:
-a, --add add or update words for selected records
-d, --del delete words for selected records
-i, --id=low[-high] select according to record recID.
-m, --modified=from[,to] select according to modification date
-c, --collection=c1[,c2] select according to collection
Repairing options:
-k, --check check consistency for all records in the table(s)
-r, --repair try to repair all records in the table(s)
Specific options:
-w, --windex=w1[,w2] word/phrase indexes to consider (all)
-M, --maxmem=XXX maximum memory usage in kB (no limit)
-f, --flush=NNN full consistent table flush after NNN records (10000)
Scheduling options:
-u, --user=USER user name to store task, password needed
-s, --sleeptime=SLEEP time after which to repeat task (no)
e.g.: 1s, 30m, 24h, 7d
-t, --time=TIME moment for the task to be active (now)
e.g.: +15s, 5m, 3h, 2002-10-27 13:57:26
General options:
-h, --help print this help and exit
-V, --version print version and exit
-v, --verbose=LEVEL verbose level (from 0 to 9, default 1)
"""
from invenio.base.factory import with_app_context
@with_app_context()
def main():
- from invenio.bibindex_engine import main as bibindex_main
+ from invenio.legacy.bibindex.engine import main as bibindex_main
return bibindex_main()
diff --git a/invenio/legacy/bibmatch/validator.py b/invenio/legacy/bibmatch/validator.py
index 516683e54..68c484af8 100644
--- a/invenio/legacy/bibmatch/validator.py
+++ b/invenio/legacy/bibmatch/validator.py
@@ -1,852 +1,852 @@
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibMatch - tool to match records with database content of
an Invenio instance, either locally or remotely.
bibmatch_validator - module containing functions for match validation step
"""
__revision__ = "$Id$"
import re
import sys
import pprint
import difflib
from invenio.config import CFG_BIBMATCH_MATCH_VALIDATION_RULESETS, \
CFG_BIBMATCH_FUZZY_MATCH_VALIDATION_LIMIT
from invenio.bibmatch_config import CFG_BIBMATCH_VALIDATION_MATCHING_MODES, \
CFG_BIBMATCH_VALIDATION_RESULT_MODES, \
CFG_BIBMATCH_VALIDATION_COMPARISON_MODES, \
CFG_BIBMATCH_LOGGER
from invenio.legacy.bibrecord import create_records, record_get_field_values
from invenio.legacy.bibrecord.scripts.xmlmarc2textmarc import get_sysno_from_record, create_marc_record
-from invenio.bibauthorid_name_utils import compare_names
-from invenio.bibauthorid_name_utils import string_partition
+from invenio.legacy.bibauthorid.name_utils import compare_names
+from invenio.legacy.bibauthorid.name_utils import string_partition
from invenio.utils.text import translate_to_ascii
re_valid_tag = re.compile("^[0-9]{3}[a-zA-Z0-9_%]{0,3}$")
def validate_matches(bibmatch_recid, record, server, result_recids, \
collections="", verbose=0, ascii_mode=False):
"""
Perform record validation on a set of matches. This function will
try to find any search-result that "really" is a correct match, based on
various methods defined in a given rule-set. See more about rule-sets in
validate_match() function documentation.
This function will return a tuple containing a list of all record IDs
satisfying the count of field matching needed for exact matches and a
similar list for fuzzy matches that has less fields matching then the
threshold. Records that are not matching at all are simply left out of
the lists.
@param bibmatch_recid: Current record number. Used for logging.
@type bibmatch_recid: int
@param record: bibrec structure of original record
@type record: dict
@param server: InvenioConnector object to matched record source repository
@type server: InvenioConnector object
@param result_recids: the list of record ids from search result.
@type result_recids: list
@param collections: list of collections to search, if specified
@type collections: list
@param verbose: be loud
@type verbose: int
@param ascii_mode: True to transform values to its ascii representation
@type ascii_mode: bool
@return: list of record IDs matched
@rtype: list
"""
matches_found = []
fuzzy_matches_found = []
# Generate final rule-set by analyzing the record
final_ruleset = get_validation_ruleset(record)
if not final_ruleset:
sys.stderr.write("Bad configuration rule-set. \
Please check that CFG_BIBMATCH_MATCH_VALIDATION_RULESETS is formed correctly.\n")
return [], []
if verbose > 8:
sys.stderr.write("\nStart record validation:\n\nFinal validation ruleset used:\n")
pp = pprint.PrettyPrinter(stream=sys.stderr, indent=2)
pp.pprint(final_ruleset)
# Fetch all records in MARCXML and convert to BibRec
found_record_list = []
for recid in result_recids:
query = "001:%d" % (recid,)
if collections:
search_params = dict(p=query, of="xm", c=collections)
else:
search_params = dict(p=query, of="xm")
result_marcxml = server.search_with_retry(**search_params)
result_record_list = create_records(result_marcxml)
# Check if record was found and BibRecord generation was successful
if result_record_list == [] or \
len(result_record_list) != 1 or \
result_record_list[0] == None:
# Error fetching a record. Unable to validate and returning with empty list.
if verbose > 8:
sys.stderr.write("\nError retrieving MARCXML for matched record %s\n" % (str(recid),))
return [], []
# Add a tuple of record ID (for easy look-up later) and BibRecord structure
found_record_list.append((recid, result_record_list[0][0]))
# Validate records one-by-one, adding any matches to the list of matching record IDs
current_index = 1
for recid, matched_record in found_record_list:
if verbose > 8:
sys.stderr.write("\n Validating matched record #%d (%s):\n" % \
(current_index, recid))
CFG_BIBMATCH_LOGGER.info("Matching of record %d: Comparing to matched record %s" % \
(bibmatch_recid, recid))
match_ratio = validate_match(record, matched_record, final_ruleset, \
verbose, ascii_mode)
if match_ratio == 1.0:
# All matches were a success, this is an exact match
CFG_BIBMATCH_LOGGER.info("Matching of record %d: Exact match found -> %s" % (bibmatch_recid, recid))
matches_found.append(recid)
elif match_ratio >= CFG_BIBMATCH_FUZZY_MATCH_VALIDATION_LIMIT:
# This means that some matches failed, but some succeeded as well. That's fuzzy...
CFG_BIBMATCH_LOGGER.info("Matching of record %d: Fuzzy match found -> %s" % \
(bibmatch_recid, recid))
fuzzy_matches_found.append(recid)
else:
CFG_BIBMATCH_LOGGER.info("Matching of record %d: Not a match" % (bibmatch_recid,))
current_index += 1
# Return list of matching record IDs
return matches_found, fuzzy_matches_found
def validate_match(org_record, matched_record, ruleset, verbose=0, ascii_mode=False):
"""
This function will try to match the original record with matched record.
This comparison uses various methods defined in configuration and/or
determined from the source record.
These methods can be derived from each rule-set defined, which contains a
mapping of a certain pattern to a list of rules defining the "match-strategy".
For example:
('260__', [{ 'tags' : '260__c',
'threshold' : 0.8,
'compare_mode' : 'lazy',
'match_mode' : 'date',
'result_mode' : 'normal' }])
Quick run-down of possible values:
Compare mode:
'strict' : all (sub-)fields are compared, and all must match. Order is significant.
'normal' : all (sub-)fields are compared, and all must match. Order is ignored.
'lazy' : all (sub-)fields are compared with each other and at least one must match
'ignored' : the tag is ignored in the match. Used to disable previously defined rules.
Match mode:
'title' : uses a method specialized for comparing titles, e.g. looking for subtitles
'author' : uses a special authorname comparison. Will take initials into account.
'identifier': special matching for identifiers, stripping away punctuation
'date' : matches dates by extracting and comparing the year
'normal' : normal string comparison.
Result mode:
'normal' : a failed match will cause the validation to continue on other rules (if any)
a successful match will cause the validation to continue on other rules (if any)
'final' : a failed match will cause the validation to immediately exit as a failure.
a successful match will cause validation to immediately exit as a success.
'joker' : a failed match will cause the validation to continue on other rules (if any).
a successful match will cause validation to immediately exit as a success.
Fields are considered matching when all its subfields or values match. ALL matching strategy
must return successfully for a match to be validated (except for 'joker' mode).
@param org_record: bibrec structure of original record
@type org_record: dict
@param matched_record: bibrec structure of matched record
@type matched_record: dict
@param ruleset: the default rule-set {tag: strategy,..} used when validating
@type ruleset: dict
@param verbose: be loud
@type verbose: int
@param ascii_mode: True to transform values to its ascii representation
@type ascii_mode: bool
@return: Number of matches succeeded divided by number of comparisons done. At least two
successful matches must be done unless a joker or final match is found
@rtype: float
"""
total_number_of_matches = 0
total_number_of_comparisons = 0
for field_tags, threshold, compare_mode, match_mode, result_mode in ruleset:
field_tag_list = field_tags.split(',')
if verbose > 8:
sys.stderr.write("\nValidating tags: %s in parsing mode '%s' and comparison\
mode '%s' as '%s' result with threshold %0.2f\n" \
% (field_tag_list, compare_mode, match_mode, \
result_mode, threshold))
current_matching_status = False
## 1. COMPARE MODE
# Fetch defined fields from both records
original_record_values = []
matched_record_values = []
for field_tag in field_tag_list:
tag_structure = validate_tag(field_tag)
if tag_structure != None:
tag, ind1, ind2, code = tag_structure
# Fetch all field instances to match
original_record_values.extend(record_get_field_values(\
org_record, tag, ind1, ind2, code))
matched_record_values.extend(record_get_field_values(\
matched_record, tag, ind1, ind2, code))
if (len(original_record_values) == 0 or len(matched_record_values) == 0):
# Any or both records do not have values, ignore.
if verbose > 8:
sys.stderr.write("\nBoth records do not have this field. Continue.\n")
continue
if ascii_mode:
original_record_values = translate_to_ascii(original_record_values)
matched_record_values = translate_to_ascii(matched_record_values)
ignore_order = True
matches_needed = 0
# How many field-value matches are needed for successful validation of this record
if compare_mode == 'lazy':
# 'lazy' : all fields are matched with each other, if any match = success
matches_needed = 1
elif compare_mode == 'normal':
# 'normal' : all fields are compared, and all must match.
# Order is ignored. The number of matches needed is equal
# to the value count of original record
matches_needed = len(original_record_values)
elif compare_mode == 'strict':
# 'strict' : all fields are compared, and all must match. Order matters.
if len(original_record_values) != len(matched_record_values):
# Not the same number of fields, not a valid match
# Unless this is a joker, we return indicating failure
if result_mode != 'joker':
return 0.0
continue
matches_needed = len(original_record_values)
ignore_order = False
if verbose > 8:
sys.stderr.write("Total matches needed: %d -> " % (matches_needed,))
## 2. MATCH MODE
total_number_of_comparisons += 1
comparison_function = None
if match_mode == 'title':
# Special title mode
comparison_function = compare_fieldvalues_title
elif match_mode == 'author':
# Special author mode
comparison_function = compare_fieldvalues_authorname
elif match_mode == 'identifier':
# Special identifier mode
comparison_function = compare_fieldvalues_identifier
elif match_mode == 'date':
# Special identifier mode
comparison_function = compare_fieldvalues_date
else:
# Normal mode
comparison_function = compare_fieldvalues_normal
# Get list of comparisons to perform containing extracted values
field_comparisons = get_paired_comparisons(original_record_values, \
matched_record_values, \
ignore_order)
if verbose > 8:
sys.stderr.write("Field comparison values:\n%s\n" % (field_comparisons,))
# Run comparisons according to match_mode
current_matching_status, matches = comparison_function(field_comparisons, \
threshold, \
matches_needed)
CFG_BIBMATCH_LOGGER.info("-- Comparing fields %s with %s = %d matches of %d" % \
(str(original_record_values), \
str(matched_record_values), \
matches, matches_needed))
## 3. RESULT MODE
if current_matching_status:
if verbose > 8:
sys.stderr.write("Fields matched successfully.\n")
if result_mode in ['final', 'joker']:
# Matching success. Return 5,5 indicating exact-match when final or joker.
return 1.0
total_number_of_matches += 1
else:
# Matching failed. Not a valid match
if result_mode == 'final':
# Final does not allow failure
return 0.0
elif result_mode == 'joker':
# Jokers looks count as a match even if its not
total_number_of_matches += 1
if verbose > 8:
sys.stderr.write("Fields not matching. (Joker)\n")
else:
if verbose > 8:
sys.stderr.write("Fields not matching. \n")
if total_number_of_matches < 2 or total_number_of_comparisons == 0:
return 0.0
return total_number_of_matches / float(total_number_of_comparisons)
def transform_record_to_marc(record, options={'text-marc':1, 'aleph-marc':0}):
""" This function will transform a given bibrec record into marc using
methods from xmlmarc2textmarc in invenio.utils.text. The function returns the
record as a MARC string.
@param record: bibrec structure for record to transform
@type record: dict
@param options: dictionary describing type of MARC record. Defaults to textmarc.
@type options: dict
@return resulting MARC record as string """
sysno = get_sysno_from_record(record, options)
# Note: Record dict is copied as create_marc_record() perform deletions
return create_marc_record(record.copy(), sysno, options)
def compare_fieldvalues_normal(field_comparisons, threshold, matches_needed):
"""
Performs field validation given an list of field comparisons using a standard
normalized string distance metric. Each comparison is done according to given
threshold which the normalized result must be equal or above to match.
Before the values are compared they will be massaged by putting all values
lower-case and any leading/trailing spaces are removed.
During validation the fields are compared and matches are counted per
field, up to the given amount of matches needed is met, causing the
function to return True. If validation ends before this threshold is met
it will return False.
@param field_comparisons: list of comparisons, each which contains a list
of field-value to field-value comparisons.
@type field_comparisons: list
@param threshold: number describing the match threshold a comparison must
exceed to become a positive match.
@type threshold: float
@param matches_needed: number of positive field matches needed for the entire
comparison process to give a positive result.
@type matches_needed: int
@return: tuple of matching result, True if enough matches are found, False if not,
and number of matches.
@rtype: tuple
"""
matches_found = 0
# Loop over all possible comparisons field by field, if a match is found,
# we are done with this field and break out to try and match next field.
for comparisons in field_comparisons:
for value, other_value in comparisons:
# Value matching - put values in lower case and strip leading/trailing spaces
diff = difflib.SequenceMatcher(None, value.lower().strip(), \
other_value.lower().strip()).ratio()
if diff >= threshold:
matches_found += 1
break
# If we already have found required number of matches, we return immediately
if matches_found >= matches_needed:
return True, matches_found
return matches_found >= matches_needed, matches_found
def compare_fieldvalues_authorname(field_comparisons, threshold, matches_needed):
"""
Performs field validation given an list of field comparisons using a technique
that is meant for author-names taking into account initials vs. full-name,
using matching techniques available from BibAuthorId.
Each comparison is done according to given threshold which the result must
be equal or above to match.
During validation the fields are compared and matches are counted per
field, up to the given amount of matches needed is met, causing the
function to return True. If validation ends before this threshold is met
it will return False.
@param field_comparisons: list of comparisons, each which contains a list
of field-value to field-value comparisons.
@type field_comparisons: list
@param threshold: number describing the match threshold a comparison must
exceed to become a positive match.
@type threshold: float
@param matches_needed: number of positive field matches needed for the entire
comparison process to give a positive result.
@type matches_needed: int
@return: tuple of matching result, True if enough matches are found, False if not,
and number of matches.
@rtype: tuple
"""
matches_found = 0
# Loop over all possible comparisons field by field, if a match is found,
# we are done with this field and break out to try and match next field.
for comparisons in field_comparisons:
for value, other_value in comparisons:
# Grab both permutations of a name (before, after and after, before)
# and compare to each unique commutative combination. Ex:
# Doe,J vs. Smith,J -> [(('Smith,J', 'Doe,J'), ('Smith,J', 'J,Doe')),
# (('J,Smith', 'Doe,J'), ('J,Smith', 'J,Doe'))]
author_comparisons = [pair for pair in get_paired_comparisons(\
get_reversed_string_variants(value), \
get_reversed_string_variants(other_value))][0]
for str1, str2 in author_comparisons:
# Author-name comparison - using BibAuthorid function
diff = compare_names(str1, str2)
if diff >= threshold:
matches_found += 1
break
else:
# We continue as no match was found
continue
# We break out as a match was found
break
# If we already have found required number of matches, we return immediately
if matches_found >= matches_needed:
return True, matches_found
# Often authors are not matching fully, so lets allow for the number of matches to
# be a little lower, using the same threshold
result = matches_found >= matches_needed or matches_found / float(matches_needed) > threshold
return result, matches_found
def compare_fieldvalues_identifier(field_comparisons, threshold, matches_needed):
"""
Performs field validation given an list of field comparisons using a method to
normalize identifiers for comparisons. For example by removing hyphens and other
symbols.
Each comparison is done according to given threshold which the normalized
result must be equal or above to match. Before the values are compared they will be
converted to lower-case.
During validation the fields are compared and matches are counted per
field, up to the given amount of matches needed is met, causing the
function to return True. If validation ends before this threshold is met
it will return False.
@param field_comparisons: list of comparisons, each which contains a list
of field-value to field-value comparisons.
@type field_comparisons: list
@param threshold: number describing the match threshold a comparison must
exceed to become a positive match.
@type threshold: float
@param matches_needed: number of positive field matches needed for the entire
comparison process to give a positive result.
@type matches_needed: int
@return: tuple of matching result, True if enough matches are found, False if not,
and number of matches.
@rtype: tuple
"""
matches_found = 0
# Loop over all possible comparisons field by field, if a match is found,
# we are done with this field and break out to try and match next field.
for comparisons in field_comparisons:
for value, other_value in comparisons:
# Value matching - put values in lower case and remove punctuation
# and trailing zeroes. 'DESY-F35D-97-04' -> 'DESYF35D974'
value = re.sub('\D[0]|\W+', "", value.lower())
other_value = re.sub('\D[0]|\W+', "", other_value.lower())
diff = difflib.SequenceMatcher(None, value, other_value).ratio()
if diff >= threshold:
matches_found += 1
break
# If we already have found required number of matches, we return immediately
if matches_found >= matches_needed:
return True, matches_found
return matches_found >= matches_needed, matches_found
def compare_fieldvalues_title(field_comparisons, threshold, matches_needed):
"""
Performs field validation given an list of field comparisons using a method
specialized for comparing titles. For example by looking for possible
concatenated title and subtitles or having a KB of common word aliases.
Each comparison is done according to given threshold which the normalized
result must be equal or above to match.
Before the values are compared they will be massaged by putting all values
lower-case and any leading/trailing spaces are removed.
During validation the fields are compared and matches are counted per
field, up to the given amount of matches needed is met, causing the
function to return True. If validation ends before this threshold is met
it will return False.
@param field_comparisons: list of comparisons, each which contains a list
of field-value to field-value comparisons.
@type field_comparisons: list
@param threshold: number describing the match threshold a comparison must
exceed to become a positive match.
@type threshold: float
@param matches_needed: number of positive field matches needed for the entire
comparison process to give a positive result.
@type matches_needed: int
@return: tuple of matching result, True if enough matches are found, False if not,
and number of matches.
@rtype: tuple
"""
matches_found = 0
# Loop over all possible comparisons field by field, if a match is found,
# we are done with this field and break out to try and match next field.
for comparisons in field_comparisons:
for value, other_value in comparisons:
# TODO: KB of alias mappings of common names
title_comparisons = [pair for pair in _get_grouped_pairs(\
get_separated_string_variants(value), \
get_separated_string_variants(other_value))][0]
for str1, str2 in title_comparisons:
# Title comparison
diff = difflib.SequenceMatcher(None, str1.lower().strip(), \
str2.lower().strip()).ratio()
if diff >= threshold:
matches_found += 1
break
else:
# We continue as no match was found
continue
# We break out as a match was found
break
# If we already have found required number of matches, we return immediately
if matches_found >= matches_needed:
return True, matches_found
return matches_found >= matches_needed, matches_found
def compare_fieldvalues_date(field_comparisons, threshold, matches_needed):
"""
Performs field validation given an list of field comparisons specialized
towards matching dates. Each comparison is done according to given
threshold which the final result must be equal or above to match.
During validation the fields are compared and matches are counted per
field, up to the given amount of matches needed is met, causing the
function to return True. If validation ends before this threshold is met
it will return False.
@param field_comparisons: list of comparisons, each which contains a list
of field-value to field-value comparisons.
@type field_comparisons: list
@param threshold: number describing the match threshold a comparison must
exceed to become a positive match.
@type threshold: float
@param matches_needed: number of positive field matches needed for the entire
comparison process to give a positive result.
@type matches_needed: int
@return: tuple of matching result, True if enough matches are found, False if not,
and number of matches.
@rtype: tuple
"""
matches_found = 0
# Loop over all possible comparisons field by field, if a match is found,
# we are done with this field and break out to try and match next field.
for comparisons in field_comparisons:
for value, other_value in comparisons:
value_list = re.findall('[0-9]{4}', value.lower())
other_value_list = re.findall('[0-9]{4}', other_value.lower())
for year1 in value_list:
for year2 in other_value_list:
# Value matching - convert values to int
diff = compare_numbers(int(year1), int(year2))
if diff >= threshold:
matches_found += 1
break
else:
continue
break
else:
continue
break
# If we already have found required number of matches, we return immediately
if matches_found >= matches_needed:
return True, matches_found
return matches_found >= matches_needed, matches_found
def get_validation_ruleset(record):
"""
This function will iterate over any defined rule-sets in
CFG_BIBMATCH_MATCH_VALIDATION_RULESETS, generating a validation
rule-set for use when comparing records.
in the order of appearance. Meaning that the last rules will have
precedence over earlier one, should MARC tags be conflicting.
You can add your own rule-sets in invenio.conf. The 'default' rule-set
is always applied, but the tag-rules can be overwritten by other
rule-sets. The rule-sets are only allowed to be tuples of two items.
For example: ('980__ \$\$aTHESIS', { tag : (rules) })
* The first part is a string containing a regular expression
that is matched against the textmarc representation of each
record. If a match is found, the final rule-set is updated with
the given "sub rule-set", i.e. second item.
* The second item is a dict that indicates specific MARC tags with
corresponding validation rules.
@param record: bibrec record dict to analyze
@type record: dict
@return: list of ordered rule-sets
@rtype: list
"""
# Convert original record to textmarc in order to regexp search
original_record_marc = transform_record_to_marc(record)
# Lets parse the rule-set configuration to try to match rule-sets
# with original record, adding to/overwritin as we go
validation_ruleset = {}
for pattern, rules in CFG_BIBMATCH_MATCH_VALIDATION_RULESETS:
if pattern == "default" or re.search(pattern, original_record_marc) != None:
for rule in rules:
# Simple validation of rules syntax
if rule['compare_mode'] not in CFG_BIBMATCH_VALIDATION_COMPARISON_MODES:
return
if rule['match_mode'] not in CFG_BIBMATCH_VALIDATION_MATCHING_MODES:
return
if rule['result_mode'] not in CFG_BIBMATCH_VALIDATION_RESULT_MODES:
return
try:
# Update/Add rule in rule-set
validation_ruleset[rule['tags']] = (rule['threshold'], \
rule['compare_mode'], \
rule['match_mode'], \
rule['result_mode'])
except KeyError:
# Bad rule-set, return None
return
# Now generate the final list of rules in proper order, so final and joker result-modes
# are executed before normal rules. Order of precedence: final, joker, normal
final_list = []
joker_list = []
normal_list = []
for tag, (threshold, compare_mode, match_mode, result_mode) in validation_ruleset.iteritems():
if compare_mode == 'ignored' or threshold <= 0.0:
# Ignore rule
continue
if result_mode == 'final':
final_list.append((tag, threshold, compare_mode, match_mode, result_mode))
elif result_mode == 'joker':
joker_list.append((tag, threshold, compare_mode, match_mode, result_mode))
else:
normal_list.append((tag, threshold, compare_mode, match_mode, result_mode))
return final_list + joker_list + normal_list
def validate_tag(field_tag):
"""
This function will return a tuple of (tag, ind1, ind2, code) as extracted
from given string. If the tag is not deemed valid: return None.
For example: "100__a" will return ('100', '', '', 'a')
@param field_tag: field tag to extract MARC parts from
@type field_tag: string
@return: tuple of MARC tag parts, tag, ind1, ind2, code
@rtype: tuple
"""
if re_valid_tag.match(field_tag) != None:
tag = field_tag[0:3]
ind1 = field_tag[3:4]
ind2 = field_tag[4:5]
code = field_tag[5:6]
if ind1 == "_":
ind1 = ""
if ind2 == "_":
ind2 = ""
return tag, ind1, ind2, code
return None
def get_paired_comparisons(first_list, second_list, ignore_order=True):
"""
This function will return a a list of comparisons, each which contains
a list of all the possible unique item to item comparisons.
If ordering is required, the lists must be of same length and the
comparisons will be single item by item comparisons.
@param first_list: a iterable to pair with second_list items
@type first_list: iterable
@param second_list: an iterable to be paired against first_list
@type first_list: iterable
@return: the resulting iterable of pairs grouped by first_list items
@rtype: iterable
"""
if ignore_order:
# Get grouped permutations of comparisons between subfields
paired_comparisons = _get_grouped_pairs(first_list, second_list)
else:
# Must have same number of items
if len(first_list) != len(second_list):
return []
# Now prepare direct one-to-one comparisons
paired_comparisons = [((first_list[i], second_list[i]),) \
for i in range(0, len(first_list))]
return paired_comparisons
def compare_numbers(num1, num2):
"""
This function will try to compare two numbers to each other,
returning the normalized distance between them. The value
returned will be between 0.0 - 1.0, with 1.0 being a full
match, decreasing 0.1 per year in difference.
Inspired by similar function in MarcXimil
(http://marcximil.sourceforge.net/).
@param num1: the first number to compare
@type num1: int
@param num2: the second number to compare
@type num2: int
@return: the normalized equality score between 0.0 and 1.0
@rtype: float
"""
return 1.0 - (abs(num1 - num2) * 0.1)
def get_separated_string_variants(s, sep=':'):
"""
This function will return a list of all the possible combinations
of substrings of given title when separated by given separator.
For example:
"scalar tensor theory : validity of Cosmic no hair conjecture"
produces:
['scalar tensor theory ',
' validity of Cosmic no hair conjecture',
'scalar tensor theory : validity of Cosmic no hair conjecture']
It also returns variants containing several separators:
"scalar tensor theory : validity of Cosmic no hair : conjecture"
produces:
['scalar tensor theory ',
' validity of Cosmic no hair : conjecture',
'scalar tensor theory : validity of Cosmic no hair ',
' conjecture',
'scalar tensor theory : validity of Cosmic no hair : conjecture']
@param s: string to generate variants from
@type s: string
@param sep: separator that splits the string in two. Defaults to colon (:).
@type sep: string
@return: list of strings
@rtype: list
"""
string_variants = []
str_parts = s.split(sep)
start_index = 1
for dummy in str_parts:
first_part = sep.join(str_parts[:start_index])
if first_part != '':
string_variants.append(first_part)
last_part = sep.join(str_parts[start_index:])
if last_part != '':
string_variants.append(last_part)
if start_index <= len(str_parts):
start_index += 1
else:
break
return string_variants
def get_reversed_string_variants(s, sep=','):
"""
This function will return a tuple containing a pair of the original
string and the reversed version, with regards to text before/after the
separator (on first encounter of said separator).
For example, "lastname, firstname", "firstname, lastname"
@param s: string to extract pair from
@type s: string
@param sep: separator that splits the string in two. Defaults to comma (,).
@type sep: string
@return: tuple of strings
@rtype: tuple
"""
# Extract the different parts of the name using partition function.
left, sep, right = string_partition(s, sep)
return (left + sep + right, right + sep + left)
def _get_grouped_pairs(first_list, second_list):
"""
This function will return a list of grouped pairs of items from
the first list with every item in the second list.
e.g. [1,2,3],[4,5] -> [([1, 4], [1, 5]),
([2, 4], [2, 5]),
([3, 4], [3, 5])]
@param first_list: an iterable to pair with second_list items
@type first_list: iterable
@param second_list: an iterable to be paired against first_list
@type second_list: iterable
@return: the resulting iterable of pairs grouped by first_list items
@rtype: iterable
"""
pairs = []
for first_item in first_list:
pair_group = []
for second_item in second_list:
pair_group.append((first_item, second_item))
pairs.append(tuple(pair_group))
return pairs
diff --git a/invenio/legacy/bibmerge/engine.py b/invenio/legacy/bibmerge/engine.py
index d46af9865..d32aff977 100644
--- a/invenio/legacy/bibmerge/engine.py
+++ b/invenio/legacy/bibmerge/engine.py
@@ -1,440 +1,440 @@
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0103
"""Invenio BibMerge Engine."""
import os
from invenio.bibmerge_merger import merge_field_group, replace_field, \
add_field, delete_field, merge_field, \
add_subfield, replace_subfield, \
delete_subfield, copy_R2_to_R1, merge_record
from invenio.legacy.search_engine import print_record, perform_request_search, \
record_exists
from invenio.legacy.bibrecord import get_fieldvalues
-from invenio.bibedit_utils import cache_exists, cache_expired, \
+from invenio.legacy.bibedit.utils import cache_exists, cache_expired, \
create_cache_file, delete_cache_file, get_cache_file_contents, \
get_cache_mtime, latest_record_revision, record_locked_by_other_user, \
record_locked_by_queue, save_xml_record, touch_cache_file, \
update_cache_file_contents, _get_file_path, \
get_record_revision_ids, revision_format_valid_p, split_revid, \
get_marcxml_of_revision_id
from invenio.utils.html import remove_html_markup
from invenio.legacy.bibrecord import create_record, record_xml_output, record_add_field, \
record_order_subfields
-from invenio.bibedit_config import CFG_BIBEDIT_TO_MERGE_SUFFIX
+from invenio.legacy.bibedit.config import CFG_BIBEDIT_TO_MERGE_SUFFIX
import invenio.legacy.template
bibmerge_templates = invenio.legacy.template.load('bibmerge')
def perform_request_init():
"""Handle the initial request.
"""
errors = []
warnings = []
body = ''
# Build page structure and control panel.
body += bibmerge_templates.controlpanel()
body += """
<div id="bibMergeContent">
</div>"""
return body, errors, warnings
def perform_request_ajax(req, uid, data):
"""Ajax request dispatcher.\
"""
requestType = data['requestType']
if requestType in ('getRecordCompare', 'submit', 'cancel', 'recCopy', \
'recMerge', 'recMergeNC'):
return perform_request_record(requestType, uid, data)
elif requestType in ('getFieldGroup', 'getFieldGroupDiff', \
'mergeFieldGroup', 'mergeNCFieldGroup', 'replaceField', 'addField', \
'deleteField', 'mergeField'):
return perform_request_update_record(requestType, uid, data)
elif requestType in ('deleteSubfield', 'addSubfield', 'replaceSubfield', \
'diffSubfield'):
return perform_small_request_update_record(requestType, uid, data)
elif requestType == "searchCandidates" or requestType == "searchRevisions":
return perform_candidate_record_search(requestType, data)
else:
return { 'resultCode': 1, 'resultText': 'Error unknown' }
def perform_candidate_record_search(requestType, data):
"""Handle search requests.
"""
max_results = 999
too_many = False
result = {
'resultCode': 0,
'resultText': ''
}
if requestType == "searchCandidates":
recids = perform_request_search( p=data['query'] )
if len(recids) > max_results:
too_many = True
else:
captions = [ search_result_info(x) for x in recids ]
alternative_titles = [ remove_html_markup(print_record(x, "hs")) for x in recids ]
search_results = [recids, captions, alternative_titles]
elif requestType == "searchRevisions":
revisions = get_record_revision_ids( data['recID1'] )
captions = [ split_revid(x, 'datetext')[1] for x in revisions ]
search_results = [revisions, captions]
if too_many == True:
result['resultCode'] = 1
result['resultText'] = 'Too many results'
else:
result['results'] = search_results
result['resultText'] = '%s results' % len(search_results[0])
return result
def search_result_info(recid):
"""Return report number of a record or if it doen't exist return the recid
itself.
"""
report_numbers = get_fieldvalues(recid, '037__a')
if len(report_numbers) == 0:
return "#"+str(recid)
else:
return report_numbers[0]
def perform_request_record(requestType, uid, data):
"""Handle 'major' record related requests.
Handle retrieving, submitting or cancelling the merging session.
"""
#TODO add checks before submission and cancel, replace get_bibrecord call
result = {
'resultCode': 0,
'resultText': ''
}
recid1 = data["recID1"]
record1 = _get_record(recid1, uid, result)
if result['resultCode'] != 0: #if record not accessible return error information
return result
if requestType == 'submit':
if data.has_key('duplicate'):
recid2 = data['duplicate']
record2 = _get_record_slave(recid2, result, 'recid', uid)
if result['resultCode'] != 0: #return in case of error
return result
# mark record2 as deleted
record_add_field(record2, '980', ' ', ' ', '', [('c', 'DELETED')])
# mark record2 as duplicate of record1
record_add_field(record2, '970', ' ', ' ', '', [('d', str(recid1))])
# submit record2 to be deleted
xml_record2 = record_xml_output(record2)
save_xml_record(recid2, uid, xml_record2)
#submit record1
xml_record1 = record_xml_output(record1)
save_xml_record(recid1, uid, xml_record1)
result['resultText'] = 'Records submitted'
return result
#submit record1 from cache
save_xml_record(recid1, uid)
# Delete cache file if it exists
if cache_exists(recid1, uid):
delete_cache_file(recid1, uid)
result['resultText'] = 'Record submitted'
return result
elif requestType == 'cancel':
delete_cache_file(recid1, uid)
result['resultText'] = 'Cancelled'
return result
recid2 = data["recID2"]
mode = data['record2Mode']
record2 = _get_record_slave(recid2, result, mode, uid)
if result['resultCode'] != 0: #if record not accessible return error information
return result
if requestType == 'getRecordCompare':
result['resultHtml'] = bibmerge_templates.BM_html_all_diff(record1, record2)
result['resultText'] = 'Records compared'
elif requestType == 'recCopy':
copy_R2_to_R1(record1, record2)
result['resultHtml'] = bibmerge_templates.BM_html_all_diff(record1, record2)
result['resultText'] = 'Record copied'
elif requestType == 'recMerge':
merge_record(record1, record2, merge_conflicting_fields=True)
result['resultHtml'] = bibmerge_templates.BM_html_all_diff(record1, record2)
result['resultText'] = 'Records merged'
elif requestType == 'recMergeNC':
merge_record(record1, record2, merge_conflicting_fields=False)
result['resultHtml'] = bibmerge_templates.BM_html_all_diff(record1, record2)
result['resultText'] = 'Records merged'
else:
result['resultCode'], result['resultText'] = 1, 'Wrong request type'
return result
def perform_request_update_record(requestType, uid, data):
"""Handle record update requests for actions on a field level.
Handle merging, adding, or replacing of fields.
"""
result = {
'resultCode': 0,
'resultText': ''
}
recid1 = data["recID1"]
recid2 = data["recID2"]
record_content = get_cache_file_contents(recid1, uid)
cache_dirty = record_content[0]
rec_revision = record_content[1]
record1 = record_content[2]
pending_changes = record_content[3]
disabled_hp_changes = record_content[4]
# We will not be able to Undo/Redo correctly after any modifications
# from the level of bibmerge are performed ! We clear all the undo/redo
# lists
undo_list = []
redo_list = []
mode = data['record2Mode']
record2 = _get_record_slave(recid2, result, mode, uid)
if result['resultCode'] != 0: #if record not accessible return error information
return result
if requestType == 'getFieldGroup':
result['resultHtml'] = bibmerge_templates.BM_html_field_group(record1, record2, data['fieldTag'])
result['resultText'] = 'Field group retrieved'
return result
elif requestType == 'getFieldGroupDiff':
result['resultHtml'] = bibmerge_templates.BM_html_field_group(record1, record2, data['fieldTag'], True)
result['resultText'] = 'Fields compared'
return result
elif requestType == 'mergeFieldGroup' or requestType == 'mergeNCFieldGroup':
fnum, ind1, ind2 = _fieldtagNum_and_indicators(data['fieldTag'])
if requestType == 'mergeNCFieldGroup':
merge_field_group(record1, record2, fnum, ind1, ind2, False)
else:
merge_field_group(record1, record2, fnum, ind1, ind2, True)
resultText = 'Field group merged'
elif requestType == 'replaceField' or requestType == 'addField':
fnum, ind1, ind2 = _fieldtagNum_and_indicators(data['fieldTag'])
findex1 = _field_info( data['fieldCode1'] )[1]
findex2 = _field_info( data['fieldCode2'] )[1]
if findex2 == None:
result['resultCode'], result['resultText'] = 1, 'No value in the selected field'
return result
if requestType == 'replaceField':
replace_field(record1, record2, fnum, findex1, findex2)
resultText = 'Field replaced'
else: # requestType == 'addField'
add_field(record1, record2, fnum, findex1, findex2)
resultText = 'Field added'
elif requestType == 'deleteField':
fnum, ind1, ind2 = _fieldtagNum_and_indicators(data['fieldTag'])
findex1 = _field_info( data['fieldCode1'] )[1]
if findex1 == None:
result['resultCode'], result['resultText'] = 1, 'No value in the selected field'
return result
delete_field(record1, fnum, findex1)
resultText = 'Field deleted'
elif requestType == 'mergeField':
fnum, ind1, ind2 = _fieldtagNum_and_indicators(data['fieldTag'])
findex1 = _field_info( data['fieldCode1'] )[1]
findex2 = _field_info( data['fieldCode2'] )[1]
if findex2 == None:
result['resultCode'], result['resultText'] = 1, 'No value in the selected field'
return result
merge_field(record1, record2, fnum, findex1, findex2)
resultText = 'Field merged'
else:
result['resultCode'], result['resultText'] = 1, 'Wrong request type'
return result
result['resultHtml'] = bibmerge_templates.BM_html_field_group(record1, record2, data['fieldTag'])
result['resultText'] = resultText
update_cache_file_contents(recid1, uid, rec_revision, record1, pending_changes, disabled_hp_changes, undo_list, redo_list)
return result
def perform_small_request_update_record(requestType, uid, data):
"""Handle record update requests for actions on a subfield level.
Handle adding, replacing or deleting of subfields.
"""
result = {
'resultCode': 0,
'resultText': '',
'resultHtml': ''
}
recid1 = data["recID1"]
recid2 = data["recID2"]
cache_content = get_cache_file_contents(recid1, uid) #TODO: check mtime, existence
cache_dirty = cache_content[0]
rec_revision = cache_content[1]
record1 = cache_content[2]
pending_changes = cache_content[3]
disabled_hp_changes = cache_content[4]
mode = data['record2Mode']
record2 = _get_record_slave(recid2, result, mode, uid)
if result['resultCode'] != 0: #if record not accessible return error information
return result
ftag, findex1 = _field_info(data['fieldCode1'])
fnum = ftag[:3]
findex2 = _field_info(data['fieldCode2'])[1]
sfindex1 = data['sfindex1']
sfindex2 = data['sfindex2']
if requestType == 'deleteSubfield':
delete_subfield(record1, fnum, findex1, sfindex1)
result['resultText'] = 'Subfield deleted'
elif requestType == 'addSubfield':
add_subfield(record1, record2, fnum, findex1, findex2, sfindex1, sfindex2)
result['resultText'] = 'Subfield added'
elif requestType == 'replaceSubfield':
replace_subfield(record1, record2, fnum, findex1, findex2, sfindex1, sfindex2)
result['resultText'] = 'Subfield replaced'
elif requestType == 'diffSubfield':
result['resultHtml'] = bibmerge_templates.BM_html_subfield_row_diffed(record1, record2, fnum, findex1, findex2, sfindex1, sfindex2)
result['resultText'] = 'Subfields diffed'
update_cache_file_contents(recid1, uid, rec_revision, record1, pending_changes, disabled_hp_changes, [], [])
return result
def _get_record(recid, uid, result, fresh_record=False):
"""Retrieve record structure.
"""
record = None
mtime = None
cache_dirty = None
record_status = record_exists(recid)
existing_cache = cache_exists(recid, uid)
if record_status == 0:
result['resultCode'], result['resultText'] = 1, 'Non-existent record: %s' % recid
elif record_status == -1:
result['resultCode'], result['resultText'] = 1, 'Deleted record: %s' % recid
elif not existing_cache and record_locked_by_other_user(recid, uid):
result['resultCode'], result['resultText'] = 1, 'Record %s locked by user' % recid
elif existing_cache and cache_expired(recid, uid) and \
record_locked_by_other_user(recid, uid):
result['resultCode'], result['resultText'] = 1, 'Record %s locked by user' % recid
elif record_locked_by_queue(recid):
result['resultCode'], result['resultText'] = 1, 'Record %s locked by queue' % recid
else:
if fresh_record:
delete_cache_file(recid, uid)
existing_cache = False
if not existing_cache:
record_revision, record = create_cache_file(recid, uid)
mtime = get_cache_mtime(recid, uid)
cache_dirty = False
else:
tmpRes = get_cache_file_contents(recid, uid)
cache_dirty, record_revision, record = tmpRes[0], tmpRes[1], tmpRes[2]
touch_cache_file(recid, uid)
mtime = get_cache_mtime(recid, uid)
if not latest_record_revision(recid, record_revision):
result['cacheOutdated'] = True
result['resultCode'], result['resultText'], result['cacheDirty'], result['cacheMTime'] = 0, 'Record OK', cache_dirty, mtime
return record
def _get_record_slave(recid, result, mode=None, uid=None):
"""Check if record exists and return it in dictionary format.
If any kind of error occurs returns None.
If mode=='revision' then recid parameter is considered as revid."""
record = None
if recid == 'none':
mode = 'none'
if mode == 'recid':
record_status = record_exists(recid)
#check for errors
if record_status == 0:
result['resultCode'], result['resultText'] = 1, 'Non-existent record: %s' % recid
elif record_status == -1:
result['resultCode'], result['resultText'] = 1, 'Deleted record: %s' % recid
elif record_locked_by_queue(recid):
result['resultCode'], result['resultText'] = 1, 'Record %s locked by queue' % recid
else:
record = create_record( print_record(recid, 'xm') )[0]
record_order_subfields(record)
elif mode == 'tmpfile':
file_path = '%s_%s.xml' % (_get_file_path(recid, uid),
CFG_BIBEDIT_TO_MERGE_SUFFIX)
if not os.path.isfile(file_path): #check if file doesn't exist
result['resultCode'], result['resultText'] = 1, 'Temporary file doesnt exist'
else: #open file
tmpfile = open(file_path, 'r')
record = create_record( tmpfile.read() )[0]
tmpfile.close()
elif mode == 'revision':
if revision_format_valid_p(recid):
marcxml = get_marcxml_of_revision_id(recid)
if marcxml:
record = create_record(marcxml)[0]
else:
result['resultCode'], result['resultText'] = 1, 'The specified revision does not exist'
else:
result['resultCode'], result['resultText'] = 1, 'Invalid revision id'
elif mode == 'none':
return {}
else:
result['resultCode'], result['resultText'] = 1, 'Invalid record mode for record2'
return record
def _field_info(fieldIdCode):
"""Returns a tuple: (field-tag, field-index)
eg.: _field_info('R1-8560_-2') --> ('8560_', 2) """
info = fieldIdCode.split('-')
if info[2] == 'None':
info[2] = None
else:
info[2] = int(info[2])
return tuple( info[1:] )
def _fieldtagNum_and_indicators(fieldTag):
"""Separate a 5-char field tag to a 3-character field-tag number and two
indicators"""
fnum, ind1, ind2 = fieldTag[:3], fieldTag[3], fieldTag[4]
if ind1 == '_':
ind1 = ' '
if ind2 == '_':
ind2 = ' '
return (fnum, ind1, ind2)
diff --git a/invenio/legacy/bibrank/bridge_utils.py b/invenio/legacy/bibrank/bridge_utils.py
index 5b625652d..3f33d3558 100644
--- a/invenio/legacy/bibrank/bridge_utils.py
+++ b/invenio/legacy/bibrank/bridge_utils.py
@@ -1,83 +1,83 @@
## This file is part of Invenio.
## Copyright (C) 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import ConfigParser
import re
from invenio.legacy.bibrank.bridge_config import CFG_BIBRANK_WRD_CFG_PATH
from invenio.legacy.search_engine import get_fieldvalues
-from invenio.bibindexadminlib import get_fld_id, get_fld_tags
+from invenio.legacy.bibindex.adminlib import get_fld_id, get_fld_tags
def get_external_word_similarity_ranker():
for line in open(CFG_BIBRANK_WRD_CFG_PATH):
for ranker in ('solr', 'xapian'):
if 'word_similarity_%s' % ranker in line:
return ranker
return False
def get_tags():
"""
Returns the tags per Solr field as a dictionary.
"""
tags = {}
for (field_name, logical_fields) in get_logical_fields().iteritems():
tags_of_logical_fields = []
for logical_field in logical_fields:
field_id = get_fld_id(logical_field)
tags_of_logical_fields.extend([tag[3] for tag in get_fld_tags(field_id)])
tags[field_name] = tags_of_logical_fields
return tags
def get_logical_fields():
"""
Returns the logical fields per Solr field as a dictionary.
"""
fields = {}
try:
config = ConfigParser.ConfigParser()
config.readfp(open(CFG_BIBRANK_WRD_CFG_PATH))
except StandardError:
return fields
sections = config.sections()
field_pattern = re.compile('field[0-9]+')
for section in sections:
if field_pattern.search(section):
field_name = config.get(section, 'name')
if config.has_option(section, 'logical_fields'):
logical_fields = config.get(section, 'logical_fields')
fields[field_name] = [f.strip() for f in logical_fields.split(',')]
return fields
def get_field_content_in_utf8(recid, field, tag_dict, separator=' '):
"""
Returns the content of a field comprised of tags
concatenated in an UTF-8 string.
"""
content = ''
try:
values = []
for tag in tag_dict[field]:
values.extend(get_fieldvalues(recid, tag))
content = unicode(separator.join(values), 'utf-8')
except:
pass
return content
diff --git a/invenio/legacy/bibrank/citation_indexer.py b/invenio/legacy/bibrank/citation_indexer.py
index e49376cfc..b084f78f2 100644
--- a/invenio/legacy/bibrank/citation_indexer.py
+++ b/invenio/legacy/bibrank/citation_indexer.py
@@ -1,1017 +1,1017 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
import re
import time
import os
import sys
import ConfigParser
from itertools import islice
from datetime import datetime
from invenio.legacy.dbquery import run_sql, serialize_via_marshal, \
deserialize_via_marshal
from invenio.bibindex_tokenizers.BibIndexJournalTokenizer import \
CFG_JOURNAL_PUBINFO_STANDARD_FORM, \
CFG_JOURNAL_PUBINFO_STANDARD_FORM_REGEXP_CHECK
from invenio.legacy.search_engine import search_pattern, search_unit
from invenio.legacy.bibrecord import get_fieldvalues
from invenio.modules.formatter.utils import parse_tag
from invenio.modules.knowledge.api import get_kb_mappings
-from invenio.bibtask import write_message, task_get_option, \
+from invenio.legacy.bibsched.bibtask import write_message, task_get_option, \
task_update_progress, task_sleep_now_if_required, \
task_get_task_param
from invenio.ext.logging import register_exception
-from invenio.bibindex_engine import get_field_tags
+from invenio.legacy.bibindex.engine import get_field_tags
INTBITSET_OF_DELETED_RECORDS = search_unit(p='DELETED', f='980', m='a')
re_CFG_JOURNAL_PUBINFO_STANDARD_FORM_REGEXP_CHECK = re.compile(CFG_JOURNAL_PUBINFO_STANDARD_FORM_REGEXP_CHECK)
def get_recids_matching_query(p, f, m='e'):
"""Return set of recIDs matching query for pattern p in field f."""
return search_pattern(p=p, f=f, m=m) - INTBITSET_OF_DELETED_RECORDS
def get_citation_weight(rank_method_code, config, chunk_size=20000):
"""return a dictionary which is used by bibrank daemon for generating
the index of sorted research results by citation information
"""
begin_time = time.time()
quick = task_get_option("quick") != "no"
# id option forces re-indexing a certain range
# even if there are no new recs
if task_get_option("id"):
# construct a range of records to index
updated_recids = []
for first, last in task_get_option("id"):
updated_recids += range(first, last+1)
if len(updated_recids) > 10000:
str_updated_recids = str(updated_recids[:10]) + ' ... ' + str(updated_recids[-10:])
else:
str_updated_recids = str(updated_recids)
write_message('Records to process: %s' % str_updated_recids)
index_update_time = None
else:
bibrank_update_time = get_bibrankmethod_lastupdate(rank_method_code)
if not quick:
bibrank_update_time = "0000-00-00 00:00:00"
write_message("bibrank: %s" % bibrank_update_time)
index_update_time = get_bibindex_update_time()
write_message("bibindex: %s" % index_update_time)
if index_update_time > datetime.now().strftime("%Y-%m-%d %H:%M:%S"):
index_update_time = "0000-00-00 00:00:00"
updated_recids = get_modified_recs(bibrank_update_time,
index_update_time)
if len(updated_recids) > 10000:
str_updated_recids = str(updated_recids[:10]) + ' ... ' + str(updated_recids[-10:])
else:
str_updated_recids = str(updated_recids)
write_message("%s records to update" % str_updated_recids)
if updated_recids:
# result_intermediate should be warranted to exists!
# but if the user entered a "-R" (do all) option, we need to
# make an empty start set
if quick:
dicts = {
'cites_weight': last_updated_result(rank_method_code),
'cites': get_cit_dict("citationdict"),
'refs': get_cit_dict("reversedict"),
'selfcites': get_cit_dict("selfcitdict"),
'selfrefs': get_cit_dict("selfcitedbydict"),
'authorcites': get_initial_author_dict(),
}
else:
dicts = {
'cites_weight': {},
'cites': {},
'refs': {},
'selfcites': {},
'selfrefs': {},
'authorcites': {},
}
# Process fully the updated records
process_and_store(updated_recids, config, dicts, chunk_size, quick)
end_time = time.time()
write_message("Total time of get_citation_weight(): %.2f sec" % \
(end_time - begin_time))
task_update_progress("citation analysis done")
cites_weight = dicts['cites_weight']
else:
cites_weight = {}
write_message("No new records added since last time this " \
"rank method was executed")
return cites_weight, index_update_time
def process_and_store(recids, config, dicts, chunk_size, quick):
# Process recent records first
# The older records were most likely added by the above steps
# to be reprocessed so they only have minor changes
recids_iter = iter(sorted(recids, reverse=True))
# Split records to process into chunks so that we do not
# fill up too much memory
while True:
task_sleep_now_if_required()
chunk = list(islice(recids_iter, chunk_size))
if not chunk:
if not quick:
store_dicts(dicts)
break
write_message("Processing chunk #%s to #%s" % (chunk[0], chunk[-1]))
# dicts are modified in-place
process_chunk(chunk, config, dicts)
if quick:
# Store partial result as it is just an update and not
# a creation from scratch
store_dicts(dicts)
def process_chunk(recids, config, dicts):
cites_weight = dicts['cites_weight']
cites = dicts['cites']
refs = dicts['refs']
old_refs = {}
for recid in recids:
old_refs[recid] = set(refs.get(recid, []))
old_cites = {}
for recid in recids:
old_cites[recid] = set(cites.get(recid, []))
process_inner(recids, config, dicts)
# Records cited by updated_recid_list
# They can only loose references as added references
# are already added to the dicts at this point
for somerecid in recids:
for recid in set(old_cites[somerecid]) - set(cites.get(somerecid, [])):
refs[recid] = list(set(refs.get(recid, [])) - set([somerecid]))
if not refs[recid]:
del refs[recid]
# Records referenced by updated_recid_list
# They can only loose citations as added citations
# are already added to the dicts at this point
for somerecid in recids:
for recid in set(old_refs[somerecid]) - set(refs.get(somerecid, [])):
cites[recid] = list(set(cites.get(recid, [])) - set([somerecid]))
cites_weight[recid] = len(cites[recid])
if not cites[recid]:
del cites[recid]
del cites_weight[recid]
def process_inner(recids, config, dicts, do_catchup=True):
tags = get_tags_config(config)
# call the procedure that does the hard work by reading fields of
# citations and references in the updated_recid's (but nothing else)!
write_message("Entering get_citation_informations", verbose=9)
citation_informations = get_citation_informations(recids, tags,
fetch_catchup_info=do_catchup)
write_message("Entering ref_analyzer", verbose=9)
# call the analyser that uses the citation_informations to really
# search x-cites-y in the coll..
return ref_analyzer(citation_informations,
dicts,
recids,
tags,
do_catchup=do_catchup)
def get_bibrankmethod_lastupdate(rank_method_code):
"""return the last excution date of bibrank method
"""
query = """SELECT DATE_FORMAT(last_updated, '%%Y-%%m-%%d %%H:%%i:%%s')
FROM rnkMETHOD WHERE name =%s"""
last_update_time = run_sql(query, [rank_method_code])
try:
r = last_update_time[0][0]
except IndexError:
r = "0000-00-00 00:00:00"
return r
def get_bibindex_update_time():
try:
# check indexing times of `journal' and `reportnumber`
# indexes, and only fetch records which have been indexed
sql = "SELECT DATE_FORMAT(MIN(last_updated), " \
"'%%Y-%%m-%%d %%H:%%i:%%s') FROM idxINDEX WHERE name IN (%s,%s)"
index_update_time = run_sql(sql, ('journal', 'reportnumber'), 1)[0][0]
except IndexError:
write_message("Not running citation indexer since journal/reportnumber"
" indexes are not created yet.")
index_update_time = "0000-00-00 00:00:00"
return index_update_time
def get_modified_recs(bibrank_method_lastupdate, indexes_lastupdate):
"""Get records to be updated by bibrank indexing
Return the list of records which have been modified between the last
execution of bibrank method and the latest journal/report index updates.
The result is expected to have ascending id order.
"""
query = """SELECT id FROM bibrec
WHERE modification_date >= %s
AND modification_date < %s
ORDER BY id ASC"""
records = run_sql(query, (bibrank_method_lastupdate, indexes_lastupdate))
return [r[0] for r in records]
def last_updated_result(rank_method_code):
""" return the last value of dictionary in rnkMETHODDATA table if it
exists and initialize the value of last updated records by zero,
otherwise an initial dictionary with zero as value for all recids
"""
query = """SELECT relevance_data FROM rnkMETHOD, rnkMETHODDATA WHERE
rnkMETHOD.id = rnkMETHODDATA.id_rnkMETHOD
AND rnkMETHOD.Name = '%s'""" % rank_method_code
try:
rdict = run_sql(query)[0][0]
except IndexError:
dic = {}
else:
dic = deserialize_via_marshal(rdict)
return dic
def format_journal(format_string, mappings):
"""format the publ infostring according to the format"""
def replace(char, data):
return data.get(char, char)
return ''.join(replace(c, mappings) for c in format_string)
def get_tags_config(config):
"""Fetch needs config from our config file"""
# Probably "citation" unless this file gets renamed
function = config.get("rank_method", "function")
write_message("config function %s" % function, verbose=9)
tags = {}
# 037a: contains (often) the "hep-ph/0501084" tag of THIS record
try:
tag = config.get(function, "primary_report_number")
except ConfigParser.NoOptionError:
tags['record_pri_number'] = None
else:
tags['record_pri_number'] = tagify(parse_tag(tag))
# 088a: additional short identifier for the record
try:
tag = config.get(function, "additional_report_number")
except ConfigParser.NoOptionError:
tags['record_add_number'] = None
else:
tags['record_add_number'] = tagify(parse_tag(tag))
# 999C5r. this is in the reference list, refers to other records.
# Looks like: hep-ph/0408002
try:
tag = config.get(function, "reference_via_report_number")
except ConfigParser.NoOptionError:
tags['refs_report_number'] = None
else:
tags['refs_report_number'] = tagify(parse_tag(tag))
# 999C5s. this is in the reference list, refers to other records.
# Looks like: Phys.Rev.,A21,78
try:
tag = config.get(function, "reference_via_pubinfo")
except ConfigParser.NoOptionError:
tags['refs_journal'] = None
else:
tags['refs_journal'] = tagify(parse_tag(tag))
# 999C5a. this is in the reference list, refers to other records.
# Looks like: 10.1007/BF03170733
try:
tag = config.get(function, "reference_via_doi")
except ConfigParser.NoOptionError:
tags['refs_doi'] = None
else:
tags['refs_doi'] = tagify(parse_tag(tag))
# Fields needed to construct the journals for this record
try:
tag = {
'pages': config.get(function, "pubinfo_journal_page"),
'year': config.get(function, "pubinfo_journal_year"),
'journal': config.get(function, "pubinfo_journal_title"),
'volume': config.get(function, "pubinfo_journal_volume"),
}
except ConfigParser.NoOptionError:
tags['publication'] = None
else:
tags['publication'] = {
'pages': tagify(parse_tag(tag['pages'])),
'year': tagify(parse_tag(tag['year'])),
'journal': tagify(parse_tag(tag['journal'])),
'volume': tagify(parse_tag(tag['volume'])),
}
# Fields needed to lookup the DOIs
tags['doi'] = get_field_tags('doi')
# 999C5s. A standardized way of writing a reference in the reference list.
# Like: Nucl. Phys. B 710 (2000) 371
try:
tags['publication_format'] = config.get(function,
"pubinfo_journal_format")
except ConfigParser.NoOptionError:
tags['publication_format'] = CFG_JOURNAL_PUBINFO_STANDARD_FORM
# Print values of tags for debugging
write_message("tag values: %r" % [tags], verbose=9)
return tags
def get_journal_info(recid, tags):
record_info = []
# TODO: handle recors with multiple journals
tagsvalues = {} # we store the tags and their values here
# like c->444 y->1999 p->"journal of foo",
# v->20
tmp = get_fieldvalues(recid, tags['publication']['journal'])
if tmp:
tagsvalues["p"] = tmp[0]
tmp = get_fieldvalues(recid, tags['publication']['volume'])
if tmp:
tagsvalues["v"] = tmp[0]
tmp = get_fieldvalues(recid, tags['publication']['year'])
if tmp:
tagsvalues["y"] = tmp[0]
tmp = get_fieldvalues(recid, tags['publication']['pages'])
if tmp:
# if the page numbers have "x-y" take just x
pages = tmp[0]
hpos = pages.find("-")
if hpos > 0:
pages = pages[:hpos]
tagsvalues["c"] = pages
# check if we have the required data
ok = True
for c in tags['publication_format']:
if c in ('p', 'v', 'y', 'c'):
if c not in tagsvalues:
ok = False
if ok:
publ = format_journal(tags['publication_format'], tagsvalues)
record_info += [publ]
alt_volume = get_alt_volume(tagsvalues['v'])
if alt_volume:
tagsvalues2 = tagsvalues.copy()
tagsvalues2['v'] = alt_volume
publ = format_journal(tags['publication_format'], tagsvalues2)
record_info += [publ]
# Add codens
for coden in get_kb_mappings('CODENS',
value=tagsvalues['p']):
tagsvalues2 = tagsvalues.copy()
tagsvalues2['p'] = coden['key']
publ = format_journal(tags['publication_format'], tagsvalues2)
record_info += [publ]
return record_info
def get_alt_volume(volume):
alt_volume = None
if re.match(ur'[a-zA-Z]\d+', volume, re.U|re.I):
alt_volume = volume[1:] + volume[0]
elif re.match(ur'\d+[a-zA-Z]', volume, re.U|re.I):
alt_volume = volume[-1] + volume[:-1]
return alt_volume
def get_citation_informations(recid_list, tags, fetch_catchup_info=True):
"""scans the collections searching references (999C5x -fields) and
citations for items in the recid_list
returns a 4 list of dictionaries that contains the citation information
of cds records
examples: [ {} {} {} {} ]
[ {5: 'SUT-DP-92-70-5'},
{ 93: ['astro-ph/9812088']},
{ 93: ['Phys. Rev. Lett. 96 (2006) 081301'] }, {} ]
NB: stuff here is for analysing new or changed records.
see "ref_analyzer" for more.
"""
begin_time = os.times()[4]
records_info = {
'report-numbers': {},
'journals': {},
'doi': {},
}
references_info = {
'report-numbers': {},
'journals': {},
'doi': {},
}
# perform quick check to see if there are some records with
# reference tags, because otherwise get.cit.inf would be slow even
# if there is nothing to index:
if run_sql("SELECT value FROM bib%sx WHERE tag=%%s LIMIT 1" % tags['refs_journal'][0:2],
(tags['refs_journal'], )) or \
run_sql("SELECT value FROM bib%sx WHERE tag=%%s LIMIT 1" % tags['refs_report_number'][0:2],
(tags['refs_report_number'], )):
done = 0 # for status reporting
for recid in recid_list:
if done % 10 == 0:
task_sleep_now_if_required()
# in fact we can sleep any time here
if done % 1000 == 0:
mesg = "get cit.inf done %s of %s" % (done, len(recid_list))
write_message(mesg)
task_update_progress(mesg)
done += 1
if recid in INTBITSET_OF_DELETED_RECORDS:
# do not treat this record since it was deleted; we
# skip it like this in case it was only soft-deleted
# e.g. via bibedit (i.e. when collection tag 980 is
# DELETED but other tags like report number or journal
# publication info remained the same, so the calls to
# get_fieldvalues() below would return old values)
continue
if tags['refs_report_number']:
references_info['report-numbers'][recid] \
= get_fieldvalues(recid,
tags['refs_report_number'],
sort=False)
msg = "references_info['report-numbers'][%s] = %r" \
% (recid, references_info['report-numbers'][recid])
write_message(msg, verbose=9)
if tags['refs_journal']:
references_info['journals'][recid] = []
for ref in get_fieldvalues(recid,
tags['refs_journal'],
sort=False):
try:
# Inspire specific parsing
journal, volume, page = ref.split(',')
except ValueError:
pass
else:
alt_volume = get_alt_volume(volume)
if alt_volume:
alt_ref = ','.join([journal, alt_volume, page])
references_info['journals'][recid] += [alt_ref]
references_info['journals'][recid] += [ref]
msg = "references_info['journals'][%s] = %r" \
% (recid, references_info['journals'][recid])
write_message(msg, verbose=9)
if tags['refs_doi']:
references_info['doi'][recid] \
= get_fieldvalues(recid, tags['refs_doi'], sort=False)
msg = "references_info['doi'][%s] = %r" \
% (recid, references_info['doi'][recid])
write_message(msg, verbose=9)
if not fetch_catchup_info:
# We do not need the extra info
continue
if tags['record_pri_number'] or tags['record_add_number']:
records_info['report-numbers'][recid] = []
if tags['record_pri_number']:
records_info['report-numbers'][recid] \
+= get_fieldvalues(recid,
tags['record_pri_number'],
sort=False)
if tags['record_add_number']:
records_info['report-numbers'][recid] \
+= get_fieldvalues(recid,
tags['record_add_number'],
sort=False)
msg = "records_info[%s]['report-numbers'] = %r" \
% (recid, records_info['report-numbers'][recid])
write_message(msg, verbose=9)
if tags['doi']:
records_info['doi'][recid] = []
for tag in tags['doi']:
records_info['doi'][recid] += get_fieldvalues(recid,
tag,
sort=False)
msg = "records_info[%s]['doi'] = %r" \
% (recid, records_info['doi'][recid])
write_message(msg, verbose=9)
# get a combination of
# journal vol (year) pages
if tags['publication']:
records_info['journals'][recid] = get_journal_info(recid, tags)
msg = "records_info[%s]['journals'] = %r" \
% (recid, records_info['journals'][recid])
write_message(msg, verbose=9)
else:
mesg = "Warning: there are no records with tag values for " \
"%s or %s. Nothing to do." % \
(tags['refs_journal'], tags['refs_report_number'])
write_message(mesg)
mesg = "get cit.inf done fully"
write_message(mesg)
task_update_progress(mesg)
end_time = os.times()[4]
write_message("Execution time for generating citation info "
"from record: %.2f sec" % (end_time - begin_time))
return records_info, references_info
def standardize_report_number(report_number):
# Remove category for arxiv papers
report_number = re.sub(ur'(?:arXiv:)?(\d{4}\.\d{4}) \[[a-zA-Z\.-]+\]',
ur'arXiv:\g<1>',
report_number,
re.I | re.U)
return report_number
def ref_analyzer(citation_informations, dicts,
updated_recids, tags, do_catchup=True):
"""Analyze the citation informations and calculate the citation weight
and cited by list dictionary.
"""
citations_weight = dicts['cites_weight']
citations = dicts['cites']
references = dicts['refs']
selfcites = dicts['selfcites']
selfrefs = dicts['selfrefs']
authorcites = dicts['authorcites']
def step(msg_prefix, recid, done, total):
if done % 30 == 0:
task_sleep_now_if_required()
if done % 1000 == 0:
mesg = "%s done %s of %s" % (msg_prefix, done, total)
write_message(mesg)
task_update_progress(mesg)
write_message("Processing: %s" % recid, verbose=9)
def add_to_dicts(citer, cited):
# Make sure we don't add ourselves
# Workaround till we know why we are adding ourselves.
if citer == cited:
return
if cited not in citations_weight:
citations_weight[cited] = 0
# Citations and citations weight
if citer not in citations.setdefault(cited, []):
citations[cited].append(citer)
citations_weight[cited] += 1
# References
if cited not in references.setdefault(citer, []):
references[citer].append(cited)
# dict of recid -> institute_give_publ_id
records_info, references_info = citation_informations
t1 = os.times()[4]
write_message("Phase 0: temporarily remove changed records from " \
"citation dictionaries; they will be filled later")
if do_catchup:
for somerecid in updated_recids:
try:
del citations[somerecid]
except KeyError:
pass
for somerecid in updated_recids:
try:
del references[somerecid]
except KeyError:
pass
# Try to find references based on 999C5r
# e.g 8 -> ([astro-ph/9889],[hep-ph/768])
# meaning: rec 8 contains these in bibliography
write_message("Phase 1: Report numbers references")
done = 0
for thisrecid, refnumbers in references_info['report-numbers'].iteritems():
step("Report numbers references", thisrecid, done,
len(references_info['report-numbers']))
done += 1
for refnumber in (r for r in refnumbers if r):
field = 'reportnumber'
refnumber = standardize_report_number(refnumber)
# Search for "hep-th/5644654 or such" in existing records
recids = get_recids_matching_query(p=refnumber, f=field)
write_message("These match searching %s in %s: %s" % \
(refnumber, field, list(recids)), verbose=9)
if not recids:
insert_into_missing(thisrecid, refnumber)
else:
remove_from_missing(refnumber)
if len(recids) > 1:
store_citation_warning('multiple-matches', refnumber)
msg = "Whoops: record '%d' report number value '%s' " \
"matches many records; taking only the first one. %s" % \
(thisrecid, refnumber, repr(recids))
write_message(msg, stream=sys.stderr)
for recid in list(recids)[:1]: # take only the first one
add_to_dicts(thisrecid, recid)
mesg = "done fully"
write_message(mesg)
task_update_progress(mesg)
t2 = os.times()[4]
# Try to find references based on 999C5s
# e.g. Phys.Rev.Lett. 53 (1986) 2285
write_message("Phase 2: Journal references")
done = 0
for thisrecid, refs in references_info['journals'].iteritems():
step("Journal references", thisrecid, done,
len(references_info['journals']))
done += 1
for reference in (r for r in refs if r):
p = reference
field = 'journal'
# check reference value to see whether it is well formed:
if not re_CFG_JOURNAL_PUBINFO_STANDARD_FORM_REGEXP_CHECK.match(p):
store_citation_warning('not-well-formed', p)
msg = "Whoops, record '%d' reference value '%s' " \
"is not well formed; skipping it." % (thisrecid, p)
write_message(msg, stream=sys.stderr)
continue # skip this ill-formed value
recids = search_unit(p, field) - INTBITSET_OF_DELETED_RECORDS
write_message("These match searching %s in %s: %s" \
% (reference, field, list(recids)), verbose=9)
if not recids:
insert_into_missing(thisrecid, p)
else:
remove_from_missing(p)
if len(recids) > 1:
store_citation_warning('multiple-matches', p)
msg = "Whoops: record '%d' reference value '%s' " \
"matches many records; taking only the first one. %s" % \
(thisrecid, p, repr(recids))
write_message(msg, stream=sys.stderr)
for recid in list(recids)[:1]: # take only the first one
add_to_dicts(thisrecid, recid)
mesg = "done fully"
write_message(mesg)
task_update_progress(mesg)
t3 = os.times()[4]
# Try to find references based on 999C5a
# e.g. 10.1007/BF03170733
write_message("Phase 3: DOI references")
done = 0
for thisrecid, refs in references_info['doi'].iteritems():
step("DOI references", thisrecid, done, len(references_info['doi']))
done += 1
for reference in (r for r in refs if r):
p = reference
field = 'doi'
recids = get_recids_matching_query(p, field)
write_message("These match searching %s in %s: %s" \
% (reference, field, list(recids)), verbose=9)
if not recids:
insert_into_missing(thisrecid, p)
else:
remove_from_missing(p)
if len(recids) > 1:
store_citation_warning('multiple-matches', p)
msg = "Whoops: record '%d' DOI value '%s' " \
"matches many records; taking only the first one. %s" % \
(thisrecid, p, repr(recids))
write_message(msg, stream=sys.stderr)
for recid in list(recids)[:1]: # take only the first one
add_to_dicts(thisrecid, recid)
mesg = "done fully"
write_message(mesg)
task_update_progress(mesg)
t4 = os.times()[4]
# Search for stuff like CERN-TH-4859/87 in list of refs
write_message("Phase 4: report numbers catchup")
done = 0
for thisrecid, reportcodes in records_info['report-numbers'].iteritems():
step("Report numbers catchup", thisrecid, done,
len(records_info['report-numbers']))
done += 1
for reportcode in (r for r in reportcodes if r):
if reportcode.startswith('arXiv'):
std_reportcode = standardize_report_number(reportcode)
report_pattern = r'^%s( *\[[a-zA-Z.-]*\])?' % \
re.escape(std_reportcode)
recids = get_recids_matching_query(report_pattern,
tags['refs_report_number'],
'r')
else:
recids = get_recids_matching_query(reportcode,
tags['refs_report_number'],
'e')
for recid in recids:
add_to_dicts(recid, thisrecid)
mesg = "done fully"
write_message(mesg)
task_update_progress(mesg)
# Find this record's pubinfo in other records' bibliography
write_message("Phase 5: journals catchup")
done = 0
t5 = os.times()[4]
for thisrecid, rec_journals in records_info['journals'].iteritems():
step("Journals catchup", thisrecid, done,
len(records_info['journals']))
done += 1
for journal in rec_journals:
journal = journal.replace("\"", "")
# Search the publication string like
# Phys. Lett., B 482 (2000) 417 in 999C5s
recids = search_unit(p=journal, f=tags['refs_journal'], m='a') \
- INTBITSET_OF_DELETED_RECORDS
write_message("These records match %s in %s: %s" \
% (journal, tags['refs_journal'], list(recids)), verbose=9)
for recid in recids:
add_to_dicts(recid, thisrecid)
mesg = "done fully"
write_message(mesg)
task_update_progress(mesg)
write_message("Phase 6: DOI catchup")
done = 0
t6 = os.times()[4]
for thisrecid, dois in records_info['doi'].iteritems():
step("DOI catchup", thisrecid, done, len(records_info['doi']))
done += 1
for doi in dois:
# Search the publication string like
# Phys. Lett., B 482 (2000) 417 in 999C5a
recids = search_unit(p=doi, f=tags['refs_doi'], m='a') \
- INTBITSET_OF_DELETED_RECORDS
write_message("These records match %s in %s: %s" \
% (doi, tags['refs_doi'], list(recids)), verbose=9)
for recid in recids:
add_to_dicts(recid, thisrecid)
mesg = "done fully"
write_message(mesg)
task_update_progress(mesg)
write_message("Phase 7: remove empty lists from dicts")
# Remove empty lists in citation and reference
keys = citations.keys()
for k in keys:
if not citations[k]:
del citations[k]
keys = references.keys()
for k in keys:
if not references[k]:
del references[k]
if task_get_task_param('verbose') >= 3:
# Print only X first to prevent flood
write_message("citation_list (x is cited by y):")
write_message(dict(islice(citations.iteritems(), 10)))
write_message("size: %s" % len(citations))
write_message("reference_list (x cites y):")
write_message(dict(islice(references.iteritems(), 10)))
write_message("size: %s" % len(references))
write_message("selfcitedbydic (x is cited by y and one of the " \
"authors of x same as y's):")
write_message(dict(islice(selfcites.iteritems(), 10)))
write_message("size: %s" % len(selfcites))
write_message("selfdic (x cites y and one of the authors of x " \
"same as y's):")
write_message(dict(islice(selfrefs.iteritems(), 10)))
write_message("size: %s" % len(selfrefs))
write_message("authorcitdic (author is cited in recs):")
write_message(dict(islice(authorcites.iteritems(), 10)))
write_message("size: %s" % len(authorcites))
t7 = os.times()[4]
write_message("Execution time for analyzing the citation information " \
"generating the dictionary:")
write_message("... checking ref report numbers: %.2f sec" % (t2-t1))
write_message("... checking ref journals: %.2f sec" % (t3-t2))
write_message("... checking ref DOI: %.2f sec" % (t4-t3))
write_message("... checking rec report numbers: %.2f sec" % (t5-t4))
write_message("... checking rec journals: %.2f sec" % (t6-t5))
write_message("... checking rec DOI: %.2f sec" % (t7-t6))
write_message("... total time of ref_analyze: %.2f sec" % (t7-t1))
return citations_weight, citations, references, selfcites, \
selfrefs, authorcites
def store_dicts(dicts):
"""Insert the reference and citation list into the database"""
insert_into_cit_db(dicts['refs'], "reversedict")
insert_into_cit_db(dicts['cites'], "citationdict")
insert_into_cit_db(dicts['selfcites'], "selfcitedbydict")
insert_into_cit_db(dicts['selfrefs'], "selfcitdict")
def insert_into_cit_db(dic, name):
"""Stores citation dictionary in the database"""
ndate = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
s = serialize_via_marshal(dic)
write_message("size of %s %s" % (name, len(s)))
# check that this column really exists
run_sql("""REPLACE INTO rnkCITATIONDATA(object_name, object_value,
last_updated) VALUES (%s, %s, %s)""", (name, s, ndate))
def get_cit_dict(name):
"""get a named citation dict from the db"""
cdict = run_sql("""SELECT object_value FROM rnkCITATIONDATA
WHERE object_name = %s""", (name, ))
if cdict and cdict[0] and cdict[0][0]:
dict_from_db = deserialize_via_marshal(cdict[0][0])
else:
dict_from_db = {}
return dict_from_db
def get_initial_author_dict():
"""read author->citedinlist dict from the db"""
adict = {}
try:
ah = run_sql("SELECT aterm,hitlist FROM rnkAUTHORDATA")
for (a, h) in ah:
adict[a] = deserialize_via_marshal(h)
return adict
except:
register_exception(prefix="could not read rnkAUTHORDATA",
alert_admin=True)
return {}
def insert_into_missing(recid, report):
"""put the referingrecordnum-publicationstring into
the "we are missing these" table"""
if len(report) >= 255:
# Invalid report, it is too long
# and does not fit in the database column
# (currently varchar 255)
return
wasalready = run_sql("""SELECT id_bibrec
FROM rnkCITATIONDATAEXT
WHERE id_bibrec = %s
AND extcitepubinfo = %s""",
(recid, report))
if not wasalready:
run_sql("""INSERT INTO rnkCITATIONDATAEXT(id_bibrec, extcitepubinfo)
VALUES (%s,%s)""", (recid, report))
def remove_from_missing(report):
"""remove the recid-ref -pairs from the "missing" table for report x: prob
in the case ref got in our library collection"""
run_sql("""DELETE FROM rnkCITATIONDATAEXT
WHERE extcitepubinfo = %s""", (report,))
def create_analysis_tables():
"""temporary simple table + index"""
sql1 = "CREATE TABLE IF NOT EXISTS tmpcit (citer mediumint(10), \
cited mediumint(10)) TYPE=MyISAM"
sql2 = "CREATE UNIQUE INDEX citercited ON tmpcit(citer, cited)"
sql3 = "CREATE INDEX citer ON tmpcit(citer)"
sql4 = "CREATE INDEX cited ON tmpcit(cited)"
run_sql(sql1)
run_sql(sql2)
run_sql(sql3)
run_sql(sql4)
def write_citer_cited(citer, cited):
"""write an entry to tmp table"""
run_sql("INSERT INTO tmpcit(citer, cited) VALUES (%s,%s)", (citer, cited))
def print_missing(num):
"""
Print the contents of rnkCITATIONDATAEXT table containing external
records that were cited by NUM or more internal records.
NUM is by default taken from the -E command line option.
"""
if not num:
num = task_get_option("print-extcites")
write_message("Listing external papers cited by %i or more \
internal records:" % num)
res = run_sql("""SELECT COUNT(id_bibrec), extcitepubinfo
FROM rnkCITATIONDATAEXT
GROUP BY extcitepubinfo HAVING COUNT(id_bibrec) >= %s
ORDER BY COUNT(id_bibrec) DESC""", (num,))
for (cnt, brec) in res:
print str(cnt)+"\t"+brec
write_message("Listing done.")
def tagify(parsedtag):
"""aux auf to make '100__a' out of ['100','','','a']"""
tag = ""
for t in parsedtag:
if t == '':
t = '_'
tag += t
return tag
def store_citation_warning(warning_type, cit_info):
r = run_sql("""SELECT 1 FROM rnkCITATIONDATAERR
WHERE type = %s
AND citinfo = %s""", (warning_type, cit_info))
if not r:
run_sql("""INSERT INTO rnkCITATIONDATAERR (type, citinfo)
VALUES (%s, %s)""", (warning_type, cit_info))
diff --git a/invenio/legacy/bibrank/citerank_indexer.py b/invenio/legacy/bibrank/citerank_indexer.py
index 4a8ecbf5a..41e961f4c 100644
--- a/invenio/legacy/bibrank/citerank_indexer.py
+++ b/invenio/legacy/bibrank/citerank_indexer.py
@@ -1,891 +1,891 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Implementation of different ranking methods based on
the citation graph:
- citation count/ time decayed citation count
- pagerank / pagerank with external citations
- time decayed pagerank
"""
# pylint: disable=E0611
import ConfigParser
from math import exp
import datetime
import time
import re
import sys
try:
from numpy import array, ones, zeros, int32, float32, sqrt, dot
import_numpy = 1
except ImportError:
import_numpy = 0
if sys.hexversion < 0x2040000:
# pylint: disable=W0622
from sets import Set as set
# pylint: enable=W0622
from invenio.legacy.dbquery import run_sql, serialize_via_marshal, \
deserialize_via_marshal
-from invenio.bibtask import write_message
+from invenio.legacy.bibsched.bibtask import write_message
from invenio.config import CFG_ETCDIR
def get_citations_from_file(filename):
"""gets the citation data (who cites who) from a file and returns
- a dictionary of type x:{x1,x2..},
where x is cited by x1,x2..
- a dictionary of type a:{b},
where recid 'a' is asociated with an index 'b' """
cit = {}
dict_of_ids = {}
count = 0
try:
citation_file = open(filename, "r")
except StandardError:
write_message("Cannot find file: %s" % filename, sys.stderr)
raise StandardError
for line in citation_file:
tokens = line.strip().split()
recid_cites = int(tokens[0])
recid_cited = int(tokens[1])
if recid_cited not in cit:
cit[recid_cited] = []
#without this, duplicates might be introduced
if recid_cites not in cit[recid_cited] and recid_cites != recid_cited:
cit[recid_cited].append(recid_cites)
if recid_cites not in dict_of_ids:
dict_of_ids[recid_cites] = count
count += 1
if recid_cited not in dict_of_ids:
dict_of_ids[recid_cited] = count
count += 1
citation_file.close()
write_message("Citation data collected from file: %s" %filename, verbose=2)
write_message("Ids and recids corespondace: %s" \
%str(dict_of_ids), verbose=9)
write_message("Citations: %s" % str(cit), verbose=9)
return cit, dict_of_ids
def get_citations_from_db():
"""gets the citation data (who cites who) from the rnkCITATIONDATA table,
and returns:
-a dictionary of type x:{x1,x2..}, where x is cited by x1,x2..
-a dict of type a:{b} where recid 'a' is asociated with an index 'b'"""
dict_of_ids = {}
count = 0
query = "select object_value from rnkCITATIONDATA \
where object_name = 'citationdict'"
cit_compressed = run_sql(query)
cit = []
if cit_compressed and cit_compressed[0] and cit_compressed[0][0]:
cit = deserialize_via_marshal(cit_compressed[0][0])
if cit:
for item in cit:
#check for duplicates in citation dictionary
cit[item] = set(cit[item])
if item in cit[item]:
cit[item].remove(item)
if item not in dict_of_ids:
dict_of_ids[item] = count
count += 1
for value in cit[item]:
if value not in dict_of_ids:
dict_of_ids[value] = count
count += 1
write_message("Citation data collected\
from rnkCITATIONDATA", verbose=2)
write_message("Ids and recids corespondace: %s" \
% str(dict_of_ids), verbose=9)
write_message("Citations: %s" % str(cit), verbose=9)
return cit, dict_of_ids
else:
write_message("Error while extracting citation data \
from rnkCITATIONDATA table", verbose=1)
else:
write_message("Error while extracting citation data \
from rnkCITATIONDATA table", verbose=1)
return {}, {}
def construct_ref_array(cit, dict_of_ids, len_):
"""returns an array with the number of references that each recid has """
ref = array((), int32)
ref = zeros(len_, int32)
for key in cit:
for value in cit[key]:
ref[dict_of_ids[value]] += 1
write_message("Number of references: %s" %str(ref), verbose=9)
write_message("Finished computing total number \
of references for each paper.", verbose=5)
return ref
def get_external_links_from_file(filename, ref, dict_of_ids):
"""returns a dictionary containing the number of
external links for each recid
external link=citation that is not in our database """
ext_links = {}
#format: ext_links[dict_of_ids[recid]]=number of total external links
try:
external_file = open(filename, "r")
except StandardError:
write_message("Cannot find file: %s" % filename, sys.stderr)
raise StandardError
for line in external_file:
tokens = line.strip().split()
recid = int(tokens[0])
nr_of_external = int(tokens[1])
ext_links[dict_of_ids[recid]] = nr_of_external - ref[dict_of_ids[recid]]
if ext_links[dict_of_ids[recid]] < 0:
ext_links[dict_of_ids[recid]] = 0
external_file.close()
write_message("External link information extracted", verbose=2)
return ext_links
def get_external_links_from_db_old(ref, dict_of_ids, reference_indicator):
"""returns a dictionary containing the number of
external links for each recid
external link=citation that is not in our database """
ext_links = {}
reference_tag_regex = reference_indicator + "[a-z]"
for recid in dict_of_ids:
query = "select COUNT(DISTINCT field_number) from bibrec_bib99x \
where id_bibrec='%s' and id_bibxxx in \
(select id from bib99x where tag RLIKE '%s');" \
% (str(recid), reference_tag_regex)
result_set = run_sql(query)
if result_set:
total_links = int(result_set[0][0])
internal_links = ref[dict_of_ids[recid]]
ext_links[dict_of_ids[recid]] = total_links - internal_links
if ext_links[dict_of_ids[recid]] < 0:
ext_links[dict_of_ids[recid]] = 0
else:
ext_links[dict_of_ids[recid]] = 0
write_message("External link information extracted", verbose=2)
write_message("External links: %s" % str(ext_links), verbose=9)
return ext_links
def get_external_links_from_db(ref, dict_of_ids, reference_indicator):
"""returns a dictionary containing the number of
external links for each recid
external link=citation that is not in our database """
ext_links = {}
dict_all_ref = {}
for recid in dict_of_ids:
dict_all_ref[recid] = 0
ext_links[dict_of_ids[recid]] = 0
reference_db_id = reference_indicator[0:2]
reference_tag_regex = reference_indicator + "[a-z]"
tag_list = run_sql("select id from bib" + reference_db_id + \
"x where tag RLIKE %s", (reference_tag_regex, ))
tag_set = set()
for tag in tag_list:
tag_set.add(tag[0])
ref_list = run_sql("select id_bibrec, id_bibxxx, field_number from \
bibrec_bib" + reference_db_id + "x group by \
id_bibrec, field_number")
for item in ref_list:
recid = int(item[0])
id_bib = int(item[1])
if recid in dict_of_ids and id_bib in tag_set:
dict_all_ref[recid] += 1
for recid in dict_of_ids:
total_links = dict_all_ref[recid]
internal_links = ref[dict_of_ids[recid]]
ext_links[dict_of_ids[recid]] = total_links - internal_links
if ext_links[dict_of_ids[recid]] < 0:
ext_links[dict_of_ids[recid]] = 0
write_message("External link information extracted", verbose=2)
write_message("External links: %s" % str(ext_links), verbose=9)
return ext_links
def avg_ext_links_with_0(ext_links):
"""returns the average number of external links per paper
including in the counting the papers with 0 external links"""
total = 0.0
for item in ext_links:
total += ext_links[item]
avg_ext = total/len(ext_links)
write_message("The average number of external links per paper (including \
papers with 0 external links) is: %s" % str(avg_ext), verbose=3)
return avg_ext
def avg_ext_links_without_0(ext_links):
"""returns the average number of external links per paper
excluding in the counting the papers with 0 external links"""
count = 0.0
total = 0.0
for item in ext_links:
if ext_links[item] != 0:
count += 1
total += ext_links[item]
avg_ext = total/count
write_message("The average number of external links per paper (excluding \
papers with 0 external links) is: %s" % str(avg_ext), verbose=3)
return avg_ext
def leaves(ref):
"""returns the number of papers that do not cite any other paper"""
nr_of_leaves = 0
for i in ref:
if i == 0:
nr_of_leaves += 1
write_message("The number of papers that do not cite \
any other papers: %s" % str(leaves), verbose=3)
return nr_of_leaves
def get_dates_from_file(filename, dict_of_ids):
"""Returns the year of the publication for each paper.
In case the year is not in the db, the year of the submission is taken"""
dates = {}
# the format is: dates[dict_of_ids[recid]] = year
try:
dates_file = open(filename, "r")
except StandardError:
write_message("Cannot find file: %s" % filename, sys.stderr)
raise StandardError
for line in dates_file:
tokens = line.strip().split()
recid = int(tokens[0])
year = int(tokens[1])
dates[dict_of_ids[recid]] = year
dates_file.close()
write_message("Dates extracted", verbose=2)
write_message("Dates dictionary %s" % str(dates), verbose=9)
return dates
def get_dates_from_db(dict_of_ids, publication_year_tag, creation_date_tag):
"""Returns the year of the publication for each paper.
In case the year is not in the db, the year of the submission is taken"""
current_year = int(datetime.datetime.now().strftime("%Y"))
publication_year_db_id = publication_year_tag[0:2]
creation_date_db_id = creation_date_tag[0:2]
total = 0
count = 0
dict_of_dates = {}
for recid in dict_of_ids:
dict_of_dates[recid] = 0
date_list = run_sql("select id, tag, value from bib" + \
publication_year_db_id + "x where tag=%s", \
(publication_year_tag, ))
date_dict = {}
for item in date_list:
date_dict[int(item[0])] = item[2]
pattern = re.compile('.*(\d{4}).*')
date_list = run_sql("select id_bibrec, id_bibxxx, field_number \
from bibrec_bib" + publication_year_db_id +"x")
for item in date_list:
recid = int(item[0])
id_ = int(item[1])
if id_ in date_dict and recid in dict_of_dates:
reg = pattern.match(date_dict[id_])
if reg:
date = int(reg.group(1))
if date > 1000 and date <= current_year:
dict_of_dates[recid] = date
total += date
count += 1
not_covered = []
for recid in dict_of_dates:
if dict_of_dates[recid] == 0:
not_covered.append(recid)
date_list = run_sql("select id, tag, value from bib" + \
creation_date_db_id + "x where tag=%s", \
(creation_date_tag, ))
date_dict = {}
for item in date_list:
date_dict[int(item[0])] = item[2]
date_list = run_sql("select id_bibrec, id_bibxxx, field_number \
from bibrec_bib" + creation_date_db_id + "x")
for item in date_list:
recid = int(item[0])
id_ = int(item[1])
if id_ in date_dict and recid in not_covered:
date = int(str(date_dict[id_])[0:4])
if date > 1000 and date <= current_year:
dict_of_dates[recid] = date
total += date
count += 1
dates = {}
med = total/count
for recid in dict_of_dates:
if dict_of_dates[recid] == 0:
dates[dict_of_ids[recid]] = med
else:
dates[dict_of_ids[recid]] = dict_of_dates[recid]
write_message("Dates extracted", verbose=2)
write_message("Dates dictionary %s" % str(dates), verbose=9)
return dates
def construct_sparse_matrix(cit, ref, dict_of_ids, len_, damping_factor):
"""returns several structures needed in the calculation
of the PAGERANK method using this structures, we don't need
to keep the full matrix in the memory"""
sparse = {}
for item in cit:
for value in cit[item]:
sparse[(dict_of_ids[item], dict_of_ids[value])] = \
damping_factor * 1.0/ref[dict_of_ids[value]]
semi_sparse = []
for j in range(len_):
if ref[j] == 0:
semi_sparse.append(j)
semi_sparse_coeficient = damping_factor/len_
#zero_coeficient = (1-damping_factor)/len_
write_message("Sparse information calculated", verbose=3)
return sparse, semi_sparse, semi_sparse_coeficient
def construct_sparse_matrix_ext(cit, ref, ext_links, dict_of_ids, alpha, beta):
"""if x doesn't cite anyone: cites everyone : 1/len_ -- should be used!
returns several structures needed in the calculation
of the PAGERANK_EXT method"""
len_ = len(dict_of_ids)
sparse = {}
semi_sparse = {}
sparse[0, 0] = 1.0 - alpha
for j in range(len_):
sparse[j+1, 0] = alpha/(len_)
if j not in ext_links:
sparse[0, j+1] = beta/(len_ + beta)
else:
if ext_links[j] == 0:
sparse[0, j+1] = beta/(len_ + beta)
else:
aux = beta * ext_links[j]
if ref[j] == 0:
sparse[0, j+1] = aux/(aux + len_)
else:
sparse[0, j+1] = aux/(aux + ref[j])
if ref[j] == 0:
semi_sparse[j+1] = (1.0 - sparse[0, j + 1])/len_
for item in cit:
for value in cit[item]:
sparse[(dict_of_ids[item] + 1, dict_of_ids[value] + 1)] = \
(1.0 - sparse[0, dict_of_ids[value] + 1])/ref[dict_of_ids[value]]
#for i in range(len_ + 1):
# a = ""
# for j in range (len_ + 1):
# if (i,j) in sparse:
# a += str(sparse[(i,j)]) + "\t"
# else:
# a += "0\t"
# print a
#print semi_sparse
write_message("Sparse information calculated", verbose=3)
return sparse, semi_sparse
def construct_sparse_matrix_time(cit, ref, dict_of_ids, \
damping_factor, date_coef):
"""returns several structures needed in the calculation of the PAGERANK_time
method using this structures,
we don't need to keep the full matrix in the memory"""
len_ = len(dict_of_ids)
sparse = {}
for item in cit:
for value in cit[item]:
sparse[(dict_of_ids[item], dict_of_ids[value])] = damping_factor * \
date_coef[dict_of_ids[value]]/ref[dict_of_ids[value]]
semi_sparse = []
for j in range(len_):
if ref[j] == 0:
semi_sparse.append(j)
semi_sparse_coeficient = damping_factor/len_
#zero_coeficient = (1-damping_factor)/len_
write_message("Sparse information calculated", verbose=3)
return sparse, semi_sparse, semi_sparse_coeficient
def statistics_on_sparse(sparse):
"""returns the number of papers that cite themselves"""
count_diag = 0
for (i, j) in sparse.keys():
if i == j:
count_diag += 1
write_message("The number of papers that cite themselves: %s" % \
str(count_diag), verbose=3)
return count_diag
def pagerank(conv_threshold, check_point, len_, sparse, \
semi_sparse, semi_sparse_coef):
"""the core function of the PAGERANK method
returns an array with the ranks coresponding to each recid"""
weights_old = ones((len_), float32) # initial weights
weights_new = array((), float32)
converged = False
nr_of_check_points = 0
difference = len_
while not converged:
nr_of_check_points += 1
for step in (range(check_point)):
weights_new = zeros((len_), float32)
for (i, j) in sparse.keys():
weights_new[i] += sparse[(i, j)]*weights_old[j]
semi_total = 0.0
for j in semi_sparse:
semi_total += weights_old[j]
weights_new = weights_new + semi_sparse_coef * semi_total + \
(1.0/len_ - semi_sparse_coef) * sum(weights_old)
if step == check_point - 1:
diff = weights_new - weights_old
difference = sqrt(dot(diff, diff))/len_
write_message("Finished step: %s, %s " \
%(str(check_point*(nr_of_check_points-1) + step), \
str(difference)), verbose=5)
weights_old = weights_new.copy()
converged = (difference < conv_threshold)
write_message("PageRank calculated for all recids finnished in %s steps. \
The threshold was %s" % (str(nr_of_check_points), str(difference)),\
verbose=2)
return weights_old
def pagerank_ext(conv_threshold, check_point, len_, sparse, semi_sparse):
"""the core function of the PAGERANK_EXT method
returns an array with the ranks coresponding to each recid"""
weights_old = array((), float32)
weights_old = ones((len_), float32)
weights_new = array((), float32)
converged = False
nr_of_check_points = 0
difference = len_
while not converged:
nr_of_check_points += 1
for step in (range(check_point)):
weights_new = zeros((len_), float32)
for (i, j) in sparse.keys():
weights_new[i] += sparse[(i, j)]*weights_old[j]
total_sum = 0.0
for j in semi_sparse:
total_sum += semi_sparse[j]*weights_old[j]
weights_new[1:len_] = weights_new[1:len_] + total_sum
if step == check_point - 1:
diff = weights_new - weights_old
difference = sqrt(dot(diff, diff))/len_
write_message("Finished step: %s, %s " \
% (str(check_point*(nr_of_check_points-1) + step), \
str(difference)), verbose=5)
weights_old = weights_new.copy()
converged = (difference < conv_threshold)
write_message("PageRank calculated for all recids finnished in %s steps. \
The threshold was %s" % (str(nr_of_check_points), \
str(difference)), verbose=2)
#return weights_old[1:len_]/(len_ - weights_old[0])
return weights_old[1:len_]
def pagerank_time(conv_threshold, check_point, len_, \
sparse, semi_sparse, semi_sparse_coeficient, date_coef):
"""the core function of the PAGERANK_TIME method: pageRank + time decay
returns an array with the ranks coresponding to each recid"""
weights_old = array((), float32)
weights_old = ones((len_), float32) # initial weights
weights_new = array((), float32)
converged = False
nr_of_check_points = 0
difference = len_
while not converged:
nr_of_check_points += 1
for step in (range(check_point)):
weights_new = zeros((len_), float32)
for (i, j) in sparse.keys():
weights_new[i] += sparse[(i, j)]*weights_old[j]
semi_total = 0.0
for j in semi_sparse:
semi_total += weights_old[j]*date_coef[j]
zero_total = 0.0
for i in range(len_):
zero_total += weights_old[i]*date_coef[i]
#dates = array(date_coef.keys())
#zero_total = dot(weights_old, dates)
weights_new = weights_new + semi_sparse_coeficient * semi_total + \
(1.0/len_ - semi_sparse_coeficient) * zero_total
if step == check_point - 1:
diff = weights_new - weights_old
difference = sqrt(dot(diff, diff))/len_
write_message("Finished step: %s, %s " \
% (str(check_point*(nr_of_check_points-1) + step), \
str(difference)), verbose=5)
weights_old = weights_new.copy()
converged = (difference < conv_threshold)
write_message("PageRank calculated for all recids finnished in %s steps.\
The threshold was %s" % (str(nr_of_check_points), \
str(difference)), verbose=2)
return weights_old
def citation_rank_time(cit, dict_of_ids, date_coef, dates, decimals):
"""returns a dictionary recid:weight based on the total number of
citations as function of time"""
dict_of_ranks = {}
for key in dict_of_ids:
if key in cit:
dict_of_ranks[key] = 0
for recid in cit[key]:
dict_of_ranks[key] += date_coef[dict_of_ids[recid]]
dict_of_ranks[key] = round(dict_of_ranks[key], decimals) \
+ dates[dict_of_ids[key]]* pow(10, 0-4-decimals)
else:
dict_of_ranks[key] = dates[dict_of_ids[key]]* pow(10, 0-4-decimals)
write_message("Citation rank calculated", verbose=2)
return dict_of_ranks
def get_ranks(weights, dict_of_ids, mult, dates, decimals):
"""returns a dictionary recid:value, where value is the weight of the
recid paper; the second order is the reverse time order,
from recent to past"""
dict_of_ranks = {}
for item in dict_of_ids:
dict_of_ranks[item] = round(weights[dict_of_ids[item]]* mult, decimals)\
+ dates[dict_of_ids[item]]* pow(10, 0-4-decimals)
#dict_of_ranks[item] = weights[dict_of_ids[item]]
return dict_of_ranks
def sort_weights(dict_of_ranks):
"""sorts the recids based on weights(first order)
and on dates(second order)"""
ranks_by_citations = sorted(dict_of_ranks.keys(), lambda x, y: \
cmp(dict_of_ranks[y], dict_of_ranks[x]))
return ranks_by_citations
def normalize_weights(dict_of_ranks):
"""the weights should be normalized to 100, so they woun't be
different from the weights from other ranking methods"""
max_weight = 0.0
for recid in dict_of_ranks:
weight = dict_of_ranks[recid]
if weight > max_weight:
max_weight = weight
for recid in dict_of_ranks:
dict_of_ranks[recid] = round(dict_of_ranks[recid] * 100.0/max_weight, 3)
def write_first_ranks_to_file(ranks_by_citations, dict_of_ranks, \
nr_of_ranks, filename):
"""Writes the first n results of the ranking method into a file"""
try:
ranks_file = open(filename, "w")
except StandardError:
write_message("Problems with file: %s" % filename, sys.stderr)
raise StandardError
for i in range(nr_of_ranks):
ranks_file.write(str(i+1) + "\t" + str(ranks_by_citations[i]) + \
"\t" + str(dict_of_ranks[ranks_by_citations[i]]) + "\n")
ranks_file.close()
write_message("The first %s pairs recid:rank in the ranking order \
are written into this file: %s" % (nr_of_ranks, filename), verbose=2)
def del_rank_method_data(rank_method_code):
"""Delete the data for a rank method from rnkMETHODDATA table"""
id_ = run_sql("SELECT id from rnkMETHOD where name=%s", (rank_method_code, ))
run_sql("DELETE FROM rnkMETHODDATA WHERE id_rnkMETHOD=%s", (id_[0][0], ))
def into_db(dict_of_ranks, rank_method_code):
"""Writes into the rnkMETHODDATA table the ranking results"""
method_id = run_sql("SELECT id from rnkMETHOD where name=%s", \
(rank_method_code, ))
del_rank_method_data(rank_method_code)
serialized_data = serialize_via_marshal(dict_of_ranks)
method_id_str = str(method_id[0][0])
run_sql("INSERT INTO rnkMETHODDATA(id_rnkMETHOD, relevance_data) \
VALUES(%s, %s) ", (method_id_str, serialized_data, ))
date = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
run_sql("UPDATE rnkMETHOD SET last_updated=%s WHERE name=%s", \
(date, rank_method_code))
write_message("Finished writing the ranks into rnkMETHOD table", verbose=5)
def run_pagerank(cit, dict_of_ids, len_, ref, damping_factor, \
conv_threshold, check_point, dates):
"""returns the final form of the ranks when using pagerank method"""
write_message("Running the PageRank method", verbose=5)
sparse, semi_sparse, semi_sparse_coeficient = \
construct_sparse_matrix(cit, ref, dict_of_ids, len_, damping_factor)
weights = pagerank(conv_threshold, check_point, len_, \
sparse, semi_sparse, semi_sparse_coeficient)
dict_of_ranks = get_ranks(weights, dict_of_ids, 1, dates, 2)
return dict_of_ranks
def run_pagerank_ext(cit, dict_of_ids, ref, ext_links, \
conv_threshold, check_point, alpha, beta, dates):
"""returns the final form of the ranks when using pagerank_ext method"""
write_message("Running the PageRank with external links method", verbose=5)
len_ = len(dict_of_ids)
sparse, semi_sparse = construct_sparse_matrix_ext(cit, ref, \
ext_links, dict_of_ids, alpha, beta)
weights = pagerank_ext(conv_threshold, check_point, \
len_ + 1, sparse, semi_sparse)
dict_of_ranks = get_ranks(weights, dict_of_ids, 1, dates, 2)
return dict_of_ranks
def run_pagerank_time(cit, dict_of_ids, len_, ref, damping_factor, \
conv_threshold, check_point, date_coef, dates):
"""returns the final form of the ranks when using
pagerank + time decay method"""
write_message("Running the PageRank_time method", verbose=5)
sparse, semi_sparse, semi_sparse_coeficient = \
construct_sparse_matrix_time(cit, ref, dict_of_ids, \
damping_factor, date_coef)
weights = pagerank_time(conv_threshold, check_point, len_, \
sparse, semi_sparse, semi_sparse_coeficient, date_coef)
dict_of_ranks = get_ranks(weights, dict_of_ids, 100000, dates, 2)
return dict_of_ranks
def run_citation_rank_time(cit, dict_of_ids, date_coef, dates):
"""returns the final form of the ranks when using citation count
as function of time method"""
write_message("Running the citation rank with time decay method", verbose=5)
dict_of_ranks = citation_rank_time(cit, dict_of_ids, date_coef, dates, 2)
return dict_of_ranks
def spearman_rank_correlation_coef(rank1, rank2, len_):
"""rank1 and rank2 are arrays containing the recids in the ranking order
returns the corelation coeficient (-1 <= c <= 1) between 2 rankings
the closec c is to 1, the more correlated are the two ranking methods"""
total = 0
for i in range(len_):
rank_value = rank2.index(rank1[i])
total += (i - rank_value)*(i - rank_value)
return 1 - (6.0 * total) / (len_*(len_*len_ - 1))
def remove_loops(cit, dates, dict_of_ids):
"""when using time decay, new papers that are part of a loop
are accumulating a lot of fake weight"""
new_cit = {}
for recid in cit:
new_cit[recid] = []
for cited_by in cit[recid]:
if dates[dict_of_ids[cited_by]] >= dates[dict_of_ids[recid]]:
if cited_by in cit:
if recid not in cit[cited_by]:
new_cit[recid].append(cited_by)
else:
write_message("Loop removed: %s <-> %s" \
%(cited_by, recid), verbose=9)
else:
new_cit[recid].append(cited_by)
else:
write_message("Loop removed: %s <-> %s" \
%(cited_by, recid), verbose=9)
write_message("Simple loops removed", verbose=5)
return new_cit
def calculate_time_weights(len_, time_decay, dates):
"""calculates the time coeficients for each paper"""
current_year = int(datetime.datetime.now().strftime("%Y"))
date_coef = {}
for j in range(len_):
date_coef[j] = exp(time_decay*(dates[j] - current_year))
write_message("Time weights calculated", verbose=5)
write_message("Time weights: %s" % str(date_coef), verbose=9)
return date_coef
def get_dates(function, config, dict_of_ids):
"""returns a dictionary containing the year of
publishing for each paper"""
try:
file_for_dates = config.get(function, "file_with_dates")
dates = get_dates_from_file(file_for_dates, dict_of_ids)
except (ConfigParser.NoOptionError, StandardError), err:
write_message("If you want to read the dates from file set up the \
'file_for_dates' variable in the config file [%s]" %err, verbose=3)
try:
publication_year_tag = config.get(function, "publication_year_tag")
dummy = int(publication_year_tag[0:3])
except (ConfigParser.NoOptionError, StandardError):
write_message("You need to set up correctly the publication_year_tag \
in the cfg file", sys.stderr)
raise Exception
try:
creation_date_tag = config.get(function, "creation_date_tag")
dummy = int(creation_date_tag[0:3])
except (ConfigParser.NoOptionError, StandardError):
write_message("You need to set up correctly the creation_date_tag \
in the cfg file", sys.stderr)
raise Exception
dates = get_dates_from_db(dict_of_ids, publication_year_tag, \
creation_date_tag)
return dates
def citerank(rank_method_code):
"""new ranking method based on the citation graph"""
write_message("Running rank method: %s" % rank_method_code, verbose=0)
if not import_numpy:
write_message('The numpy package could not be imported. \
This package is compulsory for running the citerank methods.')
return
try:
file_ = CFG_ETCDIR + "/bibrank/" + rank_method_code + ".cfg"
config = ConfigParser.ConfigParser()
config.readfp(open(file_))
except StandardError:
write_message("Cannot find configuration file: %s" % file_, sys.stderr)
raise StandardError
# the file for citations needs to have the following format:
#each line needs to be x[tab]y, where x cites y; x,y are recids
function = config.get("rank_method", "function")
try:
file_for_citations = config.get(function, "file_with_citations")
cit, dict_of_ids = get_citations_from_file(file_for_citations)
except (ConfigParser.NoOptionError, StandardError), err:
write_message("If you want to read the citation data from file set up \
the file_for_citations parameter in the config file [%s]" %err, verbose=2)
cit, dict_of_ids = get_citations_from_db()
len_ = len(dict_of_ids.keys())
write_message("Number of nodes(papers) to rank : %s" % str(len_), verbose=3)
if len_ == 0:
write_message("No citation data found, nothing to be done.")
return
try:
method = config.get(function, "citerank_method")
except ConfigParser.NoOptionError, err:
write_message("Exception: %s " %err, sys.stderr)
raise Exception
write_message("Running %s method." % method, verbose=2)
dates = get_dates(function, config, dict_of_ids)
if method == "citation_time":
try:
time_decay = float(config.get(function, "time_decay"))
except (ConfigParser.NoOptionError, ValueError), err:
write_message("Exception: %s" % err, sys.stderr)
raise Exception
date_coef = calculate_time_weights(len_, time_decay, dates)
#cit = remove_loops(cit, dates, dict_of_ids)
dict_of_ranks = \
run_citation_rank_time(cit, dict_of_ids, date_coef, dates)
else:
try:
conv_threshold = float(config.get(function, "conv_threshold"))
check_point = int(config.get(function, "check_point"))
damping_factor = float(config.get(function, "damping_factor"))
write_message("Parameters: d = %s, conv_threshold = %s, \
check_point = %s" %(str(damping_factor), \
str(conv_threshold), str(check_point)), verbose=5)
except (ConfigParser.NoOptionError, StandardError), err:
write_message("Exception: %s" % err, sys.stderr)
raise Exception
if method == "pagerank_classic":
ref = construct_ref_array(cit, dict_of_ids, len_)
use_ext_cit = ""
try:
use_ext_cit = config.get(function, "use_external_citations")
write_message("Pagerank will use external citations: %s" \
%str(use_ext_cit), verbose=5)
except (ConfigParser.NoOptionError, StandardError), err:
write_message("%s" % err, verbose=2)
if use_ext_cit == "yes":
try:
ext_citation_file = config.get(function, "ext_citation_file")
ext_links = get_external_links_from_file(ext_citation_file,
ref, dict_of_ids)
except (ConfigParser.NoOptionError, StandardError):
write_message("If you want to read the external citation \
data from file set up the ext_citation_file parameter in the config. file", \
verbose=3)
try:
reference_tag = config.get(function, "ext_reference_tag")
dummy = int(reference_tag[0:3])
except (ConfigParser.NoOptionError, StandardError):
write_message("You need to set up correctly the \
reference_tag in the cfg file", sys.stderr)
raise Exception
ext_links = get_external_links_from_db(ref, \
dict_of_ids, reference_tag)
avg = avg_ext_links_with_0(ext_links)
if avg < 1:
write_message("This method can't be ran. There is not \
enough information about the external citation. Hint: check the reference tag", \
sys.stderr)
raise Exception
avg_ext_links_without_0(ext_links)
try:
alpha = float(config.get(function, "ext_alpha"))
beta = float(config.get(function, "ext_beta"))
except (ConfigParser.NoOptionError, StandardError), err:
write_message("Exception: %s" % err, sys.stderr)
raise Exception
dict_of_ranks = run_pagerank_ext(cit, dict_of_ids, ref, \
ext_links, conv_threshold, check_point, alpha, beta, dates)
else:
dict_of_ranks = run_pagerank(cit, dict_of_ids, len_, ref, \
damping_factor, conv_threshold, check_point, dates)
elif method == "pagerank_time":
try:
time_decay = float(config.get(function, "time_decay"))
write_message("Parameter: time_decay = %s" \
%str(time_decay), verbose=5)
except (ConfigParser.NoOptionError, StandardError), err:
write_message("Exception: %s" % err, sys.stderr)
raise Exception
date_coef = calculate_time_weights(len_, time_decay, dates)
cit = remove_loops(cit, dates, dict_of_ids)
ref = construct_ref_array(cit, dict_of_ids, len_)
dict_of_ranks = run_pagerank_time(cit, dict_of_ids, len_, ref, \
damping_factor, conv_threshold, check_point, date_coef, dates)
else:
write_message("Error: Unknown ranking method. \
Please check the ranking_method parameter in the config. file.", sys.stderr)
raise Exception
try:
filename_ranks = config.get(function, "output_ranks_to_filename")
max_ranks = config.get(function, "output_rank_limit")
if not max_ranks.isdigit():
max_ranks = len_
else:
max_ranks = int(max_ranks)
if max_ranks > len_:
max_ranks = len_
ranks = sort_weights(dict_of_ranks)
write_message("Ranks: %s" % str(ranks), verbose=9)
write_first_ranks_to_file(ranks, dict_of_ranks, \
max_ranks, filename_ranks)
except (ConfigParser.NoOptionError, StandardError):
write_message("If you want the ranks to be printed in a file you have \
to set output_ranks_to_filename and output_rank_limit \
parameters in the configuration file", verbose=3)
normalize_weights(dict_of_ranks)
into_db(dict_of_ranks, rank_method_code)
diff --git a/invenio/legacy/bibrank/cli.py b/invenio/legacy/bibrank/cli.py
index e81831df5..46e3c75a4 100644
--- a/invenio/legacy/bibrank/cli.py
+++ b/invenio/legacy/bibrank/cli.py
@@ -1,295 +1,294 @@
## -*- mode: python; coding: utf-8; -*-
##
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibRank ranking daemon.
Usage: bibrank [options]
Ranking examples:
bibrank -wjif -a --id=0-30000,30001-860000 --verbose=9
bibrank -wjif -d --modified='2002-10-27 13:57:26'
bibrank -wwrd --rebalance --collection=Articles
bibrank -wwrd -a -i 234-250,293,300-500 -u admin
Ranking options:
-w, --run=r1[,r2] runs each rank method in the order given
-c, --collection=c1[,c2] select according to collection
-i, --id=low[-high] select according to doc recID
-m, --modified=from[,to] select according to modification date
-l, --lastupdate select according to last update
-a, --add add or update words for selected records
-d, --del delete words for selected records
-S, --stat show statistics for a method
-R, --recalculate recalculate weigth data, used by word frequency
and citation methods, should be used if ca 1%
of the document has been changed since last
time -R was used
-E, --extcites=NUM print the top entries of the external cites table.
These are entries that should be entered in
your collection, since they have been cited
by NUM or more other records present in the
system. Useful for cataloguers to input
external papers manually.
Repairing options:
-k, --check check consistency for all records in the table(s)
check if update of ranking data is necessary
-r, --repair try to repair all records in the table(s)
Scheduling options:
-u, --user=USER user name to store task, password needed
-s, --sleeptime=SLEEP time after which to repeat tasks (no)
e.g.: 1s, 30m, 24h, 7d
-t, --time=TIME moment for the task to be active (now)
e.g.: +15s, 5m, 3h , 2002-10-27 13:57:26
General options:
-h, --help print this help and exit
-V, --version print version and exit
-v, --verbose=LEVEL verbose level (from 0 to 9, default 1)
"""
__revision__ = "$Id$"
import sys
import traceback
import ConfigParser
from invenio.config import CFG_ETCDIR
from invenio.legacy.dbquery import run_sql
from invenio.ext.logging import register_exception
-from invenio.bibtask import task_init, write_message, task_get_option, \
+from invenio.legacy.bibsched.bibtask import task_init, write_message, task_get_option, \
task_set_option, get_datetime, task_update_status, \
task_sleep_now_if_required
+from invenio.base.factory import with_app_context
# pylint: disable=W0611
# Disabling unused import pylint check, since these are needed to get
# imported here, and are called later dynamically.
-from invenio.bibrank_tag_based_indexer import \
+from invenio.legacy.bibrank.tag_based_indexer import \
single_tag_rank_method, \
citation, \
download_weight_filtering_user, \
download_weight_total, \
file_similarity_by_times_downloaded, \
index_term_count
-from invenio.bibrank_word_indexer import word_similarity #@UnusedImport
-from invenio.bibrank_citerank_indexer import citerank #@UnusedImport
+from invenio.legacy.bibrank.word_indexer import word_similarity #@UnusedImport
+from invenio.legacy.bibrank.citerank_indexer import citerank #@UnusedImport
from invenio.solrutils_bibrank_indexer import word_similarity_solr #@UnusedImport
from invenio.xapianutils_bibrank_indexer import word_similarity_xapian #@UnusedImport
-from invenio.bibrank_selfcites_task import process_updates as selfcites
+from invenio.legacy.bibrank.selfcites_task import process_updates as selfcites
# pylint: enable=W0611
nb_char_in_line = 50 # for verbose pretty printing
chunksize = 1000 # default size of chunks that the records will be treated by
base_process_size = 4500 # process base size
def split_ranges(parse_string):
"""Split ranges of numbers"""
recIDs = []
ranges = parse_string.split(",")
for rang in ranges:
tmp_recIDs = rang.split("-")
if len(tmp_recIDs)==1:
recIDs.append([int(tmp_recIDs[0]), int(tmp_recIDs[0])])
else:
if int(tmp_recIDs[0]) > int(tmp_recIDs[1]): # sanity check
tmp = tmp_recIDs[0]
tmp_recIDs[0] = tmp_recIDs[1]
tmp_recIDs[1] = tmp
recIDs.append([int(tmp_recIDs[0]), int(tmp_recIDs[1])])
return recIDs
def get_date_range(var):
"Returns the two dates contained as a low,high tuple"
limits = var.split(",")
if len(limits)==1:
low = get_datetime(limits[0])
return low, None
if len(limits)==2:
low = get_datetime(limits[0])
high = get_datetime(limits[1])
return low, high
def task_run_core():
"""Run the indexing task. The row argument is the BibSched task
queue row, containing if, arguments, etc.
Return 1 in case of success and 0 in case of failure.
"""
if not task_get_option("run"):
task_set_option("run", [name[0] for name in run_sql("SELECT name from rnkMETHOD")])
try:
for key in task_get_option("run"):
task_sleep_now_if_required(can_stop_too=True)
write_message("")
filename = CFG_ETCDIR + "/bibrank/" + key + ".cfg"
write_message("Getting configuration from file: %s" % filename,
verbose=9)
config = ConfigParser.ConfigParser()
try:
config.readfp(open(filename))
except StandardError, e:
write_message("Cannot find configurationfile: %s. "
"The rankmethod may also not be registered using "
"the BibRank Admin Interface." % filename, sys.stderr)
raise StandardError
#Using the function variable to call the function related to the
#rank method
cfg_function = config.get("rank_method", "function")
func_object = globals().get(cfg_function)
if func_object:
func_object(key)
else:
write_message("Cannot run method '%s', no function to call"
% key)
except StandardError, e:
write_message("\nException caught: %s" % e, sys.stderr)
write_message(traceback.format_exc()[:-1])
register_exception()
task_update_status("ERROR")
sys.exit(1)
return True
+@with_app_context()
def main():
"""Main that construct all the bibtask."""
task_init(authorization_action='runbibrank',
authorization_msg="BibRank Task Submission",
description="""Ranking examples:
bibrank -wjif -a --id=0-30000,30001-860000 --verbose=9
bibrank -wjif -d --modified='2002-10-27 13:57:26'
bibrank -wjif --rebalance --collection=Articles
bibrank -wsbr -a -i 234-250,293,300-500 -u admin
bibrank -u admin -w citation -E 10
bibrank -u admin -w citation -A
""",
help_specific_usage="""Ranking options:
-w, --run=r1[,r2] runs each rank method in the order given
-c, --collection=c1[,c2] select according to collection
-i, --id=low[-high] select according to doc recID
-m, --modified=from[,to] select according to modification date
-l, --lastupdate select according to last update
-a, --add add or update words for selected records
-d, --del delete words for selected records
-S, --stat show statistics for a method
-R, --recalculate recalculate weight data, used by word frequency
and citation methods, should be used if ca 1%
of the documents have been changed since last
time -R was used. NOTE: This will replace the
entire set of weights, regardless of date/id
selection.
-E, --extcites=NUM print the top entries of the external cites table.
These are entries that should be entered in
your collection, since they have been cited
by NUM or more other records present in the
system. Useful for cataloguers to input
external papers manually.
-A --author-citations Calculate author citations.
Repairing options:
-k, --check check consistency for all records in the table(s)
check if update of ranking data is necessary
-r, --repair try to repair all records in the table(s)
""",
version=__revision__,
specific_params=("AE:ladSi:m:c:kUrRM:f:w:", [
"author-citations",
"print-extcites=",
"lastupdate",
"add",
"del",
"repair",
"maxmem",
"flush",
"stat",
"rebalance",
"id=",
"collection=",
"check",
"modified=",
"update",
"run="]),
task_submit_elaborate_specific_parameter_fnc=
task_submit_elaborate_specific_parameter,
task_run_fnc=task_run_core)
def task_submit_elaborate_specific_parameter(key, value, opts, dummy):
"""Elaborate a specific parameter of CLI bibrank."""
if key in ("-a", "--add"):
task_set_option("cmd", "add")
if ("-x","") in opts or ("--del","") in opts:
raise StandardError, "--add incompatible with --del"
elif key in ("--run", "-w"):
task_set_option("run", [])
run = value.split(",")
for run_key in range(0, len(run)):
task_get_option('run').append(run[run_key])
elif key in ("-r", "--repair"):
task_set_option("cmd", "repair")
elif key in ("-E", "--print-extcites"):
try:
task_set_option("print-extcites", int(value))
except:
task_set_option("print-extcites", 10) # default fallback value
task_set_option("cmd", "print-missing")
elif key in ("-A", "--author-citations"):
task_set_option("author-citations", "1")
elif key in ("-d", "--del"):
task_set_option("cmd", "del")
elif key in ("-k", "--check"):
task_set_option("cmd", "check")
elif key in ("-S", "--stat"):
task_set_option("cmd", "stat")
elif key in ("-i", "--id"):
task_set_option("id", task_get_option("id") + split_ranges(value))
task_set_option("last_updated", "")
elif key in ("-c", "--collection"):
task_set_option("collection", value)
elif key in ("-R", "--rebalance"):
task_set_option("quick", "no")
elif key in ("-f", "--flush"):
task_set_option("flush", int(value))
elif key in ("-M", "--maxmem"):
task_set_option("maxmem", int(value))
if task_get_option("maxmem") < base_process_size + 1000:
raise StandardError, "Memory usage should be higher than %d kB" % \
(base_process_size + 1000)
elif key in ("-m", "--modified"):
task_set_option("modified", get_date_range(value))#2002-10-27 13:57:26)
task_set_option("last_updated", "")
elif key in ("-l", "--lastupdate"):
task_set_option("last_updated", "last_updated")
else:
return False
return True
-
-if __name__ == "__main__":
- main()
diff --git a/invenio/legacy/bibrank/record_sorter.py b/invenio/legacy/bibrank/record_sorter.py
index 2ddf5b7c0..abd232fd0 100644
--- a/invenio/legacy/bibrank/record_sorter.py
+++ b/invenio/legacy/bibrank/record_sorter.py
@@ -1,442 +1,442 @@
# -*- coding: utf-8 -*-
## Ranking of records using different parameters and methods on the fly.
##
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
import string
import time
import math
import re
import ConfigParser
import copy
from invenio.config import \
CFG_SITE_LANG, \
CFG_ETCDIR, \
CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS
from invenio.legacy.dbquery import run_sql, deserialize_via_marshal, wash_table_column_name
from invenio.ext.logging import register_exception
from invenio.legacy.webpage import adderrorbox
-from invenio.bibindex_engine_stemmer import stem
-from invenio.bibindex_engine_stopwords import is_stopword
+from invenio.legacy.bibindex.engine_stemmer import stem
+from invenio.legacy.bibindex.engine_stopwords import is_stopword
from invenio.legacy.bibrank.citation_searcher import get_cited_by, get_cited_by_weight
from invenio.intbitset import intbitset
from invenio.legacy.bibrank.word_searcher import find_similar
# Do not remove these lines, it is necessary for func_object = globals().get(function)
from invenio.legacy.bibrank.word_searcher import word_similarity
-from invenio.solrutils_bibrank_searcher import word_similarity_solr
-from invenio.xapianutils_bibrank_searcher import word_similarity_xapian
+from invenio.legacy.miscutil.solrutils_bibrank_searcher import word_similarity_solr
+from invenio.legacy.miscutil.xapianutils_bibrank_searcher import word_similarity_xapian
def compare_on_val(first, second):
return cmp(second[1], first[1])
def check_term(term, col_size, term_rec, max_occ, min_occ, termlength):
"""Check if the tem is valid for use
term - the term to check
col_size - the number of records in database
term_rec - the number of records which contains this term
max_occ - max frequency of the term allowed
min_occ - min frequence of the term allowed
termlength - the minimum length of the terms allowed"""
try:
if is_stopword(term) or (len(term) <= termlength) or ((float(term_rec) / float(col_size)) >= max_occ) or ((float(term_rec) / float(col_size)) <= min_occ):
return ""
if int(term):
return ""
except StandardError, e:
pass
return "true"
def create_external_ranking_settings(rank_method_code, config):
methods[rank_method_code]['fields'] = dict()
sections = config.sections()
field_pattern = re.compile('field[0-9]+')
for section in sections:
if field_pattern.search(section):
field_name = config.get(section, 'name')
methods[rank_method_code]['fields'][field_name] = dict()
for option in config.options(section):
if option != 'name':
create_external_ranking_option(section, option, methods[rank_method_code]['fields'][field_name], config)
elif section == 'find_similar_to_recid':
methods[rank_method_code][section] = dict()
for option in config.options(section):
create_external_ranking_option(section, option, methods[rank_method_code][section], config)
elif section == 'field_settings':
for option in config.options(section):
create_external_ranking_option(section, option, methods[rank_method_code], config)
def create_external_ranking_option(section, option, dictionary, config):
value = config.get(section, option)
if value.isdigit():
value = int(value)
dictionary[option] = value
def create_rnkmethod_cache():
"""Create cache with vital information for each rank method."""
global methods
bibrank_meths = run_sql("SELECT name from rnkMETHOD")
methods = {}
global voutput
voutput = ""
for (rank_method_code,) in bibrank_meths:
try:
file = CFG_ETCDIR + "/bibrank/" + rank_method_code + ".cfg"
config = ConfigParser.ConfigParser()
config.readfp(open(file))
except StandardError, e:
pass
cfg_function = config.get("rank_method", "function")
if config.has_section(cfg_function):
methods[rank_method_code] = {}
methods[rank_method_code]["function"] = cfg_function
methods[rank_method_code]["prefix"] = config.get(cfg_function, "relevance_number_output_prologue")
methods[rank_method_code]["postfix"] = config.get(cfg_function, "relevance_number_output_epilogue")
methods[rank_method_code]["chars_alphanumericseparators"] = r"[1234567890\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~]"
else:
raise Exception("Error in configuration file: %s" % (CFG_ETCDIR + "/bibrank/" + rank_method_code + ".cfg"))
i8n_names = run_sql("""SELECT ln,value from rnkMETHODNAME,rnkMETHOD where id_rnkMETHOD=rnkMETHOD.id and rnkMETHOD.name=%s""", (rank_method_code,))
for (ln, value) in i8n_names:
methods[rank_method_code][ln] = value
if config.has_option(cfg_function, "table"):
methods[rank_method_code]["rnkWORD_table"] = config.get(cfg_function, "table")
query = "SELECT count(*) FROM %sR" % wash_table_column_name(methods[rank_method_code]["rnkWORD_table"][:-1])
methods[rank_method_code]["col_size"] = run_sql(query)[0][0]
if config.has_option(cfg_function, "stemming") and config.get(cfg_function, "stemming"):
try:
methods[rank_method_code]["stemmer"] = config.get(cfg_function, "stemming")
except Exception,e:
pass
if config.has_option(cfg_function, "stopword"):
methods[rank_method_code]["stopwords"] = config.get(cfg_function, "stopword")
if config.has_section("find_similar"):
methods[rank_method_code]["max_word_occurence"] = float(config.get("find_similar", "max_word_occurence"))
methods[rank_method_code]["min_word_occurence"] = float(config.get("find_similar", "min_word_occurence"))
methods[rank_method_code]["min_word_length"] = int(config.get("find_similar", "min_word_length"))
methods[rank_method_code]["min_nr_words_docs"] = int(config.get("find_similar", "min_nr_words_docs"))
methods[rank_method_code]["max_nr_words_upper"] = int(config.get("find_similar", "max_nr_words_upper"))
methods[rank_method_code]["max_nr_words_lower"] = int(config.get("find_similar", "max_nr_words_lower"))
methods[rank_method_code]["default_min_relevance"] = int(config.get("find_similar", "default_min_relevance"))
if cfg_function in ('word_similarity_solr', 'word_similarity_xapian'):
create_external_ranking_settings(rank_method_code, config)
if config.has_section("combine_method"):
i = 1
methods[rank_method_code]["combine_method"] = []
while config.has_option("combine_method", "method%s" % i):
methods[rank_method_code]["combine_method"].append(string.split(config.get("combine_method", "method%s" % i), ","))
i += 1
def is_method_valid(colID, rank_method_code):
"""
Check if RANK_METHOD_CODE method is valid for the collection given.
If colID is None, then check for existence regardless of collection.
"""
if colID is None:
return run_sql("SELECT COUNT(*) FROM rnkMETHOD WHERE name=%s", (rank_method_code,))[0][0]
enabled_colls = dict(run_sql("SELECT id_collection, score from collection_rnkMETHOD,rnkMETHOD WHERE id_rnkMETHOD=rnkMETHOD.id AND name=%s", (rank_method_code,)))
try:
colID = int(colID)
except TypeError:
return 0
if enabled_colls.has_key(colID):
return 1
else:
while colID:
colID = run_sql("SELECT id_dad FROM collection_collection WHERE id_son=%s", (colID,))
if colID and enabled_colls.has_key(colID[0][0]):
return 1
elif colID:
colID = colID[0][0]
return 0
def get_bibrank_methods(colID, ln=CFG_SITE_LANG):
"""
Return a list of rank methods enabled for collection colID and the
name of them in the language defined by the ln parameter.
"""
if not globals().has_key('methods'):
create_rnkmethod_cache()
avail_methods = []
for (rank_method_code, options) in methods.iteritems():
if options.has_key("function") and is_method_valid(colID, rank_method_code):
if options.has_key(ln):
avail_methods.append((rank_method_code, options[ln]))
elif options.has_key(CFG_SITE_LANG):
avail_methods.append((rank_method_code, options[CFG_SITE_LANG]))
else:
avail_methods.append((rank_method_code, rank_method_code))
return avail_methods
def rank_records(rank_method_code, rank_limit_relevance, hitset_global, pattern=[], verbose=0, field='', rg=None, jrec=None):
"""rank_method_code, e.g. `jif' or `sbr' (word frequency vector model)
rank_limit_relevance, e.g. `23' for `nbc' (number of citations) or `0.10' for `vec'
hitset, search engine hits;
pattern, search engine query or record ID (you check the type)
verbose, verbose level
output:
list of records
list of rank values
prefix
postfix
verbose_output"""
voutput = ""
configcreated = ""
starttime = time.time()
afterfind = starttime - time.time()
aftermap = starttime - time.time()
try:
hitset = copy.deepcopy(hitset_global) #we are receiving a global hitset
if not globals().has_key('methods'):
create_rnkmethod_cache()
function = methods[rank_method_code]["function"]
#we get 'citation' method correctly here
func_object = globals().get(function)
if verbose > 0:
voutput += "function: %s <br/> " % function
voutput += "pattern: %s <br/>" % str(pattern)
if func_object and pattern and pattern[0][0:6] == "recid:" and function == "word_similarity":
result = find_similar(rank_method_code, pattern[0][6:], hitset, rank_limit_relevance, verbose, methods)
elif rank_method_code == "citation":
#we get rank_method_code correctly here. pattern[0] is the search word - not used by find_cit
p = ""
if pattern and pattern[0]:
p = pattern[0][6:]
result = find_citations(rank_method_code, p, hitset, verbose)
elif func_object:
if function == "word_similarity":
result = func_object(rank_method_code, pattern, hitset, rank_limit_relevance, verbose, methods)
elif function in ("word_similarity_solr", "word_similarity_xapian"):
if not rg:
rg = CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS
if not jrec:
jrec = 0
ranked_result_amount = rg + jrec
if verbose > 0:
voutput += "Ranked result amount: %s<br/><br/>" % ranked_result_amount
if verbose > 0:
voutput += "field: %s<br/>" % field
if function == "word_similarity_solr":
if verbose > 0:
voutput += "In Solr part:<br/>"
result = word_similarity_solr(pattern, hitset, methods[rank_method_code], verbose, field, ranked_result_amount)
if function == "word_similarity_xapian":
if verbose > 0:
voutput += "In Xapian part:<br/>"
result = word_similarity_xapian(pattern, hitset, methods[rank_method_code], verbose, field, ranked_result_amount)
else:
result = func_object(rank_method_code, pattern, hitset, rank_limit_relevance, verbose)
else:
result = rank_by_method(rank_method_code, pattern, hitset, rank_limit_relevance, verbose)
except Exception, e:
register_exception()
result = (None, "", adderrorbox("An error occured when trying to rank the search result "+rank_method_code, ["Unexpected error: %s<br />" % (e,)]), voutput)
afterfind = time.time() - starttime
if result[0] and result[1]: #split into two lists for search_engine
results_similar_recIDs = map(lambda x: x[0], result[0])
results_similar_relevances = map(lambda x: x[1], result[0])
result = (results_similar_recIDs, results_similar_relevances, result[1], result[2], "%s" % configcreated + result[3])
aftermap = time.time() - starttime;
else:
result = (None, None, result[1], result[2], result[3])
#add stuff from here into voutput from result
tmp = voutput+result[4]
if verbose > 0:
tmp += "<br/>Elapsed time after finding: "+str(afterfind)+"\nElapsed after mapping: "+str(aftermap)
result = (result[0],result[1],result[2],result[3],tmp)
#dbg = string.join(map(str,methods[rank_method_code].items()))
#result = (None, "", adderrorbox("Debug ",rank_method_code+" "+dbg),"",voutput);
return result
def combine_method(rank_method_code, pattern, hitset, rank_limit_relevance,verbose):
"""combining several methods into one based on methods/percentage in config file"""
global voutput
result = {}
try:
for (method, percent) in methods[rank_method_code]["combine_method"]:
function = methods[method]["function"]
func_object = globals().get(function)
percent = int(percent)
if func_object:
this_result = func_object(method, pattern, hitset, rank_limit_relevance, verbose)[0]
else:
this_result = rank_by_method(method, pattern, hitset, rank_limit_relevance, verbose)[0]
for i in range(0, len(this_result)):
(recID, value) = this_result[i]
if value > 0:
result[recID] = result.get(recID, 0) + int((float(i) / len(this_result)) * float(percent))
result = result.items()
result.sort(lambda x, y: cmp(x[1], y[1]))
return (result, "(", ")", voutput)
except Exception, e:
return (None, "Warning: %s method cannot be used for ranking your query." % rank_method_code, "", voutput)
def rank_by_method(rank_method_code, lwords, hitset, rank_limit_relevance,verbose):
"""Ranking of records based on predetermined values.
input:
rank_method_code - the code of the method, from the name field in rnkMETHOD, used to get predetermined values from
rnkMETHODDATA
lwords - a list of words from the query
hitset - a list of hits for the query found by search_engine
rank_limit_relevance - show only records with a rank value above this
verbose - verbose value
output:
reclist - a list of sorted records, with unsorted added to the end: [[23,34], [344,24], [1,01]]
prefix - what to show before the rank value
postfix - what to show after the rank value
voutput - contains extra information, content dependent on verbose value"""
global voutput
voutput = ""
rnkdict = run_sql("SELECT relevance_data FROM rnkMETHODDATA,rnkMETHOD where rnkMETHOD.id=id_rnkMETHOD and rnkMETHOD.name=%s", (rank_method_code,))
if not rnkdict:
return (None, "Warning: Could not load ranking data for method %s." % rank_method_code, "", voutput)
max_recid = 0
res = run_sql("SELECT max(id) FROM bibrec")
if res and res[0][0]:
max_recid = int(res[0][0])
lwords_hitset = None
for j in range(0, len(lwords)): #find which docs to search based on ranges..should be done in search_engine...
if lwords[j] and lwords[j][:6] == "recid:":
if not lwords_hitset:
lwords_hitset = intbitset()
lword = lwords[j][6:]
if string.find(lword, "->") > -1:
lword = string.split(lword, "->")
if int(lword[0]) >= max_recid or int(lword[1]) >= max_recid + 1:
return (None, "Warning: Given record IDs are out of range.", "", voutput)
for i in range(int(lword[0]), int(lword[1])):
lwords_hitset.add(int(i))
elif lword < max_recid + 1:
lwords_hitset.add(int(lword))
else:
return (None, "Warning: Given record IDs are out of range.", "", voutput)
rnkdict = deserialize_via_marshal(rnkdict[0][0])
if verbose > 0:
voutput += "<br />Running rank method: %s, using rank_by_method function in bibrank_record_sorter<br />" % rank_method_code
voutput += "Ranking data loaded, size of structure: %s<br />" % len(rnkdict)
lrecIDs = list(hitset)
if verbose > 0:
voutput += "Number of records to rank: %s<br />" % len(lrecIDs)
reclist = []
reclist_addend = []
if not lwords_hitset: #rank all docs, can this be speed up using something else than for loop?
for recID in lrecIDs:
if rnkdict.has_key(recID):
reclist.append((recID, rnkdict[recID]))
del rnkdict[recID]
else:
reclist_addend.append((recID, 0))
else: #rank docs in hitset, can this be speed up using something else than for loop?
for recID in lwords_hitset:
if rnkdict.has_key(recID) and recID in hitset:
reclist.append((recID, rnkdict[recID]))
del rnkdict[recID]
elif recID in hitset:
reclist_addend.append((recID, 0))
if verbose > 0:
voutput += "Number of records ranked: %s<br />" % len(reclist)
voutput += "Number of records not ranked: %s<br />" % len(reclist_addend)
reclist.sort(lambda x, y: cmp(x[1], y[1]))
return (reclist_addend + reclist, methods[rank_method_code]["prefix"], methods[rank_method_code]["postfix"], voutput)
def find_citations(rank_method_code, recID, hitset, verbose):
"""Rank by the amount of citations."""
#calculate the cited-by values for all the members of the hitset
#returns: ((recordid,weight),prefix,postfix,message)
global voutput
voutput = ""
#If the recID is numeric, return only stuff that cites it. Otherwise return
#stuff that cites hitset
#try to convert to int
recisint = True
recidint = 0
try:
recidint = int(recID)
except:
recisint = False
ret = []
if recisint:
myrecords = get_cited_by(recidint) #this is a simple list
ret = get_cited_by_weight(myrecords)
else:
ret = get_cited_by_weight(hitset)
ret.sort(lambda x,y:cmp(x[1],y[1])) #ascending by the second member of the tuples
if verbose > 0:
voutput = voutput+"\nrecID "+str(recID)+" is int: "+str(recisint)+" hitset "+str(hitset)+"\n"+"find_citations retlist "+str(ret)
#voutput = voutput + str(ret)
if ret:
return (ret,"(", ")", "")
else:
return ((),"", "", "")
diff --git a/invenio/legacy/bibrank/scripts/bibrank.py b/invenio/legacy/bibrank/scripts/bibrank.py
index bfde1736f..4432457e4 100644
--- a/invenio/legacy/bibrank/scripts/bibrank.py
+++ b/invenio/legacy/bibrank/scripts/bibrank.py
@@ -1,299 +1,26 @@
-## -*- mode: python; coding: utf-8; -*-
+# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
-## Copyright (C) 2007, 2008, 2010, 2011 CERN.
+## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
-"""
-BibRank ranking daemon.
-
-Usage: bibrank [options]
- Ranking examples:
- bibrank -wjif -a --id=0-30000,30001-860000 --verbose=9
- bibrank -wjif -d --modified='2002-10-27 13:57:26'
- bibrank -wwrd --rebalance --collection=Articles
- bibrank -wwrd -a -i 234-250,293,300-500 -u admin
-
- Ranking options:
- -w, --run=r1[,r2] runs each rank method in the order given
-
- -c, --collection=c1[,c2] select according to collection
- -i, --id=low[-high] select according to doc recID
- -m, --modified=from[,to] select according to modification date
- -l, --lastupdate select according to last update
-
- -a, --add add or update words for selected records
- -d, --del delete words for selected records
- -S, --stat show statistics for a method
-
- -R, --recalculate recalculate weigth data, used by word frequency
- and citation methods, should be used if ca 1%
- of the document has been changed since last
- time -R was used
-
- -E, --extcites=NUM print the top entries of the external cites table.
- These are entries that should be entered in
- your collection, since they have been cited
- by NUM or more other records present in the
- system. Useful for cataloguers to input
- external papers manually.
-
- Repairing options:
- -k, --check check consistency for all records in the table(s)
- check if update of ranking data is necessary
- -r, --repair try to repair all records in the table(s)
- Scheduling options:
- -u, --user=USER user name to store task, password needed
- -s, --sleeptime=SLEEP time after which to repeat tasks (no)
- e.g.: 1s, 30m, 24h, 7d
- -t, --time=TIME moment for the task to be active (now)
- e.g.: +15s, 5m, 3h , 2002-10-27 13:57:26
- General options:
- -h, --help print this help and exit
- -V, --version print version and exit
- -v, --verbose=LEVEL verbose level (from 0 to 9, default 1)
-"""
-
-__revision__ = "$Id$"
-
-
-import sys
-import traceback
-import ConfigParser
-
-from invenio.config import CFG_ETCDIR
-from invenio.legacy.dbquery import run_sql
-from invenio.ext.logging import register_exception
-from invenio.bibtask import task_init, write_message, task_get_option, \
- task_set_option, get_datetime, task_update_status, \
- task_sleep_now_if_required
-
-# pylint: disable=W0611
-# Disabling unused import pylint check, since these are needed to get
-# imported here, and are called later dynamically.
-from invenio.legacy.bibrank.tag_based_indexer import \
- single_tag_rank_method, \
- citation, \
- download_weight_filtering_user, \
- download_weight_total, \
- file_similarity_by_times_downloaded, \
- index_term_count
-from invenio.legacy.bibrank.word_indexer import word_similarity #@UnusedImport
-from invenio.legacy.bibrank.citerank_indexer import citerank #@UnusedImport
-from invenio.solrutils_bibrank_indexer import word_similarity_solr #@UnusedImport
-from invenio.xapianutils_bibrank_indexer import word_similarity_xapian #@UnusedImport
-from invenio.legacy.bibrank.selfcites_task import process_updates as selfcites
-# pylint: enable=W0611
-
-
-nb_char_in_line = 50 # for verbose pretty printing
-chunksize = 1000 # default size of chunks that the records will be treated by
-base_process_size = 4500 # process base size
-
-def split_ranges(parse_string):
- """Split ranges of numbers"""
- recIDs = []
- ranges = parse_string.split(",")
- for rang in ranges:
- tmp_recIDs = rang.split("-")
-
- if len(tmp_recIDs)==1:
- recIDs.append([int(tmp_recIDs[0]), int(tmp_recIDs[0])])
- else:
- if int(tmp_recIDs[0]) > int(tmp_recIDs[1]): # sanity check
- tmp = tmp_recIDs[0]
- tmp_recIDs[0] = tmp_recIDs[1]
- tmp_recIDs[1] = tmp
- recIDs.append([int(tmp_recIDs[0]), int(tmp_recIDs[1])])
- return recIDs
-
-def get_date_range(var):
- "Returns the two dates contained as a low,high tuple"
- limits = var.split(",")
- if len(limits)==1:
- low = get_datetime(limits[0])
- return low, None
- if len(limits)==2:
- low = get_datetime(limits[0])
- high = get_datetime(limits[1])
- return low, high
-
-def task_run_core():
- """Run the indexing task. The row argument is the BibSched task
- queue row, containing if, arguments, etc.
- Return 1 in case of success and 0 in case of failure.
- """
- if not task_get_option("run"):
- task_set_option("run", [name[0] for name in run_sql("SELECT name from rnkMETHOD")])
-
- try:
- for key in task_get_option("run"):
- task_sleep_now_if_required(can_stop_too=True)
- write_message("")
- filename = CFG_ETCDIR + "/bibrank/" + key + ".cfg"
- write_message("Getting configuration from file: %s" % filename,
- verbose=9)
- config = ConfigParser.ConfigParser()
- try:
- config.readfp(open(filename))
- except StandardError, e:
- write_message("Cannot find configurationfile: %s. "
- "The rankmethod may also not be registered using "
- "the BibRank Admin Interface." % filename, sys.stderr)
- raise StandardError
-
- #Using the function variable to call the function related to the
- #rank method
- cfg_function = config.get("rank_method", "function")
- func_object = globals().get(cfg_function)
- if func_object:
- func_object(key)
- else:
- write_message("Cannot run method '%s', no function to call"
- % key)
- except StandardError, e:
- write_message("\nException caught: %s" % e, sys.stderr)
- write_message(traceback.format_exc()[:-1])
- register_exception()
- task_update_status("ERROR")
- sys.exit(1)
-
- return True
-
-def main():
- """Main that construct all the bibtask."""
- task_init(authorization_action='runbibrank',
- authorization_msg="BibRank Task Submission",
- description="""Ranking examples:
- bibrank -wjif -a --id=0-30000,30001-860000 --verbose=9
- bibrank -wjif -d --modified='2002-10-27 13:57:26'
- bibrank -wjif --rebalance --collection=Articles
- bibrank -wsbr -a -i 234-250,293,300-500 -u admin
- bibrank -u admin -w citation -E 10
- bibrank -u admin -w citation -A
-""",
- help_specific_usage="""Ranking options:
- -w, --run=r1[,r2] runs each rank method in the order given
-
- -c, --collection=c1[,c2] select according to collection
- -i, --id=low[-high] select according to doc recID
- -m, --modified=from[,to] select according to modification date
- -l, --lastupdate select according to last update
-
- -a, --add add or update words for selected records
- -d, --del delete words for selected records
- -S, --stat show statistics for a method
-
- -R, --recalculate recalculate weight data, used by word frequency
- and citation methods, should be used if ca 1%
- of the documents have been changed since last
- time -R was used. NOTE: This will replace the
- entire set of weights, regardless of date/id
- selection.
-
- -E, --extcites=NUM print the top entries of the external cites table.
- These are entries that should be entered in
- your collection, since they have been cited
- by NUM or more other records present in the
- system. Useful for cataloguers to input
- external papers manually.
-
- -A --author-citations Calculate author citations.
-
- Repairing options:
- -k, --check check consistency for all records in the table(s)
- check if update of ranking data is necessary
- -r, --repair try to repair all records in the table(s)
-""",
- version=__revision__,
- specific_params=("AE:ladSi:m:c:kUrRM:f:w:", [
- "author-citations",
- "print-extcites=",
- "lastupdate",
- "add",
- "del",
- "repair",
- "maxmem",
- "flush",
- "stat",
- "rebalance",
- "id=",
- "collection=",
- "check",
- "modified=",
- "update",
- "run="]),
- task_submit_elaborate_specific_parameter_fnc=
- task_submit_elaborate_specific_parameter,
- task_run_fnc=task_run_core)
-
-
-def task_submit_elaborate_specific_parameter(key, value, opts, dummy):
- """Elaborate a specific parameter of CLI bibrank."""
- if key in ("-a", "--add"):
- task_set_option("cmd", "add")
- if ("-x","") in opts or ("--del","") in opts:
- raise StandardError, "--add incompatible with --del"
- elif key in ("--run", "-w"):
- task_set_option("run", [])
- run = value.split(",")
- for run_key in range(0, len(run)):
- task_get_option('run').append(run[run_key])
- elif key in ("-r", "--repair"):
- task_set_option("cmd", "repair")
- elif key in ("-E", "--print-extcites"):
- try:
- task_set_option("print-extcites", int(value))
- except:
- task_set_option("print-extcites", 10) # default fallback value
- task_set_option("cmd", "print-missing")
- elif key in ("-A", "--author-citations"):
- task_set_option("author-citations", "1")
- elif key in ("-d", "--del"):
- task_set_option("cmd", "del")
- elif key in ("-k", "--check"):
- task_set_option("cmd", "check")
- elif key in ("-S", "--stat"):
- task_set_option("cmd", "stat")
- elif key in ("-i", "--id"):
- task_set_option("id", task_get_option("id") + split_ranges(value))
- task_set_option("last_updated", "")
- elif key in ("-c", "--collection"):
- task_set_option("collection", value)
- elif key in ("-R", "--rebalance"):
- task_set_option("quick", "no")
- elif key in ("-f", "--flush"):
- task_set_option("flush", int(value))
- elif key in ("-M", "--maxmem"):
- task_set_option("maxmem", int(value))
- if task_get_option("maxmem") < base_process_size + 1000:
- raise StandardError, "Memory usage should be higher than %d kB" % \
- (base_process_size + 1000)
- elif key in ("-m", "--modified"):
- task_set_option("modified", get_date_range(value))#2002-10-27 13:57:26)
- task_set_option("last_updated", "")
- elif key in ("-l", "--lastupdate"):
- task_set_option("last_updated", "last_updated")
- else:
- return False
- return True
-
-
from invenio.base.factory import with_app_context
+
@with_app_context()
-if __name__ == "__main__":
- main()
+def main():
+ from invenio.legacy.bibrank.cli import main
+ return main()
diff --git a/invenio/legacy/bibrank/selfcites_indexer.py b/invenio/legacy/bibrank/selfcites_indexer.py
index 24efad11d..92fd12023 100644
--- a/invenio/legacy/bibrank/selfcites_indexer.py
+++ b/invenio/legacy/bibrank/selfcites_indexer.py
@@ -1,345 +1,345 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Self-citations indexer
We store the records and authors in a faster to access way than directly
accessing the bibrecs tables.
We have 3 tables:
1. rnkAUTHORS to associate records to authors in a speedy way
2. rnkEXTENDEDAUTHORS to associate co-authors with bibrecs
for a given bibrec, it provides a fast way to access all the authors of
the bibrec but also the people they have written papers with
3. rnkSELFCITES used by search_engine_summarizer for displaying the self-
citations count.
"""
from itertools import chain
import ConfigParser
from invenio.modules.formatter.utils import parse_tag
from invenio.legacy.bibrecord import get_fieldvalues
from invenio.legacy.bibrank.citation_indexer import tagify
from invenio.config import CFG_ETCDIR, \
CFG_BIBRANK_SELFCITES_USE_BIBAUTHORID, \
CFG_BIBRANK_SELFCITES_PRECOMPUTE
from invenio.legacy.dbquery import run_sql
-from invenio.bibauthorid_searchinterface import get_personids_from_bibrec
+from invenio.legacy.bibauthorid.searchinterface import get_personids_from_bibrec
from invenio.legacy.bibrank.citation_searcher import get_cited_by
def load_config_file(key):
"""Load config file containing the authors, co-authors tags #"""
filename = CFG_ETCDIR + "/bibrank/" + key + ".cfg"
config = ConfigParser.ConfigParser()
try:
config.readfp(open(filename))
except StandardError:
raise Exception('Unable to load config file %s' % filename)
return config
def get_personids_from_record(record):
"""Returns all the personids associated to a record.
We limit the result length to 20 authors, after which it returns an
empty set for performance reasons
"""
ids = get_personids_from_bibrec(record)
if 0 < len(ids) <= 20:
person_ids = set(ids)
else:
person_ids = set()
return person_ids
def get_authors_tags():
"""
Get the tags for main author, coauthors, alternative authors from config
"""
config = load_config_file('citation')
function = config.get("rank_method", "function")
tags_names = [
'first_author',
'additional_author',
'alternative_author_name',
'collaboration_name',
]
tags = {}
for t in tags_names:
r_tag = config.get(function, t)
tags[t] = tagify(parse_tag(r_tag))
return tags
def get_authors_from_record(recID, tags,
use_bibauthorid=CFG_BIBRANK_SELFCITES_USE_BIBAUTHORID):
"""Get all authors for a record
We need this function because there's 3 different types of authors
and to fetch each one of them we need look through MARC tags
"""
if use_bibauthorid:
authors = get_personids_from_record(recID)
else:
authors_list = chain(
get_fieldvalues(recID, tags['first_author']),
get_fieldvalues(recID, tags['additional_author']),
get_fieldvalues(recID, tags['alternative_author_name']))
authors = set(hash(author) for author in list(authors_list)[:20])
return authors
def get_collaborations_from_record(recID, tags):
"""Get all collaborations for a record"""
return get_fieldvalues(recID, tags['collaboration_name'])
def compute_self_citations(recid, tags, authors_fun):
"""Compute the self-citations
We return the total numbers of citations minus the number of self-citations
Args:
- recid: record id
- lciters: list of record ids citing this record
- authors_cache: the authors cache which will be used to store an author
friends (to not compute friends twice)
- tags: the tag number for author, coauthors, collaborations,
required since it depends on how the marc was defined
"""
citers = get_cited_by(recid)
if not citers:
return set()
self_citations = set()
authors = frozenset(get_authors_from_record(recid, tags))
collaborations = None
if not authors or len(authors) > 20:
collaborations = frozenset(
get_collaborations_from_record(recid, tags))
if collaborations:
# Use collaborations names
for cit in citers:
cit_collaborations = frozenset(
get_collaborations_from_record(cit, tags))
if collaborations.intersection(cit_collaborations):
self_citations.add(cit)
else:
# Use authors names
for cit in citers:
cit_authors = get_authors_from_record(cit, tags)
if (not authors or len(cit_authors) > 20) and \
get_collaborations_from_record(cit, tags):
# Record from a collaboration that cites
# a record from an author, it's fine
pass
else:
cit_coauthors = frozenset(authors_fun(cit, tags))
if authors.intersection(cit_coauthors):
self_citations.add(cit)
return self_citations
def fetch_references(recid):
"""Fetch the references stored in the self-citations table for given record
We need to store the references to make sure that when we do incremental
updates of the table, we update all the related records properly
"""
sql = "SELECT `references` FROM rnkSELFCITES WHERE id_bibrec = %s"
try:
references = run_sql(sql, (recid, ))[0][0]
except IndexError:
references = ''
if references:
ids = set(int(ref) for ref in references.split(','))
else:
ids = set()
return ids
def get_precomputed_self_cites_list(recids):
"""Fetch pre-computed self-cites data for given records"""
in_sql = ','.join('%s' for dummy in recids)
sql = """SELECT id_bibrec, count
FROM rnkSELFCITES
WHERE id_bibrec IN (%s)""" % in_sql
return run_sql(sql, recids)
def get_precomputed_self_cites(recid):
"""Fetch pre-computed self-cites data for given record"""
sql = "SELECT count FROM rnkSELFCITES WHERE id_bibrec = %s"
try:
r = run_sql(sql, (recid, ))[0][0]
except IndexError:
r = None
return r
def compute_friends_self_citations(recid, tags):
def coauthors(recid, tags):
return set(get_record_coauthors(recid)) \
| set(get_authors_from_record(recid, tags))
return compute_self_citations(recid, tags, coauthors)
def compute_simple_self_citations(recid, tags):
"""Simple compute self-citations
The purpose of this algorithm is to provide an alternate way to compute
self-citations that we can use at runtime.
Here, we only check for authors citing themselves.
"""
return compute_self_citations(recid, tags, get_authors_from_record)
def get_self_citations_count(recids, algorithm='simple',
precompute=CFG_BIBRANK_SELFCITES_PRECOMPUTE):
"""Depending on our site we config, we either:
* compute self-citations (using a simple algorithm)
* or fetch self-citations from pre-computed table"""
total_cites = 0
if not precompute:
tags = get_authors_tags()
selfcites_fun = ALL_ALGORITHMS[algorithm]
for recid in recids:
citers = get_cited_by(recid)
self_cites = selfcites_fun(recid, tags)
total_cites += len(citers) - len(self_cites)
else:
results = get_precomputed_self_cites_list(recids)
results_dict = {}
for r in results:
results_dict[r[0]] = r[1]
for r in recids:
citers = get_cited_by(r)
self_cites = results_dict.get(r, 0)
total_cites += len(citers) - self_cites
return total_cites
def update_self_cites_tables(recid, config, tags):
"""For a given record update all self-cites table if needed"""
authors = get_authors_from_record(recid, tags)
if 0 < len(authors) <= 20:
# Updated reords cache table
deleted_authors, added_authors = store_record(recid, authors)
if deleted_authors or added_authors:
# Update extended authors table
store_record_coauthors(recid,
authors,
deleted_authors,
added_authors,
config)
def store_record(recid, authors):
"""
For a given record, updates if needed the db table (rnkRECORDSCACHE)
storing the association of recids and authorids
Returns true if the database has been modified
"""
sql = 'SELECT authorid FROM rnkRECORDSCACHE WHERE id_bibrec = %s'
rows = run_sql(sql, (recid, ))
old_authors = set(r[0] for r in rows)
if authors != old_authors:
deleted_authors = old_authors.difference(authors)
added_authors = authors.difference(old_authors)
for authorid in deleted_authors:
run_sql("""DELETE FROM rnkRECORDSCACHE
WHERE id_bibrec = %s""", (recid, ))
for authorid in added_authors:
run_sql("""INSERT IGNORE INTO rnkRECORDSCACHE (id_bibrec, authorid)
VALUES (%s,%s)""", (recid, authorid))
return deleted_authors, added_authors
return set(), set()
def get_author_coauthors_list(personids, config):
"""
Get all the authors that have written a paper with any of the given authors
"""
personids = list(personids)
if not personids:
return ()
cluster_threshold = config['friends_threshold']
in_sql = ','.join('%s' for r in personids)
coauthors = (r[0] for r in run_sql("""
SELECT a.authorid FROM rnkRECORDSCACHE as a
JOIN rnkRECORDSCACHE as b ON a.id_bibrec = b.id_bibrec
WHERE b.authorid IN (%s)
GROUP BY a.authorid
HAVING count(a.authorid) >= %s""" % (in_sql, cluster_threshold),
personids))
return chain(personids, coauthors)
def store_record_coauthors(recid, authors, deleted_authors,
added_authors, config):
"""Fill table used by get_record_coauthors()"""
if deleted_authors:
to_process = authors
else:
to_process = added_authors
for personid in get_author_coauthors_list(deleted_authors, config):
run_sql('DELETE FROM rnkEXTENDEDAUTHORS WHERE'\
' id = %s AND authorid = %s', (recid, personid))
for personid in get_author_coauthors_list(to_process, config):
run_sql('INSERT IGNORE INTO rnkEXTENDEDAUTHORS (id, authorid) ' \
'VALUES (%s,%s)', (recid, personid))
def get_record_coauthors(recid):
"""
Get all the authors that have written a paper with any of the authors of
given bibrec
"""
sql = 'SELECT authorid FROM rnkEXTENDEDAUTHORS WHERE id = %s'
return (r[0] for r in run_sql(sql, (recid, )))
ALL_ALGORITHMS = {
'friends': compute_friends_self_citations,
'simple': compute_simple_self_citations,
}
diff --git a/invenio/legacy/bibrank/selfcites_task.py b/invenio/legacy/bibrank/selfcites_task.py
index e5618121a..849b7feda 100644
--- a/invenio/legacy/bibrank/selfcites_task.py
+++ b/invenio/legacy/bibrank/selfcites_task.py
@@ -1,325 +1,325 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Self citations task
Stores self-citations in a table for quick access
"""
import sys
import ConfigParser
from datetime import datetime
from invenio.config import CFG_BIBRANK_SELFCITES_USE_BIBAUTHORID, \
CFG_ETCDIR
-from invenio.bibtask import task_set_option, \
+from invenio.legacy.bibsched.bibtask import task_set_option, \
task_get_option, write_message, \
task_sleep_now_if_required, \
task_update_progress
from invenio.legacy.dbquery import run_sql
from invenio.utils.shell import split_cli_ids_arg
from invenio.legacy.bibrank.selfcites_indexer import update_self_cites_tables, \
compute_friends_self_citations, \
compute_simple_self_citations, \
get_authors_tags
from invenio.legacy.bibrank.citation_searcher import get_refers_to
from invenio.bibauthorid_daemon import get_user_log as bibauthorid_user_log
from invenio.legacy.bibrank.citation_indexer import get_bibrankmethod_lastupdate
HELP_MESSAGE = """
Scheduled (daemon) self cites options:
-a, --new Run on all newly inserted records.
-m, --modified Run on all newly modified records.
-r, --recids Record id for extraction.
-c, --collections Entire Collection for extraction.
--rebuild Rebuild pre-computed tables
* rnkRECORDSCACHE
* rnkEXTENDEDAUTHORS
* rnkSELFCITES
Examples:
(run a daemon job)
selfcites -a
(run on a set of records)
selfcites --recids 1,2 -r 3
(run on a collection)
selfcites --collections "Reports"
"""
"Shown when passed options are invalid or -h is specified in the CLI"
DESCRIPTION = """This task handles the self-citations computation
It is run on modified records so that it can update the tables used for
displaying info in the citesummary format
"""
"Description of the task"
NAME = 'selfcites'
def check_options():
"""Check command line options"""
if not task_get_option('new') \
and not task_get_option('modified') \
and not task_get_option('recids') \
and not task_get_option('collections') \
and not task_get_option('rebuild'):
print >>sys.stderr, 'Error: No input file specified, you need' \
' to specify which files to run on'
return False
return True
def parse_option(key, value, dummy, args):
"""Parse command line options"""
if args:
# There should be no standalone arguments for any refextract job
# This will catch args before the job is shipped to Bibsched
raise StandardError("Error: Unrecognised argument '%s'." % args[0])
if key in ('-a', '--new'):
task_set_option('new', True)
elif key in ('-m', '--modified'):
task_set_option('modified', True)
elif key == '--rebuild':
task_set_option('rebuild', True)
elif key in ('-c', '--collections'):
collections = task_get_option('collections')
if not collections:
collections = set()
task_set_option('collections', collections)
collections.update(split_cli_ids_arg(value))
elif key in ('-r', '--recids'):
recids = task_get_option('recids')
if not recids:
recids = set()
task_set_option('recids', recids)
recids.update(split_cli_ids_arg(value))
return True
def compute_and_store_self_citations(recid, tags, citations_fun,
verbose=False):
"""Compute and store self-cites in a table
Args:
- recid
- tags: used when bibauthorid is desactivated see get_author_tags()
in bibrank_selfcites_indexer
"""
assert recid
if verbose:
write_message("* processing %s" % recid)
references = get_refers_to(recid)
recids_to_check = set([recid]) | set(references)
placeholders = ','.join('%s' for r in recids_to_check)
rec_row = run_sql("SELECT MAX(`modification_date`) FROM `bibrec`"
" WHERE `id` IN (%s)" % placeholders, recids_to_check)
try:
rec_timestamp = rec_row[0]
except IndexError:
write_message("record not found")
return
cached_citations_row = run_sql("SELECT `count` FROM `rnkSELFCITES`"
" WHERE `last_updated` >= %s" \
" AND `id_bibrec` = %s", (rec_timestamp[0], recid))
if cached_citations_row and cached_citations_row[0][0]:
if verbose:
write_message("%s found (cached)" % cached_citations_row[0])
else:
cites = citations_fun(recid, tags)
sql = """REPLACE INTO rnkSELFCITES (`id_bibrec`, `count`, `references`,
`last_updated`) VALUES (%s, %s, %s, NOW())"""
references_string = ','.join(str(r) for r in references)
run_sql(sql, (recid, len(cites), references_string))
if verbose:
write_message("%s found" % len(cites))
def rebuild_tables(config):
task_update_progress('emptying tables')
empty_self_cites_tables()
task_update_progress('filling tables')
fill_self_cites_tables(config)
return True
def fetch_bibauthorid_last_update():
bibauthorid_log = bibauthorid_user_log(userinfo='daemon',
action='PID_UPDATE',
only_most_recent=True)
try:
bibauthorid_end_date = bibauthorid_log[0][2]
except IndexError:
bibauthorid_end_date = datetime(year=1, month=1, day=1)
return bibauthorid_end_date
def fetch_index_update():
"""Fetch last runtime of given task"""
end_date = get_bibrankmethod_lastupdate('citation')
if CFG_BIBRANK_SELFCITES_USE_BIBAUTHORID:
bibauthorid_end_date = fetch_bibauthorid_last_update()
end_date = min(end_date, bibauthorid_end_date)
return end_date
def fetch_records(start_date, end_date):
"""Filter records not indexed out of recids
We need to run after bibauthorid // bibrank citation indexer
"""
sql = """SELECT `id` FROM `bibrec`
WHERE `modification_date` <= %s
AND `modification_date` > %s"""
records = run_sql(sql, (end_date,
start_date))
return [r[0] for r in records]
def fetch_concerned_records(name):
start_date = get_bibrankmethod_lastupdate(name)
end_date = fetch_index_update()
return fetch_records(start_date, end_date)
def store_last_updated(name, date):
run_sql("UPDATE rnkMETHOD SET last_updated=%s WHERE name=%s", (date, name))
def read_configuration(rank_method_code):
filename = CFG_ETCDIR + "/bibrank/" + rank_method_code + ".cfg"
config = ConfigParser.ConfigParser()
try:
config.readfp(open(filename))
except StandardError:
write_message("Cannot find configuration file: %s" % filename, sys.stderr)
raise
return config
def process_updates(rank_method_code):
"""
This is what gets executed first when the task is started.
It handles the --rebuild option. If that option is not specified
we fall back to the process_one()
"""
selfcites_config = read_configuration(rank_method_code)
config = {
'algorithm': selfcites_config.get(rank_method_code, "algorithm"),
'friends_threshold': selfcites_config.get(rank_method_code, "friends_threshold")
}
begin_date = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
quick = task_get_option("quick") != "no"
if not quick:
return rebuild_tables(config)
write_message("Starting")
tags = get_authors_tags()
recids = fetch_concerned_records(rank_method_code)
citations_fun = get_citations_fun(config['algorithm'])
write_message("recids %s" % str(recids))
total = len(recids)
for count, recid in enumerate(recids):
task_sleep_now_if_required(can_stop_too=True)
msg = "Extracting for %s (%d/%d)" % (recid, count + 1, total)
task_update_progress(msg)
write_message(msg)
process_one(recid, tags, citations_fun)
store_last_updated(rank_method_code, begin_date)
write_message("Complete")
return True
def get_citations_fun(algorithm):
if algorithm == 'friends':
citations_fun = compute_friends_self_citations
else:
citations_fun = compute_simple_self_citations
return citations_fun
def process_one(recid, tags, citations_fun):
"""Self-cites core func, executed on each recid"""
# First update this record then all its references
compute_and_store_self_citations(recid, tags, citations_fun)
references = get_refers_to(recid)
for recordid in references:
compute_and_store_self_citations(recordid, tags, citations_fun)
def empty_self_cites_tables():
"""
This will empty all the self-cites tables
The purpose is to rebuild the tables from scratch in case there is problem
with them: inconsitencies, corruption,...
"""
run_sql('TRUNCATE rnkSELFCITES')
run_sql('TRUNCATE rnkEXTENDEDAUTHORS')
run_sql('TRUNCATE rnkRECORDSCACHE')
def fill_self_cites_tables(config):
"""
This will fill the self-cites tables with data
The purpose of this function is to fill these tables on a website that
never ran the self-cites daemon
"""
algorithm = config['algorithm']
tags = get_authors_tags()
all_ids = [r[0] for r in run_sql('SELECT id FROM bibrec ORDER BY id')]
citations_fun = get_citations_fun(algorithm)
write_message('using %s' % citations_fun.__name__)
if algorithm == 'friends':
# We only needs this table for the friends algorithm or assimilated
# Fill intermediary tables
for index, recid in enumerate(all_ids):
if index % 1000 == 0:
msg = 'intermediate %d/%d' % (index, len(all_ids))
task_update_progress(msg)
write_message(msg)
task_sleep_now_if_required()
update_self_cites_tables(recid, config, tags)
# Fill self-cites table
for index, recid in enumerate(all_ids):
if index % 1000 == 0:
msg = 'final %d/%d' % (index, len(all_ids))
task_update_progress(msg)
write_message(msg)
task_sleep_now_if_required()
compute_and_store_self_citations(recid, tags, citations_fun)
diff --git a/invenio/legacy/bibrank/tag_based_indexer.py b/invenio/legacy/bibrank/tag_based_indexer.py
index aaa5f1ac4..d8e83c98a 100644
--- a/invenio/legacy/bibrank/tag_based_indexer.py
+++ b/invenio/legacy/bibrank/tag_based_indexer.py
@@ -1,533 +1,533 @@
# -*- coding: utf-8 -*-
## Ranking of records using different parameters and methods.
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import os
import sys
import time
import traceback
import ConfigParser
from invenio.config import \
CFG_SITE_LANG, \
CFG_ETCDIR, \
CFG_PREFIX
from invenio.legacy.search_engine import perform_request_search
from invenio.legacy.bibrank.citation_indexer import get_citation_weight, print_missing, get_cit_dict, insert_into_cit_db
from invenio.legacy.bibrank.downloads_indexer import *
from invenio.legacy.dbquery import run_sql, serialize_via_marshal, deserialize_via_marshal, \
wash_table_column_name, get_table_update_time
from invenio.ext.logging import register_exception
-from invenio.bibtask import task_get_option, write_message, task_sleep_now_if_required
-from invenio.bibindex_engine import create_range_list
+from invenio.legacy.bibsched.bibtask import task_get_option, write_message, task_sleep_now_if_required
+from invenio.legacy.bibindex.engine import create_range_list
from invenio.intbitset import intbitset
options = {}
def remove_auto_cites(dic):
"""Remove auto-cites and dedupe."""
for key in dic.keys():
new_list = dic.fromkeys(dic[key]).keys()
try:
new_list.remove(key)
except ValueError:
pass
dic[key] = new_list
return dic
def citation_repair_exec():
"""Repair citation ranking method"""
## repair citations
for rowname in ["citationdict","reversedict"]:
## get dic
dic = get_cit_dict(rowname)
## repair
write_message("Repairing %s" % rowname)
dic = remove_auto_cites(dic)
## store healthy citation dic
insert_into_cit_db(dic, rowname)
return
def download_weight_filtering_user_repair_exec ():
"""Repair download weight filtering user ranking method"""
write_message("Repairing for this ranking method is not defined. Skipping.")
return
def download_weight_total_repair_exec():
"""Repair download weight total ranking method"""
write_message("Repairing for this ranking method is not defined. Skipping.")
return
def file_similarity_by_times_downloaded_repair_exec():
"""Repair file similarity by times downloaded ranking method"""
write_message("Repairing for this ranking method is not defined. Skipping.")
return
def single_tag_rank_method_repair_exec():
"""Repair single tag ranking method"""
write_message("Repairing for this ranking method is not defined. Skipping.")
return
def citation_exec(rank_method_code, name, config):
"""Rank method for citation analysis"""
#first check if this is a specific task
if task_get_option("cmd") == "print-missing":
num = task_get_option("num")
print_missing(num)
else:
dic, index_update_time = get_citation_weight(rank_method_code, config)
if dic:
if task_get_option("id") or task_get_option("collection") or \
task_get_option("modified"):
# user have asked to citation-index specific records
# only, so we should not update citation indexer's
# last run time stamp information
index_update_time = None
intoDB(dic, index_update_time, rank_method_code)
else:
write_message("No need to update the indexes for citations.")
def download_weight_filtering_user(run):
return bibrank_engine(run)
def download_weight_total(run):
return bibrank_engine(run)
def file_similarity_by_times_downloaded(run):
return bibrank_engine(run)
def download_weight_filtering_user_exec (rank_method_code, name, config):
"""Ranking by number of downloads per User.
Only one full Text Download is taken in account for one
specific userIP address"""
begin_date = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
time1 = time.time()
dic = fromDB(rank_method_code)
last_updated = get_lastupdated(rank_method_code)
keys = new_downloads_to_index(last_updated)
filter_downloads_per_hour(keys, last_updated)
dic = get_download_weight_filtering_user(dic, keys)
intoDB(dic, begin_date, rank_method_code)
time2 = time.time()
return {"time":time2-time1}
def download_weight_total_exec(rank_method_code, name, config):
"""rankink by total number of downloads without check the user ip
if users downloads 3 time the same full text document it has to be count as 3 downloads"""
begin_date = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
time1 = time.time()
dic = fromDB(rank_method_code)
last_updated = get_lastupdated(rank_method_code)
keys = new_downloads_to_index(last_updated)
filter_downloads_per_hour(keys, last_updated)
dic = get_download_weight_total(dic, keys)
intoDB(dic, begin_date, rank_method_code)
time2 = time.time()
return {"time":time2-time1}
def file_similarity_by_times_downloaded_exec(rank_method_code, name, config):
"""update dictionnary {recid:[(recid, nb page similarity), ()..]}"""
begin_date = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
time1 = time.time()
dic = fromDB(rank_method_code)
last_updated = get_lastupdated(rank_method_code)
keys = new_downloads_to_index(last_updated)
filter_downloads_per_hour(keys, last_updated)
dic = get_file_similarity_by_times_downloaded(dic, keys)
intoDB(dic, begin_date, rank_method_code)
time2 = time.time()
return {"time":time2-time1}
def single_tag_rank_method_exec(rank_method_code, name, config):
"""Creating the rank method data"""
begin_date = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
rnkset = {}
rnkset_old = fromDB(rank_method_code)
rnkset_new = single_tag_rank(config)
rnkset = union_dicts(rnkset_old, rnkset_new)
intoDB(rnkset, begin_date, rank_method_code)
def single_tag_rank(config):
"""Connect the given tag with the data from the kb file given"""
write_message("Loading knowledgebase file", verbose=9)
kb_data = {}
records = []
write_message("Reading knowledgebase file: %s" % \
config.get(config.get("rank_method", "function"), "kb_src"))
input = open(config.get(config.get("rank_method", "function"), "kb_src"), 'r')
data = input.readlines()
for line in data:
if not line[0:1] == "#":
kb_data[string.strip((string.split(string.strip(line), "---"))[0])] = (string.split(string.strip(line), "---"))[1]
write_message("Number of lines read from knowledgebase file: %s" % len(kb_data))
tag = config.get(config.get("rank_method", "function"), "tag")
tags = config.get(config.get("rank_method", "function"), "check_mandatory_tags").split(", ")
if tags == ['']:
tags = ""
records = []
for (recids, recide) in options["recid_range"]:
task_sleep_now_if_required(can_stop_too=True)
write_message("......Processing records #%s-%s" % (recids, recide))
recs = run_sql("SELECT id_bibrec, value FROM bib%sx, bibrec_bib%sx WHERE tag=%%s AND id_bibxxx=id and id_bibrec >=%%s and id_bibrec<=%%s" % (tag[0:2], tag[0:2]), (tag, recids, recide))
valid = intbitset(trailing_bits=1)
valid.discard(0)
for key in tags:
newset = intbitset()
newset += [recid[0] for recid in (run_sql("SELECT id_bibrec FROM bib%sx, bibrec_bib%sx WHERE id_bibxxx=id AND tag=%%s AND id_bibxxx=id and id_bibrec >=%%s and id_bibrec<=%%s" % (tag[0:2], tag[0:2]), (key, recids, recide)))]
valid.intersection_update(newset)
if tags:
recs = filter(lambda x: x[0] in valid, recs)
records = records + list(recs)
write_message("Number of records found with the necessary tags: %s" % len(records))
records = filter(lambda x: x[0] in options["validset"], records)
rnkset = {}
for key, value in records:
if kb_data.has_key(value):
if not rnkset.has_key(key):
rnkset[key] = float(kb_data[value])
else:
if kb_data.has_key(rnkset[key]) and float(kb_data[value]) > float((rnkset[key])[1]):
rnkset[key] = float(kb_data[value])
else:
rnkset[key] = 0
write_message("Number of records available in rank method: %s" % len(rnkset))
return rnkset
def get_lastupdated(rank_method_code):
"""Get the last time the rank method was updated"""
res = run_sql("SELECT rnkMETHOD.last_updated FROM rnkMETHOD WHERE name=%s", (rank_method_code, ))
if res:
return res[0][0]
else:
raise Exception("Is this the first run? Please do a complete update.")
def intoDB(dict, date, rank_method_code):
"""Insert the rank method data into the database"""
mid = run_sql("SELECT id from rnkMETHOD where name=%s", (rank_method_code, ))
del_rank_method_codeDATA(rank_method_code)
serdata = serialize_via_marshal(dict);
midstr = str(mid[0][0]);
run_sql("INSERT INTO rnkMETHODDATA(id_rnkMETHOD, relevance_data) VALUES (%s,%s)", (midstr, serdata,))
if date:
run_sql("UPDATE rnkMETHOD SET last_updated=%s WHERE name=%s", (date, rank_method_code))
def fromDB(rank_method_code):
"""Get the data for a rank method"""
id = run_sql("SELECT id from rnkMETHOD where name=%s", (rank_method_code, ))
res = run_sql("SELECT relevance_data FROM rnkMETHODDATA WHERE id_rnkMETHOD=%s", (id[0][0], ))
if res:
return deserialize_via_marshal(res[0][0])
else:
return {}
def del_rank_method_codeDATA(rank_method_code):
"""Delete the data for a rank method"""
id = run_sql("SELECT id from rnkMETHOD where name=%s", (rank_method_code, ))
run_sql("DELETE FROM rnkMETHODDATA WHERE id_rnkMETHOD=%s", (id[0][0], ))
def del_recids(rank_method_code, range_rec):
"""Delete some records from the rank method"""
id = run_sql("SELECT id from rnkMETHOD where name=%s", (rank_method_code, ))
res = run_sql("SELECT relevance_data FROM rnkMETHODDATA WHERE id_rnkMETHOD=%s", (id[0][0], ))
if res:
rec_dict = deserialize_via_marshal(res[0][0])
write_message("Old size: %s" % len(rec_dict))
for (recids, recide) in range_rec:
for i in range(int(recids), int(recide)):
if rec_dict.has_key(i):
del rec_dict[i]
write_message("New size: %s" % len(rec_dict))
begin_date = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
intoDB(rec_dict, begin_date, rank_method_code)
else:
write_message("Create before deleting!")
def union_dicts(dict1, dict2):
"Returns union of the two dicts."
union_dict = {}
for (key, value) in dict1.iteritems():
union_dict[key] = value
for (key, value) in dict2.iteritems():
union_dict[key] = value
return union_dict
def rank_method_code_statistics(rank_method_code):
"""Print statistics"""
method = fromDB(rank_method_code)
max = ('', -999999)
maxcount = 0
min = ('', 999999)
mincount = 0
for (recID, value) in method.iteritems():
if value < min and value > 0:
min = value
if value > max:
max = value
for (recID, value) in method.iteritems():
if value == min:
mincount += 1
if value == max:
maxcount += 1
write_message("Showing statistic for selected method")
write_message("Method name: %s" % getName(rank_method_code))
write_message("Short name: %s" % rank_method_code)
write_message("Last run: %s" % get_lastupdated(rank_method_code))
write_message("Number of records: %s" % len(method))
write_message("Lowest value: %s - Number of records: %s" % (min, mincount))
write_message("Highest value: %s - Number of records: %s" % (max, maxcount))
write_message("Divided into 10 sets:")
for i in range(1, 11):
setcount = 0
distinct_values = {}
lower = -1.0 + ((float(max + 1) / 10)) * (i - 1)
upper = -1.0 + ((float(max + 1) / 10)) * i
for (recID, value) in method.iteritems():
if value >= lower and value <= upper:
setcount += 1
distinct_values[value] = 1
write_message("Set %s (%s-%s) %s Distinct values: %s" % (i, lower, upper, len(distinct_values), setcount))
def check_method(rank_method_code):
write_message("Checking rank method...")
if len(fromDB(rank_method_code)) == 0:
write_message("Rank method not yet executed, please run it to create the necessary data.")
else:
if len(add_recIDs_by_date(rank_method_code)) > 0:
write_message("Records modified, update recommended")
else:
write_message("No records modified, update not necessary")
def bibrank_engine(run):
"""Run the indexing task.
Return 1 in case of success and 0 in case of failure.
"""
startCreate = time.time()
try:
options["run"] = []
options["run"].append(run)
for rank_method_code in options["run"]:
task_sleep_now_if_required(can_stop_too=True)
cfg_name = getName(rank_method_code)
write_message("Running rank method: %s." % cfg_name)
file = CFG_ETCDIR + "/bibrank/" + rank_method_code + ".cfg"
config = ConfigParser.ConfigParser()
try:
config.readfp(open(file))
except StandardError, e:
write_message("Cannot find configurationfile: %s" % file, sys.stderr)
raise StandardError
cfg_short = rank_method_code
cfg_function = config.get("rank_method", "function") + "_exec"
cfg_repair_function = config.get("rank_method", "function") + "_repair_exec"
cfg_name = getName(cfg_short)
options["validset"] = get_valid_range(rank_method_code)
if task_get_option("collection"):
l_of_colls = string.split(task_get_option("collection"), ", ")
recIDs = perform_request_search(c=l_of_colls)
recIDs_range = []
for recID in recIDs:
recIDs_range.append([recID, recID])
options["recid_range"] = recIDs_range
elif task_get_option("id"):
options["recid_range"] = task_get_option("id")
elif task_get_option("modified"):
options["recid_range"] = add_recIDs_by_date(rank_method_code, task_get_option("modified"))
elif task_get_option("last_updated"):
options["recid_range"] = add_recIDs_by_date(rank_method_code)
else:
write_message("No records specified, updating all", verbose=2)
min_id = run_sql("SELECT min(id) from bibrec")[0][0]
max_id = run_sql("SELECT max(id) from bibrec")[0][0]
options["recid_range"] = [[min_id, max_id]]
if task_get_option("quick") == "no":
write_message("Recalculate parameter not used, parameter ignored.", verbose=9)
if task_get_option("cmd") == "del":
del_recids(cfg_short, options["recid_range"])
elif task_get_option("cmd") == "add":
func_object = globals().get(cfg_function)
func_object(rank_method_code, cfg_name, config)
elif task_get_option("cmd") == "stat":
rank_method_code_statistics(rank_method_code)
elif task_get_option("cmd") == "check":
check_method(rank_method_code)
elif task_get_option("cmd") == "print-missing":
func_object = globals().get(cfg_function)
func_object(rank_method_code, cfg_name, config)
elif task_get_option("cmd") == "repair":
func_object = globals().get(cfg_repair_function)
func_object()
else:
write_message("Invalid command found processing %s" % rank_method_code, sys.stderr)
raise StandardError
except StandardError, e:
write_message("\nException caught: %s" % e, sys.stderr)
write_message(traceback.format_exc()[:-1])
register_exception()
raise StandardError
if task_get_option("verbose"):
showtime((time.time() - startCreate))
return 1
def get_valid_range(rank_method_code):
"""Return a range of records"""
write_message("Getting records from collections enabled for rank method.", verbose=9)
res = run_sql("SELECT collection.name FROM collection, collection_rnkMETHOD, rnkMETHOD WHERE collection.id=id_collection and id_rnkMETHOD=rnkMETHOD.id and rnkMETHOD.name=%s", (rank_method_code, ))
l_of_colls = []
for coll in res:
l_of_colls.append(coll[0])
if len(l_of_colls) > 0:
recIDs = perform_request_search(c=l_of_colls)
else:
recIDs = []
valid = intbitset()
valid += recIDs
return valid
def add_recIDs_by_date(rank_method_code, dates=""):
"""Return recID range from records modified between DATES[0] and DATES[1].
If DATES is not set, then add records modified since the last run of
the ranking method RANK_METHOD_CODE.
"""
if not dates:
try:
dates = (get_lastupdated(rank_method_code), '')
except Exception:
dates = ("0000-00-00 00:00:00", '')
if dates[0] is None:
dates = ("0000-00-00 00:00:00", '')
query = """SELECT b.id FROM bibrec AS b WHERE b.modification_date >= %s"""
if dates[1]:
query += " and b.modification_date <= %s"
query += " ORDER BY b.id ASC"""
if dates[1]:
res = run_sql(query, (dates[0], dates[1]))
else:
res = run_sql(query, (dates[0], ))
alist = create_range_list([row[0] for row in res])
if not alist:
write_message("No new records added since last time method was run")
return alist
def getName(rank_method_code, ln=CFG_SITE_LANG, type='ln'):
"""Returns the name of the method if it exists"""
try:
rnkid = run_sql("SELECT id FROM rnkMETHOD where name=%s", (rank_method_code, ))
if rnkid:
rnkid = str(rnkid[0][0])
res = run_sql("SELECT value FROM rnkMETHODNAME where type=%s and ln=%s and id_rnkMETHOD=%s", (type, ln, rnkid))
if not res:
res = run_sql("SELECT value FROM rnkMETHODNAME WHERE ln=%s and id_rnkMETHOD=%s and type=%s", (CFG_SITE_LANG, rnkid, type))
if not res:
return rank_method_code
return res[0][0]
else:
raise Exception
except Exception:
write_message("Cannot run rank method, either given code for method is wrong, or it has not been added using the webinterface.")
raise Exception
def single_tag_rank_method(run):
return bibrank_engine(run)
def showtime(timeused):
"""Show time used for method"""
write_message("Time used: %d second(s)." % timeused, verbose=9)
def citation(run):
return bibrank_engine(run)
# Hack to put index based sorting here, but this is very similar to tag
#based method and should re-use a lot of this code, so better to have here
#than separate
#
def index_term_count_exec(rank_method_code, name, config):
"""Creating the rank method data"""
write_message("Recreating index weighting data")
begin_date = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
# we must recalculate these every time for all records, since the
# weighting of a record is determined by the index entries of _other_
# records
rnkset = calculate_index_term_count(config)
intoDB(rnkset, begin_date, rank_method_code)
def calculate_index_term_count(config):
"""Calculate the weight of a record set based on number of enries of a
tag from the record in another index...useful for authority files"""
records = []
if config.has_section("index_term_count"):
index = config.get("index_term_count","index_table_name")
tag = config.get("index_term_count","index_term_value_from_tag")
# check against possible SQL injection:
dummy = get_table_update_time(index)
tag = wash_table_column_name(tag)
else:
raise Exception("Config file " + config + " does not have index_term_count section")
return()
task_sleep_now_if_required(can_stop_too=True)
write_message("......Processing all records")
query = "SELECT id_bibrec, value FROM bib%sx, bibrec_bib%sx WHERE tag=%%s AND id_bibxxx=id" % \
(tag[0:2], tag[0:2]) # we checked that tag is safe
records = list(run_sql(query, (tag,)))
write_message("Number of records found with the necessary tags: %s" % len(records))
rnkset = {}
for key, value in records:
hits = 0
if len(value):
query = "SELECT hitlist from %s where term = %%s" % index # we checked that index is a table
row = run_sql(query, (value,))
if row and row[0] and row[0][0]:
#has to be prepared for corrupted data!
try:
hits = len(intbitset(row[0][0]))
except:
hits = 0
rnkset[key] = hits
write_message("Number of records available in rank method: %s" % len(rnkset))
return rnkset
def index_term_count(run):
return bibrank_engine(run)
diff --git a/invenio/legacy/bibrank/word_indexer.py b/invenio/legacy/bibrank/word_indexer.py
index e67ece704..352c756e8 100644
--- a/invenio/legacy/bibrank/word_indexer.py
+++ b/invenio/legacy/bibrank/word_indexer.py
@@ -1,1195 +1,1195 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
import sys
import time
import urllib
import math
import re
import ConfigParser
from invenio.config import \
CFG_SITE_LANG, \
CFG_ETCDIR
from invenio.legacy.search_engine import perform_request_search, wash_index_term
from invenio.legacy.dbquery import run_sql, DatabaseError, serialize_via_marshal, deserialize_via_marshal
-from invenio.bibindex_engine_stemmer import is_stemmer_available_for_language, stem
-from invenio.bibindex_engine_stopwords import is_stopword
-from invenio.bibindex_engine import beautify_range_list, \
+from invenio.legacy.bibindex.engine_stemmer import is_stemmer_available_for_language, stem
+from invenio.legacy.bibindex.engine_stopwords import is_stopword
+from invenio.legacy.bibindex.engine import beautify_range_list, \
kill_sleepy_mysql_threads, create_range_list
-from invenio.bibtask import write_message, task_get_option, task_update_progress, \
+from invenio.legacy.bibsched.bibtask import write_message, task_get_option, task_update_progress, \
task_update_status, task_sleep_now_if_required
from invenio.intbitset import intbitset
from invenio.ext.logging import register_exception
from invenio.utils.text import strip_accents
options = {} # global variable to hold task options
## safety parameters concerning DB thread-multiplication problem:
CFG_CHECK_MYSQL_THREADS = 0 # to check or not to check the problem?
CFG_MAX_MYSQL_THREADS = 50 # how many threads (connections) we consider as still safe
CFG_MYSQL_THREAD_TIMEOUT = 20 # we'll kill threads that were sleeping for more than X seconds
## override urllib's default password-asking behaviour:
class MyFancyURLopener(urllib.FancyURLopener):
def prompt_user_passwd(self, host, realm):
# supply some dummy credentials by default
return ("mysuperuser", "mysuperpass")
def http_error_401(self, url, fp, errcode, errmsg, headers):
# do not bother with protected pages
raise IOError, (999, 'unauthorized access')
return None
#urllib._urlopener = MyFancyURLopener()
nb_char_in_line = 50 # for verbose pretty printing
chunksize = 1000 # default size of chunks that the records will be treated by
base_process_size = 4500 # process base size
## Dictionary merging functions
def dict_union(list1, list2):
"Returns union of the two dictionaries."
union_dict = {}
for (e, count) in list1.iteritems():
union_dict[e] = count
for (e, count) in list2.iteritems():
if not union_dict.has_key(e):
union_dict[e] = count
else:
union_dict[e] = (union_dict[e][0] + count[0], count[1])
#for (e, count) in list2.iteritems():
# list1[e] = (list1.get(e, (0, 0))[0] + count[0], count[1])
#return list1
return union_dict
# tagToFunctions mapping. It offers an indirection level necesary for
# indexing fulltext. The default is get_words_from_phrase
tagToWordsFunctions = {}
def get_words_from_phrase(phrase, weight, lang="",
chars_punctuation=r"[\.\,\:\;\?\!\"]",
chars_alphanumericseparators=r"[1234567890\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~]",
split=str.split):
"Returns list of words from phrase 'phrase'."
words = {}
phrase = strip_accents(phrase)
phrase = phrase.lower()
#Getting rid of strange characters
phrase = re.sub("&eacute;", 'e', phrase)
phrase = re.sub("&egrave;", 'e', phrase)
phrase = re.sub("&agrave;", 'a', phrase)
phrase = re.sub("&nbsp;", ' ', phrase)
phrase = re.sub("&laquo;", ' ', phrase)
phrase = re.sub("&raquo;", ' ', phrase)
phrase = re.sub("&ecirc;", ' ', phrase)
phrase = re.sub("&amp;", ' ', phrase)
if phrase.find("</") > -1:
#Most likely html, remove html code
phrase = re.sub("(?s)<[^>]*>|&#?\w+;", ' ', phrase)
#removes http links
phrase = re.sub("(?s)http://[^( )]*", '', phrase)
phrase = re.sub(chars_punctuation, ' ', phrase)
#By doing this like below, characters standing alone, like c a b is not added to the inedx, but when they are together with characters like c++ or c$ they are added.
for word in split(phrase):
if options["remove_stopword"] == "True" and not is_stopword(word) and check_term(word, 0):
if lang and lang !="none" and options["use_stemming"]:
word = stem(word, lang)
if not words.has_key(word):
words[word] = (0, 0)
else:
if not words.has_key(word):
words[word] = (0, 0)
words[word] = (words[word][0] + weight, 0)
elif options["remove_stopword"] == "True" and not is_stopword(word):
phrase = re.sub(chars_alphanumericseparators, ' ', word)
for word_ in split(phrase):
if lang and lang !="none" and options["use_stemming"]:
word_ = stem(word_, lang)
if word_:
if not words.has_key(word_):
words[word_] = (0,0)
words[word_] = (words[word_][0] + weight, 0)
return words
class WordTable:
"A class to hold the words table."
def __init__(self, tablename, fields_to_index, separators="[^\s]"):
"Creates words table instance."
self.tablename = tablename
self.recIDs_in_mem = []
self.fields_to_index = fields_to_index
self.separators = separators
self.value = {}
def get_field(self, recID, tag):
"""Returns list of values of the MARC-21 'tag' fields for the
record 'recID'."""
out = []
bibXXx = "bib" + tag[0] + tag[1] + "x"
bibrec_bibXXx = "bibrec_" + bibXXx
query = """SELECT value FROM %s AS b, %s AS bb
WHERE bb.id_bibrec=%s AND bb.id_bibxxx=b.id
AND tag LIKE '%s'""" % (bibXXx, bibrec_bibXXx, recID, tag);
res = run_sql(query)
for row in res:
out.append(row[0])
return out
def clean(self):
"Cleans the words table."
self.value={}
def put_into_db(self, mode="normal"):
"""Updates the current words table in the corresponding DB
rnkWORD table. Mode 'normal' means normal execution,
mode 'emergency' means words index reverting to old state.
"""
write_message("%s %s wordtable flush started" % (self.tablename,mode))
write_message('...updating %d words into %sR started' % \
(len(self.value), self.tablename[:-1]))
task_update_progress("%s flushed %d/%d words" % (self.tablename, 0, len(self.value)))
self.recIDs_in_mem = beautify_range_list(self.recIDs_in_mem)
if mode == "normal":
for group in self.recIDs_in_mem:
query = """UPDATE %sR SET type='TEMPORARY' WHERE id_bibrec
BETWEEN '%d' AND '%d' AND type='CURRENT'""" % \
(self.tablename[:-1], group[0], group[1])
write_message(query, verbose=9)
run_sql(query)
nb_words_total = len(self.value)
nb_words_report = int(nb_words_total/10)
nb_words_done = 0
for word in self.value.keys():
self.put_word_into_db(word, self.value[word])
nb_words_done += 1
if nb_words_report!=0 and ((nb_words_done % nb_words_report) == 0):
write_message('......processed %d/%d words' % (nb_words_done, nb_words_total))
task_update_progress("%s flushed %d/%d words" % (self.tablename, nb_words_done, nb_words_total))
write_message('...updating %d words into %s ended' % \
(nb_words_total, self.tablename), verbose=9)
#if options["verbose"]:
# write_message('...updating reverse table %sR started' % self.tablename[:-1])
if mode == "normal":
for group in self.recIDs_in_mem:
query = """UPDATE %sR SET type='CURRENT' WHERE id_bibrec
BETWEEN '%d' AND '%d' AND type='FUTURE'""" % \
(self.tablename[:-1], group[0], group[1])
write_message(query, verbose=9)
run_sql(query)
query = """DELETE FROM %sR WHERE id_bibrec
BETWEEN '%d' AND '%d' AND type='TEMPORARY'""" % \
(self.tablename[:-1], group[0], group[1])
write_message(query, verbose=9)
run_sql(query)
write_message('End of updating wordTable into %s' % self.tablename, verbose=9)
elif mode == "emergency":
write_message("emergency")
for group in self.recIDs_in_mem:
query = """UPDATE %sR SET type='CURRENT' WHERE id_bibrec
BETWEEN '%d' AND '%d' AND type='TEMPORARY'""" % \
(self.tablename[:-1], group[0], group[1])
write_message(query, verbose=9)
run_sql(query)
query = """DELETE FROM %sR WHERE id_bibrec
BETWEEN '%d' AND '%d' AND type='FUTURE'""" % \
(self.tablename[:-1], group[0], group[1])
write_message(query, verbose=9)
run_sql(query)
write_message('End of emergency flushing wordTable into %s' % self.tablename, verbose=9)
#if options["verbose"]:
# write_message('...updating reverse table %sR ended' % self.tablename[:-1])
self.clean()
self.recIDs_in_mem = []
write_message("%s %s wordtable flush ended" % (self.tablename, mode))
task_update_progress("%s flush ended" % (self.tablename))
def load_old_recIDs(self,word):
"""Load existing hitlist for the word from the database index files."""
query = "SELECT hitlist FROM %s WHERE term=%%s" % self.tablename
res = run_sql(query, (word,))
if res:
return deserialize_via_marshal(res[0][0])
else:
return None
def merge_with_old_recIDs(self,word,recIDs, set):
"""Merge the system numbers stored in memory (hash of recIDs with value[0] > 0 or -1
according to whether to add/delete them) with those stored in the database index
and received in set universe of recIDs for the given word.
Return 0 in case no change was done to SET, return 1 in case SET was changed.
"""
set_changed_p = 0
for recID,sign in recIDs.iteritems():
if sign[0] == -1 and set.has_key(recID):
# delete recID if existent in set and if marked as to be deleted
del set[recID]
set_changed_p = 1
elif sign[0] > -1 and not set.has_key(recID):
# add recID if not existent in set and if marked as to be added
set[recID] = sign
set_changed_p = 1
elif sign[0] > -1 and sign[0] != set[recID][0]:
set[recID] = sign
set_changed_p = 1
return set_changed_p
def put_word_into_db(self, word, recIDs, split=str.split):
"""Flush a single word to the database and delete it from memory"""
set = self.load_old_recIDs(word)
#write_message("%s %s" % (word, self.value[word]))
if set is not None: # merge the word recIDs found in memory:
options["modified_words"][word] = 1
if not self.merge_with_old_recIDs(word, recIDs, set):
# nothing to update:
write_message("......... unchanged hitlist for ``%s''" % word, verbose=9)
pass
else:
# yes there were some new words:
write_message("......... updating hitlist for ``%s''" % word, verbose=9)
run_sql("UPDATE %s SET hitlist=%%s WHERE term=%%s" % self.tablename,
(serialize_via_marshal(set), word))
else: # the word is new, will create new set:
write_message("......... inserting hitlist for ``%s''" % word, verbose=9)
set = self.value[word]
if len(set) > 0:
#new word, add to list
options["modified_words"][word] = 1
try:
run_sql("INSERT INTO %s (term, hitlist) VALUES (%%s, %%s)" % self.tablename,
(word, serialize_via_marshal(set)))
except Exception, e:
## FIXME: This is for debugging encoding errors
register_exception(prefix="Error when putting the term '%s' into db (hitlist=%s): %s\n" % (repr(word), set, e), alert_admin=True)
if not set: # never store empty words
run_sql("DELETE from %s WHERE term=%%s" % self.tablename,
(word,))
del self.value[word]
def display(self):
"Displays the word table."
keys = self.value.keys()
keys.sort()
for k in keys:
write_message("%s: %s" % (k, self.value[k]))
def count(self):
"Returns the number of words in the table."
return len(self.value)
def info(self):
"Prints some information on the words table."
write_message("The words table contains %d words." % self.count())
def lookup_words(self, word=""):
"Lookup word from the words table."
if not word:
done = 0
while not done:
try:
word = raw_input("Enter word: ")
done = 1
except (EOFError, KeyboardInterrupt):
return
if self.value.has_key(word):
write_message("The word '%s' is found %d times." \
% (word, len(self.value[word])))
else:
write_message("The word '%s' does not exist in the word file."\
% word)
def update_last_updated(self, rank_method_code, starting_time=None):
"""Update last_updated column of the index table in the database.
Puts starting time there so that if the task was interrupted for record download,
the records will be reindexed next time."""
if starting_time is None:
return None
write_message("updating last_updated to %s..." % starting_time, verbose=9)
return run_sql("UPDATE rnkMETHOD SET last_updated=%s WHERE name=%s",
(starting_time, rank_method_code,))
def add_recIDs(self, recIDs):
"""Fetches records which id in the recIDs arange list and adds
them to the wordTable. The recIDs arange list is of the form:
[[i1_low,i1_high],[i2_low,i2_high], ..., [iN_low,iN_high]].
"""
global chunksize
flush_count = 0
records_done = 0
records_to_go = 0
for arange in recIDs:
records_to_go = records_to_go + arange[1] - arange[0] + 1
time_started = time.time() # will measure profile time
for arange in recIDs:
i_low = arange[0]
chunksize_count = 0
while i_low <= arange[1]:
# calculate chunk group of recIDs and treat it:
i_high = min(i_low+task_get_option("flush")-flush_count-1,arange[1])
i_high = min(i_low+chunksize-chunksize_count-1, i_high)
try:
self.chk_recID_range(i_low, i_high)
except StandardError, e:
write_message("Exception caught: %s" % e, sys.stderr)
register_exception()
task_update_status("ERROR")
sys.exit(1)
write_message("%s adding records #%d-#%d started" % \
(self.tablename, i_low, i_high))
if CFG_CHECK_MYSQL_THREADS:
kill_sleepy_mysql_threads()
task_update_progress("%s adding recs %d-%d" % (self.tablename, i_low, i_high))
self.del_recID_range(i_low, i_high)
just_processed = self.add_recID_range(i_low, i_high)
flush_count = flush_count + i_high - i_low + 1
chunksize_count = chunksize_count + i_high - i_low + 1
records_done = records_done + just_processed
write_message("%s adding records #%d-#%d ended " % \
(self.tablename, i_low, i_high))
if chunksize_count >= chunksize:
chunksize_count = 0
# flush if necessary:
if flush_count >= task_get_option("flush"):
self.put_into_db()
self.clean()
write_message("%s backing up" % (self.tablename))
flush_count = 0
self.log_progress(time_started,records_done,records_to_go)
# iterate:
i_low = i_high + 1
if flush_count > 0:
self.put_into_db()
self.log_progress(time_started,records_done,records_to_go)
def add_recIDs_by_date(self, dates=""):
"""Add recIDs modified between DATES[0] and DATES[1].
If DATES is not set, then add records modified since the last run of
the ranking method.
"""
if not dates:
write_message("Using the last update time for the rank method")
query = """SELECT last_updated FROM rnkMETHOD WHERE name='%s'
""" % options["current_run"]
res = run_sql(query)
if not res:
return
if not res[0][0]:
dates = ("0000-00-00",'')
else:
dates = (res[0][0],'')
query = """SELECT b.id FROM bibrec AS b WHERE b.modification_date >=
'%s'""" % dates[0]
if dates[1]:
query += "and b.modification_date <= '%s'" % dates[1]
query += " ORDER BY b.id ASC"""
res = run_sql(query)
alist = create_range_list([row[0] for row in res])
if not alist:
write_message( "No new records added. %s is up to date" % self.tablename)
else:
self.add_recIDs(alist)
return alist
def add_recID_range(self, recID1, recID2):
"""Add records from RECID1 to RECID2."""
wlist = {}
normalize = {}
self.recIDs_in_mem.append([recID1,recID2])
# secondly fetch all needed tags:
for (tag, weight, lang) in self.fields_to_index:
if tag in tagToWordsFunctions.keys():
get_words_function = tagToWordsFunctions[tag]
else:
get_words_function = get_words_from_phrase
bibXXx = "bib" + tag[0] + tag[1] + "x"
bibrec_bibXXx = "bibrec_" + bibXXx
query = """SELECT bb.id_bibrec,b.value FROM %s AS b, %s AS bb
WHERE bb.id_bibrec BETWEEN %d AND %d
AND bb.id_bibxxx=b.id AND tag LIKE '%s'""" % (bibXXx, bibrec_bibXXx, recID1, recID2, tag)
res = run_sql(query)
nb_total_to_read = len(res)
verbose_idx = 0 # for verbose pretty printing
for row in res:
recID, phrase = row
if recID in options["validset"]:
if not wlist.has_key(recID): wlist[recID] = {}
new_words = get_words_function(phrase, weight, lang) # ,self.separators
wlist[recID] = dict_union(new_words,wlist[recID])
# were there some words for these recIDs found?
if len(wlist) == 0: return 0
recIDs = wlist.keys()
for recID in recIDs:
# was this record marked as deleted?
if "DELETED" in self.get_field(recID, "980__c"):
wlist[recID] = {}
write_message("... record %d was declared deleted, removing its word list" % recID, verbose=9)
write_message("... record %d, termlist: %s" % (recID, wlist[recID]), verbose=9)
# put words into reverse index table with FUTURE status:
for recID in recIDs:
run_sql("INSERT INTO %sR (id_bibrec,termlist,type) VALUES (%%s,%%s,'FUTURE')" % self.tablename[:-1],
(recID, serialize_via_marshal(wlist[recID])))
# ... and, for new records, enter the CURRENT status as empty:
try:
run_sql("INSERT INTO %sR (id_bibrec,termlist,type) VALUES (%%s,%%s,'CURRENT')" % self.tablename[:-1],
(recID, serialize_via_marshal([])))
except DatabaseError:
# okay, it's an already existing record, no problem
pass
# put words into memory word list:
put = self.put
for recID in recIDs:
for (w, count) in wlist[recID].iteritems():
put(recID, w, count)
return len(recIDs)
def log_progress(self, start, done, todo):
"""Calculate progress and store it.
start: start time,
done: records processed,
todo: total number of records"""
time_elapsed = time.time() - start
# consistency check
if time_elapsed == 0 or done > todo:
return
time_recs_per_min = done/(time_elapsed/60.0)
write_message("%d records took %.1f seconds to complete.(%1.f recs/min)"\
% (done, time_elapsed, time_recs_per_min))
if time_recs_per_min:
write_message("Estimated runtime: %.1f minutes" % \
((todo-done)/time_recs_per_min))
def put(self, recID, word, sign):
"Adds/deletes a word to the word list."
try:
word = wash_index_term(word)
if self.value.has_key(word):
# the word 'word' exist already: update sign
self.value[word][recID] = sign
# PROBLEM ?
else:
self.value[word] = {recID: sign}
except:
write_message("Error: Cannot put word %s with sign %d for recID %s." % (word, sign, recID))
def del_recIDs(self, recIDs):
"""Fetches records which id in the recIDs range list and adds
them to the wordTable. The recIDs range list is of the form:
[[i1_low,i1_high],[i2_low,i2_high], ..., [iN_low,iN_high]].
"""
count = 0
for range in recIDs:
self.del_recID_range(range[0],range[1])
count = count + range[1] - range[0]
self.put_into_db()
def del_recID_range(self, low, high):
"""Deletes records with 'recID' system number between low
and high from memory words index table."""
write_message("%s fetching existing words for records #%d-#%d started" % \
(self.tablename, low, high), verbose=3)
self.recIDs_in_mem.append([low,high])
query = """SELECT id_bibrec,termlist FROM %sR as bb WHERE bb.id_bibrec
BETWEEN '%d' AND '%d'""" % (self.tablename[:-1], low, high)
recID_rows = run_sql(query)
for recID_row in recID_rows:
recID = recID_row[0]
wlist = deserialize_via_marshal(recID_row[1])
for word in wlist:
self.put(recID, word, (-1, 0))
write_message("%s fetching existing words for records #%d-#%d ended" % \
(self.tablename, low, high), verbose=3)
def report_on_table_consistency(self):
"""Check reverse words index tables (e.g. rnkWORD01R) for
interesting states such as 'TEMPORARY' state.
Prints small report (no of words, no of bad words).
"""
# find number of words:
query = """SELECT COUNT(*) FROM %s""" % (self.tablename)
res = run_sql(query, None, 1)
if res:
nb_words = res[0][0]
else:
nb_words = 0
# find number of records:
query = """SELECT COUNT(DISTINCT(id_bibrec)) FROM %sR""" % (self.tablename[:-1])
res = run_sql(query, None, 1)
if res:
nb_records = res[0][0]
else:
nb_records = 0
# report stats:
write_message("%s contains %d words from %d records" % (self.tablename, nb_words, nb_records))
# find possible bad states in reverse tables:
query = """SELECT COUNT(DISTINCT(id_bibrec)) FROM %sR WHERE type <> 'CURRENT'""" % (self.tablename[:-1])
res = run_sql(query)
if res:
nb_bad_records = res[0][0]
else:
nb_bad_records = 999999999
if nb_bad_records:
write_message("EMERGENCY: %s needs to repair %d of %d index records" % \
(self.tablename, nb_bad_records, nb_records))
else:
write_message("%s is in consistent state" % (self.tablename))
return nb_bad_records
def repair(self):
"""Repair the whole table"""
# find possible bad states in reverse tables:
query = """SELECT COUNT(DISTINCT(id_bibrec)) FROM %sR WHERE type <> 'CURRENT'""" % (self.tablename[:-1])
res = run_sql(query, None, 1)
if res:
nb_bad_records = res[0][0]
else:
nb_bad_records = 0
# find number of records:
query = """SELECT COUNT(DISTINCT(id_bibrec)) FROM %sR""" % (self.tablename[:-1])
res = run_sql(query)
if res:
nb_records = res[0][0]
else:
nb_records = 0
if nb_bad_records == 0:
return
query = """SELECT id_bibrec FROM %sR WHERE type <> 'CURRENT' ORDER BY id_bibrec""" \
% (self.tablename[:-1])
res = run_sql(query)
recIDs = create_range_list([row[0] for row in res])
flush_count = 0
records_done = 0
records_to_go = 0
for range in recIDs:
records_to_go = records_to_go + range[1] - range[0] + 1
time_started = time.time() # will measure profile time
for range in recIDs:
i_low = range[0]
chunksize_count = 0
while i_low <= range[1]:
# calculate chunk group of recIDs and treat it:
i_high = min(i_low+task_get_option("flush")-flush_count-1,range[1])
i_high = min(i_low+chunksize-chunksize_count-1, i_high)
try:
self.fix_recID_range(i_low, i_high)
except StandardError, e:
write_message("Exception caught: %s" % e, sys.stderr)
register_exception()
task_update_status("ERROR")
sys.exit(1)
flush_count = flush_count + i_high - i_low + 1
chunksize_count = chunksize_count + i_high - i_low + 1
records_done = records_done + i_high - i_low + 1
if chunksize_count >= chunksize:
chunksize_count = 0
# flush if necessary:
if flush_count >= task_get_option("flush"):
self.put_into_db("emergency")
self.clean()
flush_count = 0
self.log_progress(time_started,records_done,records_to_go)
# iterate:
i_low = i_high + 1
if flush_count > 0:
self.put_into_db("emergency")
self.log_progress(time_started,records_done,records_to_go)
write_message("%s inconsistencies repaired." % self.tablename)
def chk_recID_range(self, low, high):
"""Check if the reverse index table is in proper state"""
## check db
query = """SELECT COUNT(*) FROM %sR WHERE type <> 'CURRENT'
AND id_bibrec BETWEEN '%d' AND '%d'""" % (self.tablename[:-1], low, high)
res = run_sql(query, None, 1)
if res[0][0]==0:
write_message("%s for %d-%d is in consistent state"%(self.tablename,low,high))
return # okay, words table is consistent
## inconsistency detected!
write_message("EMERGENCY: %s inconsistencies detected..." % self.tablename)
write_message("""EMERGENCY: Errors found. You should check consistency of the %s - %sR tables.\nRunning 'bibrank --repair' is recommended.""" \
% (self.tablename, self.tablename[:-1]))
raise StandardError
def fix_recID_range(self, low, high):
"""Try to fix reverse index database consistency (e.g. table rnkWORD01R) in the low,high doc-id range.
Possible states for a recID follow:
CUR TMP FUT: very bad things have happened: warn!
CUR TMP : very bad things have happened: warn!
CUR FUT: delete FUT (crash before flushing)
CUR : database is ok
TMP FUT: add TMP to memory and del FUT from memory
flush (revert to old state)
TMP : very bad things have happened: warn!
FUT: very bad things have happended: warn!
"""
state = {}
query = "SELECT id_bibrec,type FROM %sR WHERE id_bibrec BETWEEN '%d' AND '%d'"\
% (self.tablename[:-1], low, high)
res = run_sql(query)
for row in res:
if not state.has_key(row[0]):
state[row[0]]=[]
state[row[0]].append(row[1])
ok = 1 # will hold info on whether we will be able to repair
for recID in state.keys():
if not 'TEMPORARY' in state[recID]:
if 'FUTURE' in state[recID]:
if 'CURRENT' not in state[recID]:
write_message("EMERGENCY: Index record %d is in inconsistent state. Can't repair it" % recID)
ok = 0
else:
write_message("EMERGENCY: Inconsistency in index record %d detected" % recID)
query = """DELETE FROM %sR
WHERE id_bibrec='%d'""" % (self.tablename[:-1], recID)
run_sql(query)
write_message("EMERGENCY: Inconsistency in index record %d repaired." % recID)
else:
if 'FUTURE' in state[recID] and not 'CURRENT' in state[recID]:
self.recIDs_in_mem.append([recID,recID])
# Get the words file
query = """SELECT type,termlist FROM %sR
WHERE id_bibrec='%d'""" % (self.tablename[:-1], recID)
write_message(query, verbose=9)
res = run_sql(query)
for row in res:
wlist = deserialize_via_marshal(row[1])
write_message("Words are %s " % wlist, verbose=9)
if row[0] == 'TEMPORARY':
sign = 1
else:
sign = -1
for word in wlist:
self.put(recID, word, wlist[word])
else:
write_message("EMERGENCY: %s for %d is in inconsistent state. Couldn't repair it." % (self.tablename, recID))
ok = 0
if not ok:
write_message("""EMERGENCY: Unrepairable errors found. You should check consistency
of the %s - %sR tables. Deleting affected TEMPORARY and FUTURE entries
from these tables is recommended; see the BibIndex Admin Guide.
(The repairing procedure is similar for bibrank word indexes.)""" % (self.tablename, self.tablename[:-1]))
raise StandardError
def word_index(run):
"""Run the indexing task. The row argument is the BibSched task
queue row, containing if, arguments, etc.
Return 1 in case of success and 0 in case of failure.
"""
global languages
max_recid = 0
res = run_sql("SELECT max(id) FROM bibrec")
if res and res[0][0]:
max_recid = int(res[0][0])
options["run"] = []
options["run"].append(run)
for rank_method_code in options["run"]:
task_sleep_now_if_required(can_stop_too=True)
method_starting_time = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
write_message("Running rank method: %s" % getName(rank_method_code))
try:
file = CFG_ETCDIR + "/bibrank/" + rank_method_code + ".cfg"
config = ConfigParser.ConfigParser()
config.readfp(open(file))
except StandardError, e:
write_message("Cannot find configurationfile: %s" % file, sys.stderr)
raise StandardError
options["current_run"] = rank_method_code
options["modified_words"] = {}
options["table"] = config.get(config.get("rank_method", "function"), "table")
options["use_stemming"] = config.get(config.get("rank_method","function"),"stemming")
options["remove_stopword"] = config.get(config.get("rank_method","function"),"stopword")
tags = get_tags(config) #get the tags to include
options["validset"] = get_valid_range(rank_method_code) #get the records from the collections the method is enabled for
function = config.get("rank_method","function")
wordTable = WordTable(options["table"], tags)
wordTable.report_on_table_consistency()
try:
if task_get_option("cmd") == "del":
if task_get_option("id"):
wordTable.del_recIDs(task_get_option("id"))
task_sleep_now_if_required(can_stop_too=True)
elif task_get_option("collection"):
l_of_colls = task_get_option("collection").split(",")
recIDs = perform_request_search(c=l_of_colls)
recIDs_range = []
for recID in recIDs:
recIDs_range.append([recID,recID])
wordTable.del_recIDs(recIDs_range)
task_sleep_now_if_required(can_stop_too=True)
else:
write_message("Missing IDs of records to delete from index %s.", wordTable.tablename,
sys.stderr)
raise StandardError
elif task_get_option("cmd") == "add":
if task_get_option("id"):
wordTable.add_recIDs(task_get_option("id"))
task_sleep_now_if_required(can_stop_too=True)
elif task_get_option("collection"):
l_of_colls = task_get_option("collection").split(",")
recIDs = perform_request_search(c=l_of_colls)
recIDs_range = []
for recID in recIDs:
recIDs_range.append([recID,recID])
wordTable.add_recIDs(recIDs_range)
task_sleep_now_if_required(can_stop_too=True)
elif task_get_option("last_updated"):
wordTable.add_recIDs_by_date("")
# only update last_updated if run via automatic mode:
wordTable.update_last_updated(rank_method_code, method_starting_time)
task_sleep_now_if_required(can_stop_too=True)
elif task_get_option("modified"):
wordTable.add_recIDs_by_date(task_get_option("modified"))
task_sleep_now_if_required(can_stop_too=True)
else:
wordTable.add_recIDs([[0,max_recid]])
task_sleep_now_if_required(can_stop_too=True)
elif task_get_option("cmd") == "repair":
wordTable.repair()
check_rnkWORD(options["table"])
task_sleep_now_if_required(can_stop_too=True)
elif task_get_option("cmd") == "check":
check_rnkWORD(options["table"])
options["modified_words"] = {}
task_sleep_now_if_required(can_stop_too=True)
elif task_get_option("cmd") == "stat":
rank_method_code_statistics(options["table"])
task_sleep_now_if_required(can_stop_too=True)
else:
write_message("Invalid command found processing %s" % \
wordTable.tablename, sys.stderr)
raise StandardError
update_rnkWORD(options["table"], options["modified_words"])
task_sleep_now_if_required(can_stop_too=True)
except StandardError, e:
register_exception(alert_admin=True)
write_message("Exception caught: %s" % e, sys.stderr)
sys.exit(1)
wordTable.report_on_table_consistency()
# We are done. State it in the database, close and quit
return 1
def get_tags(config):
"""Get the tags that should be used creating the index and each tag's parameter"""
tags = []
function = config.get("rank_method","function")
i = 1
shown_error = 0
#try:
if 1:
while config.has_option(function,"tag%s"% i):
tag = config.get(function, "tag%s" % i)
tag = tag.split(",")
tag[1] = int(tag[1].strip())
tag[2] = tag[2].strip()
#check if stemmer for language is available
if config.get(function, "stemming") and stem("information", "en") != "inform":
if shown_error == 0:
write_message("Warning: Stemming not working. Please check it out!")
shown_error = 1
elif tag[2] and tag[2] != "none" and config.get(function,"stemming") and not is_stemmer_available_for_language(tag[2]):
write_message("Warning: Stemming not available for language '%s'." % tag[2])
tags.append(tag)
i += 1
#except Exception:
# write_message("Could not read data from configuration file, please check for errors")
# raise StandardError
return tags
def get_valid_range(rank_method_code):
"""Returns which records are valid for this rank method, according to which collections it is enabled for."""
#if options["verbose"] >=9:
# write_message("Getting records from collections enabled for rank method.")
#res = run_sql("SELECT collection.name FROM collection,collection_rnkMETHOD,rnkMETHOD WHERE collection.id=id_collection and id_rnkMETHOD=rnkMETHOD.id and rnkMETHOD.name='%s'" % rank_method_code)
#l_of_colls = []
#for coll in res:
# l_of_colls.append(coll[0])
#if len(l_of_colls) > 0:
# recIDs = perform_request_search(c=l_of_colls)
#else:
# recIDs = []
valid = intbitset(trailing_bits=1)
valid.discard(0)
#valid.addlist(recIDs)
return valid
def check_term(term, termlength):
"""Check if term contains not allowed characters, or for any other reasons for not using this term."""
try:
if len(term) <= termlength:
return False
reg = re.compile(r"[1234567890\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~]")
if re.search(reg, term):
return False
term = str.replace(term, "-", "")
term = str.replace(term, ".", "")
term = str.replace(term, ",", "")
if int(term):
return False
except StandardError, e:
pass
return True
def check_rnkWORD(table):
"""Checks for any problems in rnkWORD tables."""
i = 0
errors = {}
termslist = run_sql("SELECT term FROM %s" % table)
N = run_sql("select max(id_bibrec) from %sR" % table[:-1])[0][0]
write_message("Checking integrity of rank values in %s" % table)
terms = map(lambda x: x[0], termslist)
while i < len(terms):
query_params = ()
for j in range(i, ((i+5000)< len(terms) and (i+5000) or len(terms))):
query_params += (terms[j],)
terms_docs = run_sql("SELECT term, hitlist FROM %s WHERE term IN (%s)" % (table, (len(query_params)*"%s,")[:-1]),
query_params)
for (t, hitlist) in terms_docs:
term_docs = deserialize_via_marshal(hitlist)
if (term_docs.has_key("Gi") and term_docs["Gi"][1] == 0) or not term_docs.has_key("Gi"):
write_message("ERROR: Missing value for term: %s (%s) in %s: %s" % (t, repr(t), table, len(term_docs)))
errors[t] = 1
i += 5000
write_message("Checking integrity of rank values in %sR" % table[:-1])
i = 0
while i < N:
docs_terms = run_sql("SELECT id_bibrec, termlist FROM %sR WHERE id_bibrec>=%s and id_bibrec<=%s" % (table[:-1], i, i+5000))
for (j, termlist) in docs_terms:
termlist = deserialize_via_marshal(termlist)
for (t, tf) in termlist.iteritems():
if tf[1] == 0 and not errors.has_key(t):
errors[t] = 1
write_message("ERROR: Gi missing for record %s and term: %s (%s) in %s" % (j,t,repr(t), table))
terms_docs = run_sql("SELECT term, hitlist FROM %s WHERE term=%%s" % table, (t,))
termlist = deserialize_via_marshal(terms_docs[0][1])
i += 5000
if len(errors) == 0:
write_message("No direct errors found, but nonconsistent data may exist.")
else:
write_message("%s errors found during integrity check, repair and rebalancing recommended." % len(errors))
options["modified_words"] = errors
def rank_method_code_statistics(table):
"""Shows some statistics about this rank method."""
maxID = run_sql("select max(id) from %s" % table)
maxID = maxID[0][0]
terms = {}
Gi = {}
write_message("Showing statistics of terms in index:")
write_message("Important: For the 'Least used terms', the number of terms is shown first, and the number of occurences second.")
write_message("Least used terms---Most important terms---Least important terms")
i = 0
while i < maxID:
terms_docs=run_sql("SELECT term, hitlist FROM %s WHERE id>= %s and id < %s" % (table, i, i + 10000))
for (t, hitlist) in terms_docs:
term_docs=deserialize_via_marshal(hitlist)
terms[len(term_docs)] = terms.get(len(term_docs), 0) + 1
if term_docs.has_key("Gi"):
Gi[t] = term_docs["Gi"]
i=i + 10000
terms=terms.items()
terms.sort(lambda x, y: cmp(y[1], x[1]))
Gi=Gi.items()
Gi.sort(lambda x, y: cmp(y[1], x[1]))
for i in range(0, 20):
write_message("%s/%s---%s---%s" % (terms[i][0],terms[i][1], Gi[i][0],Gi[len(Gi) - i - 1][0]))
def update_rnkWORD(table, terms):
"""Updates rnkWORDF and rnkWORDR with Gi and Nj values. For each term in rnkWORDF, a Gi value for the term is added. And for each term in each document, the Nj value for that document is added. In rnkWORDR, the Gi value for each term in each document is added. For description on how things are computed, look in the hacking docs.
table - name of forward index to update
terms - modified terms"""
stime = time.time()
Gi = {}
Nj = {}
N = run_sql("select count(id_bibrec) from %sR" % table[:-1])[0][0]
if len(terms) == 0 and task_get_option("quick") == "yes":
write_message("No terms to process, ending...")
return ""
elif task_get_option("quick") == "yes": #not used -R option, fast calculation (not accurate)
write_message("Beginning post-processing of %s terms" % len(terms))
#Locating all documents related to the modified/new/deleted terms, if fast update,
#only take into account new/modified occurences
write_message("Phase 1: Finding records containing modified terms")
terms = terms.keys()
i = 0
while i < len(terms):
terms_docs = get_from_forward_index(terms, i, (i+5000), table)
for (t, hitlist) in terms_docs:
term_docs = deserialize_via_marshal(hitlist)
if term_docs.has_key("Gi"):
del term_docs["Gi"]
for (j, tf) in term_docs.iteritems():
if (task_get_option("quick") == "yes" and tf[1] == 0) or task_get_option("quick") == "no":
Nj[j] = 0
write_message("Phase 1: ......processed %s/%s terms" % ((i+5000>len(terms) and len(terms) or (i+5000)), len(terms)))
i += 5000
write_message("Phase 1: Finished finding records containing modified terms")
#Find all terms in the records found in last phase
write_message("Phase 2: Finding all terms in affected records")
records = Nj.keys()
i = 0
while i < len(records):
docs_terms = get_from_reverse_index(records, i, (i + 5000), table)
for (j, termlist) in docs_terms:
doc_terms = deserialize_via_marshal(termlist)
for (t, tf) in doc_terms.iteritems():
Gi[t] = 0
write_message("Phase 2: ......processed %s/%s records " % ((i+5000>len(records) and len(records) or (i+5000)), len(records)))
i += 5000
write_message("Phase 2: Finished finding all terms in affected records")
else: #recalculate
max_id = run_sql("SELECT MAX(id) FROM %s" % table)
max_id = max_id[0][0]
write_message("Beginning recalculation of %s terms" % max_id)
terms = []
i = 0
while i < max_id:
terms_docs = get_from_forward_index_with_id(i, (i+5000), table)
for (t, hitlist) in terms_docs:
Gi[t] = 0
term_docs = deserialize_via_marshal(hitlist)
if term_docs.has_key("Gi"):
del term_docs["Gi"]
for (j, tf) in term_docs.iteritems():
Nj[j] = 0
write_message("Phase 1: ......processed %s/%s terms" % ((i+5000)>max_id and max_id or (i+5000), max_id))
i += 5000
write_message("Phase 1: Finished finding which records contains which terms")
write_message("Phase 2: Jumping over..already done in phase 1 because of -R option")
terms = Gi.keys()
Gi = {}
i = 0
if task_get_option("quick") == "no":
#Calculating Fi and Gi value for each term
write_message("Phase 3: Calculating importance of all affected terms")
while i < len(terms):
terms_docs = get_from_forward_index(terms, i, (i+5000), table)
for (t, hitlist) in terms_docs:
term_docs = deserialize_via_marshal(hitlist)
if term_docs.has_key("Gi"):
del term_docs["Gi"]
Fi = 0
Gi[t] = 1
for (j, tf) in term_docs.iteritems():
Fi += tf[0]
for (j, tf) in term_docs.iteritems():
if tf[0] != Fi:
Gi[t] = Gi[t] + ((float(tf[0]) / Fi) * math.log(float(tf[0]) / Fi) / math.log(2)) / math.log(N)
write_message("Phase 3: ......processed %s/%s terms" % ((i+5000>len(terms) and len(terms) or (i+5000)), len(terms)))
i += 5000
write_message("Phase 3: Finished calculating importance of all affected terms")
else:
#Using existing Gi value instead of calculating a new one. Missing some accurancy.
write_message("Phase 3: Getting approximate importance of all affected terms")
while i < len(terms):
terms_docs = get_from_forward_index(terms, i, (i+5000), table)
for (t, hitlist) in terms_docs:
term_docs = deserialize_via_marshal(hitlist)
if term_docs.has_key("Gi"):
Gi[t] = term_docs["Gi"][1]
elif len(term_docs) == 1:
Gi[t] = 1
else:
Fi = 0
Gi[t] = 1
for (j, tf) in term_docs.iteritems():
Fi += tf[0]
for (j, tf) in term_docs.iteritems():
if tf[0] != Fi:
Gi[t] = Gi[t] + ((float(tf[0]) / Fi) * math.log(float(tf[0]) / Fi) / math.log(2)) / math.log(N)
write_message("Phase 3: ......processed %s/%s terms" % ((i+5000>len(terms) and len(terms) or (i+5000)), len(terms)))
i += 5000
write_message("Phase 3: Finished getting approximate importance of all affected terms")
write_message("Phase 4: Calculating normalization value for all affected records and updating %sR" % table[:-1])
records = Nj.keys()
i = 0
while i < len(records):
#Calculating the normalization value for each document, and adding the Gi value to each term in each document.
docs_terms = get_from_reverse_index(records, i, (i + 5000), table)
for (j, termlist) in docs_terms:
doc_terms = deserialize_via_marshal(termlist)
try:
for (t, tf) in doc_terms.iteritems():
if Gi.has_key(t):
Nj[j] = Nj.get(j, 0) + math.pow(Gi[t] * (1 + math.log(tf[0])), 2)
Git = int(math.floor(Gi[t]*100))
if Git >= 0:
Git += 1
doc_terms[t] = (tf[0], Git)
else:
Nj[j] = Nj.get(j, 0) + math.pow(tf[1] * (1 + math.log(tf[0])), 2)
Nj[j] = 1.0 / math.sqrt(Nj[j])
Nj[j] = int(Nj[j] * 100)
if Nj[j] >= 0:
Nj[j] += 1
run_sql("UPDATE %sR SET termlist=%%s WHERE id_bibrec=%%s" % table[:-1],
(serialize_via_marshal(doc_terms), j))
except (ZeroDivisionError, OverflowError), e:
## This is to try to isolate division by zero errors.
register_exception(prefix="Error when analysing the record %s (%s): %s\n" % (j, repr(docs_terms), e), alert_admin=True)
write_message("Phase 4: ......processed %s/%s records" % ((i+5000>len(records) and len(records) or (i+5000)), len(records)))
i += 5000
write_message("Phase 4: Finished calculating normalization value for all affected records and updating %sR" % table[:-1])
write_message("Phase 5: Updating %s with new normalization values" % table)
i = 0
terms = Gi.keys()
while i < len(terms):
#Adding the Gi value to each term, and adding the normalization value to each term in each document.
terms_docs = get_from_forward_index(terms, i, (i+5000), table)
for (t, hitlist) in terms_docs:
try:
term_docs = deserialize_via_marshal(hitlist)
if term_docs.has_key("Gi"):
del term_docs["Gi"]
for (j, tf) in term_docs.iteritems():
if Nj.has_key(j):
term_docs[j] = (tf[0], Nj[j])
Git = int(math.floor(Gi[t]*100))
if Git >= 0:
Git += 1
term_docs["Gi"] = (0, Git)
run_sql("UPDATE %s SET hitlist=%%s WHERE term=%%s" % table,
(serialize_via_marshal(term_docs), t))
except (ZeroDivisionError, OverflowError), e:
register_exception(prefix="Error when analysing the term %s (%s): %s\n" % (t, repr(terms_docs), e), alert_admin=True)
write_message("Phase 5: ......processed %s/%s terms" % ((i+5000>len(terms) and len(terms) or (i+5000)), len(terms)))
i += 5000
write_message("Phase 5: Finished updating %s with new normalization values" % table)
write_message("Time used for post-processing: %.1fmin" % ((time.time() - stime) / 60))
write_message("Finished post-processing")
def get_from_forward_index(terms, start, stop, table):
terms_docs = ()
for j in range(start, (stop < len(terms) and stop or len(terms))):
terms_docs += run_sql("SELECT term, hitlist FROM %s WHERE term=%%s" % table,
(terms[j],))
return terms_docs
def get_from_forward_index_with_id(start, stop, table):
terms_docs = run_sql("SELECT term, hitlist FROM %s WHERE id BETWEEN %s AND %s" % (table, start, stop))
return terms_docs
def get_from_reverse_index(records, start, stop, table):
current_recs = "%s" % records[start:stop]
current_recs = current_recs[1:-1]
docs_terms = run_sql("SELECT id_bibrec, termlist FROM %sR WHERE id_bibrec IN (%s)" % (table[:-1], current_recs))
return docs_terms
#def test_word_separators(phrase="hep-th/0101001"):
#"""Tests word separating policy on various input."""
#print "%s:" % phrase
#gwfp = get_words_from_phrase(phrase)
#for (word, count) in gwfp.iteritems():
#print "\t-> %s - %s" % (word, count)
def getName(methname, ln=CFG_SITE_LANG, type='ln'):
"""Returns the name of the rank method, either in default language or given language.
methname = short name of the method
ln - the language to get the name in
type - which name "type" to get."""
try:
rnkid = run_sql("SELECT id FROM rnkMETHOD where name='%s'" % methname)
if rnkid:
rnkid = str(rnkid[0][0])
res = run_sql("SELECT value FROM rnkMETHODNAME where type='%s' and ln='%s' and id_rnkMETHOD=%s" % (type, ln, rnkid))
if not res:
res = run_sql("SELECT value FROM rnkMETHODNAME WHERE ln='%s' and id_rnkMETHOD=%s and type='%s'" % (CFG_SITE_LANG, rnkid, type))
if not res:
return methname
return res[0][0]
else:
raise Exception
except Exception, e:
write_message("Cannot run rank method, either given code for method is wrong, or it has not been added using the webinterface.")
raise Exception
def word_similarity(run):
"""Call correct method"""
return word_index(run)
diff --git a/invenio/legacy/bibrank/word_searcher.py b/invenio/legacy/bibrank/word_searcher.py
index b32f08b9d..9b5c35e09 100644
--- a/invenio/legacy/bibrank/word_searcher.py
+++ b/invenio/legacy/bibrank/word_searcher.py
@@ -1,333 +1,333 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import string
import time
import math
import re
from invenio.legacy.dbquery import run_sql, deserialize_via_marshal
-from invenio.bibindex_engine_stemmer import stem
-from invenio.bibindex_engine_stopwords import is_stopword
+from invenio.legacy.bibindex.engine_stemmer import stem
+from invenio.legacy.bibindex.engine_stopwords import is_stopword
def find_similar(rank_method_code, recID, hitset, rank_limit_relevance,verbose, methods):
"""Finding terms to use for calculating similarity. Terms are taken from the recid given, returns a list of recids's and relevance,
input:
rank_method_code - the code of the method, from the name field in rnkMETHOD
recID - records to use for find similar
hitset - a list of hits for the query found by search_engine
rank_limit_relevance - show only records with a rank value above this
verbose - verbose value
output:
reclist - a list of sorted records: [[23,34], [344,24], [1,01]]
prefix - what to show before the rank value
postfix - what to show after the rank value
voutput - contains extra information, content dependent on verbose value"""
startCreate = time.time()
global voutput
voutput = ""
if verbose > 0:
voutput += "<br />Running rank method: %s, using find_similar/word_frequency in bibrank_record_sorter<br />" % rank_method_code
rank_limit_relevance = methods[rank_method_code]["default_min_relevance"]
try:
recID = int(recID)
except Exception,e :
return (None, "Warning: Error in record ID, please check that a number is given.", "", voutput)
rec_terms = run_sql("""SELECT termlist FROM %sR WHERE id_bibrec=%%s""" % methods[rank_method_code]["rnkWORD_table"][:-1], (recID,))
if not rec_terms:
return (None, "Warning: Requested record does not seem to exist.", "", voutput)
rec_terms = deserialize_via_marshal(rec_terms[0][0])
#Get all documents using terms from the selected documents
if len(rec_terms) == 0:
return (None, "Warning: Record specified has no content indexed for use with this method.", "", voutput)
else:
terms = "%s" % rec_terms.keys()
terms_recs = dict(run_sql("""SELECT term, hitlist FROM %s WHERE term IN (%s)""" % (methods[rank_method_code]["rnkWORD_table"], terms[1:len(terms) - 1])))
tf_values = {}
#Calculate all term frequencies
for (term, tf) in rec_terms.iteritems():
if len(term) >= methods[rank_method_code]["min_word_length"] and terms_recs.has_key(term) and tf[1] != 0:
tf_values[term] = int((1 + math.log(tf[0])) * tf[1]) #calculate term weigth
tf_values = tf_values.items()
tf_values.sort(lambda x, y: cmp(y[1], x[1])) #sort based on weigth
lwords = []
stime = time.time()
(recdict, rec_termcount) = ({}, {})
for (t, tf) in tf_values: #t=term, tf=term frequency
term_recs = deserialize_via_marshal(terms_recs[t])
if len(tf_values) <= methods[rank_method_code]["max_nr_words_lower"] or (len(term_recs) >= methods[rank_method_code]["min_nr_words_docs"] and (((float(len(term_recs)) / float(methods[rank_method_code]["col_size"])) <= methods[rank_method_code]["max_word_occurence"]) and ((float(len(term_recs)) / float(methods[rank_method_code]["col_size"])) >= methods[rank_method_code]["min_word_occurence"]))): #too complicated...something must be done
lwords.append((t, methods[rank_method_code]["rnkWORD_table"])) #list of terms used
(recdict, rec_termcount) = calculate_record_relevance_findsimilar((t, round(tf, 4)) , term_recs, hitset, recdict, rec_termcount, verbose, "true") #true tells the function to not calculate all unimportant terms
if len(tf_values) > methods[rank_method_code]["max_nr_words_lower"] and (len(lwords) == methods[rank_method_code]["max_nr_words_upper"] or tf < 0):
break
if len(recdict) == 0 or len(lwords) == 0:
return (None, "Could not find similar documents for this query.", "", voutput)
else: #sort if we got something to sort
(reclist, hitset) = sort_record_relevance_findsimilar(recdict, rec_termcount, hitset, rank_limit_relevance, verbose)
if verbose > 0:
voutput += "<br />Number of terms: %s<br />" % run_sql("SELECT count(id) FROM %s" % methods[rank_method_code]["rnkWORD_table"])[0][0]
voutput += "Number of terms to use for query: %s<br />" % len(lwords)
voutput += "Terms: %s<br />" % lwords
voutput += "Current number of recIDs: %s<br />" % (methods[rank_method_code]["col_size"])
voutput += "Prepare time: %s<br />" % (str(time.time() - startCreate))
voutput += "Total time used: %s<br />" % (str(time.time() - startCreate))
rank_method_stat(rank_method_code, reclist, lwords)
return (reclist[:len(reclist)], methods[rank_method_code]["prefix"], methods[rank_method_code]["postfix"], voutput)
def calculate_record_relevance_findsimilar(term, invidx, hitset, recdict, rec_termcount, verbose, quick=None):
"""Calculating the relevance of the documents based on the input, calculates only one word
term - (term, query term factor) the term and its importance in the overall search
invidx - {recid: tf, Gi: norm value} The Gi value is used as a idf value
hitset - a hitset with records that are allowed to be ranked
recdict - contains currently ranked records, is returned with new values
rec_termcount - {recid: count} the number of terms in this record that matches the query
verbose - verbose value
quick - if quick=yes only terms with a positive qtf is used, to limit the number of records to sort"""
(t, qtf) = term
if invidx.has_key("Gi"): #Gi = weigth for this term, created by bibrank_word_indexer
Gi = invidx["Gi"][1]
del invidx["Gi"]
else: #if not existing, bibrank should be run with -R
return (recdict, rec_termcount)
if not quick or (qtf >= 0 or (qtf < 0 and len(recdict) == 0)):
#Only accept records existing in the hitset received from the search engine
for (j, tf) in invidx.iteritems():
if j in hitset: #only include docs found by search_engine based on query
#calculate rank value
recdict[j] = recdict.get(j, 0) + int((1 + math.log(tf[0])) * Gi * tf[1] * qtf)
rec_termcount[j] = rec_termcount.get(j, 0) + 1 #number of terms from query in document
elif quick: #much used term, do not include all records, only use already existing ones
for (j, tf) in recdict.iteritems(): #i.e: if doc contains important term, also count unimportant
if invidx.has_key(j):
tf = invidx[j]
recdict[j] = recdict[j] + int((1 + math.log(tf[0])) * Gi * tf[1] * qtf)
rec_termcount[j] = rec_termcount.get(j, 0) + 1 #number of terms from query in document
return (recdict, rec_termcount)
def sort_record_relevance_findsimilar(recdict, rec_termcount, hitset, rank_limit_relevance, verbose):
"""Sorts the dictionary and returns records with a relevance higher than the given value.
recdict - {recid: value} unsorted
rank_limit_relevance - a value > 0 usually
verbose - verbose value"""
startCreate = time.time()
voutput = ""
reclist = []
#Multiply with the number of terms of the total number of terms in the query existing in the records
for j in recdict.keys():
if recdict[j] > 0 and rec_termcount[j] > 1:
recdict[j] = math.log((recdict[j] * rec_termcount[j]))
else:
recdict[j] = 0
hitset -= recdict.keys()
#gives each record a score between 0-100
divideby = max(recdict.values())
for (j, w) in recdict.iteritems():
w = int(w * 100 / divideby)
if w >= rank_limit_relevance:
reclist.append((j, w))
#sort scores
reclist.sort(lambda x, y: cmp(x[1], y[1]))
if verbose > 0:
voutput += "Number of records sorted: %s<br />" % len(reclist)
voutput += "Sort time: %s<br />" % (str(time.time() - startCreate))
return (reclist, hitset)
def word_similarity(rank_method_code, lwords, hitset, rank_limit_relevance, verbose, methods):
"""Ranking a records containing specified words and returns a sorted list.
input:
rank_method_code - the code of the method, from the name field in rnkMETHOD
lwords - a list of words from the query
hitset - a list of hits for the query found by search_engine
rank_limit_relevance - show only records with a rank value above this
verbose - verbose value
output:
reclist - a list of sorted records: [[23,34], [344,24], [1,01]]
prefix - what to show before the rank value
postfix - what to show after the rank value
voutput - contains extra information, content dependent on verbose value"""
voutput = ""
startCreate = time.time()
if verbose > 0:
voutput += "<br />Running rank method: %s, using word_frequency function in bibrank_record_sorter<br />" % rank_method_code
lwords_old = lwords
lwords = []
#Check terms, remove non alphanumeric characters. Use both unstemmed and stemmed version of all terms.
for i in range(0, len(lwords_old)):
term = string.lower(lwords_old[i])
if not methods[rank_method_code]["stopwords"] == "True" or methods[rank_method_code]["stopwords"] and not is_stopword(term):
lwords.append((term, methods[rank_method_code]["rnkWORD_table"]))
terms = string.split(string.lower(re.sub(methods[rank_method_code]["chars_alphanumericseparators"], ' ', term)))
for term in terms:
if methods[rank_method_code].has_key("stemmer"): # stem word
term = stem(string.replace(term, ' ', ''), methods[rank_method_code]["stemmer"])
if lwords_old[i] != term: #add if stemmed word is different than original word
lwords.append((term, methods[rank_method_code]["rnkWORD_table"]))
(recdict, rec_termcount, lrecIDs_remove) = ({}, {}, {})
#For each term, if accepted, get a list of the records using the term
#calculate then relevance for each term before sorting the list of records
for (term, table) in lwords:
term_recs = run_sql("""SELECT term, hitlist FROM %s WHERE term=%%s""" % methods[rank_method_code]["rnkWORD_table"], (term,))
if term_recs: #if term exists in database, use for ranking
term_recs = deserialize_via_marshal(term_recs[0][1])
(recdict, rec_termcount) = calculate_record_relevance((term, int(term_recs["Gi"][1])) , term_recs, hitset, recdict, rec_termcount, verbose, quick=None)
del term_recs
if len(recdict) == 0 or (len(lwords) == 1 and lwords[0] == ""):
return (None, "Records not ranked. The query is not detailed enough, or not enough records found, for ranking to be possible.", "", voutput)
else: #sort if we got something to sort
(reclist, hitset) = sort_record_relevance(recdict, rec_termcount, hitset, rank_limit_relevance, verbose)
#Add any documents not ranked to the end of the list
if hitset:
lrecIDs = list(hitset) #using 2-3mb
reclist = zip(lrecIDs, [0] * len(lrecIDs)) + reclist #using 6mb
if verbose > 0:
voutput += "<br />Current number of recIDs: %s<br />" % (methods[rank_method_code]["col_size"])
voutput += "Number of terms: %s<br />" % run_sql("SELECT count(id) FROM %s" % methods[rank_method_code]["rnkWORD_table"])[0][0]
voutput += "Terms: %s<br />" % lwords
voutput += "Prepare and pre calculate time: %s<br />" % (str(time.time() - startCreate))
voutput += "Total time used: %s<br />" % (str(time.time() - startCreate))
voutput += str(reclist) + "<br />"
rank_method_stat(rank_method_code, reclist, lwords)
return (reclist, methods[rank_method_code]["prefix"], methods[rank_method_code]["postfix"], voutput)
def calculate_record_relevance(term, invidx, hitset, recdict, rec_termcount, verbose, quick=None):
"""Calculating the relevance of the documents based on the input, calculates only one word
term - (term, query term factor) the term and its importance in the overall search
invidx - {recid: tf, Gi: norm value} The Gi value is used as a idf value
hitset - a hitset with records that are allowed to be ranked
recdict - contains currently ranked records, is returned with new values
rec_termcount - {recid: count} the number of terms in this record that matches the query
verbose - verbose value
quick - if quick=yes only terms with a positive qtf is used, to limit the number of records to sort"""
(t, qtf) = term
if invidx.has_key("Gi"):#Gi = weigth for this term, created by bibrank_word_indexer
Gi = invidx["Gi"][1]
del invidx["Gi"]
else: #if not existing, bibrank should be run with -R
return (recdict, rec_termcount)
if not quick or (qtf >= 0 or (qtf < 0 and len(recdict) == 0)):
#Only accept records existing in the hitset received from the search engine
for (j, tf) in invidx.iteritems():
if j in hitset:#only include docs found by search_engine based on query
try: #calculates rank value
recdict[j] = recdict.get(j, 0) + int(math.log(tf[0] * Gi * tf[1] * qtf))
except:
return (recdict, rec_termcount)
rec_termcount[j] = rec_termcount.get(j, 0) + 1 #number of terms from query in document
elif quick: #much used term, do not include all records, only use already existing ones
for (j, tf) in recdict.iteritems(): #i.e: if doc contains important term, also count unimportant
if invidx.has_key(j):
tf = invidx[j]
recdict[j] = recdict.get(j, 0) + int(math.log(tf[0] * Gi * tf[1] * qtf))
rec_termcount[j] = rec_termcount.get(j, 0) + 1 #number of terms from query in document
return (recdict, rec_termcount)
def sort_record_relevance(recdict, rec_termcount, hitset, rank_limit_relevance, verbose):
"""Sorts the dictionary and returns records with a relevance higher than the given value.
recdict - {recid: value} unsorted
rank_limit_relevance - a value > 0 usually
verbose - verbose value"""
startCreate = time.time()
voutput = ""
reclist = []
#remove all ranked documents so that unranked can be added to the end
hitset -= recdict.keys()
#gives each record a score between 0-100
divideby = max(recdict.values())
for (j, w) in recdict.iteritems():
w = int(w * 100 / divideby)
if w >= rank_limit_relevance:
reclist.append((j, w))
#sort scores
reclist.sort(lambda x, y: cmp(x[1], y[1]))
if verbose > 0:
voutput += "Number of records sorted: %s<br />" % len(reclist)
voutput += "Sort time: %s<br />" % (str(time.time() - startCreate))
return (reclist, hitset)
def rank_method_stat(rank_method_code, reclist, lwords):
"""Shows some statistics about the searchresult.
rank_method_code - name field from rnkMETHOD
reclist - a list of sorted and ranked records
lwords - the words in the query"""
voutput = ""
if len(reclist) > 20:
j = 20
else:
j = len(reclist)
voutput += "<br />Rank statistics:<br />"
for i in range(1, j + 1):
voutput += "%s,Recid:%s,Score:%s<br />" % (i,reclist[len(reclist) - i][0],reclist[len(reclist) - i][1])
for (term, table) in lwords:
term_recs = run_sql("""SELECT hitlist FROM %s WHERE term=%%s""" % table, (term,))
if term_recs:
term_recs = deserialize_via_marshal(term_recs[0][0])
if term_recs.has_key(reclist[len(reclist) - i][0]):
voutput += "%s-%s / " % (term, term_recs[reclist[len(reclist) - i][0]])
voutput += "<br />"
voutput += "<br />Score variation:<br />"
count = {}
for i in range(0, len(reclist)):
count[reclist[i][1]] = count.get(reclist[i][1], 0) + 1
i = 100
while i >= 0:
if count.has_key(i):
voutput += "%s-%s<br />" % (i, count[i])
i -= 1
#TODO: use Cython instead of psycho
diff --git a/invenio/legacy/bibsched/bibtask.py b/invenio/legacy/bibsched/bibtask.py
index f4ca04563..c0043fca9 100644
--- a/invenio/legacy/bibsched/bibtask.py
+++ b/invenio/legacy/bibsched/bibtask.py
@@ -1,1186 +1,1186 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Invenio Bibliographic Task Class.
BibTask class.
A BibTask is an executable under CFG_BINDIR, whose name is stored in
bibtask_config.CFG_BIBTASK_VALID_TASKS.
A valid task must call the task_init function with the proper parameters.
Generic task related parameters (user, sleeptime, runtime, task_id, task_name
verbose)
go to _TASK_PARAMS global dictionary accessible through task_get_task_param.
Option specific to the particular BibTask go to _OPTIONS global dictionary
and are accessible via task_get_option/task_set_option.
In order to log something properly, just use write_message(s) with the desired
verbose level.
task_update_status and task_update_progress can be used to update the status
of the task (DONE, FAILED, DONE WITH ERRORS...) and it's progress
(1 out 100..) within the bibsched monitor.
It is possible to enqueue a BibTask via API call by means of
task_low_level_submission.
"""
__revision__ = "$Id$"
import getopt
import getpass
import marshal
import os
import pwd
import re
import signal
import sys
import time
import datetime
import traceback
import logging
import logging.handlers
import random
from socket import gethostname
from invenio.legacy.dbquery import run_sql, _db_login
from invenio.modules.access.engine import acc_authorize_action
from invenio.config import CFG_PREFIX, CFG_BINDIR, CFG_LOGDIR, \
CFG_BIBSCHED_PROCESS_USER, CFG_TMPDIR, CFG_SITE_SUPPORT_EMAIL
from invenio.ext.logging import register_exception
from invenio.modules.access.local_config import CFG_EXTERNAL_AUTH_USING_SSO, \
CFG_EXTERNAL_AUTHENTICATION
from invenio.legacy.webuser import get_user_preferences, get_email
-from invenio.bibtask_config import CFG_BIBTASK_VALID_TASKS, \
+from invenio.legacy.bibsched.bibtask_config import CFG_BIBTASK_VALID_TASKS, \
CFG_BIBTASK_DEFAULT_TASK_SETTINGS, CFG_BIBTASK_FIXEDTIMETASKS
from invenio.utils.date import parse_runtime_limit
from invenio.utils.shell import escape_shell_arg
from invenio.ext.email import send_email
-from invenio.bibsched import bibsched_set_host, \
+from invenio.legacy.bibsched.scripts.bibsched import bibsched_set_host, \
bibsched_get_host
# Global _TASK_PARAMS dictionary.
_TASK_PARAMS = {
'version': '',
'task_stop_helper_fnc': None,
'task_name': os.path.basename(sys.argv[0]),
'task_specific_name': '',
'task_id': 0,
'user': '',
# If the task is not initialized (usually a developer debugging
# a single method), output all messages.
'verbose': 9,
'sleeptime': '',
'runtime': time.strftime("%Y-%m-%d %H:%M:%S"),
'priority': 0,
'runtime_limit': None,
'profile': [],
'post-process': [],
'sequence-id':None,
'stop_queue_on_error': False,
'fixed_time': False,
'email_logs_to': [],
}
# Global _OPTIONS dictionary.
_OPTIONS = {}
# Which tasks don't need to ask the user for authorization?
CFG_VALID_PROCESSES_NO_AUTH_NEEDED = ("bibupload", )
CFG_TASK_IS_NOT_A_DEAMON = ("bibupload", )
def fix_argv_paths(paths, argv=None):
"""Given the argv vector of cli parameters, and a list of path that
can be relative and may have been specified within argv,
it substitute all the occurencies of these paths in argv.
argv is changed in place and returned.
"""
if argv is None:
argv = sys.argv
for path in paths:
for count in xrange(len(argv)):
if path == argv[count]:
argv[count] = os.path.abspath(path)
return argv
def task_low_level_submission(name, user, *argv):
"""Let special lowlevel enqueuing of a task on the bibsche queue.
@param name: is the name of the bibtask. It must be a valid executable under
C{CFG_BINDIR}.
@type name: string
@param user: is a string that will appear as the "user" submitting the task.
Since task are submitted via API it make sense to set the
user to the name of the module/function that called
task_low_level_submission.
@type user: string
@param argv: are all the additional CLI parameters that would have been
passed on the CLI (one parameter per variable).
e.g.:
>>> task_low_level_submission('bibupload', 'admin', '-a', '/tmp/z.xml')
@type: strings
@return: the task identifier when the task is correctly enqueued.
@rtype: int
@note: use absolute paths in argv
"""
def get_priority(argv):
"""Try to get the priority by analysing the arguments."""
priority = 0
argv = list(argv)
while True:
try:
opts, args = getopt.gnu_getopt(argv, 'P:', ['priority='])
except getopt.GetoptError, err:
## We remove one by one all the non recognized parameters
if len(err.opt) > 1:
argv = [arg for arg in argv if arg != '--%s' % err.opt and not arg.startswith('--%s=' % err.opt)]
else:
argv = [arg for arg in argv if not arg.startswith('-%s' % err.opt)]
else:
break
for opt in opts:
if opt[0] in ('-P', '--priority'):
try:
priority = int(opt[1])
except ValueError:
pass
return priority
def get_special_name(argv):
"""Try to get the special name by analysing the arguments."""
special_name = ''
argv = list(argv)
while True:
try:
opts, args = getopt.gnu_getopt(argv, 'N:', ['name='])
except getopt.GetoptError, err:
## We remove one by one all the non recognized parameters
if len(err.opt) > 1:
argv = [arg for arg in argv if arg != '--%s' % err.opt and not arg.startswith('--%s=' % err.opt)]
else:
argv = [arg for arg in argv if not arg.startswith('-%s' % err.opt)]
else:
break
for opt in opts:
if opt[0] in ('-N', '--name'):
special_name = opt[1]
return special_name
def get_runtime(argv):
"""Try to get the runtime by analysing the arguments."""
runtime = time.strftime("%Y-%m-%d %H:%M:%S")
argv = list(argv)
while True:
try:
opts, args = getopt.gnu_getopt(argv, 't:', ['runtime='])
except getopt.GetoptError, err:
## We remove one by one all the non recognized parameters
if len(err.opt) > 1:
argv = [arg for arg in argv if arg != '--%s' % err.opt and not arg.startswith('--%s=' % err.opt)]
else:
argv = [arg for arg in argv if not arg.startswith('-%s' % err.opt)]
else:
break
for opt in opts:
if opt[0] in ('-t', '--runtime'):
try:
runtime = get_datetime(opt[1])
except ValueError:
pass
return runtime
def get_sleeptime(argv):
"""Try to get the runtime by analysing the arguments."""
sleeptime = ""
argv = list(argv)
while True:
try:
opts, args = getopt.gnu_getopt(argv, 's:', ['sleeptime='])
except getopt.GetoptError, err:
## We remove one by one all the non recognized parameters
if len(err.opt) > 1:
argv = [arg for arg in argv if arg != '--%s' % err.opt and not arg.startswith('--%s=' % err.opt)]
else:
argv = [arg for arg in argv if not arg.startswith('-%s' % err.opt)]
else:
break
for opt in opts:
if opt[0] in ('-s', '--sleeptime'):
try:
sleeptime = opt[1]
except ValueError:
pass
return sleeptime
def get_sequenceid(argv):
"""Try to get the sequenceid by analysing the arguments."""
sequenceid = None
argv = list(argv)
while True:
try:
opts, args = getopt.gnu_getopt(argv, 'I:', ['sequence-id='])
except getopt.GetoptError, err:
## We remove one by one all the non recognized parameters
if len(err.opt) > 1:
argv = [arg for arg in argv if arg != '--%s' % err.opt and not arg.startswith('--%s=' % err.opt)]
else:
argv = [arg for arg in argv if not arg.startswith('-%s' % err.opt)]
else:
break
for opt in opts:
if opt[0] in ('-I', '--sequence-id'):
try:
sequenceid = opt[1]
except ValueError:
pass
return sequenceid
task_id = None
try:
if not name in CFG_BIBTASK_VALID_TASKS:
raise StandardError('%s is not a valid task name' % name)
new_argv = []
for arg in argv:
if isinstance(arg, unicode):
arg = arg.encode('utf8')
new_argv.append(arg)
argv = new_argv
priority = get_priority(argv)
special_name = get_special_name(argv)
runtime = get_runtime(argv)
sleeptime = get_sleeptime(argv)
sequenceid = get_sequenceid(argv)
argv = tuple([os.path.join(CFG_BINDIR, name)] + list(argv))
if special_name:
name = '%s:%s' % (name, special_name)
verbose_argv = 'Will execute: %s' % ' '.join([escape_shell_arg(str(arg)) for arg in argv])
## submit task:
task_id = run_sql("""INSERT INTO schTASK (proc,user,
runtime,sleeptime,status,progress,arguments,priority,sequenceid)
VALUES (%s,%s,%s,%s,'WAITING',%s,%s,%s,%s)""",
(name, user, runtime, sleeptime, verbose_argv[:254], marshal.dumps(argv), priority, sequenceid))
except Exception:
register_exception(alert_admin=True)
if task_id:
run_sql("""DELETE FROM schTASK WHERE id=%s""", (task_id, ))
raise
return task_id
def bibtask_allocate_sequenceid(curdir=None):
"""
Returns an almost unique number to be used a task sequence ID.
In WebSubmit functions, set C{curdir} to the curdir (!) to read
the shared sequence ID for all functions of this submission (reading
"access number").
@param curdir: in WebSubmit functions (ONLY) the value retrieved
from the curdir parameter of the function
@return: an integer for the sequence ID. 0 is returned if the
sequence ID could not be allocated
@rtype: int
"""
if curdir:
try:
fd = file(os.path.join(curdir, 'access'), "r")
access = fd.readline().strip()
fd.close()
return access.replace("_", "")[-9:]
except:
return 0
else:
return random.randrange(1, 4294967296)
def setup_loggers(task_id=None):
"""Sets up the logging system."""
logger = logging.getLogger()
for handler in logger.handlers:
## Let's clean the handlers in case some piece of code has already
## fired any write_message, i.e. any call to debug, info, etc.
## which triggered a call to logging.basicConfig()
logger.removeHandler(handler)
formatter = logging.Formatter('%(asctime)s --> %(message)s', '%Y-%m-%d %H:%M:%S')
if task_id is not None:
err_logger = logging.handlers.RotatingFileHandler(os.path.join(CFG_LOGDIR, 'bibsched_task_%d.err' % _TASK_PARAMS['task_id']), 'a', 1*1024*1024, 10)
log_logger = logging.handlers.RotatingFileHandler(os.path.join(CFG_LOGDIR, 'bibsched_task_%d.log' % _TASK_PARAMS['task_id']), 'a', 1*1024*1024, 10)
log_logger.setFormatter(formatter)
log_logger.setLevel(logging.DEBUG)
err_logger.setFormatter(formatter)
err_logger.setLevel(logging.WARNING)
logger.addHandler(err_logger)
logger.addHandler(log_logger)
stdout_logger = logging.StreamHandler(sys.stdout)
stdout_logger.setFormatter(formatter)
stdout_logger.setLevel(logging.DEBUG)
stderr_logger = logging.StreamHandler(sys.stderr)
stderr_logger.setFormatter(formatter)
stderr_logger.setLevel(logging.WARNING)
logger.addHandler(stderr_logger)
logger.addHandler(stdout_logger)
logger.setLevel(logging.INFO)
return logger
def task_init(
authorization_action="",
authorization_msg="",
description="",
help_specific_usage="",
version=__revision__,
specific_params=("", []),
task_stop_helper_fnc=None,
task_submit_elaborate_specific_parameter_fnc=None,
task_submit_check_options_fnc=None,
task_run_fnc=None):
""" Initialize a BibTask.
@param authorization_action: is the name of the authorization action
connected with this task;
@param authorization_msg: is the header printed when asking for an
authorization password;
@param description: is the generic description printed in the usage page;
@param help_specific_usage: is the specific parameter help
@param task_stop_fnc: is a function that will be called
whenever the task is stopped
@param task_submit_elaborate_specific_parameter_fnc: will be called passing
a key and a value, for parsing specific cli parameters. Must return True if
it has recognized the parameter. Must eventually update the options with
bibtask_set_option;
@param task_submit_check_options: must check the validity of options (via
bibtask_get_option) once all the options where parsed;
@param task_run_fnc: will be called as the main core function. Must return
False in case of errors.
"""
global _TASK_PARAMS, _OPTIONS
_TASK_PARAMS = {
"version" : version,
"task_stop_helper_fnc" : task_stop_helper_fnc,
"task_name" : os.path.basename(sys.argv[0]),
"task_specific_name" : '',
"user" : '',
"verbose" : 1,
"sleeptime" : '',
"runtime" : time.strftime("%Y-%m-%d %H:%M:%S"),
"priority" : 0,
"runtime_limit" : None,
"profile" : [],
"post-process": [],
"sequence-id": None,
"stop_queue_on_error": False,
"fixed_time": False,
}
to_be_submitted = True
if len(sys.argv) == 2 and sys.argv[1].isdigit():
_TASK_PARAMS['task_id'] = int(sys.argv[1])
argv = _task_get_options(_TASK_PARAMS['task_id'], _TASK_PARAMS['task_name'])
to_be_submitted = False
else:
argv = sys.argv
setup_loggers(_TASK_PARAMS.get('task_id'))
task_name = os.path.basename(sys.argv[0])
if task_name not in CFG_BIBTASK_VALID_TASKS or os.path.realpath(os.path.join(CFG_BINDIR, task_name)) != os.path.realpath(sys.argv[0]):
raise OSError("%s is not in the allowed modules" % sys.argv[0])
from invenio.ext.logging import wrap_warn
wrap_warn()
if type(argv) is dict:
# FIXME: REMOVE AFTER MAJOR RELEASE 1.0
# This is needed for old task submitted before CLI parameters
# where stored in DB and _OPTIONS dictionary was stored instead.
_OPTIONS = argv
else:
try:
_task_build_params(_TASK_PARAMS['task_name'], argv, description,
help_specific_usage, version, specific_params,
task_submit_elaborate_specific_parameter_fnc,
task_submit_check_options_fnc)
except (SystemExit, Exception), err:
if not to_be_submitted:
register_exception(alert_admin=True)
write_message("Error in parsing the parameters: %s." % err, sys.stderr)
write_message("Exiting.", sys.stderr)
task_update_status("ERROR")
raise
write_message('argv=%s' % (argv, ), verbose=9)
write_message('_OPTIONS=%s' % (_OPTIONS, ), verbose=9)
write_message('_TASK_PARAMS=%s' % (_TASK_PARAMS, ), verbose=9)
if to_be_submitted:
_task_submit(argv, authorization_action, authorization_msg)
else:
try:
try:
if task_get_task_param('profile'):
try:
from cStringIO import StringIO
import pstats
filename = os.path.join(CFG_TMPDIR, 'bibsched_task_%s.pyprof' % _TASK_PARAMS['task_id'])
existing_sorts = pstats.Stats.sort_arg_dict_default.keys()
required_sorts = []
profile_dump = []
for sort in task_get_task_param('profile'):
if sort not in existing_sorts:
sort = 'cumulative'
if sort not in required_sorts:
required_sorts.append(sort)
if sys.hexversion < 0x02050000:
import hotshot
import hotshot.stats
pr = hotshot.Profile(filename)
ret = pr.runcall(_task_run, task_run_fnc)
for sort_type in required_sorts:
tmp_out = sys.stdout
sys.stdout = StringIO()
hotshot.stats.load(filename).strip_dirs().sort_stats(sort_type).print_stats()
# pylint: disable=E1103
# This is a hack. sys.stdout is a StringIO in this case.
profile_dump.append(sys.stdout.getvalue())
# pylint: enable=E1103
sys.stdout = tmp_out
else:
import cProfile
pr = cProfile.Profile()
ret = pr.runcall(_task_run, task_run_fnc)
pr.dump_stats(filename)
for sort_type in required_sorts:
strstream = StringIO()
pstats.Stats(filename, stream=strstream).strip_dirs().sort_stats(sort_type).print_stats()
profile_dump.append(strstream.getvalue())
profile_dump = '\n'.join(profile_dump)
profile_dump += '\nYou can use profile=%s' % existing_sorts
open(os.path.join(CFG_LOGDIR, 'bibsched_task_%d.log' % _TASK_PARAMS['task_id']), 'a').write("%s" % profile_dump)
os.remove(filename)
except ImportError:
ret = _task_run(task_run_fnc)
write_message("ERROR: The Python Profiler is not installed!", stream=sys.stderr)
else:
ret = _task_run(task_run_fnc)
if not ret:
write_message("Error occurred. Exiting.", sys.stderr)
except Exception, e:
register_exception(alert_admin=True)
write_message("Unexpected error occurred: %s." % e, sys.stderr)
write_message("Traceback is:", sys.stderr)
write_messages(''.join(traceback.format_tb(sys.exc_info()[2])), sys.stderr)
write_message("Exiting.", sys.stderr)
task_update_status("ERROR")
finally:
_task_email_logs()
logging.shutdown()
def _task_build_params(
task_name,
argv,
description="",
help_specific_usage="",
version=__revision__,
specific_params=("", []),
task_submit_elaborate_specific_parameter_fnc=None,
task_submit_check_options_fnc=None):
""" Build the BibTask params.
@param argv: a list of string as in sys.argv
@param description: is the generic description printed in the usage page;
@param help_specific_usage: is the specific parameter help
@param task_submit_elaborate_specific_parameter_fnc: will be called passing
a key and a value, for parsing specific cli parameters. Must return True if
it has recognized the parameter. Must eventually update the options with
bibtask_set_option;
@param task_submit_check_options: must check the validity of options (via
bibtask_get_option) once all the options where parsed;
"""
global _OPTIONS
_OPTIONS = {}
if task_name in CFG_BIBTASK_DEFAULT_TASK_SETTINGS:
_OPTIONS.update(CFG_BIBTASK_DEFAULT_TASK_SETTINGS[task_name])
# set user-defined options:
try:
(short_params, long_params) = specific_params
opts, args = getopt.gnu_getopt(argv[1:], "hVv:u:s:t:P:N:L:I:" +
short_params, [
"help",
"version",
"verbose=",
"user=",
"sleep=",
"runtime=",
"priority=",
"name=",
"limit=",
"profile=",
"post-process=",
"sequence-id=",
"stop-on-error",
"continue-on-error",
"fixed-time",
"email-logs-to="
] + long_params)
except getopt.GetoptError, err:
_usage(1, err, help_specific_usage=help_specific_usage, description=description)
try:
for opt in opts:
if opt[0] in ("-h", "--help"):
_usage(0, help_specific_usage=help_specific_usage, description=description)
elif opt[0] in ("-V", "--version"):
print _TASK_PARAMS["version"]
sys.exit(0)
elif opt[0] in ("-u", "--user"):
_TASK_PARAMS["user"] = opt[1]
elif opt[0] in ("-v", "--verbose"):
_TASK_PARAMS["verbose"] = int(opt[1])
elif opt[0] in ("-s", "--sleeptime"):
if task_name not in CFG_TASK_IS_NOT_A_DEAMON:
get_datetime(opt[1]) # see if it is a valid shift
_TASK_PARAMS["sleeptime"] = opt[1]
elif opt[0] in ("-t", "--runtime"):
_TASK_PARAMS["runtime"] = get_datetime(opt[1])
elif opt[0] in ("-P", "--priority"):
_TASK_PARAMS["priority"] = int(opt[1])
elif opt[0] in ("-N", "--name"):
_TASK_PARAMS["task_specific_name"] = opt[1]
elif opt[0] in ("-L", "--limit"):
_TASK_PARAMS["runtime_limit"] = parse_runtime_limit(opt[1])
elif opt[0] in ("--profile", ):
_TASK_PARAMS["profile"] += opt[1].split(',')
elif opt[0] in ("--post-process", ):
_TASK_PARAMS["post-process"] += [opt[1]]
elif opt[0] in ("-I","--sequence-id"):
_TASK_PARAMS["sequence-id"] = opt[1]
elif opt[0] in ("--stop-on-error", ):
_TASK_PARAMS["stop_queue_on_error"] = True
elif opt[0] in ("--continue-on-error", ):
_TASK_PARAMS["stop_queue_on_error"] = False
elif opt[0] in ("--fixed-time", ):
_TASK_PARAMS["fixed_time"] = True
elif opt[0] in ("--email-logs-to",):
_TASK_PARAMS["email_logs_to"] = opt[1].split(',')
elif not callable(task_submit_elaborate_specific_parameter_fnc) or \
not task_submit_elaborate_specific_parameter_fnc(opt[0],
opt[1], opts, args):
_usage(1, help_specific_usage=help_specific_usage, description=description)
except StandardError, e:
_usage(e, help_specific_usage=help_specific_usage, description=description)
if callable(task_submit_check_options_fnc):
if not task_submit_check_options_fnc():
_usage(1, help_specific_usage=help_specific_usage, description=description)
def task_set_option(key, value):
"""Set an value to key in the option dictionary of the task"""
global _OPTIONS
try:
_OPTIONS[key] = value
except NameError:
_OPTIONS = {key : value}
def task_get_option(key, default=None):
"""Returns the value corresponding to key in the option dictionary of the task"""
try:
return _OPTIONS.get(key, default)
except NameError:
return default
def task_has_option(key):
"""Map the has_key query to _OPTIONS"""
try:
return _OPTIONS.has_key(key)
except NameError:
return False
def task_get_task_param(key, default=None):
"""Returns the value corresponding to the particular task param"""
try:
return _TASK_PARAMS.get(key, default)
except NameError:
return default
def task_set_task_param(key, value):
"""Set the value corresponding to the particular task param"""
global _TASK_PARAMS
try:
_TASK_PARAMS[key] = value
except NameError:
_TASK_PARAMS = {key : value}
def task_update_progress(msg):
"""Updates progress information in the BibSched task table."""
write_message("Updating task progress to %s." % msg, verbose=9)
if "task_id" in _TASK_PARAMS:
return run_sql("UPDATE schTASK SET progress=%s where id=%s",
(msg, _TASK_PARAMS["task_id"]))
def task_update_status(val):
"""Updates status information in the BibSched task table."""
write_message("Updating task status to %s." % val, verbose=9)
if "task_id" in _TASK_PARAMS:
return run_sql("UPDATE schTASK SET status=%s where id=%s",
(val, _TASK_PARAMS["task_id"]))
def task_read_status():
"""Read status information in the BibSched task table."""
res = run_sql("SELECT status FROM schTASK where id=%s",
(_TASK_PARAMS['task_id'],), 1)
try:
out = res[0][0]
except:
out = 'UNKNOWN'
return out
def write_messages(msgs, stream=None, verbose=1):
"""Write many messages through write_message"""
if stream is None:
stream = sys.stdout
for msg in msgs.split('\n'):
write_message(msg, stream, verbose)
def write_message(msg, stream=None, verbose=1):
"""Write message and flush output stream (may be sys.stdout or sys.stderr).
Useful for debugging stuff.
@note: msg could be a callable with no parameters. In this case it is
been called in order to obtain the string to be printed.
"""
if stream is None:
stream = sys.stdout
if msg and _TASK_PARAMS['verbose'] >= verbose:
if callable(msg):
msg = msg()
if stream == sys.stdout:
logging.info(msg)
elif stream == sys.stderr:
logging.error(msg)
else:
sys.stderr.write("Unknown stream %s. [must be sys.stdout or sys.stderr]\n" % stream)
else:
logging.debug(msg)
_RE_SHIFT = re.compile("([-\+]{0,1})([\d]+)([dhms])")
def get_datetime(var, format_string="%Y-%m-%d %H:%M:%S", now=None):
"""Returns a date string according to the format string.
It can handle normal date strings and shifts with respect
to now."""
date = now or datetime.datetime.now()
factors = {"d": 24 * 3600, "h": 3600, "m": 60, "s": 1}
m = _RE_SHIFT.match(var)
if m:
sign = m.groups()[0] == "-" and -1 or 1
factor = factors[m.groups()[2]]
value = float(m.groups()[1])
delta = sign * factor * value
while delta > 0 and date < datetime.datetime.now():
date = date + datetime.timedelta(seconds=delta)
date = date.strftime(format_string)
else:
date = time.strptime(var, format_string)
date = time.strftime(format_string, date)
return date
def task_sleep_now_if_required(can_stop_too=False):
"""This function should be called during safe state of BibTask,
e.g. after flushing caches or outside of run_sql calls.
"""
status = task_read_status()
write_message('Entering task_sleep_now_if_required with status=%s' % status, verbose=9)
if status == 'ABOUT TO SLEEP':
write_message("sleeping...")
task_update_status("SLEEPING")
signal.signal(signal.SIGTSTP, _task_sig_dumb)
os.kill(os.getpid(), signal.SIGSTOP)
time.sleep(1)
if task_read_status() == 'NOW STOP':
if can_stop_too:
write_message("stopped")
task_update_status("STOPPED")
sys.exit(0)
else:
write_message("stopping as soon as possible...")
task_update_status('ABOUT TO STOP')
else:
write_message("... continuing...")
task_update_status("CONTINUING")
signal.signal(signal.SIGTSTP, _task_sig_sleep)
elif status == 'ABOUT TO STOP':
if can_stop_too:
write_message("stopped")
task_update_status("STOPPED")
sys.exit(0)
else:
## I am a capricious baby. At least I am going to sleep :-)
write_message("sleeping...")
task_update_status("SLEEPING")
signal.signal(signal.SIGTSTP, _task_sig_dumb)
os.kill(os.getpid(), signal.SIGSTOP)
time.sleep(1)
## Putting back the status to "ABOUT TO STOP"
write_message("... continuing...")
task_update_status("ABOUT TO STOP")
signal.signal(signal.SIGTSTP, _task_sig_sleep)
if can_stop_too:
runtime_limit = task_get_option("limit")
if runtime_limit is not None:
if not (runtime_limit[0] <= datetime.datetime.now() <= runtime_limit[1]):
write_message("stopped (outside runtime limit)")
task_update_status("STOPPED")
sys.exit(0)
def authenticate(user, authorization_action, authorization_msg=""):
"""Authenticate the user against the user database.
Check for its password, if it exists.
Check for authorization_action access rights.
Return user name upon authorization success,
do system exit upon authorization failure.
"""
#FIXME
return user
# With SSO it's impossible to check for pwd
if CFG_EXTERNAL_AUTH_USING_SSO or os.path.basename(sys.argv[0]) in CFG_VALID_PROCESSES_NO_AUTH_NEEDED:
return user
if authorization_msg:
print authorization_msg
print "=" * len(authorization_msg)
if user == "":
print >> sys.stdout, "\rUsername: ",
try:
user = sys.stdin.readline().lower().strip()
except EOFError:
sys.stderr.write("\n")
sys.exit(1)
except KeyboardInterrupt:
sys.stderr.write("\n")
sys.exit(1)
else:
print >> sys.stdout, "\rUsername:", user
## first check user:
# p_un passed may be an email or a nickname:
res = run_sql("select id from user where email=%s", (user,), 1) + \
run_sql("select id from user where nickname=%s", (user,), 1)
if not res:
print "Sorry, %s does not exist." % user
sys.exit(1)
else:
uid = res[0][0]
ok = False
login_method = get_user_preferences(uid)['login_method']
if not CFG_EXTERNAL_AUTHENTICATION[login_method]:
#Local authentication, let's see if we want passwords.
res = run_sql("select id from user where id=%s "
"and password=AES_ENCRYPT(email,'')",
(uid,), 1)
if res:
ok = True
if not ok:
try:
password_entered = getpass.getpass()
except EOFError:
sys.stderr.write("\n")
sys.exit(1)
except KeyboardInterrupt:
sys.stderr.write("\n")
sys.exit(1)
if not CFG_EXTERNAL_AUTHENTICATION[login_method]:
res = run_sql("select id from user where id=%s "
"and password=AES_ENCRYPT(email, %s)",
(uid, password_entered), 1)
if res:
ok = True
else:
if CFG_EXTERNAL_AUTHENTICATION[login_method].auth_user(get_email(uid), password_entered):
ok = True
if not ok:
print "Sorry, wrong credentials for %s." % user
sys.exit(1)
else:
## secondly check authorization for the authorization_action:
(auth_code, auth_message) = acc_authorize_action(uid, authorization_action)
if auth_code != 0:
print auth_message
sys.exit(1)
return user
def _task_submit(argv, authorization_action, authorization_msg):
"""Submits task to the BibSched task queue. This is what people will
be invoking via command line."""
## check as whom we want to submit?
check_running_process_user()
## sanity check: remove eventual "task" option:
## authenticate user:
_TASK_PARAMS['user'] = authenticate(_TASK_PARAMS["user"], authorization_action, authorization_msg)
## submit task:
if _TASK_PARAMS['task_specific_name']:
task_name = '%s:%s' % (_TASK_PARAMS['task_name'], _TASK_PARAMS['task_specific_name'])
else:
task_name = _TASK_PARAMS['task_name']
write_message("storing task options %s\n" % argv, verbose=9)
verbose_argv = 'Will execute: %s' % ' '.join([escape_shell_arg(str(arg)) for arg in argv])
_TASK_PARAMS['task_id'] = run_sql("""INSERT INTO schTASK (proc,user,
runtime,sleeptime,status,progress,arguments,priority,sequenceid)
VALUES (%s,%s,%s,%s,'WAITING',%s,%s,%s,%s)""",
(task_name, _TASK_PARAMS['user'], _TASK_PARAMS["runtime"],
_TASK_PARAMS["sleeptime"], verbose_argv, marshal.dumps(argv), _TASK_PARAMS['priority'], _TASK_PARAMS['sequence-id']))
## update task number:
write_message("Task #%d submitted." % _TASK_PARAMS['task_id'])
return _TASK_PARAMS['task_id']
def _task_get_options(task_id, task_name):
"""Returns options for the task 'id' read from the BibSched task
queue table."""
out = {}
res = run_sql("SELECT arguments FROM schTASK WHERE id=%s AND proc LIKE %s",
(task_id, task_name+'%'))
try:
out = marshal.loads(res[0][0])
except:
write_message("Error: %s task %d does not seem to exist." \
% (task_name, task_id), sys.stderr)
task_update_status('ERROR')
sys.exit(1)
write_message('Options retrieved: %s' % (out, ), verbose=9)
return out
def _task_email_logs():
"""
In case this was requested, emails the logs.
"""
email_logs_to = task_get_task_param('email_logs_to')
if not email_logs_to:
return
status = task_read_status()
task_name = task_get_task_param('task_name')
task_specific_name = task_get_task_param('task_specific_name')
if task_specific_name:
task_name += ':' + task_specific_name
runtime = task_get_task_param('runtime')
title = "Execution of %s: %s" % (task_name, status)
body = """
Attached you can find the stdout and stderr logs of the execution of
name: %s
id: %s
runtime: %s
options: %s
status: %s
""" % (task_name, _TASK_PARAMS['task_id'], runtime, _OPTIONS, status)
err_file = os.path.join(CFG_LOGDIR, 'bibsched_task_%d.err' % _TASK_PARAMS['task_id'])
log_file = os.path.join(CFG_LOGDIR, 'bibsched_task_%d.log' % _TASK_PARAMS['task_id'])
return send_email(CFG_SITE_SUPPORT_EMAIL, email_logs_to, title, body, attachments=[(log_file, 'text/plain'), (err_file, 'text/plain')])
def _task_run(task_run_fnc):
"""Runs the task by fetching arguments from the BibSched task queue.
This is what BibSched will be invoking via daemon call.
The task prints Fibonacci numbers for up to NUM on the stdout, and some
messages on stderr.
@param task_run_fnc: will be called as the main core function. Must return
False in case of errors.
Return True in case of success and False in case of failure."""
- from invenio.bibtasklet import _TASKLETS
+ from invenio.legacy.bibsched.bibtasklet import _TASKLETS
## We prepare the pid file inside /prefix/var/run/taskname_id.pid
check_running_process_user()
try:
pidfile_name = os.path.join(CFG_PREFIX, 'var', 'run',
'bibsched_task_%d.pid' % _TASK_PARAMS['task_id'])
pidfile = open(pidfile_name, 'w')
pidfile.write(str(os.getpid()))
pidfile.close()
except OSError:
register_exception(alert_admin=True)
task_update_status("ERROR")
return False
## check task status:
task_status = task_read_status()
if task_status not in ("WAITING", "SCHEDULED"):
write_message("Error: The task #%d is %s. I expected WAITING or SCHEDULED." %
(_TASK_PARAMS['task_id'], task_status), sys.stderr)
return False
time_now = datetime.datetime.now()
if _TASK_PARAMS['runtime_limit'] is not None and os.environ.get('BIBSCHED_MODE', 'manual') != 'manual':
if not _TASK_PARAMS['runtime_limit'][0][0] <= time_now <= _TASK_PARAMS['runtime_limit'][0][1]:
if time_now <= _TASK_PARAMS['runtime_limit'][0][0]:
new_runtime = _TASK_PARAMS['runtime_limit'][0][0].strftime("%Y-%m-%d %H:%M:%S")
else:
new_runtime = _TASK_PARAMS['runtime_limit'][1][0].strftime("%Y-%m-%d %H:%M:%S")
progress = run_sql("SELECT progress FROM schTASK WHERE id=%s", (_TASK_PARAMS['task_id'], ))
if progress:
progress = progress[0][0]
else:
progress = ''
g = re.match(r'Postponed (\d+) time\(s\)', progress)
if g:
postponed_times = int(g.group(1))
else:
postponed_times = 0
if _TASK_PARAMS['sequence-id']:
## Also postponing other dependent tasks.
run_sql("UPDATE schTASK SET runtime=%s, progress=%s WHERE sequenceid=%s AND status='WAITING'", (new_runtime, 'Postponed as task %s' % _TASK_PARAMS['task_id'], _TASK_PARAMS['sequence-id'])) # kwalitee: disable=sql
run_sql("UPDATE schTASK SET runtime=%s, status='WAITING', progress=%s, host='' WHERE id=%s", (new_runtime, 'Postponed %d time(s)' % (postponed_times + 1), _TASK_PARAMS['task_id'])) # kwalitee: disable=sql
write_message("Task #%d postponed because outside of runtime limit" % _TASK_PARAMS['task_id'])
return True
# Make sure the host field is updated
# It will not be updated properly when we run
# a task from the cli (without using the bibsched monitor)
host = bibsched_get_host(_TASK_PARAMS['task_id'])
if host and host != gethostname():
write_message("Error: The task #%d is bound to %s." %
(_TASK_PARAMS['task_id'], host), sys.stderr)
return False
else:
bibsched_set_host(_TASK_PARAMS['task_id'], gethostname())
## initialize signal handler:
signal.signal(signal.SIGUSR2, signal.SIG_IGN)
signal.signal(signal.SIGTSTP, _task_sig_sleep)
signal.signal(signal.SIGTERM, _task_sig_stop)
signal.signal(signal.SIGQUIT, _task_sig_stop)
signal.signal(signal.SIGABRT, _task_sig_suicide)
signal.signal(signal.SIGINT, _task_sig_stop)
## we can run the task now:
write_message("Task #%d started." % _TASK_PARAMS['task_id'])
task_update_status("RUNNING")
## run the task:
_TASK_PARAMS['task_starting_time'] = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
sleeptime = _TASK_PARAMS['sleeptime']
try:
try:
if callable(task_run_fnc) and task_run_fnc():
task_update_status("DONE")
else:
task_update_status("DONE WITH ERRORS")
except SystemExit:
pass
except:
write_message(traceback.format_exc()[:-1])
register_exception(alert_admin=True)
if task_get_task_param('stop_queue_on_error'):
task_update_status("ERROR")
else:
task_update_status("CERROR")
finally:
task_status = task_read_status()
if sleeptime:
argv = _task_get_options(_TASK_PARAMS['task_id'], _TASK_PARAMS['task_name'])
verbose_argv = 'Will execute: %s' % ' '.join([escape_shell_arg(str(arg)) for arg in argv])
# Here we check if the task can shift away of has to be run at
# a fixed time
if task_get_task_param('fixed_time') or _TASK_PARAMS['task_name'] in CFG_BIBTASK_FIXEDTIMETASKS:
old_runtime = run_sql("SELECT runtime FROM schTASK WHERE id=%s", (_TASK_PARAMS['task_id'], ))[0][0]
else:
old_runtime = None
new_runtime = get_datetime(sleeptime, now=old_runtime)
## The task is a daemon. We resubmit it
if task_status == 'DONE':
## It has finished in a good way. We recycle the database row
run_sql("UPDATE schTASK SET runtime=%s, status='WAITING', progress=%s, host='' WHERE id=%s", (new_runtime, verbose_argv, _TASK_PARAMS['task_id']))
write_message("Task #%d finished and resubmitted." % _TASK_PARAMS['task_id'])
elif task_status == 'STOPPED':
run_sql("UPDATE schTASK SET status='WAITING', progress=%s, host='' WHERE id=%s", (verbose_argv, _TASK_PARAMS['task_id'], ))
write_message("Task #%d stopped and resubmitted." % _TASK_PARAMS['task_id'])
else:
## We keep the bad result and we resubmit with another id.
#res = run_sql('SELECT proc,user,sleeptime,arguments,priority FROM schTASK WHERE id=%s', (_TASK_PARAMS['task_id'], ))
#proc, user, sleeptime, arguments, priority = res[0]
#run_sql("""INSERT INTO schTASK (proc,user,
#runtime,sleeptime,status,arguments,priority)
#VALUES (%s,%s,%s,%s,'WAITING',%s, %s)""",
#(proc, user, new_runtime, sleeptime, arguments, priority))
write_message("Task #%d finished but not resubmitted. [%s]" % (_TASK_PARAMS['task_id'], task_status))
else:
## we are done:
write_message("Task #%d finished. [%s]" % (_TASK_PARAMS['task_id'], task_status))
## Removing the pid
os.remove(pidfile_name)
#Lets call the post-process tasklets
if task_get_task_param("post-process"):
split = re.compile(r"(bst_.*)\[(.*)\]")
for tasklet in task_get_task_param("post-process"):
if not split.match(tasklet): # wrong syntax
_usage(1, "There is an error in the post processing option "
"for this task.")
aux_tasklet = split.match(tasklet)
_TASKLETS[aux_tasklet.group(1)](**eval("dict(%s)" % (aux_tasklet.group(2))))
return True
def _usage(exitcode=1, msg="", help_specific_usage="", description=""):
"""Prints usage info."""
if msg:
sys.stderr.write("Error: %s.\n" % msg)
sys.stderr.write("Usage: %s [options]\n" % sys.argv[0])
if help_specific_usage:
sys.stderr.write("Command options:\n")
sys.stderr.write(help_specific_usage)
sys.stderr.write(" Scheduling options:\n")
sys.stderr.write(" -u, --user=USER\tUser name under which to submit this"
" task.\n")
sys.stderr.write(" -t, --runtime=TIME\tTime to execute the task. [default=now]\n"
"\t\t\tExamples: +15s, 5m, 3h, 2002-10-27 13:57:26.\n")
sys.stderr.write(" -s, --sleeptime=SLEEP\tSleeping frequency after"
" which to repeat the task.\n"
"\t\t\tExamples: 30m, 2h, 1d. [default=no]\n")
sys.stderr.write(" --fixed-time\t\tAvoid drifting of execution time when using --sleeptime\n")
sys.stderr.write(" -I, --sequence-id=SEQUENCE-ID\tSequence Id of the current process\n")
sys.stderr.write(" -L --limit=LIMIT\tTime limit when it is"
" allowed to execute the task.\n"
"\t\t\tExamples: 22:00-03:00, Sunday 01:00-05:00.\n"
"\t\t\tSyntax: [Wee[kday]] [hh[:mm][-hh[:mm]]].\n")
sys.stderr.write(" -P, --priority=PRI\tTask priority (0=default, 1=higher, etc).\n")
sys.stderr.write(" -N, --name=NAME\tTask specific name (advanced option).\n\n")
sys.stderr.write(" General options:\n")
sys.stderr.write(" -h, --help\t\tPrint this help.\n")
sys.stderr.write(" -V, --version\t\tPrint version information.\n")
sys.stderr.write(" -v, --verbose=LEVEL\tVerbose level (0=min,"
" 1=default, 9=max).\n")
sys.stderr.write(" --profile=STATS\tPrint profile information. STATS is a comma-separated\n\t\t\tlist of desired output stats (calls, cumulative,\n\t\t\tfile, line, module, name, nfl, pcalls, stdname, time).\n")
sys.stderr.write(" --stop-on-error\tIn case of unrecoverable error stop the bibsched queue.\n")
sys.stderr.write(" --continue-on-error\tIn case of unrecoverable error don't stop the bibsched queue.\n")
sys.stderr.write(" --post-process=BIB_TASKLET_NAME[parameters]\tPostprocesses the specified\n\t\t\tbibtasklet with the given parameters between square\n\t\t\tbrackets.\n")
sys.stderr.write("\t\t\tExample:--post-process \"bst_send_email[fromaddr=\n\t\t\t'foo@xxx.com', toaddr='bar@xxx.com', subject='hello',\n\t\t\tcontent='help']\"\n")
sys.stderr.write(" --email-logs-to=EMAILS Sends an email with the results of the execution\n\t\t\tof the task, and attached the logs (EMAILS could be a comma-\n\t\t\tseparated lists of email addresses)\n")
if description:
sys.stderr.write(description)
sys.exit(exitcode)
def _task_sig_sleep(sig, frame):
"""Signal handler for the 'sleep' signal sent by BibSched."""
signal.signal(signal.SIGTSTP, signal.SIG_IGN)
write_message("task_sig_sleep(), got signal %s frame %s"
% (sig, frame), verbose=9)
write_message("sleeping as soon as possible...")
_db_login(relogin=1)
task_update_status("ABOUT TO SLEEP")
def _task_sig_stop(sig, frame):
"""Signal handler for the 'stop' signal sent by BibSched."""
write_message("task_sig_stop(), got signal %s frame %s"
% (sig, frame), verbose=9)
write_message("stopping as soon as possible...")
_db_login(relogin=1) # To avoid concurrency with an interrupted run_sql call
task_update_status("ABOUT TO STOP")
def _task_sig_suicide(sig, frame):
"""Signal handler for the 'suicide' signal sent by BibSched."""
write_message("task_sig_suicide(), got signal %s frame %s"
% (sig, frame), verbose=9)
write_message("suiciding myself now...")
task_update_status("SUICIDING")
write_message("suicided")
_db_login(relogin=1)
task_update_status("SUICIDED")
sys.exit(1)
def _task_sig_dumb(sig, frame):
"""Dumb signal handler."""
pass
_RE_PSLINE = re.compile('^\s*(\w+)\s+(\w+)')
def guess_apache_process_user_from_ps():
"""Guess Apache process user by parsing the list of running processes."""
apache_users = []
try:
# Tested on Linux, Sun and MacOS X
for line in os.popen('ps -A -o user,comm').readlines():
g = _RE_PSLINE.match(line)
if g:
username = g.group(1)
process = os.path.basename(g.group(2))
if process in ('apache', 'apache2', 'httpd') :
if username not in apache_users and username != 'root':
apache_users.append(username)
except Exception, e:
print >> sys.stderr, "WARNING: %s" % e
return tuple(apache_users)
def guess_apache_process_user():
"""
Return the possible name of the user running the Apache server process.
(Look at running OS processes or look at OS users defined in /etc/passwd.)
"""
apache_users = guess_apache_process_user_from_ps() + ('apache2', 'apache', 'www-data')
for username in apache_users:
try:
userline = pwd.getpwnam(username)
return userline[0]
except KeyError:
pass
print >> sys.stderr, "ERROR: Cannot detect Apache server process user. Please set the correct value in CFG_BIBSCHED_PROCESS_USER."
sys.exit(1)
def check_running_process_user():
"""
Check that the user running this program is the same as the user
configured in CFG_BIBSCHED_PROCESS_USER or as the user running the
Apache webserver process.
"""
running_as_user = pwd.getpwuid(os.getuid())[0]
if CFG_BIBSCHED_PROCESS_USER:
# We have the expected bibsched process user defined in config,
# so check against her, not against Apache.
if running_as_user != CFG_BIBSCHED_PROCESS_USER:
print >> sys.stderr, """ERROR: You must run "%(x_proc)s" as the user set up in your
CFG_BIBSCHED_PROCESS_USER (seems to be "%(x_user)s").
You may want to do "sudo -u %(x_user)s %(x_proc)s ..." to do so.
If you think this is not right, please set CFG_BIBSCHED_PROCESS_USER
appropriately and rerun "inveniocfg --update-config-py".""" % \
{'x_proc': os.path.basename(sys.argv[0]), 'x_user': CFG_BIBSCHED_PROCESS_USER}
sys.exit(1)
elif running_as_user != guess_apache_process_user(): # not defined in config, check against Apache
print >> sys.stderr, """ERROR: You must run "%(x_proc)s" as the same user that runs your Apache server
process (seems to be "%(x_user)s").
You may want to do "sudo -u %(x_user)s %(x_proc)s ..." to do so.
If you think this is not right, please set CFG_BIBSCHED_PROCESS_USER
appropriately and rerun "inveniocfg --update-config-py".""" % \
{'x_proc': os.path.basename(sys.argv[0]), 'x_user': guess_apache_process_user()}
sys.exit(1)
return
diff --git a/invenio/legacy/bibsched/scripts/bibtaskex.py b/invenio/legacy/bibsched/bibtaskex.py
similarity index 96%
copy from invenio/legacy/bibsched/scripts/bibtaskex.py
copy to invenio/legacy/bibsched/bibtaskex.py
index b46d9942f..a21d961dc 100644
--- a/invenio/legacy/bibsched/scripts/bibtaskex.py
+++ b/invenio/legacy/bibsched/bibtaskex.py
@@ -1,98 +1,96 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Invenio Bibliographic Task Example.
Demonstrates BibTask <-> BibSched connectivity, signal handling,
error handling, etc.
"""
__revision__ = "$Id$"
import sys
import time
-from invenio.bibtask import task_init, write_message, task_set_option, \
+
+from invenio.legacy.bibsched.bibtask import task_init, write_message, task_set_option, \
task_get_option, task_update_progress, task_has_option, \
task_get_task_param, task_sleep_now_if_required
def fib(n):
"""Returns Fibonacci number for 'n'."""
out = 1
if n >= 2:
out = fib(n-2) + fib(n-1)
return out
def task_submit_elaborate_specific_parameter(key, value, opts, args):
""" Given the string key it checks it's meaning, eventually using the
value. Usually it fills some key in the options dict.
It must return True if it has elaborated the key, False, if it doesn't
know that key.
eg:
if key in ('-n', '--number'):
task_set_option('number', value)
return True
return False
"""
if key in ('-n', '--number'):
task_set_option('number', value)
return True
elif key in ('-e', '--error'):
task_set_option('error', True)
return True
return False
def task_run_core():
"""Runs the task by fetching arguments from the BibSched task queue. This is
what BibSched will be invoking via daemon call.
The task prints Fibonacci numbers for up to NUM on the stdout, and some
messages on stderr.
Return 1 in case of success and 0 in case of failure."""
n = int(task_get_option('number'))
write_message("Printing %d Fibonacci numbers." % n, verbose=9)
for i in range(0, n):
if i > 0 and i % 4 == 0:
write_message("Error: water in the CPU. Ignoring and continuing.", sys.stderr, verbose=3)
elif i > 0 and i % 5 == 0:
write_message("Error: floppy drive dropped on the floor. Ignoring and continuing.", sys.stderr)
if task_get_option('error'):
1 / 0
write_message("fib(%d)=%d" % (i, fib(i)))
task_update_progress("Done %d out of %d." % (i, n))
task_sleep_now_if_required(can_stop_too=True)
time.sleep(1)
task_update_progress("Done %d out of %d." % (n, n))
return 1
+
def main():
"""Main that construct all the bibtask."""
task_init(authorization_action='runbibtaskex',
authorization_msg="BibTaskEx Task Submission",
help_specific_usage="""\
-n, --number Print Fibonacci numbers for up to NUM. [default=30]
-e, --error Raise an error from time to time
""",
version=__revision__,
specific_params=("n:e",
["number=", "error"]),
task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
task_run_fnc=task_run_core)
-
-### okay, here we go:
-if __name__ == '__main__':
- main()
diff --git a/invenio/legacy/bibsched/scripts/__init__.py b/invenio/legacy/bibsched/scripts/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/invenio/legacy/bibsched/scripts/bibsched.py b/invenio/legacy/bibsched/scripts/bibsched.py
index dccd0ded7..cbb0edcd1 100644
--- a/invenio/legacy/bibsched/scripts/bibsched.py
+++ b/invenio/legacy/bibsched/scripts/bibsched.py
@@ -1,1827 +1,1830 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibSched - task management, scheduling and executing system for Invenio
"""
__revision__ = "$Id$"
import os
import sys
import time
import re
import marshal
import getopt
from itertools import chain
from socket import gethostname
from subprocess import Popen
import signal
-from invenio.bibtask_config import \
+from invenio.base.factory import with_app_context
+
+from invenio.legacy.bibsched.bibtask_config import \
CFG_BIBTASK_VALID_TASKS, \
CFG_BIBTASK_MONOTASKS, \
CFG_BIBTASK_FIXEDTIMETASKS
from invenio.config import \
CFG_PREFIX, \
CFG_BIBSCHED_REFRESHTIME, \
CFG_BIBSCHED_LOG_PAGER, \
CFG_BIBSCHED_EDITOR, \
CFG_BINDIR, \
CFG_LOGDIR, \
CFG_BIBSCHED_GC_TASKS_OLDER_THAN, \
CFG_BIBSCHED_GC_TASKS_TO_REMOVE, \
CFG_BIBSCHED_GC_TASKS_TO_ARCHIVE, \
CFG_BIBSCHED_MAX_NUMBER_CONCURRENT_TASKS, \
CFG_SITE_URL, \
CFG_BIBSCHED_NODE_TASKS, \
CFG_BIBSCHED_MAX_ARCHIVED_ROWS_DISPLAY
from invenio.legacy.dbquery import run_sql, real_escape_string
from invenio.utils.text import wrap_text_in_a_box
from invenio.ext.logging import register_exception, register_emergency
from invenio.utils.shell import run_shell_command
CFG_VALID_STATUS = ('WAITING', 'SCHEDULED', 'RUNNING', 'CONTINUING',
'% DELETED', 'ABOUT TO STOP', 'ABOUT TO SLEEP', 'STOPPED',
'SLEEPING', 'KILLED', 'NOW STOP', 'ERRORS REPORTED')
CFG_MOTD_PATH = os.path.join(CFG_PREFIX, "var", "run", "bibsched.motd")
SHIFT_RE = re.compile("([-\+]{0,1})([\d]+)([dhms])")
class RecoverableError(StandardError):
pass
def get_pager():
"""
Return the first available pager.
"""
paths = (
os.environ.get('PAGER', ''),
CFG_BIBSCHED_LOG_PAGER,
'/usr/bin/less',
'/bin/more'
)
for pager in paths:
if os.path.exists(pager):
return pager
def get_editor():
"""
Return the first available editor.
"""
paths = (
os.environ.get('EDITOR', ''),
CFG_BIBSCHED_EDITOR,
'/usr/bin/vim',
'/usr/bin/emacs',
'/usr/bin/vi',
'/usr/bin/nano',
)
for editor in paths:
if os.path.exists(editor):
return editor
def get_datetime(var, format_string="%Y-%m-%d %H:%M:%S"):
"""Returns a date string according to the format string.
It can handle normal date strings and shifts with respect
to now."""
try:
date = time.time()
factors = {"d": 24*3600, "h": 3600, "m": 60, "s": 1}
m = SHIFT_RE.match(var)
if m:
sign = m.groups()[0] == "-" and -1 or 1
factor = factors[m.groups()[2]]
value = float(m.groups()[1])
date = time.localtime(date + sign * factor * value)
date = time.strftime(format_string, date)
else:
date = time.strptime(var, format_string)
date = time.strftime(format_string, date)
return date
except:
return None
def get_my_pid(process, args=''):
if sys.platform.startswith('freebsd'):
command = "ps -o pid,args | grep '%s %s' | grep -v 'grep' | sed -n 1p" % (process, args)
else:
command = "ps -C %s o '%%p%%a' | grep '%s %s' | grep -v 'grep' | sed -n 1p" % (process, process, args)
answer = run_shell_command(command)[1].strip()
if answer == '':
answer = 0
else:
answer = answer[:answer.find(' ')]
return int(answer)
def get_task_pid(task_name, task_id, ignore_error=False):
"""Return the pid of task_name/task_id"""
try:
path = os.path.join(CFG_PREFIX, 'var', 'run', 'bibsched_task_%d.pid' % task_id)
pid = int(open(path).read())
os.kill(pid, signal.SIGUSR2)
return pid
except (OSError, IOError):
if ignore_error:
return 0
register_exception()
return get_my_pid(task_name, str(task_id))
def get_last_taskid():
"""Return the last taskid used."""
return run_sql("SELECT MAX(id) FROM schTASK")[0][0]
def delete_task(task_id):
"""Delete the corresponding task."""
run_sql("DELETE FROM schTASK WHERE id=%s", (task_id, ))
def is_task_scheduled(task_name):
"""Check if a certain task_name is due for execution (WAITING or RUNNING)"""
sql = """SELECT COUNT(proc) FROM schTASK
WHERE proc = %s AND (status='WAITING' OR status='RUNNING')"""
return run_sql(sql, (task_name,))[0][0] > 0
def get_task_ids_by_descending_date(task_name, statuses=['SCHEDULED']):
"""Returns list of task ids, ordered by descending runtime."""
sql = """SELECT id FROM schTASK
WHERE proc=%s AND (%s)
ORDER BY runtime DESC""" \
% " OR ".join(["status = '%s'" % x for x in statuses])
return [x[0] for x in run_sql(sql, (task_name,))]
def get_task_options(task_id):
"""Returns options for task_id read from the BibSched task queue table."""
res = run_sql("SELECT arguments FROM schTASK WHERE id=%s", (task_id,))
try:
return marshal.loads(res[0][0])
except IndexError:
return list()
def gc_tasks(verbose=False, statuses=None, since=None, tasks=None): # pylint: disable=W0613
"""Garbage collect the task queue."""
if tasks is None:
tasks = CFG_BIBSCHED_GC_TASKS_TO_REMOVE + CFG_BIBSCHED_GC_TASKS_TO_ARCHIVE
if since is None:
since = '-%id' % CFG_BIBSCHED_GC_TASKS_OLDER_THAN
if statuses is None:
statuses = ['DONE']
statuses = [status.upper() for status in statuses if status.upper() != 'RUNNING']
date = get_datetime(since)
status_query = 'status in (%s)' % ','.join([repr(real_escape_string(status)) for status in statuses])
for task in tasks:
if task in CFG_BIBSCHED_GC_TASKS_TO_REMOVE:
res = run_sql("""DELETE FROM schTASK WHERE proc=%%s AND %s AND
runtime<%%s""" % status_query, (task, date))
write_message('Deleted %s %s tasks (created before %s) with %s'
% (res, task, date, status_query))
elif task in CFG_BIBSCHED_GC_TASKS_TO_ARCHIVE:
run_sql("""INSERT INTO hstTASK(id,proc,host,user,
runtime,sleeptime,arguments,status,progress)
SELECT id,proc,host,user,
runtime,sleeptime,arguments,status,progress
FROM schTASK WHERE proc=%%s AND %s AND
runtime<%%s""" % status_query, (task, date))
res = run_sql("""DELETE FROM schTASK WHERE proc=%%s AND %s AND
runtime<%%s""" % status_query, (task, date))
write_message('Archived %s %s tasks (created before %s) with %s'
% (res, task, date, status_query))
def spawn_task(command, wait=False):
"""
Spawn the provided command in a way that is detached from the current
group. In this way a signal received by bibsched is not going to be
automatically propagated to the spawned process.
"""
def preexec(): # Don't forward signals.
os.setsid()
devnull = open(os.devnull, "w")
process = Popen(command, preexec_fn=preexec, shell=True,
stderr=devnull, stdout=devnull)
if wait:
process.wait()
def bibsched_get_host(task_id):
"""Retrieve the hostname of the task"""
res = run_sql("SELECT host FROM schTASK WHERE id=%s LIMIT 1", (task_id, ), 1)
if res:
return res[0][0]
def bibsched_set_host(task_id, host=""):
"""Update the progress of task_id."""
return run_sql("UPDATE schTASK SET host=%s WHERE id=%s", (host, task_id))
def bibsched_get_status(task_id):
"""Retrieve the task status."""
res = run_sql("SELECT status FROM schTASK WHERE id=%s LIMIT 1", (task_id, ), 1)
if res:
return res[0][0]
def bibsched_set_status(task_id, status, when_status_is=None):
"""Update the status of task_id."""
if when_status_is is None:
return run_sql("UPDATE schTASK SET status=%s WHERE id=%s",
(status, task_id))
else:
return run_sql("UPDATE schTASK SET status=%s WHERE id=%s AND status=%s",
(status, task_id, when_status_is))
def bibsched_set_progress(task_id, progress):
"""Update the progress of task_id."""
return run_sql("UPDATE schTASK SET progress=%s WHERE id=%s", (progress, task_id))
def bibsched_set_priority(task_id, priority):
"""Update the priority of task_id."""
return run_sql("UPDATE schTASK SET priority=%s WHERE id=%s", (priority, task_id))
def bibsched_send_signal(proc, task_id, sig):
"""Send a signal to a given task."""
if bibsched_get_host(task_id) != gethostname():
return False
pid = get_task_pid(proc, task_id, True)
if pid:
try:
os.kill(pid, sig)
return True
except OSError:
return False
return False
def is_monotask(task_id, proc, runtime, status, priority, host, sequenceid): # pylint: disable=W0613
procname = proc.split(':')[0]
return procname in CFG_BIBTASK_MONOTASKS
def stop_task(other_task_id, other_proc, other_priority, other_status, other_sequenceid): # pylint: disable=W0613
Log("Send STOP signal to #%d (%s) which was in status %s" % (other_task_id, other_proc, other_status))
bibsched_set_status(other_task_id, 'ABOUT TO STOP', other_status)
def sleep_task(other_task_id, other_proc, other_priority, other_status, other_sequenceid): # pylint: disable=W0613
Log("Send SLEEP signal to #%d (%s) which was in status %s" % (other_task_id, other_proc, other_status))
bibsched_set_status(other_task_id, 'ABOUT TO SLEEP', other_status)
class Manager(object):
def __init__(self, old_stdout):
import curses
import curses.panel
from curses.wrapper import wrapper
self.old_stdout = old_stdout
self.curses = curses
self.helper_modules = CFG_BIBTASK_VALID_TASKS
self.running = 1
self.footer_auto_mode = "Automatic Mode [A Manual] [1/2/3 Display] [P Purge] [l/L Log] [O Opts] [E Edit motd] [Q Quit]"
self.footer_select_mode = "Manual Mode [A Automatic] [1/2/3 Display Type] [P Purge] [l/L Log] [O Opts] [E Edit motd] [Q Quit]"
self.footer_waiting_item = "[R Run] [D Delete] [N Priority]"
self.footer_running_item = "[S Sleep] [T Stop] [K Kill]"
self.footer_stopped_item = "[I Initialise] [D Delete] [K Acknowledge]"
self.footer_sleeping_item = "[W Wake Up] [T Stop] [K Kill]"
self.item_status = ""
self.rows = []
self.panel = None
self.display = 2
self.first_visible_line = 0
self.auto_mode = 0
self.currentrow = None
self.current_attr = 0
self.hostname = gethostname()
self.allowed_task_types = CFG_BIBSCHED_NODE_TASKS.get(self.hostname, CFG_BIBTASK_VALID_TASKS)
self.motd = ""
self.header_lines = 2
self.read_motd()
self.selected_line = self.header_lines
wrapper(self.start)
def read_motd(self):
"""Get a fresh motd from disk, if it exists."""
self.motd = ""
self.header_lines = 2
try:
if os.path.exists(CFG_MOTD_PATH):
motd = open(CFG_MOTD_PATH).read().strip()
if motd:
self.motd = "MOTD [%s] " % time.strftime("%Y-%m-%d %H:%M", time.localtime(os.path.getmtime(CFG_MOTD_PATH))) + motd
self.header_lines = 3
except IOError:
pass
def handle_keys(self, char):
if char == -1:
return
if self.auto_mode and (char not in (self.curses.KEY_UP,
self.curses.KEY_DOWN,
self.curses.KEY_PPAGE,
self.curses.KEY_NPAGE,
ord("g"), ord("G"), ord("n"),
ord("q"), ord("Q"), ord("a"),
ord("A"), ord("1"), ord("2"), ord("3"),
ord("p"), ord("P"), ord("o"), ord("O"),
ord("l"), ord("L"), ord("e"), ord("E"))):
self.display_in_footer("in automatic mode")
else:
status = self.currentrow and self.currentrow[5] or None
if char == self.curses.KEY_UP:
self.selected_line = max(self.selected_line - 1,
self.header_lines)
self.repaint()
if char == self.curses.KEY_PPAGE:
self.selected_line = max(self.selected_line - 10,
self.header_lines)
self.repaint()
elif char == self.curses.KEY_DOWN:
self.selected_line = min(self.selected_line + 1,
len(self.rows) + self.header_lines - 1)
self.repaint()
elif char == self.curses.KEY_NPAGE:
self.selected_line = min(self.selected_line + 10,
len(self.rows) + self.header_lines - 1)
self.repaint()
elif char == self.curses.KEY_HOME:
self.first_visible_line = 0
self.selected_line = self.header_lines
elif char == ord("g"):
self.selected_line = self.header_lines
self.repaint()
elif char == ord("G"):
self.selected_line = len(self.rows) + self.header_lines - 1
self.repaint()
elif char in (ord("a"), ord("A")):
self.change_auto_mode()
elif char == ord("l"):
self.openlog()
elif char == ord("L"):
self.openlog(err=True)
elif char in (ord("w"), ord("W")):
self.wakeup()
elif char in (ord("n"), ord("N")):
self.change_priority()
elif char in (ord("r"), ord("R")):
if status in ('WAITING', 'SCHEDULED'):
self.run()
elif char in (ord("s"), ord("S")):
self.sleep()
elif char in (ord("k"), ord("K")):
if status in ('ERROR', 'DONE WITH ERRORS', 'ERRORS REPORTED'):
self.acknowledge()
elif status is not None:
self.kill()
elif char in (ord("t"), ord("T")):
self.stop()
elif char in (ord("d"), ord("D")):
self.delete()
elif char in (ord("i"), ord("I")):
self.init()
elif char in (ord("p"), ord("P")):
self.purge_done()
elif char in (ord("o"), ord("O")):
self.display_task_options()
elif char in (ord("e"), ord("E")):
self.edit_motd()
self.read_motd()
elif char == ord("1"):
self.display = 1
self.first_visible_line = 0
self.selected_line = self.header_lines
# We need to update the display to display done tasks
self.update_rows()
self.repaint()
self.display_in_footer("only done processes are displayed")
elif char == ord("2"):
self.display = 2
self.first_visible_line = 0
self.selected_line = self.header_lines
# We need to update the display to display not done tasks
self.update_rows()
self.repaint()
self.display_in_footer("only not done processes are displayed")
elif char == ord("3"):
self.display = 3
self.first_visible_line = 0
self.selected_line = self.header_lines
# We need to update the display to display archived tasks
self.update_rows()
self.repaint()
self.display_in_footer("only archived processes are displayed")
elif char in (ord("q"), ord("Q")):
if self.curses.panel.top_panel() == self.panel:
self.panel = None
self.curses.panel.update_panels()
else:
self.running = 0
return
def openlog(self, err=False):
task_id = self.currentrow[0]
if err:
logname = os.path.join(CFG_LOGDIR, 'bibsched_task_%d.err' % task_id)
else:
logname = os.path.join(CFG_LOGDIR, 'bibsched_task_%d.log' % task_id)
if os.path.exists(logname):
pager = get_pager()
if os.path.exists(pager):
self.curses.endwin()
os.system('%s %s' % (pager, logname))
print >> self.old_stdout, "\rPress ENTER to continue",
self.old_stdout.flush()
raw_input()
# We need to redraw the bibsched task list
# since we are displaying "Press ENTER to continue"
self.repaint()
else:
self._display_message_box("No pager was found")
def edit_motd(self):
"""Add, delete or change the motd message that will be shown when the
bibsched monitor starts."""
editor = get_editor()
if editor:
previous = self.motd
self.curses.endwin()
os.system("%s %s" % (editor, CFG_MOTD_PATH))
# We need to redraw the MOTD part
self.read_motd()
self.repaint()
if previous[24:] != self.motd[24:]:
if len(previous) == 0:
Log('motd set to "%s"' % self.motd.replace("\n", "|"))
self.selected_line += 1
self.header_lines += 1
elif len(self.motd) == 0:
Log('motd deleted')
self.selected_line -= 1
self.header_lines -= 1
else:
Log('motd changed to "%s"' % self.motd.replace("\n", "|"))
else:
self._display_message_box("No editor was found")
def display_task_options(self):
"""Nicely display information about current process."""
msg = ' id: %i\n\n' % self.currentrow[0]
pid = get_task_pid(self.currentrow[1], self.currentrow[0], True)
if pid is not None:
msg += ' pid: %s\n\n' % pid
msg += ' priority: %s\n\n' % self.currentrow[8]
msg += ' proc: %s\n\n' % self.currentrow[1]
msg += ' user: %s\n\n' % self.currentrow[2]
msg += ' runtime: %s\n\n' % self.currentrow[3].strftime("%Y-%m-%d %H:%M:%S")
msg += ' sleeptime: %s\n\n' % self.currentrow[4]
msg += ' status: %s\n\n' % self.currentrow[5]
msg += ' progress: %s\n\n' % self.currentrow[6]
arguments = marshal.loads(self.currentrow[7])
if type(arguments) is dict:
# FIXME: REMOVE AFTER MAJOR RELEASE 1.0
msg += ' options : %s\n\n' % arguments
else:
msg += 'executable : %s\n\n' % arguments[0]
msg += ' arguments : %s\n\n' % ' '.join(arguments[1:])
msg += '\n\nPress q to quit this panel...'
msg = wrap_text_in_a_box(msg, style='no_border')
rows = msg.split('\n')
height = len(rows) + 2
width = max([len(row) for row in rows]) + 4
try:
self.win = self.curses.newwin(
height,
width,
(self.height - height) / 2 + 1,
(self.width - width) / 2 + 1
)
except self.curses.error:
return
self.panel = self.curses.panel.new_panel(self.win)
self.panel.top()
self.win.border()
i = 1
for row in rows:
self.win.addstr(i, 2, row, self.current_attr)
i += 1
self.win.refresh()
while self.win.getkey() != 'q':
pass
self.panel = None
def count_processes(self, status):
out = 0
res = run_sql("""SELECT COUNT(id) FROM schTASK
WHERE status=%s GROUP BY status""", (status,))
try:
out = res[0][0]
except:
pass
return out
def change_priority(self):
task_id = self.currentrow[0]
priority = self.currentrow[8]
new_priority = self._display_ask_number_box("Insert the desired \
priority for task %s. The smaller the number the less the priority. Note that \
a number less than -10 will mean to always postpone the task while a number \
bigger than 10 will mean some tasks with less priority could be stopped in \
order to let this task run. The current priority is %s. New value:"
% (task_id, priority))
try:
new_priority = int(new_priority)
except ValueError:
return
bibsched_set_priority(task_id, new_priority)
# We need to update the tasks list with our new priority
# to be able to display it
self.update_rows()
# We need to update the priority number next to the task
self.repaint()
def wakeup(self):
task_id = self.currentrow[0]
process = self.currentrow[1]
status = self.currentrow[5]
#if self.count_processes('RUNNING') + self.count_processes('CONTINUING') >= 1:
#self.display_in_footer("a process is already running!")
if status == "SLEEPING":
if not bibsched_send_signal(process, task_id, signal.SIGCONT):
bibsched_set_status(task_id, "ERROR", "SLEEPING")
self.update_rows()
self.repaint()
self.display_in_footer("process woken up")
else:
self.display_in_footer("process is not sleeping")
self.stdscr.refresh()
def _display_YN_box(self, msg):
"""Utility to display confirmation boxes."""
msg += ' (Y/N)'
msg = wrap_text_in_a_box(msg, style='no_border')
rows = msg.split('\n')
height = len(rows) + 2
width = max([len(row) for row in rows]) + 4
self.win = self.curses.newwin(
height,
width,
(self.height - height) / 2 + 1,
(self.width - width) / 2 + 1
)
self.panel = self.curses.panel.new_panel(self.win)
self.panel.top()
self.win.border()
i = 1
for row in rows:
self.win.addstr(i, 2, row, self.current_attr)
i += 1
self.win.refresh()
try:
while 1:
c = self.win.getch()
if c in (ord('y'), ord('Y')):
return True
elif c in (ord('n'), ord('N')):
return False
finally:
self.panel = None
def _display_ask_number_box(self, msg):
"""Utility to display confirmation boxes."""
msg = wrap_text_in_a_box(msg, style='no_border')
rows = msg.split('\n')
height = len(rows) + 3
width = max([len(row) for row in rows]) + 4
self.win = self.curses.newwin(
height,
width,
(self.height - height) / 2 + 1,
(self.width - width) / 2 + 1
)
self.panel = self.curses.panel.new_panel(self.win)
self.panel.top()
self.win.border()
i = 1
for row in rows:
self.win.addstr(i, 2, row, self.current_attr)
i += 1
self.win.refresh()
self.win.move(height - 2, 2)
self.curses.echo()
ret = self.win.getstr()
self.curses.noecho()
self.panel = None
return ret
def _display_message_box(self, msg):
"""Utility to display message boxes."""
rows = msg.split('\n')
height = len(rows) + 2
width = max([len(row) for row in rows]) + 3
self.win = self.curses.newwin(
height,
width,
(self.height - height) / 2 + 1,
(self.width - width) / 2 + 1
)
self.panel = self.curses.panel.new_panel(self.win)
self.panel.top()
self.win.border()
i = 1
for row in rows:
self.win.addstr(i, 2, row, self.current_attr)
i += 1
self.win.refresh()
self.win.move(height - 2, 2)
self.win.getkey()
self.curses.noecho()
self.panel = None
def purge_done(self):
"""Garbage collector."""
if self._display_YN_box(
"You are going to purge the list of DONE tasks.\n\n"
"%s tasks, submitted since %s days, will be archived.\n\n"
"%s tasks, submitted since %s days, will be deleted.\n\n"
"Are you sure?" % (
', '.join(CFG_BIBSCHED_GC_TASKS_TO_ARCHIVE),
CFG_BIBSCHED_GC_TASKS_OLDER_THAN,
', '.join(CFG_BIBSCHED_GC_TASKS_TO_REMOVE),
CFG_BIBSCHED_GC_TASKS_OLDER_THAN)):
gc_tasks()
# We removed some tasks from our list
self.update_rows()
self.repaint()
self.display_in_footer("DONE processes purged")
def run(self):
task_id = self.currentrow[0]
process = self.currentrow[1].split(':')[0]
status = self.currentrow[5]
if status == "WAITING":
if process in self.helper_modules:
if run_sql("""UPDATE schTASK SET status='SCHEDULED', host=%s
WHERE id=%s and status='WAITING'""",
(self.hostname, task_id)):
program = os.path.join(CFG_BINDIR, process)
command = "%s %s" % (program, str(task_id))
spawn_task(command)
Log("manually running task #%d (%s)" % (task_id, process))
# We changed the status of one of our tasks
self.update_rows()
self.repaint()
else:
## Process already running (typing too quickly on the keyboard?)
pass
else:
self.display_in_footer("Process %s is not in the list of allowed processes." % process)
else:
self.display_in_footer("Process status should be SCHEDULED or WAITING!")
def acknowledge(self):
task_id = self.currentrow[0]
status = self.currentrow[5]
if status in ('ERROR', 'DONE WITH ERRORS', 'ERRORS REPORTED'):
bibsched_set_status(task_id, 'ACK ' + status, status)
self.update_rows()
self.repaint()
self.display_in_footer("Acknowledged error")
def sleep(self):
task_id = self.currentrow[0]
status = self.currentrow[5]
if status in ('RUNNING', 'CONTINUING'):
bibsched_set_status(task_id, 'ABOUT TO SLEEP', status)
self.update_rows()
self.repaint()
self.display_in_footer("SLEEP signal sent to task #%s" % task_id)
else:
self.display_in_footer("Cannot put to sleep non-running processes")
def kill(self):
task_id = self.currentrow[0]
process = self.currentrow[1]
status = self.currentrow[5]
if status in ('RUNNING', 'CONTINUING', 'ABOUT TO STOP', 'ABOUT TO SLEEP', 'SLEEPING'):
if self._display_YN_box("Are you sure you want to kill the %s process %s?" % (process, task_id)):
bibsched_send_signal(process, task_id, signal.SIGKILL)
bibsched_set_status(task_id, 'KILLED')
self.update_rows()
self.repaint()
self.display_in_footer("KILL signal sent to task #%s" % task_id)
else:
self.display_in_footer("Cannot kill non-running processes")
def stop(self):
task_id = self.currentrow[0]
process = self.currentrow[1]
status = self.currentrow[5]
if status in ('RUNNING', 'CONTINUING', 'ABOUT TO SLEEP', 'SLEEPING'):
if status == 'SLEEPING':
bibsched_set_status(task_id, 'NOW STOP', 'SLEEPING')
bibsched_send_signal(process, task_id, signal.SIGCONT)
count = 10
while bibsched_get_status(task_id) == 'NOW STOP':
if count <= 0:
bibsched_set_status(task_id, 'ERROR', 'NOW STOP')
self.update_rows()
self.repaint()
self.display_in_footer("It seems impossible to wakeup this task.")
return
time.sleep(CFG_BIBSCHED_REFRESHTIME)
count -= 1
else:
bibsched_set_status(task_id, 'ABOUT TO STOP', status)
self.update_rows()
self.repaint()
self.display_in_footer("STOP signal sent to task #%s" % task_id)
else:
self.display_in_footer("Cannot stop non-running processes")
def delete(self):
task_id = self.currentrow[0]
status = self.currentrow[5]
if status not in ('RUNNING', 'CONTINUING', 'SLEEPING', 'SCHEDULED', 'ABOUT TO STOP', 'ABOUT TO SLEEP'):
bibsched_set_status(task_id, "%s_DELETED" % status, status)
self.display_in_footer("process deleted")
self.update_rows()
self.repaint()
else:
self.display_in_footer("Cannot delete running processes")
def init(self):
task_id = self.currentrow[0]
status = self.currentrow[5]
if status not in ('RUNNING', 'CONTINUING', 'SLEEPING'):
bibsched_set_status(task_id, "WAITING")
bibsched_set_progress(task_id, "")
bibsched_set_host(task_id, "")
self.update_rows()
self.repaint()
self.display_in_footer("process initialised")
else:
self.display_in_footer("Cannot initialise running processes")
def change_auto_mode(self):
program = os.path.join(CFG_BINDIR, "bibsched")
if self.auto_mode:
COMMAND = "%s -q halt" % program
else:
COMMAND = "%s -q start" % program
os.system(COMMAND)
self.auto_mode = not self.auto_mode
# We need to refresh the color of the header and footer
self.repaint()
def put_line(self, row, header=False, motd=False):
## ROW: (id,proc,user,runtime,sleeptime,status,progress,arguments,priority,host)
## 0 1 2 3 4 5 6 7 8 9
col_w = [8 , 25, 15, 21, 7, 12, 21, 60]
maxx = self.width
if self.y == self.selected_line - self.first_visible_line and self.y > 1:
self.item_status = row[5]
self.currentrow = row
if motd:
attr = self.curses.color_pair(1) + self.curses.A_BOLD
elif self.y == self.header_lines - 2:
if self.auto_mode:
attr = self.curses.color_pair(2) + self.curses.A_STANDOUT + self.curses.A_BOLD
else:
attr = self.curses.color_pair(8) + self.curses.A_STANDOUT + self.curses.A_BOLD
elif row[5] == "DONE":
attr = self.curses.color_pair(5) + self.curses.A_BOLD
elif row[5] == "STOPPED":
attr = self.curses.color_pair(6) + self.curses.A_BOLD
elif row[5].find("ERROR") > -1:
attr = self.curses.color_pair(4) + self.curses.A_BOLD
elif row[5] == "WAITING":
attr = self.curses.color_pair(3) + self.curses.A_BOLD
elif row[5] in ("RUNNING", "CONTINUING"):
attr = self.curses.color_pair(2) + self.curses.A_BOLD
elif not header and row[8]:
attr = self.curses.A_BOLD
else:
attr = self.curses.A_NORMAL
## If the task is not relevant for this instance ob BibSched because
## the type of the task can not be run, or it is running on another
## machine: make it a different color
if not header and (row[1].split(':')[0] not in self.allowed_task_types or
(row[9] != '' and row[9] != self.hostname)):
attr = self.curses.color_pair(6)
if not row[6]:
nrow = list(row)
nrow[6] = 'Not allowed on this instance'
row = tuple(nrow)
if self.y == self.selected_line - self.first_visible_line and self.y > 1:
self.current_attr = attr
attr += self.curses.A_REVERSE
if header: # Dirty hack. put_line should be better refactored.
# row contains one less element: arguments
## !!! FIXME: THIS IS CRAP
myline = str(row[0]).ljust(col_w[0]-1)
myline += str(row[1]).ljust(col_w[1]-1)
myline += str(row[2]).ljust(col_w[2]-1)
myline += str(row[3]).ljust(col_w[3]-1)
myline += str(row[4]).ljust(col_w[4]-1)
myline += str(row[5]).ljust(col_w[5]-1)
myline += str(row[6]).ljust(col_w[6]-1)
myline += str(row[7]).ljust(col_w[7]-1)
elif motd:
myline = str(row[0])
else:
## ROW: (id,proc,user,runtime,sleeptime,status,progress,arguments,priority,host)
## 0 1 2 3 4 5 6 7 8 9
priority = str(row[8] and ' [%s]' % row[8] or '')
myline = str(row[0]).ljust(col_w[0])[:col_w[0]-1]
myline += (str(row[1])[:col_w[1]-len(priority)-2] + priority).ljust(col_w[1]-1)
myline += str(row[2]).ljust(col_w[2])[:col_w[2]-1]
myline += str(row[3]).ljust(col_w[3])[:col_w[3]-1]
myline += str(row[4]).ljust(col_w[4])[:col_w[4]-1]
myline += str(row[5]).ljust(col_w[5])[:col_w[5]-1]
myline += str(row[9]).ljust(col_w[6])[:col_w[6]-1]
myline += str(row[6]).ljust(col_w[7])[:col_w[7]-1]
myline = myline.ljust(maxx)
try:
self.stdscr.addnstr(self.y, 0, myline, maxx, attr)
except self.curses.error:
pass
self.y += 1
def display_in_footer(self, footer, i=0, print_time_p=0):
if print_time_p:
footer = "%s %s" % (footer, time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))
maxx = self.stdscr.getmaxyx()[1]
footer = footer.ljust(maxx)
if self.auto_mode:
colorpair = 2
else:
colorpair = 1
try:
self.stdscr.addnstr(self.y - i, 0, footer, maxx - 1, self.curses.A_STANDOUT + self.curses.color_pair(colorpair) + self.curses.A_BOLD)
except self.curses.error:
pass
def repaint(self):
if server_pid():
self.auto_mode = 1
else:
if self.auto_mode == 1:
self.curses.beep()
self.auto_mode = 0
self.y = 0
self.stdscr.erase()
self.height, self.width = self.stdscr.getmaxyx()
maxy = self.height - 2
#maxx = self.width
if len(self.motd) > 0:
self.put_line((self.motd.strip().replace("\n", " - ")[:79], "", "", "", "", "", "", "", ""), header=False, motd=True)
self.put_line(("ID", "PROC [PRI]", "USER", "RUNTIME", "SLEEP", "STATUS", "HOST", "PROGRESS"), header=True)
self.put_line(("", "", "", "", "", "", "", ""), header=True)
if self.selected_line > maxy + self.first_visible_line - 1:
self.first_visible_line = self.selected_line - maxy + 1
if self.selected_line < self.first_visible_line + 2:
self.first_visible_line = self.selected_line - 2
for row in self.rows[self.first_visible_line:self.first_visible_line+maxy-2]:
self.put_line(row)
self.y = self.stdscr.getmaxyx()[0] - 1
if self.auto_mode:
self.display_in_footer(self.footer_auto_mode, print_time_p=1)
else:
self.display_in_footer(self.footer_select_mode, print_time_p=1)
footer2 = ""
if self.item_status.find("DONE") > -1 or self.item_status in ("ERROR", "STOPPED", "KILLED", "ERRORS REPORTED"):
footer2 += self.footer_stopped_item
elif self.item_status in ("RUNNING", "CONTINUING", "ABOUT TO STOP", "ABOUT TO SLEEP"):
footer2 += self.footer_running_item
elif self.item_status == "SLEEPING":
footer2 += self.footer_sleeping_item
elif self.item_status == "WAITING":
footer2 += self.footer_waiting_item
self.display_in_footer(footer2, 1)
self.stdscr.refresh()
def update_rows(self):
if self.display == 1:
table = "schTASK"
where = "and (status='DONE' or status LIKE 'ACK%')"
order = "runtime DESC"
limit = ""
elif self.display == 2:
table = "schTASK"
where = "and (status<>'DONE' and status NOT LIKE 'ACK%')"
order = "runtime ASC"
limit = "limit %s" % CFG_BIBSCHED_MAX_ARCHIVED_ROWS_DISPLAY
else:
table = "hstTASK"
order = "runtime DESC"
where = ""
limit = ""
self.rows = run_sql("""SELECT id, proc, user, runtime, sleeptime,
status, progress, arguments, priority, host,
sequenceid
FROM %s
WHERE status NOT LIKE '%%_DELETED' %s
ORDER BY %s
%s""" % (table, where, order, limit))
# Make sure we are not selecting a line that disappeared
self.selected_line = min(self.selected_line,
len(self.rows) + self.header_lines - 1)
def start(self, stdscr):
os.environ['BIBSCHED_MODE'] = 'manual'
if self.curses.has_colors():
self.curses.start_color()
self.curses.init_pair(8, self.curses.COLOR_WHITE, self.curses.COLOR_BLACK)
self.curses.init_pair(1, self.curses.COLOR_WHITE, self.curses.COLOR_RED)
self.curses.init_pair(2, self.curses.COLOR_GREEN, self.curses.COLOR_BLACK)
self.curses.init_pair(3, self.curses.COLOR_MAGENTA, self.curses.COLOR_BLACK)
self.curses.init_pair(4, self.curses.COLOR_RED, self.curses.COLOR_BLACK)
self.curses.init_pair(5, self.curses.COLOR_BLUE, self.curses.COLOR_BLACK)
self.curses.init_pair(6, self.curses.COLOR_CYAN, self.curses.COLOR_BLACK)
self.curses.init_pair(7, self.curses.COLOR_YELLOW, self.curses.COLOR_BLACK)
self.stdscr = stdscr
self.base_panel = self.curses.panel.new_panel(self.stdscr)
self.base_panel.bottom()
self.curses.panel.update_panels()
self.height, self.width = stdscr.getmaxyx()
self.stdscr.erase()
if server_pid():
self.auto_mode = 1
ring = 4
if len(self.motd) > 0:
self._display_message_box(self.motd + "\nPress any key to close")
while self.running:
if ring == 4:
self.read_motd()
self.update_rows()
ring = 0
self.repaint()
ring += 1
char = -1
try:
char = timed_out(self.stdscr.getch, 1)
if char == 27: # escaping sequence
char = self.stdscr.getch()
if char == 79: # arrow
char = self.stdscr.getch()
if char == 65: # arrow up
char = self.curses.KEY_UP
elif char == 66: # arrow down
char = self.curses.KEY_DOWN
elif char == 72:
char = self.curses.KEY_PPAGE
elif char == 70:
char = self.curses.KEY_NPAGE
elif char == 91:
char = self.stdscr.getch()
if char == 53:
char = self.stdscr.getch()
if char == 126:
char = self.curses.KEY_HOME
except TimedOutExc:
char = -1
self.handle_keys(char)
class BibSched(object):
def __init__(self, debug=False):
self.debug = debug
self.hostname = gethostname()
self.helper_modules = CFG_BIBTASK_VALID_TASKS
## All the tasks in the queue that the node is allowed to manipulate
self.node_relevant_bibupload_tasks = ()
self.node_relevant_waiting_tasks = ()
self.node_relevant_active_tasks = ()
## All tasks of all nodes
self.active_tasks_all_nodes = ()
self.mono_tasks_all_nodes = ()
self.allowed_task_types = CFG_BIBSCHED_NODE_TASKS.get(self.hostname, CFG_BIBTASK_VALID_TASKS)
os.environ['BIBSCHED_MODE'] = 'automatic'
def tie_task_to_host(self, task_id):
"""Sets the hostname of a task to the machine executing this script
@return: True if the scheduling was successful, False otherwise,
e.g. if the task was scheduled concurrently on a different host.
"""
if not run_sql("""SELECT id FROM schTASK WHERE id=%s AND host=''
AND status='WAITING'""", (task_id, )):
## The task was already tied?
return False
run_sql("""UPDATE schTASK SET host=%s, status='SCHEDULED'
WHERE id=%s AND host='' AND status='WAITING'""",
(self.hostname, task_id))
return bool(run_sql("SELECT id FROM schTASK WHERE id=%s AND host=%s",
(task_id, self.hostname)))
def filter_for_allowed_tasks(self):
""" Removes all tasks that are not allowed in this Invenio instance
"""
def relevant_task(task_id, proc, runtime, status, priority, host, sequenceid): # pylint: disable=W0613
# if host and self.hostname != host:
# return False
procname = proc.split(':')[0]
if procname not in self.allowed_task_types:
return False
return True
def filter_tasks(tasks):
return tuple(t for t in tasks if relevant_task(*t))
self.node_relevant_bibupload_tasks = filter_tasks(self.node_relevant_bibupload_tasks)
self.node_relevant_active_tasks = filter_tasks(self.node_relevant_active_tasks)
self.node_relevant_waiting_tasks = filter_tasks(self.node_relevant_waiting_tasks)
self.node_relevant_sleeping_tasks = filter_tasks(self.node_relevant_sleeping_tasks)
def is_task_safe_to_execute(self, proc1, proc2):
"""Return True when the two tasks can run concurrently."""
return proc1 != proc2 # and not proc1.startswith('bibupload') and not proc2.startswith('bibupload')
def get_tasks_to_sleep_and_stop(self, proc, task_set):
"""Among the task_set, return the list of tasks to stop and the list
of tasks to sleep.
"""
if proc in CFG_BIBTASK_MONOTASKS:
return [], [t for t in task_set
if t[3] not in ('SLEEPING', 'ABOUT TO SLEEP')]
min_prio = None
min_task_id = None
min_proc = None
min_status = None
min_sequenceid = None
to_stop = []
## For all the lower priority tasks...
for (this_task_id, this_proc, this_priority, this_status, this_sequenceid) in task_set:
if not self.is_task_safe_to_execute(this_proc, proc):
to_stop.append((this_task_id, this_proc, this_priority, this_status, this_sequenceid))
elif (min_prio is None or this_priority < min_prio) and \
this_status not in ('SLEEPING', 'ABOUT TO SLEEP'):
## We don't put to sleep already sleeping task :-)
min_prio = this_priority
min_task_id = this_task_id
min_proc = this_proc
min_status = this_status
min_sequenceid = this_sequenceid
if to_stop:
return to_stop, []
elif min_task_id:
return [], [(min_task_id, min_proc, min_prio, min_status, min_sequenceid)]
else:
return [], []
def split_active_tasks_by_priority(self, task_id, priority):
"""Return two lists: the list of task_ids with lower priority and
those with higher or equal priority."""
higher = []
lower = []
### !!! We already have this in node_relevant_active_tasks
for other_task_id, task_proc, dummy, status, task_priority, task_host, sequenceid in self.node_relevant_active_tasks:
# for other_task_id, task_proc, runtime, status, task_priority, task_host in self.node_relevant_active_tasks:
# for other_task_id, task_proc, task_priority, status in self.get_running_tasks():
if task_id == other_task_id:
continue
if task_priority < priority and task_host == self.hostname:
lower.append((other_task_id, task_proc, task_priority, status, sequenceid))
elif task_host == self.hostname:
higher.append((other_task_id, task_proc, task_priority, status, sequenceid))
return lower, higher
def handle_task(self, task_id, proc, runtime, status, priority, host, sequenceid):
"""Perform needed action of the row representing a task.
Return True when task_status need to be refreshed"""
debug = self.debug
if debug:
Log("task_id: %s, proc: %s, runtime: %s, status: %s, priority: %s, host: %s, sequenceid: %s" %
(task_id, proc, runtime, status, priority, host, sequenceid))
if (task_id, proc, runtime, status, priority, host, sequenceid) in self.node_relevant_active_tasks:
# For multi-node
# check if we need to sleep ourselves for monotasks to be able to run
for other_task_id, other_proc, dummy_other_runtime, other_status, other_priority, other_host, other_sequenceid in self.mono_tasks_all_nodes:
if priority < other_priority:
# Sleep ourselves
if status not in ('SLEEPING', 'ABOUT TO SLEEP'):
sleep_task(task_id, proc, priority, status, sequenceid)
return True
return False
elif (task_id, proc, runtime, status, priority, host, sequenceid) in self.node_relevant_waiting_tasks:
if debug:
Log("Trying to run %s" % task_id)
if priority < -10:
if debug:
Log("Cannot run because priority < -10")
return False
lower, higher = self.split_active_tasks_by_priority(task_id, priority)
if debug:
Log('lower: %s' % lower)
Log('higher: %s' % higher)
for other_task_id, other_proc, dummy_other_runtime, other_status, \
other_priority, other_host, other_sequenceid in chain(
self.node_relevant_sleeping_tasks,
self.active_tasks_all_nodes):
if task_id != other_task_id and \
not self.is_task_safe_to_execute(proc, other_proc):
### !!! WE NEED TO CHECK FOR TASKS THAT CAN ONLY BE EXECUTED ON ONE MACHINE AT ONE TIME
### !!! FOR EXAMPLE BIBUPLOADS WHICH NEED TO BE EXECUTED SEQUENTIALLY AND NEVER CONCURRENTLY
## There's at least a higher priority task running that
## cannot run at the same time of the given task.
## We give up
if debug:
Log("Cannot run because task_id: %s, proc: %s is in the queue and incompatible" % (other_task_id, other_proc))
return False
if sequenceid:
## Let's normalize the prority of all tasks in a sequenceid to the
## max priority of the group
max_priority = run_sql("""SELECT MAX(priority) FROM schTASK
WHERE status='WAITING'
AND sequenceid=%s""",
(sequenceid, ))[0][0]
if run_sql("""UPDATE schTASK SET priority=%s
WHERE status='WAITING' AND sequenceid=%s""",
(max_priority, sequenceid)):
Log("Raised all waiting tasks with sequenceid "
"%s to the max priority %s" % (sequenceid, max_priority))
## Some priorities where raised
return True
## Let's normalize the runtime of all tasks in a sequenceid to
## the compatible runtime.
current_runtimes = run_sql("""SELECT id, runtime FROM schTASK WHERE sequenceid=%s AND status='WAITING' ORDER by id""", (sequenceid, ))
runtimes_adjusted = False
if current_runtimes:
last_runtime = current_runtimes[0][1]
for the_task_id, runtime in current_runtimes:
if runtime < last_runtime:
run_sql("""UPDATE schTASK SET runtime=%s WHERE id=%s""", (last_runtime, the_task_id))
if debug:
Log("Adjusted runtime of task_id %s to %s in order to be executed in the correct sequenceid order" % (the_task_id, last_runtime))
runtimes_adjusted = True
runtime = last_runtime
last_runtime = runtime
if runtimes_adjusted:
## Some runtime have been adjusted
return True
if sequenceid is not None:
for other_task_id, dummy_other_proc, dummy_other_runtime, dummy_other_status, dummy_other_priority, dummy_other_host, other_sequenceid in self.active_tasks_all_nodes:
if sequenceid == other_sequenceid and task_id > other_task_id:
Log('Task %s need to run after task %s since they have the same sequence id: %s' % (task_id, other_task_id, sequenceid))
## If there is a task with same sequence number then do not run the current task
return False
if proc in CFG_BIBTASK_MONOTASKS and higher:
## This is a monotask
if debug:
Log("Cannot run because this is a monotask and there are higher priority tasks: %s" % (higher, ))
return False
## No higher priority task have issue with the given task.
if proc not in CFG_BIBTASK_FIXEDTIMETASKS and len(higher) >= CFG_BIBSCHED_MAX_NUMBER_CONCURRENT_TASKS:
if debug:
Log("Cannot run because all resources (%s) are used (%s), higher: %s" % (CFG_BIBSCHED_MAX_NUMBER_CONCURRENT_TASKS, len(higher), higher))
return False
## Check for monotasks wanting to run
for other_task_id, other_proc, dummy_other_runtime, other_status, other_priority, other_host, other_sequenceid in self.mono_tasks_all_nodes:
if priority < other_priority:
if debug:
Log("Cannot run because there is a monotask with higher priority: %s %s" % (other_task_id, other_proc))
return False
## We check if it is necessary to stop/put to sleep some lower priority
## task.
tasks_to_stop, tasks_to_sleep = self.get_tasks_to_sleep_and_stop(proc, lower)
if debug:
Log('tasks_to_stop: %s' % tasks_to_stop)
Log('tasks_to_sleep: %s' % tasks_to_sleep)
if tasks_to_stop and priority < 100:
## Only tasks with priority higher than 100 have the power
## to put task to stop.
if debug:
Log("Cannot run because there are task to stop: %s and priority < 100" % tasks_to_stop)
return False
procname = proc.split(':')[0]
if not tasks_to_stop and (not tasks_to_sleep or (proc not in CFG_BIBTASK_MONOTASKS and len(self.node_relevant_active_tasks) < CFG_BIBSCHED_MAX_NUMBER_CONCURRENT_TASKS)):
if proc in CFG_BIBTASK_MONOTASKS and self.active_tasks_all_nodes:
if debug:
Log("Cannot run because this is a monotask and there are other tasks running: %s" % (self.node_relevant_active_tasks, ))
return False
def task_in_same_host(dummy_task_id, dummy_proc, dummy_runtime, dummy_status, dummy_priority, host, dummy_sequenceid):
return host == self.hostname
def filter_by_host(tasks):
return tuple(t for t in tasks if task_in_same_host(*t))
node_active_tasks = filter_by_host(self.node_relevant_active_tasks)
if len(node_active_tasks) >= CFG_BIBSCHED_MAX_NUMBER_CONCURRENT_TASKS:
if debug:
Log("Cannot run because all resources (%s) are used (%s), active: %s" % (CFG_BIBSCHED_MAX_NUMBER_CONCURRENT_TASKS, len(node_active_tasks), node_active_tasks))
return False
if status in ("SLEEPING", "ABOUT TO SLEEP"):
if host == self.hostname:
## We can only wake up tasks that are running on our own host
for other_task_id, other_proc, dummy_other_runtime, other_status, dummy_other_priority, other_host, dummy_other_sequenceid in self.node_relevant_active_tasks:
## But only if there are not other tasks still going to sleep, otherwise
## we might end up stealing the slot for an higher priority task.
if other_task_id != task_id and other_status in ('ABOUT TO SLEEP', 'ABOUT TO STOP') and other_host == self.hostname:
if debug:
Log("Not yet waking up task #%d since there are other tasks (%s #%d) going to sleep (higher priority task incoming?)" % (task_id, other_proc, other_task_id))
return False
bibsched_set_status(task_id, "CONTINUING", status)
if not bibsched_send_signal(proc, task_id, signal.SIGCONT):
bibsched_set_status(task_id, "ERROR", "CONTINUING")
Log("Task #%d (%s) woken up but didn't existed anymore" % (task_id, proc))
return True
Log("Task #%d (%s) woken up" % (task_id, proc))
return True
else:
return False
elif procname in self.helper_modules:
program = os.path.join(CFG_BINDIR, procname)
## Trick to log in bibsched.log the task exiting
exit_str = '&& echo "`date "+%%Y-%%m-%%d %%H:%%M:%%S"` --> Task #%d (%s) exited" >> %s' % (task_id, proc, os.path.join(CFG_LOGDIR, 'bibsched.log'))
command = "%s %s %s" % (program, str(task_id), exit_str)
### Set the task to scheduled and tie it to this host
if self.tie_task_to_host(task_id):
Log("Task #%d (%s) started" % (task_id, proc))
### Relief the lock for the BibTask, it is safe now to do so
spawn_task(command, wait=proc in CFG_BIBTASK_MONOTASKS)
count = 10
while run_sql("""SELECT status FROM schTASK
WHERE id=%s AND status='SCHEDULED'""",
(task_id, )):
## Polling to wait for the task to really start,
## in order to avoid race conditions.
if count <= 0:
raise StandardError("Process %s (task_id: %s) was launched but seems not to be able to reach RUNNING status." % (proc, task_id))
time.sleep(CFG_BIBSCHED_REFRESHTIME)
count -= 1
return True
else:
raise StandardError("%s is not in the allowed modules" % procname)
else:
## It's not still safe to run the task.
## We first need to stop tasks that should be stopped
## and to put to sleep tasks that should be put to sleep
for t in tasks_to_stop:
stop_task(*t)
for t in tasks_to_sleep:
sleep_task(*t)
time.sleep(CFG_BIBSCHED_REFRESHTIME)
return True
def check_errors(self):
errors = run_sql("""SELECT id,proc,status FROM schTASK
WHERE status = 'ERROR'
OR status = 'DONE WITH ERRORS'
OR status = 'CERROR'""")
if errors:
error_msgs = []
error_recoverable = True
for e_id, e_proc, e_status in errors:
if run_sql("""UPDATE schTASK
SET status='ERRORS REPORTED'
WHERE id = %s AND (status='CERROR'
OR status='ERROR'
OR status='DONE WITH ERRORS')""", [e_id]):
msg = " #%s %s -> %s" % (e_id, e_proc, e_status)
error_msgs.append(msg)
if e_status in ('ERROR', 'DONE WITH ERRORS'):
error_recoverable = False
if error_msgs:
msg = "BibTask with ERRORS:\n%s" % '\n'.join(error_msgs)
if error_recoverable:
raise RecoverableError(msg)
else:
raise StandardError(msg)
def calculate_rows(self):
"""Return all the node_relevant_active_tasks to work on."""
try:
self.check_errors()
except RecoverableError, msg:
register_emergency('Light emergency from %s: BibTask failed: %s' % (CFG_SITE_URL, msg))
max_bibupload_priority, min_bibupload_priority = run_sql(
"""SELECT MAX(priority), MIN(priority)
FROM schTASK
WHERE status IN ('WAITING', 'RUNNING', 'SLEEPING',
'ABOUT TO STOP', 'ABOUT TO SLEEP',
'SCHEDULED', 'CONTINUING')
AND proc = 'bibupload'
AND runtime <= NOW()""")[0]
if max_bibupload_priority > min_bibupload_priority:
run_sql(
"""UPDATE schTASK SET priority = %s
WHERE status IN ('WAITING', 'RUNNING', 'SLEEPING',
'ABOUT TO STOP', 'ABOUT TO SLEEP',
'SCHEDULED', 'CONTINUING')
AND proc = 'bibupload'
AND runtime <= NOW()
AND priority < %s""", (max_bibupload_priority,
max_bibupload_priority))
## The bibupload tasks are sorted by id, which means by the order they were scheduled
self.node_relevant_bibupload_tasks = run_sql(
"""SELECT id, proc, runtime, status, priority, host, sequenceid
FROM schTASK WHERE status IN ('WAITING', 'SLEEPING')
AND proc = 'bibupload'
AND runtime <= NOW()
ORDER BY id ASC LIMIT 1""", n=1)
## The other tasks are sorted by priority
self.node_relevant_waiting_tasks = run_sql(
"""SELECT id, proc, runtime, status, priority, host, sequenceid
FROM schTASK WHERE (status='WAITING' AND runtime <= NOW())
OR status = 'SLEEPING'
ORDER BY priority DESC, runtime ASC, id ASC""")
self.node_relevant_sleeping_tasks = run_sql(
"""SELECT id, proc, runtime, status, priority, host, sequenceid
FROM schTASK WHERE status = 'SLEEPING'
ORDER BY priority DESC, runtime ASC, id ASC""")
self.node_relevant_active_tasks = run_sql(
"""SELECT id, proc, runtime, status, priority, host, sequenceid
FROM schTASK WHERE status IN ('RUNNING', 'CONTINUING',
'SCHEDULED', 'ABOUT TO STOP',
'ABOUT TO SLEEP')""")
self.active_tasks_all_nodes = tuple(self.node_relevant_active_tasks)
self.mono_tasks_all_nodes = tuple(t for t in self.node_relevant_waiting_tasks if is_monotask(*t))
## Remove tasks that can not be executed on this host
self.filter_for_allowed_tasks()
def watch_loop(self):
## Cleaning up scheduled task not run because of bibsched being
## interrupted in the middle.
run_sql("""UPDATE schTASK
SET status = 'WAITING'
WHERE status = 'SCHEDULED'
AND host = %s""", (self.hostname, ))
try:
while True:
if self.debug:
Log("New bibsched cycle")
self.calculate_rows()
## Let's first handle running node_relevant_active_tasks.
for task in self.node_relevant_active_tasks:
if self.handle_task(*task):
break
else:
# If nothing has changed we can go on to run tasks.
for task in self.node_relevant_waiting_tasks:
if task[1] == 'bibupload' and self.node_relevant_bibupload_tasks:
## We switch in bibupload serial mode!
## which means we execute the first next bibupload.
if self.handle_task(*self.node_relevant_bibupload_tasks[0]):
## Something has changed
break
elif self.handle_task(*task):
## Something has changed
break
else:
time.sleep(CFG_BIBSCHED_REFRESHTIME)
except Exception, err:
register_exception(alert_admin=True)
try:
register_emergency('Emergency from %s: BibSched halted: %s' % (CFG_SITE_URL, err))
except NotImplementedError:
pass
raise
class TimedOutExc(Exception):
def __init__(self, value="Timed Out"):
Exception.__init__(self)
self.value = value
def __str__(self):
return repr(self.value)
def timed_out(f, timeout, *args, **kwargs):
def handler(signum, frame): # pylint: disable=W0613
raise TimedOutExc()
old = signal.signal(signal.SIGALRM, handler)
signal.alarm(timeout)
try:
result = f(*args, **kwargs)
finally:
signal.signal(signal.SIGALRM, old)
signal.alarm(0)
return result
def Log(message):
log = open(CFG_LOGDIR + "/bibsched.log", "a")
log.write(time.strftime("%Y-%m-%d %H:%M:%S --> ", time.localtime()))
log.write(message)
log.write("\n")
log.close()
def redirect_stdout_and_stderr():
"This function redirects stdout and stderr to bibsched.log and bibsched.err file."
old_stdout = sys.stdout
old_stderr = sys.stderr
sys.stdout = open(CFG_LOGDIR + "/bibsched.log", "a")
sys.stderr = open(CFG_LOGDIR + "/bibsched.err", "a")
return old_stdout, old_stderr
def restore_stdout_and_stderr(stdout, stderr):
sys.stdout = stdout
sys.stderr = stderr
def usage(exitcode=1, msg=""):
"""Prints usage info."""
if msg:
sys.stderr.write("Error: %s.\n" % msg)
sys.stderr.write("""\
Usage: %s [options] [start|stop|restart|monitor|status]
The following commands are available for bibsched:
start start bibsched in background
stop stop running bibtasks and the bibsched daemon safely
halt halt running bibsched while keeping bibtasks running
restart restart running bibsched
monitor enter the interactive monitor
status get report about current status of the queue
purge purge the scheduler queue from old tasks
General options:
-h, --help \t Print this help.
-V, --version \t Print version information.
-q, --quiet \t Quiet mode
-d, --debug \t Write debugging information in bibsched.log
Status options:
-s, --status=LIST\t Which BibTask status should be considered (default is Running,waiting)
-S, --since=TIME\t Since how long time to consider tasks e.g.: 30m, 2h, 1d (default
is all)
-t, --tasks=LIST\t Comma separated list of BibTask to consider (default
\t is all)
Purge options:
-s, --status=LIST\t Which BibTask status should be considered (default is DONE)
-S, --since=TIME\t Since how long time to consider tasks e.g.: 30m, 2h, 1d (default
is %s days)
-t, --tasks=LIST\t Comma separated list of BibTask to consider (default
\t is %s)
""" % (sys.argv[0], CFG_BIBSCHED_GC_TASKS_OLDER_THAN, ','.join(CFG_BIBSCHED_GC_TASKS_TO_REMOVE + CFG_BIBSCHED_GC_TASKS_TO_ARCHIVE)))
sys.exit(exitcode)
pidfile = os.path.join(CFG_PREFIX, 'var', 'run', 'bibsched.pid')
def error(msg):
print >> sys.stderr, "error: %s" % msg
sys.exit(1)
def warning(msg):
print >> sys.stderr, "warning: %s" % msg
def server_pid(ping_the_process=True, check_is_really_bibsched=True):
# The pid must be stored on the filesystem
try:
pid = int(open(pidfile).read())
except IOError:
return None
if ping_the_process:
# Even if the pid is available, we check if it corresponds to an
# actual process, as it might have been killed externally
try:
os.kill(pid, signal.SIGCONT)
except OSError:
warning("pidfile %s found referring to pid %s which is not running" % (pidfile, pid))
return None
if check_is_really_bibsched:
output = run_shell_command("ps p %s -o args=", (str(pid), ))[1]
if not 'bibsched' in output:
warning("pidfile %s found referring to pid %s which does not correspond to bibsched: cmdline is %s" % (pidfile, pid, output))
return None
return pid
def start(verbose=True, debug=False):
""" Fork this process in the background and start processing
requests. The process PID is stored in a pid file, so that it can
be stopped later on."""
if verbose:
sys.stdout.write("starting bibsched: ")
sys.stdout.flush()
pid = server_pid(ping_the_process=False)
if pid:
pid2 = server_pid()
if pid2:
error("another instance of bibsched (pid %d) is running" % pid2)
else:
warning("%s exist but the corresponding bibsched (pid %s) seems not be running" % (pidfile, pid))
warning("erasing %s and continuing..." % (pidfile, ))
os.remove(pidfile)
# start the child process using the "double fork" technique
pid = os.fork()
if pid > 0:
sys.exit(0)
os.setsid()
os.chdir('/')
pid = os.fork()
if pid > 0:
if verbose:
sys.stdout.write('pid %d\n' % pid)
Log("daemon started (pid %d)" % pid)
open(pidfile, 'w').write('%d' % pid)
return
sys.stdin.close()
redirect_stdout_and_stderr()
sched = BibSched(debug=debug)
try:
sched.watch_loop()
finally:
try:
os.remove(pidfile)
except OSError:
pass
def halt(verbose=True, soft=False, debug=False): # pylint: disable=W0613
pid = server_pid()
if not pid:
if soft:
print >> sys.stderr, 'bibsched seems not to be running.'
return
else:
error('bibsched seems not to be running.')
try:
os.kill(pid, signal.SIGKILL)
except OSError:
print >> sys.stderr, 'no bibsched process found'
Log("daemon stopped (pid %d)" % pid)
if verbose:
print "stopping bibsched: pid %d" % pid
os.unlink(pidfile)
def monitor(verbose=True, debug=False): # pylint: disable=W0613
old_stdout, old_stderr = redirect_stdout_and_stderr()
try:
Manager(old_stdout)
finally:
restore_stdout_and_stderr(old_stdout, old_stderr)
def write_message(msg, stream=None, verbose=1): # pylint: disable=W0613
"""Write message and flush output stream (may be sys.stdout or sys.stderr).
Useful for debugging stuff."""
if stream is None:
stream = sys.stdout
if msg:
if stream == sys.stdout or stream == sys.stderr:
stream.write(time.strftime("%Y-%m-%d %H:%M:%S --> ",
time.localtime()))
try:
stream.write("%s\n" % msg)
except UnicodeEncodeError:
stream.write("%s\n" % msg.encode('ascii', 'backslashreplace'))
stream.flush()
else:
sys.stderr.write("Unknown stream %s. [must be sys.stdout or sys.stderr]\n" % stream)
def report_queue_status(verbose=True, status=None, since=None, tasks=None): # pylint: disable=W0613
"""
Report about the current status of BibSched queue on standard output.
"""
def report_about_processes(status='RUNNING', since=None, tasks=None):
"""
Helper function to report about processes with the given status.
"""
if tasks is None:
task_query = ''
else:
task_query = 'AND proc IN (%s)' % (
','.join([repr(real_escape_string(task)) for task in tasks]))
if since is None:
since_query = ''
else:
# We're not interested in future task
if since.startswith('+') or since.startswith('-'):
since = since[1:]
since = '-' + since
since_query = "AND runtime >= '%s'" % get_datetime(since)
res = run_sql("""SELECT id, proc, user, runtime, sleeptime,
status, progress, priority
FROM schTASK WHERE status=%%s %(task_query)s
%(since_query)s ORDER BY id ASC""" % {
'task_query': task_query,
'since_query' : since_query},
(status,))
write_message("%s processes: %d" % (status, len(res)))
for (proc_id, proc_proc, proc_user, proc_runtime, proc_sleeptime,
proc_status, proc_progress, proc_priority) in res:
write_message(' * ID="%s" PRIORITY="%s" PROC="%s" USER="%s" '
'RUNTIME="%s" SLEEPTIME="%s" STATUS="%s" '
'PROGRESS="%s"' % (proc_id,
proc_priority, proc_proc, proc_user, proc_runtime,
proc_sleeptime, proc_status, proc_progress))
return
write_message("BibSched queue status report for %s:" % gethostname())
mode = server_pid() and "AUTOMATIC" or "MANUAL"
write_message("BibSched queue running mode: %s" % mode)
if status is None:
report_about_processes('Running', since, tasks)
report_about_processes('Waiting', since, tasks)
else:
for state in status:
report_about_processes(state, since, tasks)
write_message("Done.")
def restart(verbose=True, debug=False):
halt(verbose, soft=True, debug=debug)
start(verbose, debug=debug)
def stop(verbose=True, debug=False):
"""
* Stop bibsched
* Send stop signal to all the running tasks
* wait for all the tasks to stop
* return
"""
if verbose:
print "Stopping BibSched if running"
halt(verbose, soft=True, debug=debug)
run_sql("UPDATE schTASK SET status='WAITING' WHERE status='SCHEDULED'")
res = run_sql("""SELECT id, proc, status FROM schTASK
WHERE status NOT LIKE 'DONE'
AND status NOT LIKE '%_DELETED'
AND (status='RUNNING'
OR status='ABOUT TO STOP'
OR status='ABOUT TO SLEEP'
OR status='SLEEPING'
OR status='CONTINUING')""")
if verbose:
print "Stopping all running BibTasks"
for task_id, proc, status in res:
if status == 'SLEEPING':
bibsched_send_signal(proc, task_id, signal.SIGCONT)
time.sleep(CFG_BIBSCHED_REFRESHTIME)
bibsched_set_status(task_id, 'ABOUT TO STOP')
while run_sql("""SELECT id FROM schTASK
WHERE status NOT LIKE 'DONE'
AND status NOT LIKE '%_DELETED'
AND (status='RUNNING'
OR status='ABOUT TO STOP'
OR status='ABOUT TO SLEEP'
OR status='SLEEPING'
OR status='CONTINUING')"""):
if verbose:
sys.stdout.write('.')
sys.stdout.flush()
time.sleep(CFG_BIBSCHED_REFRESHTIME)
if verbose:
print "\nStopped"
Log("BibSched and all BibTasks stopped")
+@with_app_context()
def main():
- from invenio.bibtask import check_running_process_user
+ from invenio.legacy.bibsched.bibtask import check_running_process_user
check_running_process_user()
verbose = True
status = None
since = None
tasks = None
debug = False
try:
opts, args = getopt.gnu_getopt(sys.argv[1:], "hVdqS:s:t:", [
"help", "version", "debug", "quiet", "since=", "status=", "task="])
except getopt.GetoptError, err:
Log("Error: %s" % err)
usage(1, err)
for opt, arg in opts:
if opt in ["-h", "--help"]:
usage(0)
elif opt in ["-V", "--version"]:
print __revision__
sys.exit(0)
elif opt in ['-q', '--quiet']:
verbose = False
elif opt in ['-s', '--status']:
status = arg.split(',')
elif opt in ['-S', '--since']:
since = arg
elif opt in ['-t', '--task']:
tasks = arg.split(',')
elif opt in ['-d', '--debug']:
debug = True
else:
usage(1)
try:
cmd = args[0]
except IndexError:
cmd = 'monitor'
try:
if cmd in ('status', 'purge'):
{'status' : report_queue_status,
'purge' : gc_tasks}[cmd](verbose, status, since, tasks)
else:
{'start': start,
'halt': halt,
'stop': stop,
'restart': restart,
'monitor': monitor}[cmd](verbose=verbose, debug=debug)
except KeyError:
usage(1, 'unkown command: %s' % cmd)
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/bibsched/scripts/bibtaskex.py b/invenio/legacy/bibsched/scripts/bibtaskex.py
index b46d9942f..4f69c8f1a 100644
--- a/invenio/legacy/bibsched/scripts/bibtaskex.py
+++ b/invenio/legacy/bibsched/scripts/bibtaskex.py
@@ -1,98 +1,26 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
-## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
+## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
-"""Invenio Bibliographic Task Example.
+from invenio.base.factory import with_app_context
-Demonstrates BibTask <-> BibSched connectivity, signal handling,
-error handling, etc.
-"""
-
-__revision__ = "$Id$"
-
-import sys
-import time
-from invenio.bibtask import task_init, write_message, task_set_option, \
- task_get_option, task_update_progress, task_has_option, \
- task_get_task_param, task_sleep_now_if_required
-
-def fib(n):
- """Returns Fibonacci number for 'n'."""
- out = 1
- if n >= 2:
- out = fib(n-2) + fib(n-1)
- return out
-
-def task_submit_elaborate_specific_parameter(key, value, opts, args):
- """ Given the string key it checks it's meaning, eventually using the
- value. Usually it fills some key in the options dict.
- It must return True if it has elaborated the key, False, if it doesn't
- know that key.
- eg:
- if key in ('-n', '--number'):
- task_set_option('number', value)
- return True
- return False
- """
- if key in ('-n', '--number'):
- task_set_option('number', value)
- return True
- elif key in ('-e', '--error'):
- task_set_option('error', True)
- return True
- return False
-
-def task_run_core():
- """Runs the task by fetching arguments from the BibSched task queue. This is
- what BibSched will be invoking via daemon call.
- The task prints Fibonacci numbers for up to NUM on the stdout, and some
- messages on stderr.
- Return 1 in case of success and 0 in case of failure."""
- n = int(task_get_option('number'))
- write_message("Printing %d Fibonacci numbers." % n, verbose=9)
- for i in range(0, n):
- if i > 0 and i % 4 == 0:
- write_message("Error: water in the CPU. Ignoring and continuing.", sys.stderr, verbose=3)
- elif i > 0 and i % 5 == 0:
- write_message("Error: floppy drive dropped on the floor. Ignoring and continuing.", sys.stderr)
- if task_get_option('error'):
- 1 / 0
- write_message("fib(%d)=%d" % (i, fib(i)))
- task_update_progress("Done %d out of %d." % (i, n))
- task_sleep_now_if_required(can_stop_too=True)
- time.sleep(1)
- task_update_progress("Done %d out of %d." % (n, n))
- return 1
+@with_app_context()
def main():
- """Main that construct all the bibtask."""
- task_init(authorization_action='runbibtaskex',
- authorization_msg="BibTaskEx Task Submission",
- help_specific_usage="""\
--n, --number Print Fibonacci numbers for up to NUM. [default=30]
--e, --error Raise an error from time to time
-""",
- version=__revision__,
- specific_params=("n:e",
- ["number=", "error"]),
- task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
- task_run_fnc=task_run_core)
-
-### okay, here we go:
-if __name__ == '__main__':
- main()
+ from invenio.legacy.bibsched.bibtaskex import main as bibtaskex_main
+ return bibtaskex_main()
diff --git a/invenio/legacy/bibsched/scripts/bibtasklet.py b/invenio/legacy/bibsched/scripts/bibtasklet.py
index 283870091..61287ab92 100644
--- a/invenio/legacy/bibsched/scripts/bibtasklet.py
+++ b/invenio/legacy/bibsched/scripts/bibtasklet.py
@@ -1,170 +1,171 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Invenio Bibliographic Tasklet BibTask.
This is a particular BibTask that execute tasklets, which can be any
function dropped into /opt/cds-invenio/lib/python/invenio/bibsched_tasklets/.
"""
__revision__ = "$Id$"
import sys
from werkzeug.utils import find_modules, import_string
-from invenio.bibtask import task_init, write_message, task_set_option, \
+from invenio.base.factory import with_app_context
+from invenio.legacy.bibsched.bibtask import task_init, write_message, task_set_option, \
task_get_option, task_update_progress
from invenio.utils.autodiscovery.helpers import get_callable_documentation
from invenio.utils.autodiscovery.checkers import check_arguments_compatibility
#from invenio.base.utils import import_module_from_packages
from invenio.utils.datastructures import LazyDict
def _load_tasklets():
"""
Load all the bibsched tasklets into the global variable _TASKLETS.
"""
tasklets = {}
#FIXME
- packages = [import_string('invenio.bibsched_tasklets'), ] #import_module_from_packages('bibsched_tasklets')
+ packages = [import_string('invenio.legacy.bibsched.tasklets'), ] #import_module_from_packages('bibsched_tasklets')
for module in packages:
for tasklet in find_modules(module.__name__):
try:
func = import_string(tasklet + ':' + tasklet.split('.')[-1])
tasklets[tasklet.split('.')[-1]] = func
except:
print 'Fail', tasklet
return tasklets
_TASKLETS = LazyDict(_load_tasklets)
def cli_list_tasklets():
"""
Print the list of available tasklets and broken tasklets.
"""
print """Available tasklets:"""
for tasklet in _TASKLETS.values():
print get_callable_documentation(tasklet)
print """Broken tasklets:"""
for tasklet_name, error in _TASKLETS.get_broken_plugins().iteritems():
print "%s: %s" % (tasklet_name, error)
sys.exit(0)
def task_submit_elaborate_specific_parameter(key, value,
dummy_opts, dummy_args):
""" Given the string key it checks it's meaning, eventually using the
value. Usually it fills some key in the options dict.
It must return True if it has elaborated the key, False, if it doesn't
know that key.
eg:
if key in ('-n', '--number'):
task_set_option('number', value)
return True
return False
"""
if key in ('-T', '--tasklet'):
task_set_option('tasklet', value)
return True
elif key in ('-a', '--argument'):
arguments = task_get_option('arguments', {})
try:
key, value = value.split('=', 1)
except NameError:
print >> sys.stderr, 'ERROR: an argument must be in the form ' \
'param=value, not "%s"' % value
return False
arguments[key] = value
task_set_option('arguments', arguments)
return True
elif key in ('-l', '--list-tasklets'):
cli_list_tasklets()
return True
return False
def task_submit_check_options():
"""
Check if a tasklet has been specified, and if the parameters are good
"""
tasklet = task_get_option('tasklet', None)
arguments = task_get_option('arguments', {})
if not tasklet:
print >> sys.stderr, 'ERROR: no tasklet specified'
return False
elif tasklet not in _TASKLETS:
print >> sys.stderr, 'ERROR: "%s" is not a valid tasklet. Use ' \
'--list-tasklets to obtain a list of the working tasklets.' % \
tasklet
return False
else:
try:
check_arguments_compatibility(_TASKLETS[tasklet], arguments)
except ValueError, err:
print >> sys.stderr, 'ERROR: wrong arguments (%s) specified for ' \
'tasklet "%s": %s' % (arguments, tasklet, err)
return False
return True
def task_run_core():
"""
Run the specific tasklet.
"""
tasklet = task_get_option('tasklet')
arguments = task_get_option('arguments', {})
write_message('Starting tasklet "%s" (with arguments %s)' %
(tasklet, arguments))
task_update_progress('%s started' % tasklet)
ret = _TASKLETS[tasklet](**arguments)
task_update_progress('%s finished' % tasklet)
write_message('Finished tasklet "%s" (with arguments %s)' %
(tasklet, arguments))
if ret is not None:
return ret
return True
-
+@with_app_context
def main():
"""
Main body of bibtasklet.
"""
task_init(
authorization_action='runbibtaslet',
authorization_msg="BibTaskLet Task Submission",
help_specific_usage="""\
-T, --tasklet Execute the specific tasklet
-a, --argument Specify an argument to be passed to tasklet in the form
param=value, e.g. --argument foo=bar
-l, --list-tasklets List the existing tasklets
""",
version=__revision__,
specific_params=("T:a:l",
["tasklet=", "argument=", "list-tasklets"]),
task_submit_elaborate_specific_parameter_fnc=
task_submit_elaborate_specific_parameter,
task_run_fnc=task_run_core,
task_submit_check_options_fnc=task_submit_check_options)
if __name__ == "__main__":
main()
diff --git a/invenio/legacy/bibsched/tasklets/__init__.py b/invenio/legacy/bibsched/tasklets/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/invenio/legacy/bibsched/tasklets/bst_fibonacci.py b/invenio/legacy/bibsched/tasklets/bst_fibonacci.py
index faf79f5ed..30b6a2d8b 100644
--- a/invenio/legacy/bibsched/tasklets/bst_fibonacci.py
+++ b/invenio/legacy/bibsched/tasklets/bst_fibonacci.py
@@ -1,59 +1,59 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Invenio Bibliographic Tasklet Example.
Demonstrates BibTaskLet <-> BibTask <-> BibSched connectivity
"""
import sys
import time
-from invenio.bibtask import write_message, task_set_option, \
+from invenio.legacy.bibsched.bibtask import write_message, task_set_option, \
task_get_option, task_update_progress, task_has_option, \
task_get_task_param, task_sleep_now_if_required
def fib(n):
"""Returns Fibonacci number for 'n'."""
out = 1
if n >= 2:
out = fib(n-2) + fib(n-1)
return out
def bst_fibonacci(n=30):
"""
Small tasklets that prints the the Fibonacci sequence for n.
@param n: how many Fibonacci numbers to print.
@type n: int
"""
## Since it's tasklet, the parameter might be passed as a string.
## it should then be converted to an int.
n = int(n)
write_message("Printing %d Fibonacci numbers." % n, verbose=9)
for i in range(0, n):
if i > 0 and i % 4 == 0:
write_message("Error: water in the CPU. Ignoring and continuing.", sys.stderr, verbose=3)
elif i > 0 and i % 5 == 0:
write_message("Error: floppy drive dropped on the floor. Ignoring and continuing.", sys.stderr)
write_message("fib(%d)=%d" % (i, fib(i)))
task_update_progress("Done %d out of %d." % (i, n))
task_sleep_now_if_required(can_stop_too=True)
time.sleep(1)
task_update_progress("Done %d out of %d." % (n, n))
return 1
diff --git a/invenio/legacy/bibsched/tasklets/bst_notify_url.py b/invenio/legacy/bibsched/tasklets/bst_notify_url.py
index ecfabf70f..ecbdb30ee 100644
--- a/invenio/legacy/bibsched/tasklets/bst_notify_url.py
+++ b/invenio/legacy/bibsched/tasklets/bst_notify_url.py
@@ -1,133 +1,133 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Invenio Tasklet.
Notify a URL, and post data if wanted.
"""
import urlparse
import urllib2
import time
from invenio.config import \
CFG_SITE_ADMIN_EMAIL, \
CFG_SITE_NAME
-from invenio.bibtask import write_message, \
+from invenio.legacy.bibsched.bibtask import write_message, \
task_sleep_now_if_required
from invenio.ext.email import send_email
def bst_notify_url(url, data=None,
content_type='text/plain',
attempt_times=1,
attempt_sleeptime=10,
admin_emails=None):
"""
Access given URL, and post given data if specified.
@param url: the URL to access
@type url: string
@param data: the data to be posted to the given URL
@type data: string
@param data: the content-type header to use to post data
@type data: string
@param attempt_times: number of tries
@type attempt_times: int
@param attempt_sleeptime: seconds in between tries
@type attempt_sleeptime: int
@param admin_emails: a comma-separated list of emails to notify in case of failure
@type admin_emails: string or list (as accepted by mailutils.send_email)
If accessing fails, try to send it ATTEMPT_TIMES, and wait for
ATTEMPT_SLEEPTIME seconds in between tries. When the maximum
number of attempts is reached, send an email notification to the
recipients specified in ADMIN_EMAILS.
"""
attempt_times = int(attempt_times)
attempt_sleeptime = int(attempt_sleeptime)
remaining_attempts = attempt_times
success_p = False
reason_failure = ""
write_message("Going to notify URL: %(url)s" % {'url': url})
while not success_p and remaining_attempts > 0:
## <scheme>://<netloc>/<path>?<query>#<fragment>
scheme, netloc, path, query, fragment = urlparse.urlsplit(url)
## See: http://stackoverflow.com/questions/111945/is-there-any-way-to-do-http-put-in-python
if scheme == 'http':
opener = urllib2.build_opener(urllib2.HTTPHandler)
elif scheme == 'https':
opener = urllib2.build_opener(urllib2.HTTPSHandler)
else:
raise ValueError("Scheme not handled %s for url %s" % (scheme, url))
request = urllib2.Request(url, data=data)
if data:
request.add_header('Content-Type', content_type)
request.get_method = lambda: 'POST'
try:
opener.open(request)
success_p = True
except urllib2.URLError, e:
success_p = False
reason_failure = repr(e)
if not success_p:
remaining_attempts -= 1
if remaining_attempts > 0: # sleep only if we shall retry again
task_sleep_now_if_required(can_stop_too=True)
time.sleep(attempt_sleeptime)
# Report about success/failure
if success_p:
write_message("URL successfully notified")
else:
write_message("Failed at notifying URL. Reason:\n%(reason_failure)s" % \
{'reason_failure': reason_failure})
if not success_p and admin_emails:
# We could not access the specified URL. Send an email to the
# specified contacts.
write_message("Notifying by email %(admin_emails)s" % \
{'admin_emails': str(admin_emails)})
subject = "%(CFG_SITE_NAME)s could not contact %(url)s" % \
{'CFG_SITE_NAME': CFG_SITE_NAME,
'url': url}
content = """\n%(CFG_SITE_NAME)s unsuccessfully tried to contact %(url)s.
Number of attempts: %(attempt_times)i. No further attempts will be made.
""" % \
{'CFG_SITE_NAME': CFG_SITE_NAME,
'url': url,
'attempt_times': attempt_times}
if data:
max_data_length = 10000
content += "The following data should have been posted:\n%(data)s%(extension)s" % \
{'data': data[:max_data_length],
'extension': len(data) > max_data_length and ' [...]' or ''}
# Send email. If sending fails, we will stop the queue
return send_email(fromaddr=CFG_SITE_ADMIN_EMAIL,
toaddr=admin_emails,
subject=subject,
content=content)
# We do not really want to stop the queue now, even in case of
# failure as an email would have been sent if necessary.
return 1
diff --git a/invenio/legacy/bibsched/tasklets/bst_run_bibtask.py b/invenio/legacy/bibsched/tasklets/bst_run_bibtask.py
index 3ae6fb385..f89746c50 100644
--- a/invenio/legacy/bibsched/tasklets/bst_run_bibtask.py
+++ b/invenio/legacy/bibsched/tasklets/bst_run_bibtask.py
@@ -1,47 +1,47 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Invenio Bibliographic Tasklet.
Allows a task to immediately run another task after it.
"""
-from invenio.bibtask import task_low_level_submission
+from invenio.legacy.bibsched.bibtask import task_low_level_submission
def bst_run_bibtask(taskname, user, **args):
"""
Initiate a bibsched task.
@param taskname: name of the task to run
@type taskname: string
@param user: the user to run the task under.
@type user: string
"""
arglist = []
# Transform dict to list: {'a': 0, 'b': 1} -> ['a', 0, 'b', 1]
for name, value in args.items():
if len(name) == 1:
name = '-' + name
else:
name = '--' + name
arglist.append(name)
if value:
arglist.append(value)
task_low_level_submission(taskname, user, *tuple(arglist))
diff --git a/invenio/legacy/bibsched/tasklets/bst_twitter_fetcher.py b/invenio/legacy/bibsched/tasklets/bst_twitter_fetcher.py
index a2a0fe491..b251dfdd7 100644
--- a/invenio/legacy/bibsched/tasklets/bst_twitter_fetcher.py
+++ b/invenio/legacy/bibsched/tasklets/bst_twitter_fetcher.py
@@ -1,174 +1,174 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Twitter fetcher
In order to schedule fetching tweets you can type at the command line:
$ sudo -u www-data /opt/invenio/bin/bibtasklet -T bst_twitter_fetcher -uadmin -s5m -a "query=YOURQUERY"
"""
## Here we import the Twitter APIs
import twitter
import re
import os
import sys
import tempfile
import time
import sys
## Here are some good Invenio APIs
from invenio.config import CFG_TMPDIR
## BibRecord -> to create MARCXML records
from invenio.legacy.bibrecord import record_add_field, record_xml_output
## BibTask -> to manipulate Bibliographic Tasks
-from invenio.bibtask import task_low_level_submission, write_message, task_update_progress
+from invenio.legacy.bibsched.bibtask import task_low_level_submission, write_message, task_update_progress
## BibDocFile to manipulate documents
-from invenio.bibdocfile import check_valid_url
+from invenio.legacy.bibdocfile.api import check_valid_url
## WebSearch to search for previous tweets
from invenio.legacy.search_engine import perform_request_search, get_fieldvalues
_TWITTER_API = twitter.Api()
def get_tweets(query):
"""
This is how simple it is to fetch tweets :-)
"""
## We shall skip tweets that already in the system.
previous_tweets = perform_request_search(p='980__a:"TWEET" 980__b:"%s"' % query, sf='970__a', so='a')
if previous_tweets:
## A bit of an algorithm to retrieve the last Tweet ID that was stored
## in our records
since_id = int(get_fieldvalues(previous_tweets[0], '970__a')[0])
else:
since_id = 0
final_results = []
results = list(_TWITTER_API.Search(query, rpp=100, since_id=since_id).results)
final_results.extend(results)
page = 1
while len(results) == 100: ## We stop if there are less than 100 results per page
page += 1
results = list(_TWITTER_API.Search(query, rpp=100, since_id=since_id, page=page).results)
final_results.extend(results)
return final_results
_RE_GET_HTTP = re.compile("(https?://.+?)(\s|$)")
_RE_TAGS = re.compile("([#@]\w+)")
def tweet_to_record(tweet, query):
"""
Transform a tweet into a record.
@note: you may want to highly customize this.
"""
rec = {}
## Let's normalize the body of the tweet.
text = tweet.text.encode('UTF-8')
text = text.replace('&gt;', '>')
text = text.replace('&lt;', '<')
text = text.replace('&quot;', "'")
text = text.replace('&amp;', '&')
## Let's add the creation date
try:
creation_date = time.strptime(tweet.created_at, '%a, %d %b %Y %H:%M:%S +0000')
except ValueError:
creation_date = time.strptime(tweet.created_at, '%a %b %d %H:%M:%S +0000 %Y')
record_add_field(rec, '260__c', time.strftime('%Y-%m-%dZ%H:%M:%ST', creation_date))
## Let's add the Tweet ID
record_add_field(rec, '970', subfields=[('a', str(tweet.id))])
## Let's add the body of the tweet as an abstract
record_add_field(rec, '520', subfields=[('a', text)])
## Let's re-add the body of the tweet as a title.
record_add_field(rec, '245', subfields=[('a', text)])
## Let's fetch information about the user
try:
user = _TWITTER_API.GetUser(tweet.from_user)
## Let's add the user name as author of the tweet
record_add_field(rec, '100', subfields=[('a', str(user.name.encode('UTF-8')))])
## Let's fetch the icon of the user profile, and let's upload it as
## an image (and an icon of itself)
record_add_field(rec, 'FFT', subfields=[('a', user.profile.image_url.encode('UTF-8')), ('x', user.profile.image_url.encode('UTF-8'))])
except Exception, err:
write_message("WARNING: issue when fetching the user: %s" % err, stream=sys.stderr)
if hasattr(tweet, 'iso_language_code'):
## Let's add the language of the Tweet if available (also this depends)
## on the kind of Twitter API call we used
record_add_field(rec, '045', subfields=[('a', tweet.iso_language_code.encode('UTF-8'))])
## Let's tag this record as a TWEET so that later we can build a collection
## out of these records.
record_add_field(rec, '980', subfields=[('a', 'TWEET'), ('b', query)])
## Some smart manipulations: let's parse out URLs and tags from the body
## of the Tweet.
for url in _RE_GET_HTTP.findall(text):
url = url[0]
record_add_field(rec, '856', '4', subfields=[('u', url)])
for tag in _RE_TAGS.findall(text):
## And here we add the keywords.
record_add_field(rec, '653', '1', subfields=[('a', tag), ('9', 'TWITTER')])
## Finally we shall serialize everything to MARCXML
return record_xml_output(rec)
def bst_twitter_fetcher(query):
"""
Fetch the tweets related to the user and upload them into Invenio.
@param user: the user
"""
## We prepare a temporary MARCXML file to upload.
fd, name = tempfile.mkstemp(suffix='.xml', prefix='tweets', dir=CFG_TMPDIR)
tweets = get_tweets(query)
if tweets:
os.write(fd, """<collection>\n""")
for i, tweet in enumerate(tweets):
## For every tweet we transform it to MARCXML and we dump it in the file.
task_update_progress('DONE: tweet %s out %s' % (i, len(tweets)))
os.write(fd, tweet_to_record(tweet, query))
os.write(fd, """</collection\n>""")
os.close(fd)
## Invenio magic: we schedule an upload of the created MARCXML to be inserted
## ASAP in the system.
task_low_level_submission('bibupload', 'admin', '-i', '-r', name, '-P5')
write_message("Uploaded file %s with %s new tweets about %s" % (name, len(tweets), query))
else:
write_message("No new tweets about %s" % query)
if __name__ == '__main__':
if len(sys.argv) == 2:
bst_twitter_fetcher(sys.argv[1])
else:
print "USAGE: %s TWITTER_QUERY" % sys.argv[0]
sys.exit(1)
diff --git a/invenio/legacy/bibsched/tasklets/bst_weblinkback_updater.py b/invenio/legacy/bibsched/tasklets/bst_weblinkback_updater.py
index ab2a55ca9..94894afe3 100644
--- a/invenio/legacy/bibsched/tasklets/bst_weblinkback_updater.py
+++ b/invenio/legacy/bibsched/tasklets/bst_weblinkback_updater.py
@@ -1,62 +1,62 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
-from invenio.bibtask import write_message
-from invenio.weblinkback_config import CFG_WEBLINKBACK_TYPE
-from invenio.weblinkback import update_linkbacks, \
+from invenio.legacy.bibsched.bibtask import write_message
+from invenio.legacy.weblinkback.config import CFG_WEBLINKBACK_TYPE
+from invenio.legacy.weblinkback.api import update_linkbacks, \
delete_linkbacks_on_blacklist, \
send_pending_linkbacks_notification
def bst_weblinkback_updater(mode):
"""
Update linkbacks
@param mode: 1 delete rejected, broken and pending linkbacks whose URLs is on blacklist
2 update page titles of new linkbacks
3 update page titles of old linkbacks
4 update manually set page titles
5 detect and disable broken linkbacks
6 send notification email for all pending linkbacks
@type mode: int
"""
mode = int(mode)
if mode == 1:
write_message("Starting to delete rejected and pending linkbacks URLs on blacklist")
delete_linkbacks_on_blacklist()
write_message("Completed to delete rejected and pending linkbacks URLs on blacklist")
elif mode == 2:
write_message("Starting to update the page titles of new linkbacks")
update_linkbacks(1)
write_message("Completed to update the page titles of new linkbacks")
elif mode == 3:
write_message("Starting to update the page titles of old linkbacks")
update_linkbacks(2)
write_message("Completed to update the page titles of old linkbacks")
elif mode == 4:
write_message("Starting to update manually set page titles")
update_linkbacks(3)
write_message("Completed to update manually set page titles")
elif mode == 5:
write_message("Starting to detect and disable broken linkbacks")
update_linkbacks(4)
write_message("Completed to detect and disable broken linkbacks")
elif mode == 6:
write_message("Starting to send notification email")
send_pending_linkbacks_notification(CFG_WEBLINKBACK_TYPE['TRACKBACK'])
write_message("Completed to send notification email")
diff --git a/invenio/legacy/bibsched/webinterface.py b/invenio/legacy/bibsched/webinterface.py
index 4ed680a5c..67bf5a7d4 100644
--- a/invenio/legacy/bibsched/webinterface.py
+++ b/invenio/legacy/bibsched/webinterface.py
@@ -1,106 +1,106 @@
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0103
"""Invenio Interface for bibsched live view."""
import sys
if sys.hexversion < 0x2060000:
try:
import simplejson as json
simplejson_available = True
except ImportError:
# Okay, no Ajax app will be possible, but continue anyway,
# since this package is only recommended, not mandatory.
simplejson_available = False
else:
import json
simplejson_available = True
from invenio.config import CFG_SITE_URL
from invenio.modules.access.engine import acc_authorize_action
from invenio.ext.legacy.handler import WebInterfaceDirectory
from invenio.legacy.bibrank.adminlib import tupletotable
from invenio.legacy.webpage import page
-from invenio.bibsched_webapi import get_javascript, get_bibsched_tasks, \
+from invenio.legacy.bibsched.webapi import get_javascript, get_bibsched_tasks, \
get_bibsched_mode, get_css, get_motd_msg
from invenio.legacy.webuser import page_not_authorized
import time
class WebInterfaceBibSchedPages(WebInterfaceDirectory):
"""Defines the set of /bibsched pages."""
_exports = ['',]
def __init__(self):
"""Initialize."""
pass
def index(self, req, form):
""" Display live BibSched queue
"""
referer = '/admin2/bibsched/'
navtrail = (' <a class="navtrail" href=\"%s/help/admin\">Admin Area</a> '
) % CFG_SITE_URL
auth_code, auth_message = acc_authorize_action(req, 'cfgbibsched')
if auth_code != 0:
return page_not_authorized(req=req, referer=referer,
text=auth_message, navtrail=navtrail)
bibsched_tasks = get_bibsched_tasks()
header = ["ID", "Name", "Priority", "User", "Time", "Status",
"Progress"]
map_status_css = {'WAITING': 'task_waiting', 'RUNNING': 'task_running',
'DONE WITH ERRORS': 'task_error'}
bibsched_error = False
motd_msg = get_motd_msg()
actions = []
body_content = ''
if len(motd_msg) > 0:
body_content += '<div class="clean_error">' + motd_msg + '</div><br />'
if not form.has_key('jsondata'):
body_content = '<div id="bibsched_table">'
if len(bibsched_tasks) > 0:
for task in bibsched_tasks:
(tskid, proc, priority, user, runtime, status, progress) = task
actions.append([tskid, proc, priority, user, runtime,
'<span class=%s>' % (status in map_status_css and
map_status_css[status] or '') + (status !="" and
status or '') + '</span>', (progress !="" and
progress or '')])
if 'ERROR' in status:
bibsched_error = True
body_content += tupletotable(header=header, tuple=actions,
alternate_row_colors_p=True)
if bibsched_error:
body_content += '<br /><img src="%s"><span class="bibsched_status"> The queue contains errors</span><br />' % ("/img/aid_reject.png")
else:
body_content += '<br /><img src="%s"><span class="bibsched_status"> BibSched is working without errors</span><br />' % ("/img/aid_check.png")
body_content += '<br /><span class="mode">Mode: %s</span>' % (get_bibsched_mode())
body_content += '<br /><br /><span class="last_updated">Last updated: %s</span>' % (time.strftime("%a %b %d, %Y %-I:%M:%S %p", time.localtime(time.time())))
if form.has_key('jsondata'):
json_response = {}
json_response.update({'bibsched': body_content})
return json.dumps(json_response)
else:
body_content += '</div>'
return page(title = "BibSched live view",
body = body_content,
errors = [],
warnings = [],
metaheaderadd = get_javascript() + get_css(),
req = req)
diff --git a/invenio/legacy/bibsort/daemon.py b/invenio/legacy/bibsort/daemon.py
index 80f462b94..3c7722aee 100644
--- a/invenio/legacy/bibsort/daemon.py
+++ b/invenio/legacy/bibsort/daemon.py
@@ -1,373 +1,373 @@
## -*- mode: python; coding: utf-8; -*-
##
## This file is part of Invenio.
## Copyright (C) 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Usage: bibsort [options]
BibSort tool
Options:
-h, --help show this help message and exit
-l, --load-config Loads the configuration from bibsort.cfg into the
database
-d, --dump-config Outputs a database dump in form of a config file
-p, --print-sorting-methods
Prints the available sorting methods
-R, --rebalance Runs the sorting methods given in '--metods'and
rebalances all the buckets.
If no method is specified, the rebalance will be done
for all the methods in the config file.
-S, --update-sorting Runs the sorting methods given in '--methods' for the
recids given in '--id'.
If no method is specified, the update will be done for
all the methods in the config file.
If no recids are specified, the update will be done
for all the records that have been
modified/inserted from the last run of the sorting.
If you want to run the sorting for all records, you
should use the '-B' option
-M, --methods=METHODS Specify the sorting methods for which the
update_sorting or rebalancing will run
(ex: --methods=method1,method2,method3).
-i, --id=RECIDS Specify the records for which the update_sorting will
run (ex: --id=1,2-56,72)
"""
__revision__ = "$Id$"
import sys
import optparse
import time
import ConfigParser
from invenio.utils.date import strftime
from invenio.legacy.dbquery import run_sql, Error
from invenio.config import CFG_ETCDIR
from invenio.bibsort_engine import run_bibsort_update, \
run_bibsort_rebalance
-from invenio.bibtask import task_init, write_message, \
+from invenio.legacy.bibsched.bibtask import task_init, write_message, \
task_set_option, task_get_option
def load_configuration():
"""Loads the configuration for the bibsort.cfg file into the database"""
config_file = CFG_ETCDIR + "/bibsort/bibsort.cfg"
write_message('Reading config data from: %s' %config_file)
config = ConfigParser.ConfigParser()
try:
config.readfp(open(config_file))
except StandardError, err:
write_message("Cannot find configuration file: %s" \
%config_file, stream=sys.stderr)
return False
to_insert = []
for section in config.sections():
try:
name = config.get(section, "name")
definition = config.get(section, "definition")
washer = config.get(section, "washer")
except (ConfigParser.NoOptionError, StandardError), err:
write_message("For each sort_field you need to define at least \
the name, the washer and the definition. \
[error: %s]" %err, stream=sys.stderr)
return False
to_insert.append((name, definition, washer))
# all the values were correctly read from the config file
run_sql("TRUNCATE TABLE bsrMETHOD")
write_message('Old data has been deleted from bsrMETHOD table', verbose=5)
for row in to_insert:
run_sql("INSERT INTO bsrMETHOD(name, definition, washer) \
VALUES (%s, %s, %s)", (row[0], row[1], row[2]))
write_message('Method %s has been inserted into bsrMETHOD table' \
%row[0], verbose=5)
return True
def dump_configuration():
"""Creates a dump of the data existing in the bibsort tables"""
try:
results = run_sql("SELECT id, name, definition, washer FROM bsrMETHOD")
except Error, err:
write_message("The error: [%s] occured while trying to get \
the bibsort data from the database." %err, sys.stderr)
return False
write_message('The bibsort data has been read from the database.', verbose=5)
if results:
config = ConfigParser.ConfigParser()
for item in results:
section = "sort_field_%s" % item[0]
config.add_section(section)
config.set(section, "name", item[1])
config.set(section, "definition", item[2])
config.set(section, "washer", item[3])
output_file_name = CFG_ETCDIR + '/bibsort/bibsort_db_dump_%s.cfg' % \
strftime("%d%m%Y%H%M%S", time.localtime())
write_message('Opening the output file %s' %output_file_name)
try:
output_file = open(output_file_name, 'w')
config.write(output_file)
output_file.close()
except Error, err:
write_message('Can not operate on the configuration file %s [%s].' \
%(output_file_name, err), stream=sys.stderr)
return False
write_message('Configuration data dumped to file.')
else:
write_message("The bsrMETHOD table does not contain any data.")
return True
def update_sorting(methods, recids):
"""Runs the updating of the sorting tables for methods and recids
Recids is a list of integer numbers(record ids)
but can also contain intervals"""
method_list = []
if methods:
method_list = methods.strip().split(',')
recid_list = []
if recids:
cli_recid_list = recids.strip().split(',')
for recid in cli_recid_list:
if recid.find('-') > 0:
rec_range = recid.split('-')
try:
recid_min = int(rec_range[0])
recid_max = int(rec_range[1])
for rec in range(recid_min, recid_max + 1):
recid_list.append(rec)
except Error, err:
write_message("Error: [%s] occured while trying \
to parse the recids argument." %err, sys.stderr)
return False
else:
recid_list.append(int(recid))
return run_bibsort_update(recid_list, method_list)
def rebalance(methods):
"""Runs the complete sorting and rebalancing of buckets for
the methods specified in 'methods' argument"""
method_list = []
if methods:
method_list = methods.strip().split(',')
return run_bibsort_rebalance(method_list)
def print_sorting_methods():
"""Outputs the available sorting methods from the DB"""
try:
results = run_sql("SELECT name FROM bsrMETHOD")
except Error, err:
write_message("The error: [%s] occured while trying to \
get the bibsort data from the database." %err)
return False
if results:
methods = []
for result in results:
methods.append(result[0])
if len(methods) > 0:
write_message('Methods: %s' %methods)
else:
write_message("There are no sorting methods configured.")
return True
# main with option parser
# to be used in case the connection with bibsched is not wanted
def main_op():
"""Runs program and handles command line options"""
option_parser = optparse.OptionParser(description="""BibSort tool""")
option_parser.add_option('-L', '--load-config', action = 'store_true', \
help = 'Loads the configuration from bibsort.conf into the database')
option_parser.add_option('-D', '--dump-config', action = 'store_true', \
help = 'Outputs a database dump in form of a config file')
option_parser.add_option('-P', '--print-sorting-methods',
action = 'store_true', \
help = "Prints the available sorting methods")
option_parser.add_option('-R', '--rebalance', action = 'store_true', \
help = "Runs the sorting methods given in '--metods'and rebalances all the buckets. If no method is specified, the rebalance will be done for all the methods in the config file.")
option_parser.add_option('-S', '--update-sorting', action = 'store_true', \
help = "Runs the sorting methods given in '--methods' for the recids given in '--id'. If no method is specified, the update will be done for all the methods in the config file. If no recids are specified, the update will be done for all the records that have been modified/inserted from the last run of the sorting. If you want to run the sorting for all records, you should use the '-R' option")
option_parser.add_option('--methods', action = 'store', dest = 'methods', \
metavar = 'METHODS', \
help = "Specify the sorting methods for which the update_sorting or rebalancing will run (ex: --methods=method1,method2,method3).")
option_parser.add_option('--id', action = 'store', dest = 'recids', \
metavar = 'RECIDS', \
help = "Specify the records for which the update_sorting will run (ex: --id=1,2-56,72) ")
options, dummy = option_parser.parse_args()
if options.load_config and options.dump_config:
option_parser.error('.. conflicting options, please add only one')
elif options.rebalance and options.update_sorting:
option_parser.error('..conflicting options, please add only one')
elif (options.load_config or options.dump_config) and \
(options.rebalance or options.update_sorting):
option_parser.error('..conflicting options, please add only one')
if options.load_config:
load_configuration()
elif options.dump_config:
dump_configuration()
elif options.update_sorting:
update_sorting(options.methods, options.recids)
elif options.rebalance:
rebalance(options.methods)
elif options.print_sorting_methods:
print_sorting_methods()
else:
option_parser.print_help()
def main():
"""Main function that constructs the bibtask"""
task_init(authorization_action='runbibsort',
authorization_msg="BibSort Task Submission",
description = "",
help_specific_usage="""
Specific options:
-l, --load-config Loads the configuration from bibsort.conf into the
database
-d, --dump-config Outputs a database dump in form of a config file
-p, --print-sorting-methods
Prints the available sorting methods
-R, --rebalance Runs the sorting methods given in '--metods'and
rebalances all the buckets. If no method is
specified, the rebalance will be done for all
the methods in the config file.
-S, --update-sorting Runs the sorting methods given in '--methods' for the
recids given in '--id'. If no method is
specified, the update will be done for all the
methods in the config file. If no recids are
specified, the update will be done for all the records
that have been modified/inserted from the last
run of the sorting. If you want to run the
sorting for all records, you should use the '-B'
option
-M, --methods=METHODS Specify the sorting methods for which the
update_sorting or rebalancing will run (ex:
--methods=method1,method2,method3).
-i, --id=RECIDS Specify the records for which the update_sorting will
run (ex: --id=1,2-56,72)
""",
version=__revision__,
specific_params=("ldpRSM:i:",
["load-config",
"dump-config",
"print-sorting-methods",
"rebalance",
"update-sorting",
"methods=",
"id="]),
task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
task_run_fnc=task_run_core)
def task_submit_elaborate_specific_parameter(key, value, opts, dummy_args):
"""Given the string key it checks it's meaning, eventually using the
value. Usually it fills some key in the options dict.
It must return True if it has elaborated the key, False, if it doesn't
know that key."""
#Load configuration
if key in ('-l', '--load-config'):
task_set_option('cmd', 'load')
if ('-d', '') in opts or ('--dump-conf', '') in opts:
raise StandardError(".. conflicting options, please add only one")
#Dump configuration
elif key in ('-d', '--dump_conf'):
task_set_option('cmd', 'dump')
#Print sorting methods
elif key in ('-p', '--print-sorting-methods'):
task_set_option('cmd', 'print')
#Rebalance
elif key in ('-R', '--rebalance'):
task_set_option('cmd', 'rebalance')
if ('-S', '') in opts or ('--update-sorting', '') in opts:
raise StandardError(".. conflicting options, please add only one")
#Update sorting
elif key in ('-S', '--update-sorting'):
task_set_option('cmd', 'sort')
#Define methods
elif key in ('-M', '--methods'):
task_set_option('methods', value)
#Define records
elif key in ('-i', '--id'):
task_set_option('recids', value)
else:
return False
return True
def task_run_core():
"""Reimplement to add the body of the task"""
write_message("bibsort starting..")
cmd = task_get_option('cmd')
methods = task_get_option('methods')
recids = task_get_option('recids')
write_message("Task parameters: command=%s ; methods=%s ; recids=%s" \
% (cmd, methods, recids), verbose=2)
executed_correctly = False
# if no command is defined, run sorting
if not cmd:
cmd = 'sort'
if cmd == 'load':
write_message('Starting loading the configuration \
from the cfg file to the db.', verbose=5)
executed_correctly = load_configuration()
if executed_correctly:
write_message('Loading completed.', verbose=5)
elif cmd == 'dump':
write_message('Starting dumping the configuration \
from the db into the cfg file.', verbose=5)
executed_correctly = dump_configuration()
if executed_correctly:
write_message('Dumping completed.', verbose=5)
elif cmd == 'print':
executed_correctly = print_sorting_methods()
elif cmd == 'sort':
write_message('Starting sorting.', verbose=5)
executed_correctly = update_sorting(methods, recids)
if executed_correctly:
write_message('Sorting completed.', verbose=5)
elif cmd == 'rebalance':
write_message('Starting rebalancing the sorting buckets.', verbose=5)
executed_correctly = rebalance(methods)
if executed_correctly:
write_message('Rebalancing completed.', verbose=5)
else:
write_message("This action is not possible. \
See the --help for available actions.", sys.stderr)
write_message('bibsort exiting..')
return executed_correctly
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/bibsort/engine.py b/invenio/legacy/bibsort/engine.py
index 8c5fbb4fe..d4564ffa1 100644
--- a/invenio/legacy/bibsort/engine.py
+++ b/invenio/legacy/bibsort/engine.py
@@ -1,945 +1,945 @@
## -*- mode: python; coding: utf-8; -*-
##
## This file is part of Invenio.
## Copyright (C) 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibSort Engine"""
import sys
import time
from invenio.utils.date import datetime, strftime
from invenio.legacy.dbquery import deserialize_via_marshal, \
serialize_via_marshal, run_sql, Error
from invenio.legacy.search_engine import get_field_tags, search_pattern
from invenio.intbitset import intbitset
-from invenio.bibtask import write_message, task_update_progress, \
+from invenio.legacy.bibsched.bibtask import write_message, task_update_progress, \
task_sleep_now_if_required
from invenio.config import CFG_BIBSORT_BUCKETS, CFG_CERN_SITE
from invenio.bibsort_washer import BibSortWasher, \
InvenioBibSortWasherNotImplementedError
import invenio.legacy.template
websearch_templates = invenio.legacy.template.load('websearch')
#The space distance between elements, to make inserts faster
CFG_BIBSORT_WEIGHT_DISTANCE = 8
def get_bibsort_methods_details(method_list = None):
"""Returns the id, definition, and washer for the methods in method_list.
If no method_list is specified: we get all the data from bsrMETHOD table"""
bibsort_methods = {}
errors = False
results = []
if not method_list:
try:
results = run_sql("SELECT id, name, definition, washer \
FROM bsrMETHOD")
except Error, err:
write_message("The error: [%s] occured while trying to read " \
"the bibsort data from the database." \
%err, stream=sys.stderr)
return {}, True
if not results:
write_message("The bsrMETHOD table is empty.")
return {}, errors
else:
for method in method_list:
try:
res = run_sql("""SELECT id, name, definition, washer \
FROM bsrMETHOD where name = %s""", (method, ))
except Error, err:
write_message("The error: [%s] occured while trying to get " \
"the bibsort data from the database for method %s." \
%(err, method), stream=sys.stderr)
errors = True
if not res:
write_message("No information for method: %s." % method)
else:
results.append(res[0])
for item in results:
bibsort_methods.setdefault(item[1], {})['id'] = item[0]
bibsort_methods[item[1]]['definition'] = item[2]
bibsort_methods[item[1]]['washer'] = item[3]
return bibsort_methods, errors
def get_all_recids(including_deleted=True):#6.68s on cdsdev
"""Returns a list of all records available in the system"""
res = run_sql("SELECT id FROM bibrec")
if not res:
return intbitset([])
all_recs = intbitset(res)
if not including_deleted: # we want to exclude deleted records
if CFG_CERN_SITE:
deleted = search_pattern(p='980__:"DELETED" OR 980__:"DUMMY"')
else:
deleted = search_pattern(p='980__:"DELETED"')
all_recs.difference_update(deleted)
return all_recs
def get_max_recid():
"""Returns the max id in bibrec - good approximation
for the total number of records"""
try:
return run_sql("SELECT MAX(id) FROM bibrec")[0][0]
except IndexError:
return 0
def _get_values_from_marc_tag(tag, recids):
'''Finds the value for a specific tag'''
digits = tag[0:2]
try:
intdigits = int(digits)
if intdigits < 0 or intdigits > 99:
raise ValueError
except ValueError:
# invalid tag value asked for
write_message('You have asked for an invalid tag value ' \
'[tag=%s; value=%s].' %(tag, intdigits), verbose=5)
return []
bx = "bib%sx" % digits
bibx = "bibrec_bib%sx" % digits
max_recid = get_max_recid()
if len(recids) == 1:
to_append = '= %s'
query_params = [recids.tolist()[0]]
elif len(recids) < max_recid/3:
# if we have less then one third of the records
# use IN
#This realy depends on how large the repository is..
to_append = 'IN %s'
query_params = [tuple(recids)]
else:
# mysql might crush with big queries, better use BETWEEN
to_append = 'BETWEEN %s AND %s'
query_params = [1, max_recid]
query = 'SELECT bibx.id_bibrec, bx.value \
FROM %s AS bx, %s AS bibx \
WHERE bibx.id_bibrec %s \
AND bx.id = bibx.id_bibxxx \
AND bx.tag LIKE %%s' % (bx, bibx, to_append)
query_params.append(tag)
res = run_sql(query, tuple(query_params))
return res
def get_data_for_definition_marc(tags, recids):
'''Having a list of tags and a list of recids, it returns a dictionary
with the values correspondig to the tags'''
#x = all_recids; [get_fieldvalues(recid, '037__a') for recid in x]
#user: 140s, sys: 21s, total: 160s - cdsdev
if isinstance(recids, (int, long)):
recids = intbitset([recids, ])
# for each recid we need only one value
#on which we sort, so we can stop looking for a value
# as soon as we find one
tag_index = 0
field_data_dict = {}
while len(recids) > 0 and tag_index < len(tags):
write_message('%s records queried for values for tags %s.' \
%(len(recids), tags), verbose=5)
res = _get_values_from_marc_tag(tags[tag_index], recids)
res_dict = dict(res)
#field_data_dict.update(res_dict)
#we can not use this, because res_dict might contain recids
#that are already in field_data_dict, and we should not overwrite their value
field_data_dict = dict(res_dict, **field_data_dict)
#there might be keys that we do not want (ex: using 'between')
#so we should remove them
res_dict_keys = intbitset(res_dict.keys())
recids_not_needed = res_dict_keys.difference(recids)
for recid in recids_not_needed:
del field_data_dict[recid]
#update the recids to contain only the recid that do not have values yet
recids.difference_update(res_dict_keys)
tag_index += 1
return field_data_dict
def get_data_for_definition_rnk(method_name, rnk_name):
'''Returns the dictionary with data for method_name ranking method'''
try:
res = run_sql('SELECT d.relevance_data \
from rnkMETHODDATA d, rnkMETHOD r WHERE \
d.id_rnkMETHOD = r.id AND \
r.name = %s', (rnk_name, ))
if res and res[0]:
write_message('Data extracted from table rnkMETHODDATA for sorting method %s' \
%method_name, verbose=5)
return deserialize_via_marshal(res[0][0])
except Error, err:
write_message("No data could be found for sorting method %s. " \
"The following errror occured: [%s]" \
%(method_name, err), stream=sys.stderr)
return {}
def get_data_for_definition_bibrec(column_name, recids_copy):
'''Having a column_name and a list of recids, it returns a dictionary
mapping each recids with its correspondig value from the column'''
dict_column = {}
for recid in recids_copy:
creation_date = run_sql('SELECT %s from bibrec WHERE id = %%s' %column_name, (recid, ))[0][0]
new_creation_date = datetime(creation_date.year,creation_date.month,creation_date.day, \
creation_date.hour,creation_date.minute, creation_date.second)
dict_column[recid] = new_creation_date.strftime('%Y%m%d%H%M%S')
return dict_column
def get_field_data(recids, method_name, definition):
"""Returns the data associated with the definition for recids.
The returned dictionary will contain ONLY the recids for which
a value has been found in the database.
"""
recids_copy = recids.copy()
#if we are dealing with a MARC definition
if definition.startswith('MARC'):
tags = definition.replace('MARC:', '').replace(' ', '').strip().split(',')
if not tags:
write_message('No MARC tags found for method %s.' \
%method_name, verbose=5)
return {}
write_message('The following MARC tags will be queried: %s' %tags, \
verbose=5)
return get_data_for_definition_marc(tags, recids_copy)
#if we are dealing with tags (ex: author, title)
elif definition.startswith('FIELD'):
tags = get_field_tags(definition.replace('FIELD:', '').strip())
if not tags:
write_message('No tags found for method %s.' \
%method_name, verbose=5)
return {}
write_message('The following tags will be queried: %s' %tags, verbose=5)
return get_data_for_definition_marc(tags, recids_copy)
# if we are dealing with ranking data
elif definition.startswith('RNK'):
rnk_name = definition.replace('RNK:', '').strip()
return get_data_for_definition_rnk(method_name, rnk_name)
# if we are looking into bibrec table
elif definition.startswith('BIBREC'):
column_name = definition.replace('BIBREC:', '').strip()
return get_data_for_definition_bibrec(column_name, recids_copy)
else:
write_message("The definition %s for method % could not be recognized" \
%(definition, method_name), stream=sys.stderr)
return {}
def apply_washer(data_dict, washer):
'''The values are filtered using the washer function'''
if not washer:
return
if washer.strip() == 'NOOP':
return
washer = washer.split(':')[0]#in case we have a locale defined
try:
method = BibSortWasher(washer)
write_message('Washer method found: %s' %washer, verbose=5)
for recid in data_dict:
new_val = method.get_transformed_value(data_dict[recid])
data_dict[recid] = new_val
except InvenioBibSortWasherNotImplementedError, err:
write_message("Washer %s is not implemented [%s]." \
%(washer, err), stream=sys.stderr)
def locale_for_sorting(washer):
"""Identifies if any specific locale should be used, and it returns it"""
if washer.find(":") > -1:
lang = washer[washer.index(':')+1:]
return websearch_templates.tmpl_localemap.get(lang, websearch_templates.tmpl_default_locale)
return None
def run_sorting_method(recids, method_name, method_id, definition, washer):
"""Does the actual sorting for the method_name
for all the records in the database"""
run_sorting_for_rnk = False
if definition.startswith('RNK'):
run_sorting_for_rnk = True
field_data_dictionary = get_field_data(recids, method_name, definition)
if not field_data_dictionary:
write_message("POSSIBLE ERROR: The sorting method --%s-- has no data!" \
%method_name)
return True
apply_washer(field_data_dictionary, washer)
#do we have any locale constraint?
sorting_locale = locale_for_sorting(washer)
sorted_data_list, sorted_data_dict = \
sort_dict(field_data_dictionary, CFG_BIBSORT_WEIGHT_DISTANCE, run_sorting_for_rnk, sorting_locale)
executed = write_to_methoddata_table(method_id, field_data_dictionary, \
sorted_data_dict, sorted_data_list)
if not executed:
return False
if CFG_BIBSORT_BUCKETS > 1:
bucket_dict, bucket_last_rec_dict = split_into_buckets(sorted_data_list, len(sorted_data_list))
for idx in bucket_dict:
executed = write_to_buckets_table(method_id, idx, bucket_dict[idx], \
sorted_data_dict[bucket_last_rec_dict[idx]])
if not executed:
return False
else:
executed = write_to_buckets_table(method_id, 1, intbitset(sorted_data_list), \
sorted_data_list[-1])
if not executed:
return False
return True
def write_to_methoddata_table(id_method, data_dict, data_dict_ordered, data_list_sorted, update_timestamp=True):
"""Serialize the date and write it to the bsrMETHODDATA"""
write_message('Starting serializing the data..', verbose=5)
serialized_data_dict = serialize_via_marshal(data_dict)
serialized_data_dict_ordered = serialize_via_marshal(data_dict_ordered)
serialized_data_list_sorted = serialize_via_marshal(data_list_sorted)
write_message('Serialization completed.', verbose=5)
date = strftime("%Y-%m-%d %H:%M:%S", time.localtime())
if not update_timestamp:
try:
date = run_sql('SELECT last_update from bsrMETHODDATA WHERE id_bsrMETHOD = %s', (id_method, ))[0][0]
except IndexError:
pass # keep the generated date
write_message("Starting writing the data for method_id=%s " \
"to the database (table bsrMETHODDATA)" %id_method, verbose=5)
try:
write_message('Deleting old data..', verbose=5)
run_sql("DELETE FROM bsrMETHODDATA WHERE id_bsrMETHOD = %s", (id_method, ))
write_message('Inserting new data..', verbose=5)
run_sql("INSERT into bsrMETHODDATA \
(id_bsrMETHOD, data_dict, data_dict_ordered, data_list_sorted, last_updated) \
VALUES (%s, %s, %s, %s, %s)", \
(id_method, serialized_data_dict, serialized_data_dict_ordered, \
serialized_data_list_sorted, date, ))
except Error, err:
write_message("The error [%s] occured when inserting new bibsort data "\
"into bsrMETHODATA table" %err, sys.stderr)
return False
write_message('Writing to the bsrMETHODDATA successfully completed.', \
verbose=5)
return True
def write_to_buckets_table(id_method, bucket_no, bucket_data, bucket_last_value, update_timestamp=True):
"""Serialize the date and write it to the bsrMEHODDATA_BUCKETS"""
write_message('Writing the data for bucket number %s for ' \
'method_id=%s to the database' \
%(bucket_no, id_method), verbose=5)
write_message('Serializing data for bucket number %s' %bucket_no, verbose=5)
serialized_bucket_data = bucket_data.fastdump()
date = strftime("%Y-%m-%d %H:%M:%S", time.localtime())
if not update_timestamp:
try:
date = run_sql('SELECT last_update from bsrMETHODDATABUCKET WHERE id_bsrMETHOD = %s and bucket_no = %s', \
(id_method, bucket_no))[0][0]
except IndexError:
pass # keep the generated date
try:
write_message('Deleting old data.', verbose=5)
run_sql("DELETE FROM bsrMETHODDATABUCKET \
WHERE id_bsrMETHOD = %s AND bucket_no = %s", \
(id_method, bucket_no, ))
write_message('Inserting new data.', verbose=5)
run_sql("INSERT into bsrMETHODDATABUCKET \
(id_bsrMETHOD, bucket_no, bucket_data, bucket_last_value, last_updated) \
VALUES (%s, %s, %s, %s, %s)", \
(id_method, bucket_no, serialized_bucket_data, bucket_last_value, date, ))
except Error, err:
write_message("The error [%s] occured when inserting new bibsort data " \
"into bsrMETHODATA_BUCKETS table" %err, sys.stderr)
return False
write_message('Writing to bsrMETHODDATABUCKET for ' \
'bucket number %s completed.' %bucket_no, verbose=5)
return True
def split_into_buckets(sorted_data_list, data_size):
"""The sorted_data_list is split into equal buckets.
Returns a dictionary containing the buckets and
a dictionary containing the last record in each bucket"""
write_message("Starting splitting the data into %s buckets." \
%CFG_BIBSORT_BUCKETS, verbose=5)
bucket_dict = {}
bucket_last_rec_dict = {}
step = data_size/CFG_BIBSORT_BUCKETS
i = 0
for i in xrange(CFG_BIBSORT_BUCKETS - 1):
bucket_dict[i+1] = intbitset(sorted_data_list[i*step:i*step+step])
bucket_last_rec_dict[i+1] = sorted_data_list[i*step+step-1]
write_message("Bucket %s done." %(i+1), verbose=5)
#last bucket contains all the remaining data
bucket_dict[CFG_BIBSORT_BUCKETS] = intbitset(sorted_data_list[(i+1)*step:])
bucket_last_rec_dict[CFG_BIBSORT_BUCKETS] = sorted_data_list[-1]
write_message("Bucket %s done." %CFG_BIBSORT_BUCKETS, verbose=5)
write_message("Splitting completed.", verbose=5)
return bucket_dict, bucket_last_rec_dict
def sort_dict(dictionary, spacing=1, run_sorting_for_rnk=False, sorting_locale=None):
"""Sorting a dictionary. Returns a list of sorted recids
and also a dictionary containing the recid: weight
weight = index * spacing"""
#10Mil records dictionary -> 36.9s
write_message("Starting sorting the dictionary " \
"containing all the data..", verbose=5)
sorted_records_dict_with_id = {}
if sorting_locale:
import locale
orig_locale = locale.getlocale(locale.LC_ALL)
try:
locale.setlocale(locale.LC_ALL, sorting_locale)
except locale.Error:
try:
locale.setlocale(locale.LC_ALL, sorting_locale + '.UTF8')
except locale.Error:
write_message("Setting locale to %s is not working.. ignoring locale")
sorted_records_list = sorted(dictionary, key=dictionary.__getitem__, cmp=locale.strcoll, reverse=False)
locale.setlocale(locale.LC_ALL, orig_locale)
else:
sorted_records_list = sorted(dictionary, key=dictionary.__getitem__, reverse=False)
if run_sorting_for_rnk:
#for ranking, we can keep the actual values associated with the recids
return sorted_records_list, dictionary
else:
index = 1
for recid in sorted_records_list:
sorted_records_dict_with_id[recid] = index * spacing
index += 1
write_message("Dictionary sorted.", verbose=5)
return sorted_records_list, sorted_records_dict_with_id
def get_modified_or_inserted_recs(method_list):
"""Returns a list of recids that have been inserted or
modified since the last update of the bibsort methods in method_list
method_list should already contain a list of methods that
SHOULD be updated, if it contains new methods, an error will be thrown"""
if not method_list: #just to be on the safe side
return 0
try:
query = "SELECT min(d.last_updated) from bsrMETHODDATA d, bsrMETHOD m \
WHERE m.name in (%s) AND d.id_bsrMETHOD = m.id" % \
("%s," * len(method_list))[:-1]
last_updated = str(run_sql(query, tuple(method_list))[0][0])
except Error, err:
write_message("Error when trying to get the last_updated date " \
"from bsrMETHODDATA: [%s]" %err, sys.stderr)
return 0
recids = []
try:
results = run_sql("SELECT id from bibrec \
where modification_date >= %s", (last_updated, ))
if results:
recids = [result[0] for result in results]
except Error, err:
write_message("Error when trying to get the list of " \
"modified records: [%s]" %err, sys.stderr)
return 0
return recids
def get_rnk_methods(bibsort_methods):
"""Returns the list of bibsort methods (names) that are RNK methods"""
return [method for method in bibsort_methods if \
bibsort_methods[method]['definition'].startswith('RNK')]
def get_modified_non_rnk_methods(non_rnk_method_list):
"""Returns 2 lists of non RNK methods:
updated_ranking_methods = non RNK methods that need to be updated
inserted_ranking_methods = non RNK methods, that have no data yet,
so rebalancing should run on them"""
updated_ranking_methods = []
inserted_ranking_methods = []
for method in non_rnk_method_list:
try:
dummy = str(run_sql('SELECT d.last_updated \
FROM bsrMETHODDATA d, bsrMETHOD m \
WHERE m.id = d.id_bsrMETHOD \
AND m.name=%s', (method, ))[0][0])
updated_ranking_methods.append(method)
except IndexError: #method is not in bsrMETHODDATA -> is new
inserted_ranking_methods.append(method)
return updated_ranking_methods, inserted_ranking_methods
def get_modified_rnk_methods(rnk_method_list, bibsort_methods):
"""Returns the list of RNK methods that have been recently modified,
so they will need to have their bibsort data updated"""
updated_ranking_methods = []
deleted_ranking_methods = []
for method in rnk_method_list:
method_name = bibsort_methods[method]['definition'].replace('RNK:', '').strip()
try:
last_updated_rnk = str(run_sql('SELECT last_updated \
FROM rnkMETHOD \
WHERE name = %s', (method_name, ))[0][0])
except IndexError:
write_message("The method %s could not be found in rnkMETHOD" \
%(method_name), stream=sys.stderr)
#this method does not exist in rnkMETHOD,
#it might have been a mistype or it might have been deleted
deleted_ranking_methods.append(method)
if method not in deleted_ranking_methods:
try:
last_updated_bsr = str(run_sql('SELECT d.last_updated \
FROM bsrMETHODDATA d, bsrMETHOD m \
WHERE m.id = d.id_bsrMETHOD \
AND m.name=%s', (method, ))[0][0])
if last_updated_rnk >= last_updated_bsr:
# rnk data has been updated after bibsort ran
updated_ranking_methods.append(method)
else:
write_message("The method %s has not been updated "\
"since the last run of bibsort." %method)
except IndexError:
write_message("The method %s could not be found in bsrMETHODDATA" \
%(method))
# that means that the bibsort never run on this method, so let's run it
updated_ranking_methods.append(method)
return updated_ranking_methods, deleted_ranking_methods
def delete_bibsort_data_for_method(method_id):
"""This method will delete all data asociated with a method
from bibsort tables (except bsrMETHOD).
Returns False in case some error occured, True otherwise"""
try:
run_sql("DELETE FROM bsrMETHODDATA WHERE id_bsrMETHOD = %s", (method_id, ))
run_sql("DELETE FROM bsrMETHODDATABUCKET WHERE id_bsrMETHOD = %s", (method_id, ))
except:
return False
return True
def delete_all_data_for_method(method_id):
"""This method will delete all data asociated with a method
from bibsort tables.
Returns False in case some error occured, True otherwise"""
method_name = 'method name'
try:
run_sql("DELETE FROM bsrMETHODDATA WHERE id_bsrMETHOD = %s", (method_id, ))
run_sql("DELETE FROM bsrMETHODDATABUCKET WHERE id_bsrMETHOD = %s", (method_id, ))
run_sql("DELETE FROM bsrMETHODNAME WHERE id_bsrMETHOD = %s", (method_id, ))
run_sql("DELETE FROM bsrMETHOD WHERE id = %s", (method_id, ))
method_name = run_sql("SELECT name from bsrMETHOD WHERE id = %s", (method_id, ))[0][0]
except Error:
return False
except IndexError:
return True
if method_name:# the method has not been deleted
return False
return True
def add_sorting_method(method_name, method_definition, method_treatment):
"""This method will add a new sorting method in the database
and update the config file"""
try:
run_sql("INSERT INTO bsrMETHOD(name, definition, washer) \
VALUES (%s, %s, %s)", (method_name, method_definition, method_treatment))
except Error:
return False
return True
def update_bibsort_tables(recids, method, update_timestamp = True):
"""Updates the data structures for sorting method: method
for the records in recids"""
res = run_sql("SELECT id, definition, washer \
from bsrMETHOD where name = %s", (method, ))
if res and res[0]:
method_id = res[0][0]
definition = res[0][1]
washer = res[0][2]
else:
write_message('No sorting method called %s could be found ' \
'in bsrMETHOD table.' %method, sys.stderr)
return False
res = run_sql("SELECT data_dict, data_dict_ordered, data_list_sorted \
FROM bsrMETHODDATA where id_bsrMETHOD = %s", (method_id, ))
if res and res[0]:
data_dict = deserialize_via_marshal(res[0][0])
data_dict_ordered = {}
data_list_sorted = []
else:
write_message('No data could be found for the sorting method %s.' \
%method)
return False #since this case should have been treated earlier
#get the values for the recids that need to be recalculated
field_data = get_field_data(recids, method, definition)
if not field_data:
write_message("Possible error: the method %s has no data for records %s." \
%(method, str(recids)))
else:
apply_washer(field_data, washer)
#if a recid is not in field_data that is because no value was found for it
#so it should be marked for deletion
recids_to_delete = list(recids.difference(intbitset(field_data.keys())))
recids_to_insert = []
recids_to_modify = {}
for recid in field_data:
if recid in data_dict:
if data_dict[recid] != field_data[recid]:
#we store the old value
recids_to_modify[recid] = data_dict[recid]
else: # recid is new, and needs to be inserted
recids_to_insert.append(recid)
#remove the recids that were not previously in bibsort
recids_to_delete = [recid for recid in recids_to_delete if recid in data_dict]
#dicts to keep the ordered values for the recids - useful bor bucket insertion
recids_current_ordered = {}
recids_old_ordered = {}
if recids_to_insert or recids_to_modify or recids_to_delete:
data_dict_ordered = deserialize_via_marshal(res[0][1])
data_list_sorted = deserialize_via_marshal(res[0][2])
if recids_to_modify:
write_message("%s records have been modified." \
%len(recids_to_modify), verbose=5)
for recid in recids_to_modify:
recids_old_ordered[recid] = data_dict_ordered[recid]
perform_modify_record(data_dict, data_dict_ordered, \
data_list_sorted, field_data[recid], recid)
if recids_to_insert:
write_message("%s records have been inserted." \
%len(recids_to_insert), verbose=5)
for recid in recids_to_insert:
perform_insert_record(data_dict, data_dict_ordered, \
data_list_sorted, field_data[recid], recid)
if recids_to_delete:
write_message("%s records have been deleted." \
%len(recids_to_delete), verbose=5)
for recid in recids_to_delete:
perform_delete_record(data_dict, data_dict_ordered, data_list_sorted, recid)
for recid in recids_to_modify:
recids_current_ordered[recid] = data_dict_ordered[recid]
for recid in recids_to_insert:
recids_current_ordered[recid] = data_dict_ordered[recid]
#write the modifications to db
executed = write_to_methoddata_table(method_id, data_dict, \
data_dict_ordered, data_list_sorted, update_timestamp)
if not executed:
return False
#update buckets
try:
perform_update_buckets(recids_current_ordered, recids_to_insert, recids_old_ordered, method_id, update_timestamp)
except Error, err:
write_message("[%s] The bucket data for method %s has not been updated" \
%(method, err), sys.stderr)
return False
return True
def perform_update_buckets(recids_current_ordered, recids_to_insert, recids_old_ordered, method_id, update_timestamp = True):
"""Updates the buckets"""
bucket_insert = {}
bucket_delete = {}
write_message("Updating the buckets for method_id = %s" %method_id, verbose=5)
buckets = run_sql("SELECT bucket_no, bucket_last_value \
FROM bsrMETHODDATABUCKET \
WHERE id_bsrMETHOD = %s", (method_id, ))
if not buckets:
write_message("No bucket data found for method_id %s." \
%method_id, sys.stderr)
raise Exception
#sort the buckets to be sure we are iterating them in order(1 to max):
buckets_dict = dict(buckets)
for recid in recids_to_insert:
for bucket_no in buckets_dict:
if recids_current_ordered[recid] <= buckets_dict[bucket_no]:
bucket_insert.setdefault(bucket_no, []).append(recid)
break
for recid in recids_old_ordered:
record_inserted = 0
record_deleted = 0
for bucket_no in buckets_dict:
bucket_value = int(buckets_dict[bucket_no])
if record_inserted and record_deleted:
#both insertion and deletion have been registered
break
if recids_current_ordered[recid] <= bucket_value and \
recids_old_ordered[recid] <= bucket_value and \
not record_inserted and \
not record_deleted:
#both before and after the modif,
#recid should be in the same bucket -> nothing to do
break
if recids_current_ordered[recid] <= bucket_value and not record_inserted:
#recid should be, after the modif, here, so insert
bucket_insert.setdefault(bucket_no, []).append(recid)
record_inserted = 1
if recids_old_ordered[recid] <= bucket_value and not record_deleted:
#recid was here before modif, must be removed
bucket_delete.setdefault(bucket_no, []).append(recid)
record_deleted = 1
for bucket_no in buckets_dict:
if (bucket_no in bucket_insert) or (bucket_no in bucket_delete):
res = run_sql("SELECT bucket_data FROM bsrMETHODDATABUCKET \
where id_bsrMETHOD = %s AND bucket_no = %s", \
(method_id, bucket_no, ))
bucket_data = intbitset(res[0][0])
for recid in bucket_insert.get(bucket_no, []):
bucket_data.add(recid)
for recid in bucket_delete.get(bucket_no, []):
bucket_data.remove(recid)
if update_timestamp:
date = strftime("%Y-%m-%d %H:%M:%S", time.localtime())
run_sql("UPDATE bsrMETHODDATABUCKET \
SET bucket_data = %s, last_updated = %s \
WHERE id_bsrMETHOD = %s AND bucket_no = %s", \
(bucket_data.fastdump(), date, method_id, bucket_no, ))
else:
run_sql("UPDATE bsrMETHODDATABUCKET \
SET bucket_data = %s \
WHERE id_bsrMETHOD = %s AND bucket_no = %s", \
(bucket_data.fastdump(), method_id, bucket_no, ))
write_message("Updating bucket %s for method %s." %(bucket_no, method_id), verbose=5)
def perform_modify_record(data_dict, data_dict_ordered, data_list_sorted, value, recid, spacing=CFG_BIBSORT_WEIGHT_DISTANCE):
"""Modifies all the data structures with the new information
about the record"""
#remove the recid from the old position, to make place for the new value
data_list_sorted.remove(recid)
# from now on, it is the same thing as insert
return perform_insert_record(data_dict, data_dict_ordered, data_list_sorted, value, recid, spacing)
def perform_insert_record(data_dict, data_dict_ordered, data_list_sorted, value, recid, spacing=CFG_BIBSORT_WEIGHT_DISTANCE):
"""Inserts a new record into all the data structures"""
#data_dict
data_dict[recid] = value
#data_dict_ordered & data_list_sorted
#calculate at which index the rec should be inserted in data_list_sorted
index_for_insert = binary_search(data_list_sorted, value, data_dict)
#we have to calculate the weight of this record in data_dict_ordered
#and it will be the med between its neighbours in the data_list_sorted
if index_for_insert == len(data_list_sorted):#insert at the end of the list
#append at the end of the list
data_list_sorted.append(recid)
#weight = highest weight + the distance
data_dict_ordered[recid] = data_dict_ordered[data_list_sorted[index_for_insert - 1]] + spacing
else:
if index_for_insert == 0: #insert at the begining of the list
left_neighbor_weight = 0
else:
left_neighbor_weight = data_dict_ordered[data_list_sorted[index_for_insert - 1]]
right_neighbor_weight = data_dict_ordered[data_list_sorted[index_for_insert]]
#the recid's weight will be the med between left and right
weight = (right_neighbor_weight - left_neighbor_weight)/2
if weight < 1: #there is no more space to insert, we have to create some space
data_list_sorted.insert(index_for_insert, recid)
data_dict_ordered[recid] = left_neighbor_weight + spacing
create_space_for_new_weight(index_for_insert, data_dict_ordered, data_list_sorted, spacing)
else:
data_list_sorted.insert(index_for_insert, recid)
data_dict_ordered[recid] = left_neighbor_weight + weight
write_message("Record %s done." %recid, verbose=5)
return index_for_insert
def perform_delete_record(data_dict, data_dict_ordered, data_list_sorted, recid):
"""Delete a record from all the data structures"""
#data_dict
del data_dict[recid]
#data_list_sorted
data_list_sorted.remove(recid)
#data_dict_ordered
del data_dict_ordered[recid]
write_message("Record %s done." %recid, verbose=5)
return 1
def create_space_for_new_weight(index_for_insert, data_dict_ordered, data_list_sorted, spacing):
"""In order to keep an order of the records in data_dict_ordered, when a new
weight is inserted, there needs to be some place for it
(ex: recid3 needs to be inserted between recid1-with weight=10 and recid2-with weight=11)
The scope of this function is to increease the distance between recid1 and recid2
(and thus all the weights after recid2) so that recid3 will have an integer weight"""
for i in range(index_for_insert+1, len(data_list_sorted)):
data_dict_ordered[data_list_sorted[i]] += 2 * spacing
def binary_search(sorted_list, value, data_dict):
"""Binary Search O(log n)"""
minimum = -1
maximum = len(sorted_list)
while maximum - minimum > 1:
med = (maximum+minimum)/2
recid1 = sorted_list[med]
value1 = data_dict[recid1]
if value1 > value:
maximum = med
elif value1 < value:
minimum = med
else:
return med
return minimum + 1
def run_bibsort_update(recids=None, method_list=None):
"""Updates bibsort tables for the methods in method_list
and for the records in recids.
If recids is None: recids = all records that have been modified
or inserted since last update
If method_list is None: method_list = all the methods available
in bsrMETHOD table"""
write_message('Initial data for run_bibsort_update method: ' \
'number of recids = %s; method_list=%s' \
%(str(len(recids)), method_list), verbose=5)
write_message('Updating sorting data.')
bibsort_methods, errors = get_bibsort_methods_details(method_list)
if errors:
return False
method_list = bibsort_methods.keys()
if not method_list:
write_message('No methods found in bsrMETHOD table.. exiting.')
return True
#we could have 4 types of methods:
#(i) RNK methods -> they should be rebalanced, not updated
#(ii) RNK methods to delete -> we should delete their data
#(iii) non RNK methods to update
#(iv) non RNK methods that are new -> they should be rebalanced(sorted), not updated
#check which of the methods are RNK methods (they do not need modified recids)
rnk_methods = get_rnk_methods(bibsort_methods)
rnk_methods_updated, rnk_methods_deleted = get_modified_rnk_methods(rnk_methods, bibsort_methods)
#check which of the methods have no data, so they are actually new,
#so they need balancing(sorting) instead of updating
non_rnk_methods = [method for method in bibsort_methods.keys() if method not in rnk_methods]
non_rnk_methods_updated, non_rnk_methods_inserted = get_modified_non_rnk_methods(non_rnk_methods)
#(i) + (iv)
methods_to_balance = rnk_methods_updated + non_rnk_methods_inserted
if methods_to_balance: # several methods require rebalancing(sorting) and not updating
return run_bibsort_rebalance(methods_to_balance)
#(ii)
#remove the data for the ranking methods that have been deleted
for method in rnk_methods_deleted:
task_sleep_now_if_required(can_stop_too=True)
task_update_progress("Deleting data for method %s" %method)
write_message('Starting deleting the data for RNK method %s' %method, verbose=5)
executed_ok = delete_bibsort_data_for_method(bibsort_methods[method]['id'])
if not executed_ok:
write_message('Method %s could not be deleted correctly, aborting..' \
%method, sys.stderr)
return False
#(iii)
#methods to actually update
if non_rnk_methods_updated: # we want to update some 'normal'(not RNK) tables, so we need recids
update_timestamp = False
if not recids:
recids = get_modified_or_inserted_recs(non_rnk_methods_updated)
if recids == 0: #error signal
return False
if not recids:
write_message("No records inserted or modified in bibrec table " \
"since the last update of bsrMETHODDATA.")
return True
write_message("These records have been recently modified/inserted: %s" \
%str(recids), verbose=5)
update_timestamp = True
recids_i = intbitset(recids)
for method in non_rnk_methods_updated:
task_sleep_now_if_required(can_stop_too=True)
task_update_progress("Updating method %s" %method)
write_message('Starting updating method %s' %method, verbose=5)
executed_ok = update_bibsort_tables(recids_i, method, update_timestamp)
if not executed_ok:
write_message('Method %s could not be executed correctly, aborting..' \
%method, sys.stderr)
return False
return True
def run_bibsort_rebalance(method_list = None):
"""Rebalances all buckets for the methods in method_list"""
bibsort_methods, errors = get_bibsort_methods_details(method_list)
if errors:
return False
if not bibsort_methods:
write_message('No methods found.. exiting rebalancing.')
return True
#check if there are only ranking methods -> no need for recids
rnk_methods = get_rnk_methods(bibsort_methods)
non_rnk_method = [method for method in bibsort_methods.keys() if method not in rnk_methods]
write_message('Running rebalancing for methods: %s' %bibsort_methods.keys())
if non_rnk_method:# we have also 'normal' (no RNK) methods, so we need the recids
recids = get_all_recids(including_deleted=False)
write_message('Rebalancing will run for %s records.' \
%str(len(recids)), verbose=5)
task_sleep_now_if_required(can_stop_too=True)
else:
recids = intbitset([])
write_message('Rebalancing will run only for RNK methods')
for name in bibsort_methods:
task_update_progress('Rebalancing %s method.' %name)
write_message('Starting sorting the data for %s method ... ' \
%name.upper())
executed_ok = run_sorting_method(recids, name,
bibsort_methods[name]['id'],
bibsort_methods[name]['definition'],
bibsort_methods[name]['washer'])
if not executed_ok:
write_message('Method %s could not be executed correctly.' \
%name, sys.stderr)
return False
write_message('Done.')
task_sleep_now_if_required(can_stop_too=True)
task_update_progress('Rebalancing done.')
return True
def main():
"""tests"""
#print "Running bibsort_rebalance...."
#run_bibsort_rebalance() #rebalances everything
#print "Running bibsort_rebalance for title and author...."
#run_bibsort_rebalance(['title', 'author']) #rebalances only these methods
#print "Running bibsort_update...."
#run_bibsort_update() #update all the methods
#print "Running bibsort_update for title and author...."
#run_bibsort_update(method_list = ['title', 'author'])
#print "Running bibsort_update for records 1,2,3, title author...."
#run_bibsort_update(recids = [1, 2, 3], method_list = ['title', 'author'])
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/bibsword/client_formatter.py b/invenio/legacy/bibsword/client_formatter.py
index b49e4f087..47b007f94 100644
--- a/invenio/legacy/bibsword/client_formatter.py
+++ b/invenio/legacy/bibsword/client_formatter.py
@@ -1,1196 +1,1196 @@
##This file is part of Invenio.
## Copyright (C) 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
'''
BibSWORD Client Formatter
'''
import zipfile
import os
from tempfile import mkstemp
from xml.dom import minidom
from invenio.config import CFG_TMPDIR
-from invenio.bibtask import task_low_level_submission
+from invenio.legacy.bibsched.bibtask import task_low_level_submission
from invenio.bibsword_config import CFG_MARC_REPORT_NUMBER, \
CFG_MARC_TITLE, \
CFG_MARC_AUTHOR_NAME, \
CFG_MARC_AUTHOR_AFFILIATION, \
CFG_MARC_CONTRIBUTOR_NAME, \
CFG_MARC_CONTRIBUTOR_AFFILIATION, \
CFG_MARC_ABSTRACT, \
CFG_MARC_ADDITIONAL_REPORT_NUMBER, \
CFG_MARC_DOI, \
CFG_MARC_JOURNAL_REF_CODE, \
CFG_MARC_JOURNAL_REF_TITLE, \
CFG_MARC_JOURNAL_REF_PAGE, \
CFG_MARC_JOURNAL_REF_YEAR, \
CFG_MARC_COMMENT, \
CFG_MARC_RECORD_SUBMIT_INFO, \
CFG_SUBMIT_ARXIV_INFO_MESSAGE, \
CFG_DOCTYPE_UPLOAD_COLLECTION, \
CFG_SUBMISSION_STATUS_SUBMITTED, \
CFG_SUBMISSION_STATUS_PUBLISHED, \
CFG_SUBMISSION_STATUS_ONHOLD, \
CFG_SUBMISSION_STATUS_REMOVED
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
from invenio.modules.formatter.engine import BibFormatObject
#-------------------------------------------------------------------------------
# Formating servicedocument file
#-------------------------------------------------------------------------------
def format_remote_server_infos(servicedocument):
'''
Get all informations about the server's options such as SWORD version,
maxUploadSize, ... These informations are found in the servicedocument
of the given server
@param servicedocument: xml servicedocument in a string format
@return: server_infomation. tuple containing the version, the
maxUploadSize and the available modes
'''
#contains information tuple {'version', 'maxUploadSize', 'verbose', 'noOp'}
server_informations = {'version' : '',
'maxUploadSize' : '',
'verbose' : '',
'noOp' : '',
'error' : '' }
# now the xml node are accessible by programation
try:
parsed_xml_collections = minidom.parseString(servicedocument)
except IOError:
server_informations['error'] = \
'No servicedocument found for the remote server'
return server_informations
# access to the root of the xml file
xml_services = parsed_xml_collections.getElementsByTagName('service')
xml_service = xml_services[0]
# get value of the node <sword:version>
version_node = xml_service.getElementsByTagName('sword:version')[0]
server_informations['version'] = \
version_node.firstChild.nodeValue.encode('utf-8')
# get value of the node <sword:maxUploadSize>
max_upload_node = xml_service.getElementsByTagName('sword:maxUploadSize')[0]
server_informations['maxUploadSize'] = \
max_upload_node.firstChild.nodeValue.encode('utf-8')
# get value of the node <sword:verbose>
verbose_node = xml_service.getElementsByTagName('sword:verbose')[0]
server_informations['verbose'] = \
verbose_node.firstChild.nodeValue.encode('utf-8')
# get value of the node <sword:noOp>
no_op_node = xml_service.getElementsByTagName('sword:noOp')[0]
server_informations['noOp'] = \
no_op_node.firstChild.nodeValue.encode('utf-8')
return server_informations
def format_remote_collection(servicedocument):
'''
The function parse the servicedocument document and return a list with
the collections of the given file ['id', 'name', 'url']
@param servicedocument: xml file returned by the remote server.
@return: the list of collection found in the service document
'''
collections = [] # contains list of collection tuple {'id', 'url', 'label'}
# get the collections root node
collection_nodes = parse_xml_servicedocument_file(servicedocument)
# i will be the id of the collection
i = 1
#---------------------------------------------------------------------------
# recuperation of the collections
#---------------------------------------------------------------------------
# loop that goes in each node's collection of the document
for collection_node in collection_nodes:
# dictionnary that contains the collections
collection = {}
collection['id'] = str(i)
i = i + 1
# collection uri (where to deposit the media)
collection['url'] = \
collection_node.attributes['href'].value.encode('utf-8')
# collection name that is displayed to the user
xml_title = collection_node.getElementsByTagName('atom:title')
collection['label'] = xml_title[0].firstChild.nodeValue.encode('utf-8')
# collection added to the collections list
collections.append(collection)
return collections
def format_collection_informations(servicedocument, id_collection):
'''
This methode parse the given servicedocument to find the given collection
node. Then it retrieve all information about the collection that contains
the collection node.
@param servicedocument: xml file returned by the remote server.
@param id_collection: position of the collection in the sd (1 = first)
@return: (collection_informations) tuple containing infos
'''
# contains information tuple {[accept], 'collectionPolicy', 'mediation',
# 'treatment', 'accept_packaging'}
collection_informations = {}
# get the collections root node
collection_nodes = parse_xml_servicedocument_file(servicedocument)
# recuperation of the selected collection
collection_node = collection_nodes[int(id_collection)-1]
# get value of the nodes <accept>
accept_nodes = collection_node.getElementsByTagName('accept')
accept = []
for accept_node in accept_nodes:
accept.append(accept_node.firstChild.nodeValue.encode('utf-8'))
collection_informations['accept'] = accept
# get value of the nodes <sword:collectionPolicy>
collection_policy = \
collection_node.getElementsByTagName('sword:collectionPolicy')[0]
collection_informations['collectionPolicy'] = \
collection_policy.firstChild.nodeValue.encode('utf-8')
# get value of the nodes <sword:mediation>
mediation = collection_node.getElementsByTagName('sword:mediation')[0]
collection_informations['mediation'] = \
mediation.firstChild.nodeValue.encode('utf-8')
# get value of the nodes <sword:treatment>
treatment = collection_node.getElementsByTagName('sword:treatment')[0]
collection_informations['treatment'] = \
treatment.firstChild.nodeValue.encode('utf-8')
# get value of the nodes <sword:acceptPackaging>
accept_packaging = \
collection_node.getElementsByTagName('sword:acceptPackaging')[0]
collection_informations['accept_packaging'] = \
accept_packaging.firstChild.nodeValue.encode('utf-8')
return collection_informations
def format_primary_categories(servicedocument, collection_id=0):
'''
This method parse the servicedocument to retrieve the primary category
of the given collection. If no collection is given, it takes the first
one.
@param servicedocument: xml file returned by the remote server.
@param collection_id: id of the collection to search
@return: list of primary categories tuple ('id', 'url', 'label')
'''
categories = [] # contains list of category tuple {'id', 'url', 'label'}
# get the collections root node
collection_nodes = parse_xml_servicedocument_file(servicedocument)
# i will be the id of the collection
i = 1
# recuperation of the selected collection
collection_node = collection_nodes[int(collection_id)-1]
#---------------------------------------------------------------------------
# recuperation of the categories
#---------------------------------------------------------------------------
# select all primary category nodes
primary_categories_node = \
collection_node.getElementsByTagName('arxiv:primary_categories')[0]
primary_category_nodes = \
primary_categories_node.getElementsByTagName('arxiv:primary_category')
# loop that goes in each primary_category nodes
for primary_category_node in primary_category_nodes:
# dictionnary that contains the categories
category = {}
category['id'] = str(i)
i = i + 1
category['url'] = \
primary_category_node.attributes['term'].value.encode('utf-8')
category['label'] = \
primary_category_node.attributes['label'].value.encode('utf-8')
categories.append(category)
return categories
def format_secondary_categories(servicedocument, collection_id=0):
'''
This method parse the servicedocument to retrieve the optional categories
of the given collection. If no collection is given, it takes the first
one.
@param servicedocument: xml file returned by the remote server.
@param collection_id: id of the collection to search
@return: list of optional categories tuple ('id', 'url', 'label')
'''
categories = [] # contains list of category tuple {'id', 'url', 'label'}
# get the collections root node
collection_nodes = parse_xml_servicedocument_file(servicedocument)
# i will be the id of the collection
i = 1
# recuperation of the selected collection
collection_id = int(collection_id) - 1
collection_node = collection_nodes[int(collection_id)]
#---------------------------------------------------------------------------
# recuperation of the categories
#---------------------------------------------------------------------------
# select all primary category nodes
categories_node = collection_node.getElementsByTagName('categories')[0]
category_nodes = categories_node.getElementsByTagName('category')
# loop that goes in each primary_category nodes
for category_node in category_nodes:
# dictionnary that contains the categories
category = {}
category['id'] = str(i)
i = i + 1
category['url'] = category_node.attributes['term'].value.encode('utf-8')
category['label'] = \
category_node.attributes['label'].value.encode('utf-8')
categories.append(category)
return categories
def parse_xml_servicedocument_file(servicedocument):
'''
This method parse a string containing a servicedocument to retrieve the
collection node. It is used by all function that needs to work with
collections
@param servicedocument: xml file in containing in a string
@return: (collecion_node) root node of all collecions
'''
# now the xml node are accessible by programation
parsed_xml_collections = minidom.parseString(servicedocument)
# access to the root of the xml file
xml_services = parsed_xml_collections.getElementsByTagName('service')
xml_service = xml_services[0]
# their is only the global workspace in this xml document
xml_workspaces = xml_service.getElementsByTagName('workspace')
xml_workspace = xml_workspaces[0]
# contains all collections in the xml file
collection_nodes = xml_workspace.getElementsByTagName('collection')
return collection_nodes
#-------------------------------------------------------------------------------
# Formating marcxml file
#-------------------------------------------------------------------------------
def get_report_number_from_macrxml(marcxml):
'''
retrieve the record id stored in the marcxml file. The record is in the
tag 'RECORD ID'
@param marcxml: marcxml file where to look for the record id
@return: the record id in a string
'''
#get the reportnumber tag list
tag = CFG_MARC_REPORT_NUMBER
if tag == '':
return ''
#variable that contains the result of the parsing of the marcxml file
datafields = get_list_of_marcxml_datafields(marcxml)
for datafield in datafields:
report_number = get_subfield_value_from_datafield(datafield, tag)
if report_number != '':
return report_number
return ''
def get_medias_to_submit(media_paths):
'''
This method get a list of recod of submission. It format a list of
media containing name, size, type and file for each media id
@param media_paths: list of path to the media to upload
@return: list of media tuple
'''
# define the return value
media = {}
fp = open("/tmp/test.txt", "w")
fp.write(media_paths[0])
if len(media_paths) > 1:
media_paths = format_file_to_zip_archiv(media_paths)
else:
media_paths = media_paths[0]
if media_paths != '':
media['file'] = open(media_paths, "r").read()
media['size'] = len(media['file'])
media['name'] = media_paths.split('/')[-1].split(';')[0]
media['type'] = 'application/%s' % media['name'].split('.')[-1]
return media
def get_media_from_recid(recid):
'''
This method get the file in the given url
@param recid: id of the file to get
'''
medias = []
bibarchiv = BibRecDocs(recid)
bibdocs = bibarchiv.list_latest_files()
for bibdocfile in bibdocs:
bibfile = {'name': bibdocfile.get_full_name(),
'file': '',
'type': 'application/%s' % \
bibdocfile.get_superformat().split(".")[-1],
'path': bibdocfile.get_full_path(),
'collection': bibdocfile.get_type(),
'size': bibdocfile.get_size(),
'loaded': False,
'selected': ''}
if bibfile['collection'] == "Main":
bibfile['selected'] = 'checked=yes'
medias.append(bibfile)
return medias
def format_author_from_marcxml(marcxml):
'''
This method parse the marcxml file to retrieve the author of a document
@param marcxml: the xml file to parse
@return: tuple containing {'name', 'email' and 'affiliations'}
'''
#get the tag id for the given field
main_author = CFG_MARC_AUTHOR_NAME
main_author_affiliation = CFG_MARC_AUTHOR_AFFILIATION
#variable that contains the result of the parsing of the marcxml file
datafields = get_list_of_marcxml_datafields(marcxml)
#init the author tuple
author = {'name':'', 'email':'', 'affiliation':[]}
for datafield in datafields:
# retreive the main author
if author['name'] == '':
name = get_subfield_value_from_datafield(datafield, main_author)
if name != '':
author['name'] = name
affiliation = get_subfield_value_from_datafield(datafield, main_author_affiliation)
if affiliation != '':
author['affiliation'].append(affiliation)
return author
def format_marcxml_file(marcxml, is_file=False):
'''
Parse the given marcxml file to retreive the metadata needed by the
forward of the document to ArXiv.org
@param marcxml: marxml file that contains metadata from Invenio
@return: (dictionnary) couple of key value needed for the push
'''
#init the return tuple
marcxml_values = { 'id' : '',
'title' : '',
'summary' : '',
'contributors' : [],
'journal_refs' : [],
'report_nos' : [],
'comment' : '',
'doi' : '' }
# check if the marcxml is not empty
if marcxml == '':
marcxml_values['error'] = "MARCXML string is empty !"
return marcxml_values
#get the tag id and code from tag table
main_report_number = CFG_MARC_REPORT_NUMBER
add_report_number = CFG_MARC_ADDITIONAL_REPORT_NUMBER
main_title = CFG_MARC_TITLE
main_summary = CFG_MARC_ABSTRACT
main_author = CFG_MARC_AUTHOR_NAME
main_author_affiliation = CFG_MARC_AUTHOR_AFFILIATION
add_author = CFG_MARC_CONTRIBUTOR_NAME
add_author_affiliation = CFG_MARC_CONTRIBUTOR_AFFILIATION
main_comment = CFG_MARC_COMMENT
doi = CFG_MARC_DOI
journal_ref_code = CFG_MARC_JOURNAL_REF_CODE
journal_ref_title = CFG_MARC_JOURNAL_REF_TITLE
journal_ref_page = CFG_MARC_JOURNAL_REF_PAGE
journal_ref_year = CFG_MARC_JOURNAL_REF_YEAR
#init tmp values
contributor = {'name' : '', 'email' : '', 'affiliation' : []}
try:
bfo = BibFormatObject(recID=None, xml_record=marcxml)
except:
marcxml_values['error'] = "Unable to open marcxml file !"
return marcxml_values
marcxml_values = { 'id' : bfo.field(main_report_number),
'title' : bfo.field(main_title),
'summary' : bfo.field(main_summary),
'report_nos' : bfo.fields(add_report_number),
'contributors' : [],
'journal_refs' : [],
'comment' : bfo.field(main_comment),
'doi' : bfo.field(doi)}
authors = bfo.fields(main_author[:-1], repeatable_subfields_p=True)
for author in authors:
name = author.get(main_author[-1], [''])[0]
affiliation = author.get(main_author_affiliation[-1], [])
author = {'name': name, 'email': '', 'affiliation': affiliation}
marcxml_values['contributors'].append(author)
authors = bfo.fields(add_author[:-1], repeatable_subfields_p=True)
for author in authors:
name = author.get(add_author[-1], [''])[0]
affiliation = author.get(add_author_affiliation[-1], [])
author = {'name': name, 'email': '', 'affiliation': affiliation}
marcxml_values['contributors'].append(author)
journals = bfo.fields(journal_ref_title[:-1])
for journal in journals:
journal_title = journal.get(journal_ref_title[-1], '')
journal_page = journal.get(journal_ref_page[-1], '')
journal_code = journal.get(journal_ref_code[-1], '')
journal_year = journal.get(journal_ref_year[-1], '')
journal = "%s: %s (%s) pp. %s" % (journal_title, journal_code, journal_year, journal_page)
marcxml_values['journal_refs'].append(journal)
return marcxml_values
def get_subfield_value_from_datafield(datafield, field_tag):
'''
This function get the datafield note from a marcxml and get the tag
value according to the tag id and code given
@param datafield: xml node to be parsed
@param field_tag: tuple containing id and code to find
@return: value of the tag as a string
'''
# extract the tag number
tag = datafield.attributes["tag"]
tag_id = field_tag[0] + field_tag[1] + field_tag[2]
tag_code = field_tag[5]
# retreive the reference to the media
if tag.value == tag_id:
subfields = datafield.getElementsByTagName('subfield')
for subfield in subfields:
if subfield.attributes['code'].value == tag_code:
return subfield.firstChild.nodeValue.encode('utf-8')
return ''
def get_list_of_marcxml_datafields(marcxml, isfile=False):
'''
This method parse the marcxml file to retrieve the root of the datafields
needed by all function that format marcxml nodes.
@param marcxml: file or string that contains the marcxml file
@param isfile: boolean that informs if a file or a string was given
@return: root of all datafileds
'''
#variable that contains the result of the parsing of the marcxml file
if isfile:
try:
parsed_marcxml = minidom.parse(marcxml)
except IOError:
return 0
else:
parsed_marcxml = minidom.parseString(marcxml)
collections = parsed_marcxml.getElementsByTagName('collection')
# some macxml file has no collection root but direct record entry
if len(collections) > 0:
collection = collections[0]
records = collection.getElementsByTagName('record')
else:
records = parsed_marcxml.getElementsByTagName('record')
record = records[0]
return record.getElementsByTagName('datafield')
def format_file_to_zip_archiv(paths):
'''
This method takes a list of different type of file, zip its and group
its into a zip archiv for sending
@param paths: list of path to file of different types
@return: (zip archiv) zipped file that contains all fulltext to submit
'''
(zip_fd, zip_path) = mkstemp(suffix='.zip', prefix='bibsword_media_',
dir=CFG_TMPDIR)
archiv = zipfile.ZipFile(zip_path, "w")
for path in paths:
if os.path.exists(path):
archiv.write(path, os.path.basename(path), zipfile.ZIP_DEFLATED)
archiv.close()
return zip_path
#-------------------------------------------------------------------------------
# getting info from media deposit response file
#-------------------------------------------------------------------------------
def format_link_from_result(result):
'''
This method parses the xml file returned after the submission of a media
and retreive the URL contained in it
@param result: xml file returned by ArXiv
@return: (links) table of url
'''
if isinstance(result, list):
result = result[0]
# parse the xml to access each node
parsed_result = minidom.parseString(result)
# finding the links in the xml file
xml_entries = parsed_result.getElementsByTagName('entry')
xml_entry = xml_entries[0]
xml_contents = xml_entry.getElementsByTagName('content')
# getting the unique content node
content = xml_contents[0]
# declare the dictionnary that contains type and url of a link
link = {}
link['link'] = content.attributes['src'].value.encode('utf-8')
link['type'] = content.attributes['type'].value.encode('utf-8')
return link
def format_update_time_from_result(result):
'''
parse any xml response to retreive and format the value of the 'updated'
tag.
@param result: xml result of a deposit or a submit call to a server
@return: formated date content in the <updated> node
'''
# parse the xml to access each node
parsed_result = minidom.parseString(result)
# finding the links in the xml file
xml_entries = parsed_result.getElementsByTagName('entry')
xml_entry = xml_entries[0]
xml_updated = xml_entry.getElementsByTagName('updated')
# getting the unique content node
updated = xml_updated[0]
return updated.firstChild.nodeValue.encode('utf-8')
def format_links_from_submission(submission):
'''
parse the xml response of a metadata submission and retrieve all the
informations proper to the link toward the media, the metadata and
the status
@param submission: xml response of a submission
@return: tuple { 'medias', 'metadata', 'status' }
'''
# parse the xml to access each node
parsed_result = minidom.parseString(submission)
# finding the links in the xml file
xml_entries = parsed_result.getElementsByTagName('entry')
xml_entry = xml_entries[0]
xml_links = xml_entry.getElementsByTagName('link')
# getting all content nodes
links = {'media':'', 'metadata':'', 'status':''}
for link in xml_links:
# declare the dictionnary that contains type and url of a link
if link.attributes['rel'].value == 'edit-media':
if links['media'] == '':
links['media'] = link.attributes['href'].value.encode('utf-8')
else:
links['media'] = links['media'] + ', ' + \
link.attributes['href'].value.encode('utf-8')
if link.attributes['rel'].value == 'edit':
links['metadata'] = link.attributes['href'].value.encode('utf-8')
if link.attributes['rel'].value == 'alternate':
links['status'] = link.attributes['href'].value.encode('utf-8')
return links
def format_id_from_submission(submission):
'''
Parse the submission file to retrieve the arxiv id retourned
@param submission: xml file returned after the submission
@return: string containing the arxiv id
'''
# parse the xml to access each node
parsed_result = minidom.parseString(submission)
# finding the id in the xml file
xml_entries = parsed_result.getElementsByTagName('entry')
xml_entry = xml_entries[0]
xml_id = xml_entry.getElementsByTagName('id')[0]
remote_id = xml_id.firstChild.nodeValue.encode('utf-8')
(begin, sep, end) = remote_id.rpartition("/")
remote_id = 'arXiv:'
i = 0
for elt in end:
remote_id += elt
if i == 3:
remote_id += '.'
i = i + 1
return remote_id
#-------------------------------------------------------------------------------
# write information in the marc file
#-------------------------------------------------------------------------------
def update_marcxml_with_remote_id(recid, remote_id, action="append"):
'''
Write a new entry in the given marc file. This entry is the remote record
id given by the server where the submission has been done
@param remote_id: the string containing the id to add to the marc file
return: boolean true if update done, false if problems
'''
field_tag = CFG_MARC_ADDITIONAL_REPORT_NUMBER
tag_id = "%s%s%s" % (field_tag[0], field_tag[1], field_tag[2])
tag_code = field_tag[5]
# concatenation of the string to append to the marc file
node = '''<record>
<controlfield tag="001">%(recid)s</controlfield>
<datafield tag="%(tagid)s" ind1=" " ind2=" ">
<subfield code="%(tagcode)s">%(remote_id)s</subfield>
</datafield>
</record>''' % {
'recid': recid,
'tagid': tag_id,
'tagcode': tag_code,
'remote_id': remote_id
}
# creation of the tmp file containing the xml node to append
(tmpfd, filename) = mkstemp(suffix='.xml', prefix='bibsword_append_remote_id_',
dir=CFG_TMPDIR)
tmpfile = os.fdopen(tmpfd, 'w')
tmpfile.write(node)
tmpfile.close()
# insert a task in bibsched to add the node in the marc file
if action == 'append':
result = \
task_low_level_submission('bibupload', 'BibSword', '-a', filename)
elif action == 'delete':
result = \
task_low_level_submission('bibupload', 'BibSword', '-d', filename)
return result
def update_marcxml_with_info(recid, username, current_date, remote_id,
action='append'):
'''
This function add a field in the marc file to informat that the
record has been submitted to a remote server
@param recid: id of the record to update
'''
# concatenation of the string to append to the marc file
node = '''<record>
<controlfield tag="001">%(recid)s</controlfield>
<datafield tag="%(tag)s" ind1=" " ind2=" ">
<subfield code="a">%(submit_info)s</subfield>
</datafield>
</record>''' % {
'recid': recid,
'tag': CFG_MARC_RECORD_SUBMIT_INFO,
'submit_info': CFG_SUBMIT_ARXIV_INFO_MESSAGE % (username, current_date, remote_id)
}
# creation of the tmp file containing the xml node to append
(tmpfd, filename) = mkstemp(suffix='.xml', prefix='bibsword_append_submit_info_',
dir=CFG_TMPDIR)
tmpfile = os.fdopen(tmpfd, 'w')
tmpfile.write(node)
tmpfile.close()
# insert a task in bibschedul to add the node in the marc file
if action == 'append':
result = \
task_low_level_submission('bibupload', 'BibSword', '-a', filename)
elif action == 'delete':
result = \
task_low_level_submission('bibupload', 'BibSword', '-d', filename)
return result
def upload_fulltext(recid, path):
'''
This method save the uploaded file to associated record
@param recid: id of the record
@param path: uploaded document to store
'''
# upload the file to the record
bibarchiv = BibRecDocs(recid)
docname = path.split('/')[-1].split('.')[0]
doctype = path.split('.')[-1].split(';')[0]
bibarchiv.add_new_file(path, CFG_DOCTYPE_UPLOAD_COLLECTION, docname,
format=doctype)
return ''
#-------------------------------------------------------------------------------
# work with the remote submission status xml file
#-------------------------------------------------------------------------------
def format_submission_status(status_xml):
'''
This method parse the given atom xml status string and retrieve the
the value of the tag <status>
@param status_xml: xml atom entry
@return: dictionnary containing status, id and/or possible error
'''
result = {'status':'', 'id_submission':'', 'error':''}
parsed_status = minidom.parseString(status_xml)
deposit = parsed_status.getElementsByTagName('deposit')[0]
status_node = deposit.getElementsByTagName('status')[0]
if status_node.firstChild != None:
status = status_node.firstChild.nodeValue.encode('utf-8')
else:
result['status'] = ''
return result
#status = "submitted"
if status == CFG_SUBMISSION_STATUS_SUBMITTED:
result['status'] = status
return result
#status = "published"
if status == CFG_SUBMISSION_STATUS_PUBLISHED:
result['status'] = status
arxiv_id_node = deposit.getElementsByTagName('arxiv_id')[0]
result['id_submission'] = \
arxiv_id_node.firstChild.nodeValue.encode('utf-8')
return result
#status = "onhold"
if status == CFG_SUBMISSION_STATUS_ONHOLD:
result['status'] = status
return result
#status = "removed"
if status == 'unknown':
result['status'] = CFG_SUBMISSION_STATUS_REMOVED
error_node = deposit.getElementsByTagName('error')[0]
result['error'] = error_node.firstChild.nodeValue.encode('utf-8')
return result
return result
#-------------------------------------------------------------------------------
# Classes for the generation of XML Atom entry containing submission metadata
#-------------------------------------------------------------------------------
class BibSwordFormat:
'''
This class gives the methodes needed to format all mandatories xml atom
entry nodes. It is extended by subclasses that has optional nodes add
to the standard SWORD format
'''
def __init__(self):
''' No init necessary for this class '''
def frmt_id(self, recid):
'''
This methode check if there is an id for the resource. If it is the case,
it format it returns a formated id node that may be inserted in the
xml metadata file
@param recid: the id of the resource
@return: (xml) xml node correctly formated
'''
if recid != '':
return '''<id>%s</id>\n''' % recid
return ''
def frmt_title(self, title):
'''
This methode check if there is a title for the resource. If yes,
it returns a formated title node that may be inserted in the
xml metadata file
@param title: the title of the resource
@return: (xml) xml node correctly formated
'''
if title != '':
return '''<title>%s</title>\n''' % title
return ''
def frmt_author(self, author_name, author_email):
'''
This methode check if there is a submitter for the resource. If yes,
it returns a formated author node that may containing the name and
the email of the author to be inserted in the xml metadata file
@param author_name: the name of the submitter of the resource
@param author_email: the email where the remote server send answers
@return: (xml) xml node correctly formated
'''
author = ''
if author_name != '':
author += '''<author>\n'''
author += '''<name>%s</name>\n''' % author_name
if author_email != '':
author += '''<email>%s</email>\n''' % author_email
author += '''</author>\n'''
return author
def frmt_summary(self, summary):
'''
This methode check if there is a summary for the resource. If yes,
it returns a formated summary node that may be inserted in the
xml metadata file
@param summary: the summary of the resource
@return: (xml) xml node correctly formated
'''
if summary != '':
return '''<summary>%s</summary>\n''' % summary
return ''
def frmt_categories(self, categories, scheme):
'''
This method check if there is some categories for the resource. If it
is the case, it returns the categorie nodes formated to be insered in
the xml metadata file
@param categories: list of categories for one resource
@return: (xml) xml node(s) correctly formated
'''
output = ''
for category in categories:
output += '''<category term="%s" scheme="%s" label="%s"/>\n''' % (category['url'], scheme, category['label'])
return output
def frmt_link(self, links):
'''
This method check if there is some links for the resource. If it
is the case, it returns the links nodes formated to be insered in
the xml metadata file
@param links: list of links for the resource
@return: (xml) xml node(s) correctly formated
'''
output = ''
if links != '':
output += '''<link href="%s" ''' % links['link']
output += '''type="%s" rel="related"/>\n''' % links['type']
return output
class ArXivFormat(BibSwordFormat):
'''
This class inherit from the class BibSwordFormat. It add some specific
mandatory nodes to the standard SWORD format.
'''
#---------------------------------------------------------------------------
# Formating metadata file for submission
#---------------------------------------------------------------------------
def format_metadata(self, metadata):
'''
This method format an atom file that fits with the arxiv atom format
used for the subission of the metadata during the push to arxiv process.
@param metadata: tuple containing every needed information + some optional
@return: (xml file) arxiv atom file
'''
#-----------------------------------------------------------------------
# structure of the arxiv metadata submission atom entry
#-----------------------------------------------------------------------
output = '''<?xml version="1.0" encoding="utf-8"?>\n'''
output += '''<entry xmlns="http://www.w3.org/2005/Atom" '''
output += '''xmlns:arxiv="http://arxiv.org/schemas/atom">\n'''
#id
if 'id' in metadata:
output += BibSwordFormat.frmt_id(self, metadata['id'])
#title
if 'title' in metadata:
output += BibSwordFormat.frmt_title(self,
metadata['title'])
#author
if 'author_name' in metadata and 'author_email' in metadata:
output += BibSwordFormat.frmt_author(self, metadata['author_name'],
metadata['author_email'])
#contributors
if 'contributors' in metadata:
output += '' + self.frmt_contributors(metadata['contributors'])
#summary
if 'summary' in metadata:
output += BibSwordFormat.frmt_summary(self, metadata['summary'])
#categories
if 'categories' in metadata:
output += BibSwordFormat.frmt_categories(self, metadata['categories'],
'http://arxiv.org/terms/arXiv/')
#primary_category
if 'primary_url' in metadata and 'primary_label' in metadata:
output += self.frmt_primary_category(metadata['primary_url'],
metadata['primary_label'],
'http://arxiv.org/terms/arXiv/')
#comment
if 'comment' in metadata:
output += self.frmt_comment(metadata['comment'])
#journal references
if 'journal_refs' in metadata:
output += self.frmt_journal_ref(metadata['journal_refs'])
#report numbers
if 'report_nos' in metadata:
output += self.frmt_report_no(metadata['report_nos'])
#doi
if 'doi' in metadata:
output += self.frmt_doi(metadata['doi'])
#link
if 'links' in metadata:
output += BibSwordFormat.frmt_link(self, metadata['links'])
output += '''</entry>'''
return output
def frmt_contributors(self, contributors):
'''
This method display each contributors in the format of an editable input
text. This allows the user to modifie it.
@param contributors: The list of all contributors of the document
@return: (html code) the html code that display each dropdown list
'''
output = ''
for contributor in contributors:
output += '''<contributor>\n'''
output += '''<name>%s</name>\n''' % contributor['name']
if contributor['email'] != '':
output += '''<email>%s</email>\n''' % \
contributor['email']
if len(contributor['affiliation']) != 0:
for affiliation in contributor['affiliation']:
output += '''<arxiv:affiliation>%s'''\
'''</arxiv:affiliation>\n''' % affiliation
output += '''</contributor>\n'''
return output
def frmt_primary_category(self, primary_url, primary_label, scheme):
'''
This method format the primary category as an element of a dropdown
list.
@param primary_url: url of the primary category deposit
@param primary_label: name of the primary category to display
@param scheme: url of the primary category schema
@return: html code containing each element to display
'''
output = ''
if primary_url != '':
output += '''<arxiv:primary_category xmlns:arxiv="http://arxiv.org/schemas/atom/" scheme="%s" label="%s" term="%s"/>\n''' % (scheme, primary_label, primary_url)
return output
def frmt_comment(self, comment):
'''
This methode check if there is an comment given. If it is the case, it
format it returns a formated comment node that may be inserted in the xml
metadata file
@param comment: the string comment
@return: (xml) xml node correctly formated
'''
output = ''
if comment != '':
output = '''<arxiv:comment>%s</arxiv:comment>\n''' % comment
return output
def frmt_journal_ref(self, journal_refs):
'''
This method check if there is some journal refs for the resource. If it
is the case, it returns the journal_ref nodes formated to be insered in
the xml metadata file
@param journal_refs: list of journal_refs for one resource
@return: (xml) xml node(s) correctly formated
'''
output = ''
for journal_ref in journal_refs:
output += '''<arxiv:journal_ref>%s</arxiv:journal_ref>\n''' % \
journal_ref
return output
def frmt_report_no(self, report_nos):
'''
This method check if there is some report numbres for the resource. If it
is the case, it returns the report_nos nodes formated to be insered in
the xml metadata file
@param report_nos: list of report_nos for one resource
@return: (xml) xml node(s) correctly formated
'''
output = ''
for report_no in report_nos:
output += '''<arxiv:report_no>%s</arxiv:report_no>\n''' % \
report_no
return output
def frmt_doi(self, doi):
'''This methode check if there is an doi given. If it is the case, it
format it returns a formated doi node that may be inserted in the xml
metadata file
@param doi: the string doi
@return: (xml) xml node correctly formated
'''
output = ''
if doi != '':
output = '''<arxiv:doi>%s</arxiv:doi>\n''' % doi
return output
diff --git a/invenio/legacy/bibupload/engine.py b/invenio/legacy/bibupload/engine.py
index 9284189a4..82a56887c 100644
--- a/invenio/legacy/bibupload/engine.py
+++ b/invenio/legacy/bibupload/engine.py
@@ -1,2937 +1,2937 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibUpload: Receive MARC XML file and update the appropriate database
tables according to options.
"""
__revision__ = "$Id$"
import os
import re
import sys
import time
from datetime import datetime
from zlib import compress
import socket
import marshal
import copy
import tempfile
import urlparse
import urllib2
import urllib
from invenio.config import CFG_OAI_ID_FIELD, \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG, \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG, \
CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG, \
CFG_BIBUPLOAD_STRONG_TAGS, \
CFG_BIBUPLOAD_CONTROLLED_PROVENANCE_TAGS, \
CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE, \
CFG_BIBUPLOAD_DELETE_FORMATS, \
CFG_SITE_URL, CFG_SITE_SECURE_URL, CFG_SITE_RECORD, \
CFG_OAI_PROVENANCE_ALTERED_SUBFIELD, \
CFG_BIBUPLOAD_DISABLE_RECORD_REVISIONS, \
CFG_BIBUPLOAD_CONFLICTING_REVISION_TICKET_QUEUE
from invenio.utils.json import json, CFG_JSON_AVAILABLE
-from invenio.bibupload_config import CFG_BIBUPLOAD_CONTROLFIELD_TAGS, \
+from invenio.legacy.bibupload.config import CFG_BIBUPLOAD_CONTROLFIELD_TAGS, \
CFG_BIBUPLOAD_SPECIAL_TAGS, \
CFG_BIBUPLOAD_DELETE_CODE, \
CFG_BIBUPLOAD_DELETE_VALUE, \
CFG_BIBUPLOAD_OPT_MODES
from invenio.legacy.dbquery import run_sql, \
Error
from invenio.legacy.bibrecord import create_records, \
record_add_field, \
record_delete_field, \
record_xml_output, \
record_get_field_instances, \
record_get_field_value, \
record_get_field_values, \
field_get_subfield_values, \
field_get_subfield_instances, \
record_modify_subfield, \
record_delete_subfield_from, \
record_delete_fields, \
record_add_subfield_into, \
record_find_field, \
record_extract_oai_id, \
record_extract_dois, \
record_has_field,\
records_identical
from invenio.legacy.search_engine import get_record
from invenio.utils.date import convert_datestruct_to_datetext
from invenio.ext.logging import register_exception
-from invenio.bibcatalog import bibcatalog_system
+from invenio.legacy.bibcatalog.api import bibcatalog_system
from invenio.intbitset import intbitset
from invenio.utils.url import make_user_agent_string
from invenio.config import CFG_BIBDOCFILE_FILEDIR
-from invenio.bibtask import task_init, write_message, \
+from invenio.legacy.bibsched.bibtask import task_init, write_message, \
task_set_option, task_get_option, task_get_task_param, task_update_status, \
task_update_progress, task_sleep_now_if_required, fix_argv_paths
-from invenio.bibdocfile import BibRecDocs, file_strip_ext, normalize_format, \
+from invenio.legacy.bibdocfile.api import BibRecDocs, file_strip_ext, normalize_format, \
get_docname_from_url, check_valid_url, download_url, \
KEEP_OLD_VALUE, decompose_bibdocfile_url, InvenioBibDocFileError, \
bibdocfile_url_p, CFG_BIBDOCFILE_AVAILABLE_FLAGS, guess_format_from_url, \
BibRelation, MoreInfo
from invenio.legacy.search_engine import search_pattern
-from invenio.bibupload_revisionverifier import RevisionVerifier, \
+from invenio.legacy.bibupload.revisionverifier import RevisionVerifier, \
InvenioBibUploadConflictingRevisionsError, \
InvenioBibUploadInvalidRevisionError, \
InvenioBibUploadMissing005Error, \
InvenioBibUploadUnchangedRecordError
#Statistic variables
stat = {}
stat['nb_records_to_upload'] = 0
stat['nb_records_updated'] = 0
stat['nb_records_inserted'] = 0
stat['nb_errors'] = 0
stat['nb_holdingpen'] = 0
stat['exectime'] = time.localtime()
_WRITING_RIGHTS = None
CFG_BIBUPLOAD_ALLOWED_SPECIAL_TREATMENTS = ('oracle', )
CFG_HAS_BIBCATALOG = "UNKNOWN"
def check_bibcatalog():
"""
Return True if bibcatalog is available.
"""
global CFG_HAS_BIBCATALOG # pylint: disable=W0603
if CFG_HAS_BIBCATALOG != "UNKNOWN":
return CFG_HAS_BIBCATALOG
CFG_HAS_BIBCATALOG = True
if bibcatalog_system is not None:
bibcatalog_response = bibcatalog_system.check_system()
else:
bibcatalog_response = "No ticket system configured"
if bibcatalog_response != "":
write_message("BibCatalog error: %s\n" % (bibcatalog_response,))
CFG_HAS_BIBCATALOG = False
return CFG_HAS_BIBCATALOG
## Let's set a reasonable timeout for URL request (e.g. FFT)
socket.setdefaulttimeout(40)
def parse_identifier(identifier):
"""Parse the identifier and determine if it is temporary or fixed"""
id_str = str(identifier)
if not id_str.startswith("TMP:"):
return (False, identifier)
else:
return (True, id_str[4:])
def resolve_identifier(tmps, identifier):
"""Resolves an identifier. If the identifier is not temporary, this
function is an identity on the second argument. Otherwise, a resolved
value is returned or an exception raised"""
is_tmp, tmp_id = parse_identifier(identifier)
if is_tmp:
if not tmp_id in tmps:
raise StandardError("Temporary identifier %s not present in the dictionary" % (tmp_id, ))
if tmps[tmp_id] == -1:
# the identifier has been signalised but never assigned a value - probably error during processing
raise StandardError("Temporary identifier %s has been declared, but never assigned a value. Probably an error during processign of an appropriate FFT has happened. Please see the log" % (tmp_id, ))
return int(tmps[tmp_id])
else:
return int(identifier)
_re_find_001 = re.compile('<controlfield\\s+tag=("001"|\'001\')\\s*>\\s*(\\d*)\\s*</controlfield>', re.S)
def bibupload_pending_recids():
"""This function embed a bit of A.I. and is more a hack than an elegant
algorithm. It should be updated in case bibupload/bibsched are modified
in incompatible ways.
This function return the intbitset of all the records that are being
(or are scheduled to be) touched by other bibuploads.
"""
options = run_sql("""SELECT arguments FROM schTASK WHERE status<>'DONE' AND
proc='bibupload' AND (status='RUNNING' OR status='CONTINUING' OR
status='WAITING' OR status='SCHEDULED' OR status='ABOUT TO STOP' OR
status='ABOUT TO SLEEP')""")
ret = intbitset()
xmls = []
if options:
for arguments in options:
arguments = marshal.loads(arguments[0])
for argument in arguments[1:]:
if argument.startswith('/'):
# XMLs files are recognizable because they're absolute
# files...
xmls.append(argument)
for xmlfile in xmls:
# Let's grep for the 001
try:
xml = open(xmlfile).read()
ret += [int(group[1]) for group in _re_find_001.findall(xml)]
except:
continue
return ret
### bibupload engine functions:
def bibupload(record, opt_mode=None, opt_notimechange=0, oai_rec_id="", pretend=False,
tmp_ids=None, tmp_vers=None):
"""Main function: process a record and fit it in the tables
bibfmt, bibrec, bibrec_bibxxx, bibxxx with proper record
metadata.
Return (error_code, recID) of the processed record.
"""
if tmp_ids is None:
tmp_ids = {}
if tmp_vers is None:
tmp_vers = {}
if opt_mode == 'reference':
## NOTE: reference mode has been deprecated in favour of 'correct'
opt_mode = 'correct'
assert(opt_mode in CFG_BIBUPLOAD_OPT_MODES)
error = None
affected_tags = {}
original_record = {}
rec_old = {}
now = datetime.now() # will hold record creation/modification date
record_had_altered_bit = False
is_opt_mode_delete = False
# Extraction of the Record Id from 001, SYSNO or OAIID or DOI tags:
rec_id = retrieve_rec_id(record, opt_mode, pretend=pretend)
if rec_id == -1:
msg = " Failed: either the record already exists and insert was " \
"requested or the record does not exists and " \
"replace/correct/append has been used"
write_message(msg, verbose=1, stream=sys.stderr)
return (1, -1, msg)
elif rec_id > 0:
write_message(" -Retrieve record ID (found %s): DONE." % rec_id, verbose=2)
(unique_p, msg) = check_record_doi_is_unique(rec_id, record)
if not unique_p:
write_message(msg, verbose=1, stream=sys.stderr)
return (1, int(rec_id), msg)
if not record.has_key('001'):
# Found record ID by means of SYSNO or OAIID or DOI, and the
# input MARCXML buffer does not have this 001 tag, so we
# should add it now:
error = record_add_field(record, '001', controlfield_value=rec_id)
if error is None:
msg = " Failed: Error during adding the 001 controlfield " \
"to the record"
write_message(msg, verbose=1, stream=sys.stderr)
return (1, int(rec_id), msg)
else:
error = None
write_message(" -Added tag 001: DONE.", verbose=2)
write_message(" -Check if the xml marc file is already in the database: DONE" , verbose=2)
record_deleted_p = False
if opt_mode == 'insert' or \
(opt_mode == 'replace_or_insert') and rec_id is None:
insert_mode_p = True
# Insert the record into the bibrec databases to have a recordId
rec_id = create_new_record(pretend=pretend)
write_message(" -Creation of a new record id (%d): DONE" % rec_id, verbose=2)
# we add the record Id control field to the record
error = record_add_field(record, '001', controlfield_value=rec_id)
if error is None:
msg = " Failed: Error during adding the 001 controlfield " \
"to the record"
write_message(msg, verbose=1, stream=sys.stderr)
return (1, int(rec_id), msg)
else:
error = None
if '005' not in record:
error = record_add_field(record, '005', controlfield_value=now.strftime("%Y%m%d%H%M%S.0"))
if error is None:
msg = " Failed: Error during adding to 005 controlfield to record"
write_message(msg, verbose=1, stream=sys.stderr)
return (1, int(rec_id), msg)
else:
error = None
else:
write_message(" Note: 005 already existing upon inserting of new record. Keeping it.", verbose=2)
elif opt_mode != 'insert':
insert_mode_p = False
# Update Mode
# Retrieve the old record to update
rec_old = get_record(rec_id)
record_had_altered_bit = record_get_field_values(rec_old, CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[:3], CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3], CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4], CFG_OAI_PROVENANCE_ALTERED_SUBFIELD)
# Also save a copy to restore previous situation in case of errors
original_record = get_record(rec_id)
if rec_old is None:
msg = " Failed during the creation of the old record!"
write_message(msg, verbose=1, stream=sys.stderr)
return (1, int(rec_id), msg)
else:
write_message(" -Retrieve the old record to update: DONE", verbose=2)
# flag to check whether the revisions have been verified and patch generated.
# If revision verification failed, then we need to manually identify the affected tags
# and process them
revision_verified = False
rev_verifier = RevisionVerifier()
#check for revision conflicts before updating record
if record_has_field(record, '005') and not CFG_BIBUPLOAD_DISABLE_RECORD_REVISIONS:
write_message(" -Upload Record has 005. Verifying Revision", verbose=2)
try:
rev_res = rev_verifier.verify_revision(record, original_record, opt_mode)
if rev_res:
opt_mode = rev_res[0]
record = rev_res[1]
affected_tags = rev_res[2]
revision_verified = True
write_message(lambda: " -Patch record generated. Changing opt_mode to correct.\nPatch:\n%s " % record_xml_output(record), verbose=2)
else:
write_message(" -No Patch Record.", verbose=2)
except InvenioBibUploadUnchangedRecordError, err:
msg = " -ISSUE: %s" % err
write_message(msg, verbose=1, stream=sys.stderr)
write_message(msg, " Continuing anyway in case there are FFT or other tags")
except InvenioBibUploadConflictingRevisionsError, err:
msg = " -ERROR: Conflicting Revisions - %s" % err
write_message(msg, verbose=1, stream=sys.stderr)
submit_ticket_for_holding_pen(rec_id, err, "Conflicting Revisions. Inserting record into holding pen.")
insert_record_into_holding_pen(record, str(rec_id))
return (2, int(rec_id), msg)
except InvenioBibUploadInvalidRevisionError, err:
msg = " -ERROR: Invalid Revision - %s" % err
write_message(msg)
submit_ticket_for_holding_pen(rec_id, err, "Invalid Revisions. Inserting record into holding pen.")
insert_record_into_holding_pen(record, str(rec_id))
return (2, int(rec_id), msg)
except InvenioBibUploadMissing005Error, err:
msg = " -ERROR: Missing 005 - %s" % err
write_message(msg)
submit_ticket_for_holding_pen(rec_id, err, "Missing 005. Inserting record into holding pen.")
insert_record_into_holding_pen(record, str(rec_id))
return (2, int(rec_id), msg)
else:
write_message(" - No 005 Tag Present. Resuming normal flow.", verbose=2)
# dictionaries to temporarily hold original recs tag-fields
existing_tags = {}
retained_tags = {}
# in case of delete operation affected tags should be deleted in delete_bibrec_bibxxx
# but should not be updated again in STAGE 4
# utilising the below flag
is_opt_mode_delete = False
if not revision_verified:
# either 005 was not present or opt_mode was not correct/replace
# in this case we still need to find out affected tags to process
write_message(" - Missing 005 or opt_mode!=Replace/Correct.Revision Verifier not called.", verbose=2)
# Identify affected tags
if opt_mode == 'correct' or opt_mode == 'replace' or opt_mode == 'replace_or_insert':
rec_diff = rev_verifier.compare_records(record, original_record, opt_mode)
affected_tags = rev_verifier.retrieve_affected_tags_with_ind(rec_diff)
elif opt_mode == 'delete':
# populate an intermediate dictionary
# used in upcoming step related to 'delete' mode
is_opt_mode_delete = True
for tag, fields in original_record.iteritems():
existing_tags[tag] = [tag + (field[1] != ' ' and field[1] or '_') + (field[2] != ' ' and field[2] or '_') for field in fields]
elif opt_mode == 'append':
for tag, fields in record.iteritems():
if tag not in CFG_BIBUPLOAD_CONTROLFIELD_TAGS:
affected_tags[tag]=[(field[1], field[2]) for field in fields]
# In Replace mode, take over old strong tags if applicable:
if opt_mode == 'replace' or \
opt_mode == 'replace_or_insert':
copy_strong_tags_from_old_record(record, rec_old)
# Delete tags to correct in the record
if opt_mode == 'correct':
delete_tags_to_correct(record, rec_old)
write_message(" -Delete the old tags to correct in the old record: DONE",
verbose=2)
# Delete tags specified if in delete mode
if opt_mode == 'delete':
record = delete_tags(record, rec_old)
for tag, fields in record.iteritems():
retained_tags[tag] = [tag + (field[1] != ' ' and field[1] or '_') + (field[2] != ' ' and field[2] or '_') for field in fields]
#identify the tags that have been deleted
for tag in existing_tags.keys():
if tag not in retained_tags:
for item in existing_tags[tag]:
tag_to_add = item[0:3]
ind1, ind2 = item[3], item[4]
if tag_to_add in affected_tags and (ind1, ind2) not in affected_tags[tag_to_add]:
affected_tags[tag_to_add].append((ind1, ind2))
else:
affected_tags[tag_to_add] = [(ind1, ind2)]
else:
deleted = list(set(existing_tags[tag]) - set(retained_tags[tag]))
for item in deleted:
tag_to_add = item[0:3]
ind1, ind2 = item[3], item[4]
if tag_to_add in affected_tags and (ind1, ind2) not in affected_tags[tag_to_add]:
affected_tags[tag_to_add].append((ind1, ind2))
else:
affected_tags[tag_to_add] = [(ind1, ind2)]
write_message(" -Delete specified tags in the old record: DONE", verbose=2)
# Append new tag to the old record and update the new record with the old_record modified
if opt_mode == 'append' or opt_mode == 'correct':
record = append_new_tag_to_old_record(record, rec_old)
write_message(" -Append new tags to the old record: DONE", verbose=2)
write_message(" -Affected Tags found after comparing upload and original records: %s"%(str(affected_tags)), verbose=2)
# 005 tag should be added everytime the record is modified
# If an exiting record is modified, its 005 tag should be overwritten with a new revision value
if record.has_key('005'):
record_delete_field(record, '005')
write_message(" Deleted the existing 005 tag.", verbose=2)
last_revision = run_sql("SELECT MAX(job_date) FROM hstRECORD WHERE id_bibrec=%s", (rec_id, ))[0][0]
if last_revision and last_revision.strftime("%Y%m%d%H%M%S.0") == now.strftime("%Y%m%d%H%M%S.0"):
## We are updating the same record within the same seconds! It's less than
## the minimal granularity. Let's pause for 1 more second to take a breath :-)
time.sleep(1)
now = datetime.now()
error = record_add_field(record, '005', controlfield_value=now.strftime("%Y%m%d%H%M%S.0"))
if error is None:
write_message(" Failed: Error during adding to 005 controlfield to record", verbose=1, stream=sys.stderr)
return (1, int(rec_id))
else:
error=None
write_message(lambda: " -Added tag 005: DONE. "+ str(record_get_field_value(record, '005', '', '')), verbose=2)
# adding 005 to affected tags will delete the existing 005 entry
# and update with the latest timestamp.
if '005' not in affected_tags:
affected_tags['005'] = [(' ', ' ')]
write_message(" -Stage COMPLETED", verbose=2)
record_deleted_p = False
try:
if not record_is_valid(record):
msg = "ERROR: record is not valid"
write_message(msg, verbose=1, stream=sys.stderr)
return (1, -1, msg)
# Have a look if we have FFT tags
write_message("Stage 2: Start (Process FFT tags if exist).", verbose=2)
record_had_FFT = False
if extract_tag_from_record(record, 'FFT') is not None:
record_had_FFT = True
if not writing_rights_p():
write_message(" Stage 2 failed: Error no rights to write fulltext files",
verbose=1, stream=sys.stderr)
task_update_status("ERROR")
sys.exit(1)
try:
record = elaborate_fft_tags(record, rec_id, opt_mode,
pretend=pretend, tmp_ids=tmp_ids,
tmp_vers=tmp_vers)
except Exception, e:
register_exception()
msg = " Stage 2 failed: Error while elaborating FFT tags: %s" % e
write_message(msg, verbose=1, stream=sys.stderr)
return (1, int(rec_id), msg)
if record is None:
msg = " Stage 2 failed: Error while elaborating FFT tags"
write_message(msg, verbose=1, stream=sys.stderr)
return (1, int(rec_id), msg)
write_message(" -Stage COMPLETED", verbose=2)
else:
write_message(" -Stage NOT NEEDED", verbose=2)
# Have a look if we have FFT tags
write_message("Stage 2B: Start (Synchronize 8564 tags).", verbose=2)
if record_had_FFT or extract_tag_from_record(record, '856') is not None:
try:
record = synchronize_8564(rec_id, record, record_had_FFT, pretend=pretend)
# in case if FFT is in affected list make appropriate changes
if opt_mode is not 'insert': # because for insert, all tags are affected
if ('4', ' ') not in affected_tags.get('856', []):
if '856' not in affected_tags:
affected_tags['856'] = [('4', ' ')]
elif ('4', ' ') not in affected_tags['856']:
affected_tags['856'].append(('4', ' '))
write_message(" -Modified field list updated with FFT details: %s" % str(affected_tags), verbose=2)
except Exception, e:
register_exception(alert_admin=True)
msg = " Stage 2B failed: Error while synchronizing 8564 tags: %s" % e
write_message(msg, verbose=1, stream=sys.stderr)
return (1, int(rec_id), msg)
if record is None:
msg = " Stage 2B failed: Error while synchronizing 8564 tags"
write_message(msg, verbose=1, stream=sys.stderr)
return (1, int(rec_id), msg)
write_message(" -Stage COMPLETED", verbose=2)
else:
write_message(" -Stage NOT NEEDED", verbose=2)
write_message("Stage 3: Start (Apply fields deletion requests).", verbose=2)
write_message(lambda: " Record before deletion:\n%s" % record_xml_output(record), verbose=9)
# remove fields with __DELETE_FIELDS__
# NOTE:creating a temporary deep copy of record for iteration to avoid RunTimeError
# RuntimeError due to change in dictionary size during iteration
tmp_rec = copy.deepcopy(record)
for tag in tmp_rec:
for data_tuple in record[tag]:
if (CFG_BIBUPLOAD_DELETE_CODE, CFG_BIBUPLOAD_DELETE_VALUE) in data_tuple[0]:
# delete the tag with particular indicator pairs from original record
record_delete_field(record, tag, data_tuple[1], data_tuple[2])
write_message(lambda: " Record after cleaning up fields to be deleted:\n%s" % record_xml_output(record), verbose=9)
# Update of the BibFmt
write_message("Stage 4: Start (Update bibfmt).", verbose=2)
updates_exist = not records_identical(record, original_record)
if updates_exist:
# if record_had_altered_bit, this must be set to true, since the
# record has been altered.
if record_had_altered_bit:
oai_provenance_fields = record_get_field_instances(record, CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[:3], CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3], CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4])
for oai_provenance_field in oai_provenance_fields:
for i, (code, dummy_value) in enumerate(oai_provenance_field[0]):
if code == CFG_OAI_PROVENANCE_ALTERED_SUBFIELD:
oai_provenance_field[0][i] = (code, 'true')
tmp_indicators = (CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3], CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4])
if tmp_indicators not in affected_tags.get(CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[:3], []):
if CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[:3] not in affected_tags:
affected_tags[CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[:3]] = [tmp_indicators]
else:
affected_tags[CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[:3]].append(tmp_indicators)
write_message(lambda: " Updates exists:\n%s\n!=\n%s" % (record, original_record), verbose=9)
# format the single record as xml
rec_xml_new = record_xml_output(record)
# Update bibfmt with the format xm of this record
modification_date = time.strftime('%Y-%m-%d %H:%M:%S', time.strptime(record_get_field_value(record, '005'), '%Y%m%d%H%M%S.0'))
error = update_bibfmt_format(rec_id, rec_xml_new, 'xm', modification_date, pretend=pretend)
if error == 1:
msg = " Failed: error during update_bibfmt_format 'xm'"
write_message(msg, verbose=1, stream=sys.stderr)
return (1, int(rec_id), msg)
if CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE:
error = update_bibfmt_format(rec_id, marshal.dumps(record), 'recstruct', modification_date, pretend=pretend)
if error == 1:
msg = " Failed: error during update_bibfmt_format 'recstruct'"
write_message(msg, verbose=1, stream=sys.stderr)
return (1, int(rec_id), msg)
if not CFG_BIBUPLOAD_DISABLE_RECORD_REVISIONS:
# archive MARCXML format of this record for version history purposes:
error = archive_marcxml_for_history(rec_id, affected_fields=affected_tags, pretend=pretend)
if error == 1:
msg = " Failed to archive MARCXML for history"
write_message(msg, verbose=1, stream=sys.stderr)
return (1, int(rec_id), msg)
else:
write_message(" -Archived MARCXML for history: DONE", verbose=2)
# delete some formats like HB upon record change:
if updates_exist or record_had_FFT:
for format_to_delete in CFG_BIBUPLOAD_DELETE_FORMATS:
try:
delete_bibfmt_format(rec_id, format_to_delete, pretend=pretend)
except:
# OK, some formats like HB could not have been deleted, no big deal
pass
write_message(" -Stage COMPLETED", verbose=2)
## Let's assert that one and only one 005 tag is existing at this stage.
assert len(record['005']) == 1
# Update the database MetaData
write_message("Stage 5: Start (Update the database with the metadata).",
verbose=2)
if insert_mode_p:
update_database_with_metadata(record, rec_id, oai_rec_id, pretend=pretend)
elif opt_mode in ('replace', 'replace_or_insert',
'append', 'correct', 'delete') and updates_exist:
# now we clear all the rows from bibrec_bibxxx from the old
record_deleted_p = True
delete_bibrec_bibxxx(rec_old, rec_id, affected_tags, pretend=pretend)
# metadata update will insert tags that are available in affected_tags.
# but for delete, once the tags have been deleted from bibrec_bibxxx, they dont have to be inserted
# except for 005.
if is_opt_mode_delete:
tmp_affected_tags = copy.deepcopy(affected_tags)
for tag in tmp_affected_tags:
if tag != '005':
affected_tags.pop(tag)
write_message(" -Clean bibrec_bibxxx: DONE", verbose=2)
update_database_with_metadata(record, rec_id, oai_rec_id, affected_tags, pretend=pretend)
else:
write_message(" -Stage NOT NEEDED in mode %s" % opt_mode,
verbose=2)
write_message(" -Stage COMPLETED", verbose=2)
record_deleted_p = False
# Finally we update the bibrec table with the current date
write_message("Stage 6: Start (Update bibrec table with current date).",
verbose=2)
if opt_notimechange == 0 and (updates_exist or record_had_FFT):
bibrec_now = convert_datestruct_to_datetext(time.localtime())
write_message(" -Retrieved current localtime: DONE", verbose=2)
update_bibrec_date(bibrec_now, rec_id, insert_mode_p, pretend=pretend)
write_message(" -Stage COMPLETED", verbose=2)
else:
write_message(" -Stage NOT NEEDED", verbose=2)
# Increase statistics
if insert_mode_p:
stat['nb_records_inserted'] += 1
else:
stat['nb_records_updated'] += 1
# Upload of this record finish
write_message("Record "+str(rec_id)+" DONE", verbose=1)
return (0, int(rec_id), "")
finally:
if record_deleted_p:
## BibUpload has failed living the record deleted. We should
## back the original record then.
update_database_with_metadata(original_record, rec_id, oai_rec_id, pretend=pretend)
write_message(" Restored original record", verbose=1, stream=sys.stderr)
def record_is_valid(record):
"""
Check if the record is valid. Currently this simply checks if the record
has exactly one rec_id.
@param record: the record
@type record: recstruct
@return: True if the record is valid
@rtype: bool
"""
rec_ids = record_get_field_values(record, tag="001")
if len(rec_ids) != 1:
write_message(" The record is not valid: it has not a single rec_id: %s" % (rec_ids), stream=sys.stderr)
return False
return True
def find_record_ids_by_oai_id(oaiId):
"""
A method finding the records identifier provided the oai identifier
returns a list of identifiers matching a given oai identifier
"""
# Is this record already in invenio (matching by oaiid)
if oaiId:
recids = search_pattern(p=oaiId, f=CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG, m='e')
# Is this record already in invenio (matching by reportnumber i.e.
# particularly 037. Idea: to avoid doubbles insertions)
repnumber = oaiId.split(":")[-1]
if repnumber:
recids |= search_pattern(p = repnumber,
f = "reportnumber",
m = 'e' )
# Is this record already in invenio (matching by reportnumber i.e.
# particularly 037. Idea: to avoid double insertions)
repnumber = "arXiv:" + oaiId.split(":")[-1]
recids |= search_pattern(p = repnumber,
f = "reportnumber",
m = 'e' )
return recids
else:
return intbitset()
def bibupload_post_phase(record, mode=None, rec_id="", pretend=False,
tmp_ids=None, tmp_vers=None):
def _elaborate_tag(record, tag, fun):
if extract_tag_from_record(record, tag) is not None:
try:
record = fun()
except Exception, e:
register_exception()
write_message(" Stage failed: Error while elaborating %s tags: %s" % (tag, e),
verbose=1, stream=sys.stderr)
return (1, int(rec_id)) # TODO: ?
if record is None:
write_message(" Stage failed: Error while elaborating %s tags" % (tag, ),
verbose=1, stream=sys.stderr)
return (1, int(rec_id))
write_message(" -Stage COMPLETED", verbose=2)
else:
write_message(" -Stage NOT NEEDED", verbose=2)
if tmp_ids is None:
tmp_ids = {}
if tmp_vers is None:
tmp_vers = {}
_elaborate_tag(record, "BDR", lambda: elaborate_brt_tags(record, rec_id = rec_id,
mode = mode,
pretend = pretend,
tmp_ids = tmp_ids,
tmp_vers = tmp_vers))
_elaborate_tag(record, "BDM", lambda: elaborate_mit_tags(record, rec_id = rec_id,
mode = mode,
pretend = pretend,
tmp_ids = tmp_ids,
tmp_vers = tmp_vers))
def submit_ticket_for_holding_pen(rec_id, err, msg):
"""
Submit a ticket via BibCatalog to report about a record that has been put
into the Holding Pen.
@rec_id: the affected record
@err: the corresponding Exception
msg: verbose message
"""
from invenio import bibtask
from invenio.legacy.webuser import get_email_from_username, get_uid_from_email
user = task_get_task_param("user")
uid = None
if user:
try:
uid = get_uid_from_email(get_email_from_username(user))
except Exception, err:
write_message("WARNING: can't reliably retrieve uid for user %s: %s" % (user, err), stream=sys.stderr)
if check_bibcatalog():
text = """
%(msg)s found for record %(rec_id)s: %(err)s
See: <%(siteurl)s/record/edit/#state=edit&recid=%(rec_id)s>
BibUpload task information:
task_id: %(task_id)s
task_specific_name: %(task_specific_name)s
user: %(user)s
task_params: %(task_params)s
task_options: %(task_options)s""" % {
"msg": msg,
"rec_id": rec_id,
"err": err,
"siteurl": CFG_SITE_SECURE_URL,
"task_id": task_get_task_param("task_id"),
"task_specific_name": task_get_task_param("task_specific_name"),
"user": user,
"task_params": bibtask._TASK_PARAMS,
"task_options": bibtask._OPTIONS}
bibcatalog_system.ticket_submit(subject="%s: %s by %s" % (msg, rec_id, user), recordid=rec_id, text=text, queue=CFG_BIBUPLOAD_CONFLICTING_REVISION_TICKET_QUEUE, owner=uid)
def insert_record_into_holding_pen(record, oai_id, pretend=False):
query = "INSERT INTO bibHOLDINGPEN (oai_id, changeset_date, changeset_xml, id_bibrec) VALUES (%s, NOW(), %s, %s)"
xml_record = record_xml_output(record)
bibrec_ids = find_record_ids_by_oai_id(oai_id) # here determining the identifier of the record
if len(bibrec_ids) > 0:
bibrec_id = bibrec_ids.pop()
else:
# id not found by using the oai_id, let's use a wider search based
# on any information we might have.
bibrec_id = retrieve_rec_id(record, 'holdingpen', pretend=pretend)
if bibrec_id is None:
bibrec_id = 0
if not pretend:
run_sql(query, (oai_id, xml_record, bibrec_id))
# record_id is logged as 0! ( We are not inserting into the main database)
log_record_uploading(oai_id, task_get_task_param('task_id', 0), 0, 'H', pretend=pretend)
stat['nb_holdingpen'] += 1
def print_out_bibupload_statistics():
"""Print the statistics of the process"""
out = "Task stats: %(nb_input)d input records, %(nb_updated)d updated, " \
"%(nb_inserted)d inserted, %(nb_errors)d errors, %(nb_holdingpen)d inserted to holding pen. " \
"Time %(nb_sec).2f sec." % { \
'nb_input': stat['nb_records_to_upload'],
'nb_updated': stat['nb_records_updated'],
'nb_inserted': stat['nb_records_inserted'],
'nb_errors': stat['nb_errors'],
'nb_holdingpen': stat['nb_holdingpen'],
'nb_sec': time.time() - time.mktime(stat['exectime']) }
write_message(out)
def open_marc_file(path):
"""Open a file and return the data"""
try:
# open the file containing the marc document
marc_file = open(path, 'r')
marc = marc_file.read()
marc_file.close()
except IOError, erro:
write_message("Error: %s" % erro, verbose=1, stream=sys.stderr)
write_message("Exiting.", sys.stderr)
if erro.errno == 2:
# No such file or directory
# Not scary
task_update_status("CERROR")
else:
task_update_status("ERROR")
sys.exit(1)
return marc
def xml_marc_to_records(xml_marc):
"""create the records"""
# Creation of the records from the xml Marc in argument
recs = create_records(xml_marc, 1, 1)
if recs == []:
write_message("Error: Cannot parse MARCXML file.", verbose=1, stream=sys.stderr)
write_message("Exiting.", sys.stderr)
task_update_status("ERROR")
sys.exit(1)
elif recs[0][0] is None:
write_message("Error: MARCXML file has wrong format: %s" % recs,
verbose=1, stream=sys.stderr)
write_message("Exiting.", sys.stderr)
task_update_status("CERROR")
sys.exit(1)
else:
recs = map((lambda x:x[0]), recs)
return recs
def find_record_format(rec_id, bibformat):
"""Look whether record REC_ID is formatted in FORMAT,
i.e. whether FORMAT exists in the bibfmt table for this record.
Return the number of times it is formatted: 0 if not, 1 if yes,
2 if found more than once (should never occur).
"""
out = 0
query = """SELECT COUNT(*) FROM bibfmt WHERE id_bibrec=%s AND format=%s"""
params = (rec_id, bibformat)
res = []
res = run_sql(query, params)
out = res[0][0]
return out
def find_record_from_recid(rec_id):
"""
Try to find record in the database from the REC_ID number.
Return record ID if found, None otherwise.
"""
res = run_sql("SELECT id FROM bibrec WHERE id=%s",
(rec_id,))
if res:
return res[0][0]
else:
return None
def find_record_from_sysno(sysno):
"""
Try to find record in the database from the external SYSNO number.
Return record ID if found, None otherwise.
"""
bibxxx = 'bib'+CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[0:2]+'x'
bibrec_bibxxx = 'bibrec_' + bibxxx
res = run_sql("""SELECT bb.id_bibrec FROM %(bibrec_bibxxx)s AS bb,
%(bibxxx)s AS b WHERE b.tag=%%s AND b.value=%%s
AND bb.id_bibxxx=b.id""" % \
{'bibxxx': bibxxx,
'bibrec_bibxxx': bibrec_bibxxx},
(CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG, sysno,))
if res:
return res[0][0]
else:
return None
def find_records_from_extoaiid(extoaiid, extoaisrc=None):
"""
Try to find records in the database from the external EXTOAIID number.
Return list of record ID if found, None otherwise.
"""
assert(CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[:5] == CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG[:5])
bibxxx = 'bib'+CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[0:2]+'x'
bibrec_bibxxx = 'bibrec_' + bibxxx
write_message(' Looking for extoaiid="%s" with extoaisrc="%s"' % (extoaiid, extoaisrc), verbose=9)
id_bibrecs = intbitset(run_sql("""SELECT bb.id_bibrec FROM %(bibrec_bibxxx)s AS bb,
%(bibxxx)s AS b WHERE b.tag=%%s AND b.value=%%s
AND bb.id_bibxxx=b.id""" % \
{'bibxxx': bibxxx,
'bibrec_bibxxx': bibrec_bibxxx},
(CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG, extoaiid,)))
write_message(' Partially found %s for extoaiid="%s"' % (id_bibrecs, extoaiid), verbose=9)
ret = intbitset()
for id_bibrec in id_bibrecs:
record = get_record(id_bibrec)
instances = record_get_field_instances(record, CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[0:3], CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3], CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4])
write_message(' recid %s -> instances "%s"' % (id_bibrec, instances), verbose=9)
for instance in instances:
this_extoaisrc = field_get_subfield_values(instance, CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG[5])
this_extoaisrc = this_extoaisrc and this_extoaisrc[0] or None
this_extoaiid = field_get_subfield_values(instance, CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[5])
this_extoaiid = this_extoaiid and this_extoaiid[0] or None
write_message(" this_extoaisrc -> %s, this_extoaiid -> %s" % (this_extoaisrc, this_extoaiid), verbose=9)
if this_extoaiid == extoaiid:
write_message(' recid %s -> provenance "%s"' % (id_bibrec, this_extoaisrc), verbose=9)
if this_extoaisrc == extoaisrc:
write_message('Found recid %s for extoaiid="%s" with provenance="%s"' % (id_bibrec, extoaiid, extoaisrc), verbose=9)
ret.add(id_bibrec)
break
if this_extoaisrc is None:
write_message('WARNING: Found recid %s for extoaiid="%s" that doesn\'t specify any provenance, while input record does.' % (id_bibrec, extoaiid), stream=sys.stderr)
if extoaisrc is None:
write_message('WARNING: Found recid %s for extoaiid="%s" that specify a provenance (%s), while input record does not have a provenance.' % (id_bibrec, extoaiid, this_extoaisrc), stream=sys.stderr)
return ret
def find_record_from_oaiid(oaiid):
"""
Try to find record in the database from the OAI ID number and OAI SRC.
Return record ID if found, None otherwise.
"""
bibxxx = 'bib'+CFG_OAI_ID_FIELD[0:2]+'x'
bibrec_bibxxx = 'bibrec_' + bibxxx
res = run_sql("""SELECT bb.id_bibrec FROM %(bibrec_bibxxx)s AS bb,
%(bibxxx)s AS b WHERE b.tag=%%s AND b.value=%%s
AND bb.id_bibxxx=b.id""" % \
{'bibxxx': bibxxx,
'bibrec_bibxxx': bibrec_bibxxx},
(CFG_OAI_ID_FIELD, oaiid,))
if res:
return res[0][0]
else:
return None
def find_record_from_doi(doi):
"""
Try to find record in the database from the given DOI.
Return record ID if found, None otherwise.
"""
bibxxx = 'bib02x'
bibrec_bibxxx = 'bibrec_' + bibxxx
res = run_sql("""SELECT bb.id_bibrec, bb.field_number
FROM %(bibrec_bibxxx)s AS bb, %(bibxxx)s AS b
WHERE b.tag=%%s AND b.value=%%s
AND bb.id_bibxxx=b.id""" % \
{'bibxxx': bibxxx,
'bibrec_bibxxx': bibrec_bibxxx},
('0247_a', doi,))
# For each of the result, make sure that it is really tagged as doi
for (id_bibrec, field_number) in res:
res = run_sql("""SELECT bb.id_bibrec
FROM %(bibrec_bibxxx)s AS bb, %(bibxxx)s AS b
WHERE b.tag=%%s AND b.value=%%s
AND bb.id_bibxxx=b.id and bb.field_number=%%s and bb.id_bibrec=%%s""" % \
{'bibxxx': bibxxx,
'bibrec_bibxxx': bibrec_bibxxx},
('0247_2', "doi", field_number, id_bibrec))
if res and res[0][0] == id_bibrec:
return res[0][0]
return None
def extract_tag_from_record(record, tag_number):
""" Extract the tag_number for record."""
# first step verify if the record is not already in the database
if record:
return record.get(tag_number, None)
return None
def retrieve_rec_id(record, opt_mode, pretend=False, post_phase = False):
"""Retrieve the record Id from a record by using tag 001 or SYSNO or OAI ID or DOI
tag. opt_mod is the desired mode.
@param post_phase Tells if we are calling this method in the postprocessing phase. If true, we accept presence of 001 fields even in the insert mode
@type post_phase boolean
"""
rec_id = None
# 1st step: we look for the tag 001
tag_001 = extract_tag_from_record(record, '001')
if tag_001 is not None:
# We extract the record ID from the tag
rec_id = tag_001[0][3]
# if we are in insert mode => error
if opt_mode == 'insert' and not post_phase:
write_message(" Failed: tag 001 found in the xml" \
" submitted, you should use the option replace," \
" correct or append to replace an existing" \
" record. (-h for help)",
verbose=1, stream=sys.stderr)
return -1
else:
# we found the rec id and we are not in insert mode => continue
# we try to match rec_id against the database:
if find_record_from_recid(rec_id) is not None:
# okay, 001 corresponds to some known record
return int(rec_id)
elif opt_mode in ('replace', 'replace_or_insert'):
if task_get_option('force'):
# we found the rec_id but it's not in the system and we are
# requested to replace records. Therefore we create on the fly
# a empty record allocating the recid.
write_message(" Warning: tag 001 found in the xml with"
" value %(rec_id)s, but rec_id %(rec_id)s does"
" not exist. Since the mode replace was"
" requested the rec_id %(rec_id)s is allocated"
" on-the-fly." % {"rec_id": rec_id},
stream=sys.stderr)
return create_new_record(rec_id=rec_id, pretend=pretend)
else:
# Since --force was not used we are going to raise an error
write_message(" Failed: tag 001 found in the xml"
" submitted with value %(rec_id)s. The"
" corresponding record however does not"
" exists. If you want to really create"
" such record, please use the --force"
" parameter when calling bibupload." % {
"rec_id": rec_id}, stream=sys.stderr)
return -1
else:
# The record doesn't exist yet. We shall have try to check
# the SYSNO or OAI or DOI id later.
write_message(" -Tag 001 value not found in database.",
verbose=9)
rec_id = None
else:
write_message(" -Tag 001 not found in the xml marc file.", verbose=9)
if rec_id is None:
# 2nd step we look for the SYSNO
sysnos = record_get_field_values(record,
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[0:3],
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[3:4] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[3:4] or "",
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[4:5] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[4:5] or "",
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[5:6])
if sysnos:
sysno = sysnos[0] # there should be only one external SYSNO
write_message(" -Checking if SYSNO " + sysno + \
" exists in the database", verbose=9)
# try to find the corresponding rec id from the database
rec_id = find_record_from_sysno(sysno)
if rec_id is not None:
# rec_id found
pass
else:
# The record doesn't exist yet. We will try to check
# external and internal OAI ids later.
write_message(" -Tag SYSNO value not found in database.",
verbose=9)
rec_id = None
else:
write_message(" -Tag SYSNO not found in the xml marc file.",
verbose=9)
if rec_id is None:
# 2nd step we look for the external OAIID
extoai_fields = record_get_field_instances(record,
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[0:3],
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4] or "",
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5] or "")
if extoai_fields:
for field in extoai_fields:
extoaiid = field_get_subfield_values(field, CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[5:6])
extoaisrc = field_get_subfield_values(field, CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG[5:6])
if extoaiid:
extoaiid = extoaiid[0]
if extoaisrc:
extoaisrc = extoaisrc[0]
else:
extoaisrc = None
write_message(" -Checking if EXTOAIID %s (%s) exists in the database" % (extoaiid, extoaisrc), verbose=9)
# try to find the corresponding rec id from the database
rec_ids = find_records_from_extoaiid(extoaiid, extoaisrc)
if rec_ids:
# rec_id found
rec_id = rec_ids.pop()
break
else:
# The record doesn't exist yet. We will try to check
# OAI id later.
write_message(" -Tag EXTOAIID value not found in database.",
verbose=9)
rec_id = None
else:
write_message(" -Tag EXTOAIID not found in the xml marc file.", verbose=9)
if rec_id is None:
# 4th step we look for the OAI ID
oaiidvalues = record_get_field_values(record,
CFG_OAI_ID_FIELD[0:3],
CFG_OAI_ID_FIELD[3:4] != "_" and \
CFG_OAI_ID_FIELD[3:4] or "",
CFG_OAI_ID_FIELD[4:5] != "_" and \
CFG_OAI_ID_FIELD[4:5] or "",
CFG_OAI_ID_FIELD[5:6])
if oaiidvalues:
oaiid = oaiidvalues[0] # there should be only one OAI ID
write_message(" -Check if local OAI ID " + oaiid + \
" exist in the database", verbose=9)
# try to find the corresponding rec id from the database
rec_id = find_record_from_oaiid(oaiid)
if rec_id is not None:
# rec_id found
pass
else:
write_message(" -Tag OAI ID value not found in database.",
verbose=9)
rec_id = None
else:
write_message(" -Tag SYSNO not found in the xml marc file.",
verbose=9)
if rec_id is None:
# 5th step we look for the DOI.
record_dois = record_extract_dois(record)
matching_recids = set()
if record_dois:
# try to find the corresponding rec id from the database
for record_doi in record_dois:
possible_recid = find_record_from_doi(record_doi)
if possible_recid:
matching_recids.add(possible_recid)
if len(matching_recids) > 1:
# Oops, this record refers to DOI existing in multiple records.
# Dunno which one to choose.
write_message(" Failed: Multiple records found in the" \
" database %s that match the DOI(s) in the input" \
" MARCXML %s" % (repr(matching_recids), repr(record_dois)),
verbose=1, stream=sys.stderr)
return -1
elif len(matching_recids) == 1:
rec_id = matching_recids.pop()
if opt_mode == 'insert':
write_message(" Failed: DOI tag matching record #%s found in the xml" \
" submitted, you should use the option replace," \
" correct or append to replace an existing" \
" record. (-h for help)" % rec_id,
verbose=1, stream=sys.stderr)
return -1
else:
write_message(" - Tag DOI value not found in database.",
verbose=9)
rec_id = None
else:
write_message(" -Tag DOI not found in the xml marc file.",
verbose=9)
# Now we should have detected rec_id from SYSNO or OAIID
# tags. (None otherwise.)
if rec_id:
if opt_mode == 'insert':
write_message(" Failed: Record found in the database," \
" you should use the option replace," \
" correct or append to replace an existing" \
" record. (-h for help)",
verbose=1, stream=sys.stderr)
return -1
else:
if opt_mode != 'insert' and \
opt_mode != 'replace_or_insert':
write_message(" Failed: Record not found in the database."\
" Please insert the file before updating it."\
" (-h for help)", verbose=1, stream=sys.stderr)
return -1
return rec_id and int(rec_id) or None
def check_record_doi_is_unique(rec_id, record):
"""
Check that DOI found in 'record' does not exist in any other
record than 'recid'.
Return (boolean, msg) where 'boolean' would be True if the DOI is
unique.
"""
record_dois = record_extract_dois(record)
if record_dois:
matching_recids = set()
for record_doi in record_dois:
possible_recid = find_record_from_doi(record_doi)
if possible_recid:
matching_recids.add(possible_recid)
if len(matching_recids) > 1:
# Oops, this record refers to DOI existing in multiple records.
msg = " Failed: Multiple records found in the" \
" database %s that match the DOI(s) in the input" \
" MARCXML %s" % (repr(matching_recids), repr(record_dois))
return (False, msg)
elif len(matching_recids) == 1:
matching_recid = matching_recids.pop()
if str(matching_recid) != str(rec_id):
# Oops, this record refers to DOI existing in a different record.
msg = " Failed: DOI(s) %s found in this record (#%s)" \
" already exist(s) in another other record (#%s)" % \
(repr(record_dois), rec_id, matching_recid)
return (False, msg)
return (True, "")
### Insert functions
def create_new_record(rec_id=None, pretend=False):
"""
Create new record in the database
@param rec_id: if specified the new record will have this rec_id.
@type rec_id: int
@return: the allocated rec_id
@rtype: int
@note: in case of errors will be returned None
"""
if rec_id is not None:
try:
rec_id = int(rec_id)
except (ValueError, TypeError), error:
write_message(" Error during the creation_new_record function: %s "
% error, verbose=1, stream=sys.stderr)
return None
if run_sql("SELECT id FROM bibrec WHERE id=%s", (rec_id, )):
write_message(" Error during the creation_new_record function: the requested rec_id %s already exists." % rec_id)
return None
if pretend:
if rec_id:
return rec_id
else:
return run_sql("SELECT max(id)+1 FROM bibrec")[0][0]
if rec_id is not None:
return run_sql("INSERT INTO bibrec (id, creation_date, modification_date) VALUES (%s, NOW(), NOW())", (rec_id, ))
else:
return run_sql("INSERT INTO bibrec (creation_date, modification_date) VALUES (NOW(), NOW())")
def insert_bibfmt(id_bibrec, marc, bibformat, modification_date='1970-01-01 00:00:00', pretend=False):
"""Insert the format in the table bibfmt"""
# compress the marc value
pickled_marc = compress(marc)
try:
time.strptime(modification_date, "%Y-%m-%d %H:%M:%S")
except ValueError:
modification_date = '1970-01-01 00:00:00'
query = """INSERT LOW_PRIORITY INTO bibfmt (id_bibrec, format, last_updated, value)
VALUES (%s, %s, %s, %s)"""
if not pretend:
row_id = run_sql(query, (id_bibrec, bibformat, modification_date, pickled_marc))
return row_id
else:
return 1
def insert_record_bibxxx(tag, value, pretend=False):
"""Insert the record into bibxxx"""
# determine into which table one should insert the record
table_name = 'bib'+tag[0:2]+'x'
# check if the tag, value combination exists in the table
query = """SELECT id,value FROM %s """ % table_name
query += """ WHERE tag=%s AND value=%s"""
params = (tag, value)
res = None
res = run_sql(query, params)
# Note: compare now the found values one by one and look for
# string binary equality (e.g. to respect lowercase/uppercase
# match), regardless of the charset etc settings. Ideally we
# could use a BINARY operator in the above SELECT statement, but
# we would have to check compatibility on various MySQLdb versions
# etc; this approach checks all matched values in Python, not in
# MySQL, which is less cool, but more conservative, so it should
# work better on most setups.
if res:
for row in res:
row_id = row[0]
row_value = row[1]
if row_value == value:
return (table_name, row_id)
# We got here only when the tag, value combination was not found,
# so it is now necessary to insert the tag, value combination into
# bibxxx table as new.
query = """INSERT INTO %s """ % table_name
query += """ (tag, value) values (%s , %s)"""
params = (tag, value)
if not pretend:
row_id = run_sql(query, params)
else:
return (table_name, 1)
return (table_name, row_id)
def insert_record_bibrec_bibxxx(table_name, id_bibxxx,
field_number, id_bibrec, pretend=False):
"""Insert the record into bibrec_bibxxx"""
# determine into which table one should insert the record
full_table_name = 'bibrec_'+ table_name
# insert the proper row into the table
query = """INSERT INTO %s """ % full_table_name
query += """(id_bibrec,id_bibxxx, field_number) values (%s , %s, %s)"""
params = (id_bibrec, id_bibxxx, field_number)
if not pretend:
res = run_sql(query, params)
else:
return 1
return res
def synchronize_8564(rec_id, record, record_had_FFT, pretend=False):
"""
Synchronize 8564_ tags and BibDocFile tables.
This function directly manipulate the record parameter.
@type rec_id: positive integer
@param rec_id: the record identifier.
@param record: the record structure as created by bibrecord.create_record
@type record_had_FFT: boolean
@param record_had_FFT: True if the incoming bibuploaded-record used FFT
@return: the manipulated record (which is also modified as a side effect)
"""
def merge_marc_into_bibdocfile(field, pretend=False):
"""
Internal function that reads a single field and stores its content
in BibDocFile tables.
@param field: the 8564_ field containing a BibDocFile URL.
"""
write_message('Merging field: %s' % (field, ), verbose=9)
url = field_get_subfield_values(field, 'u')[:1] or field_get_subfield_values(field, 'q')[:1]
description = field_get_subfield_values(field, 'y')[:1]
comment = field_get_subfield_values(field, 'z')[:1]
if url:
recid, docname, docformat = decompose_bibdocfile_url(url[0])
if recid != rec_id:
write_message("INFO: URL %s is not pointing to a fulltext owned by this record (%s)" % (url, recid), stream=sys.stderr)
else:
try:
bibdoc = BibRecDocs(recid).get_bibdoc(docname)
if description and not pretend:
bibdoc.set_description(description[0], docformat)
if comment and not pretend:
bibdoc.set_comment(comment[0], docformat)
except InvenioBibDocFileError:
## Apparently the referenced docname doesn't exist anymore.
## Too bad. Let's skip it.
write_message("WARNING: docname %s does not seem to exist for record %s. Has it been renamed outside FFT?" % (docname, recid), stream=sys.stderr)
def merge_bibdocfile_into_marc(field, subfields):
"""
Internal function that reads BibDocFile table entries referenced by
the URL in the given 8564_ field and integrate the given information
directly with the provided subfields.
@param field: the 8564_ field containing a BibDocFile URL.
@param subfields: the subfields corresponding to the BibDocFile URL
generated after BibDocFile tables.
"""
write_message('Merging subfields %s into field %s' % (subfields, field), verbose=9)
subfields = dict(subfields) ## We make a copy not to have side-effects
subfield_to_delete = []
for subfield_position, (code, value) in enumerate(field_get_subfield_instances(field)):
## For each subfield instance already existing...
if code in subfields:
## ...We substitute it with what is in BibDocFile tables
record_modify_subfield(record, '856', code, subfields[code],
subfield_position, field_position_global=field[4])
del subfields[code]
else:
## ...We delete it otherwise
subfield_to_delete.append(subfield_position)
subfield_to_delete.sort()
for counter, position in enumerate(subfield_to_delete):
## FIXME: Very hackish algorithm. Since deleting a subfield
## will alterate the position of following subfields, we
## are taking note of this and adjusting further position
## by using a counter.
record_delete_subfield_from(record, '856', position - counter,
field_position_global=field[4])
subfields = subfields.items()
subfields.sort()
for code, value in subfields:
## Let's add non-previously existing subfields
record_add_subfield_into(record, '856', code, value,
field_position_global=field[4])
def get_bibdocfile_managed_info():
"""
Internal function, returns a dictionary of
BibDocFile URL -> wanna-be subfields.
This information is retrieved from internal BibDoc
structures rather than from input MARC XML files
@rtype: mapping
@return: BibDocFile URL -> wanna-be subfields dictionary
"""
ret = {}
bibrecdocs = BibRecDocs(rec_id)
latest_files = bibrecdocs.list_latest_files(list_hidden=False)
for afile in latest_files:
url = afile.get_url()
ret[url] = {'u': url}
description = afile.get_description()
comment = afile.get_comment()
subformat = afile.get_subformat()
if description:
ret[url]['y'] = description
if comment:
ret[url]['z'] = comment
if subformat:
ret[url]['x'] = subformat
return ret
write_message("Synchronizing MARC of recid '%s' with:\n%s" % (rec_id, record), verbose=9)
tags856s = record_get_field_instances(record, '856', '%', '%')
write_message("Original 856%% instances: %s" % tags856s, verbose=9)
tags8564s_to_add = get_bibdocfile_managed_info()
write_message("BibDocFile instances: %s" % tags8564s_to_add, verbose=9)
positions_tags8564s_to_remove = []
for local_position, field in enumerate(tags856s):
if field[1] == '4' and field[2] == ' ':
write_message('Analysing %s' % (field, ), verbose=9)
for url in field_get_subfield_values(field, 'u') + field_get_subfield_values(field, 'q'):
if url in tags8564s_to_add:
# there exists a link in the MARC of the record and the connection exists in BibDoc tables
if record_had_FFT:
merge_bibdocfile_into_marc(field, tags8564s_to_add[url])
else:
merge_marc_into_bibdocfile(field, pretend=pretend)
del tags8564s_to_add[url]
break
elif bibdocfile_url_p(url) and decompose_bibdocfile_url(url)[0] == rec_id:
# The link exists and is potentially correct-looking link to a document
# moreover, it refers to current record id ... but it does not exist in
# internal BibDoc structures. This could have happen in the case of renaming a document
# or its removal. In both cases we have to remove link... a new one will be created
positions_tags8564s_to_remove.append(local_position)
write_message("%s to be deleted and re-synchronized" % (field, ), verbose=9)
break
record_delete_fields(record, '856', positions_tags8564s_to_remove)
tags8564s_to_add = tags8564s_to_add.values()
tags8564s_to_add.sort()
for subfields in tags8564s_to_add:
subfields = subfields.items()
subfields.sort()
record_add_field(record, '856', '4', ' ', subfields=subfields)
write_message('Final record: %s' % record, verbose=9)
return record
def _get_subfield_value(field, subfield_code, default=None):
res = field_get_subfield_values(field, subfield_code)
if res != [] and res != None:
return res[0]
else:
return default
def elaborate_mit_tags(record, rec_id, mode, pretend = False, tmp_ids = {},
tmp_vers = {}):
"""
Uploading MoreInfo -> BDM tags
"""
tuple_list = extract_tag_from_record(record, 'BDM')
# Now gathering information from BDR tags - to be processed later
write_message("Processing BDM entries of the record ")
recordDocs = BibRecDocs(rec_id)
if tuple_list:
for mit in record_get_field_instances(record, 'BDM', ' ', ' '):
relation_id = _get_subfield_value(mit, "r")
bibdoc_id = _get_subfield_value(mit, "i")
# checking for a possibly temporary ID
if not (bibdoc_id is None):
bibdoc_id = resolve_identifier(tmp_ids, bibdoc_id)
bibdoc_ver = _get_subfield_value(mit, "v")
if not (bibdoc_ver is None):
bibdoc_ver = resolve_identifier(tmp_vers, bibdoc_ver)
bibdoc_name = _get_subfield_value(mit, "n")
bibdoc_fmt = _get_subfield_value(mit, "f")
moreinfo_str = _get_subfield_value(mit, "m")
if bibdoc_id == None:
if bibdoc_name == None:
raise StandardError("Incorrect relation. Neither name nor identifier of the first obejct has been specified")
else:
# retrieving the ID based on the document name (inside current record)
# The document is attached to current record.
try:
bibdoc_id = recordDocs.get_docid(bibdoc_name)
except:
raise StandardError("BibDoc of a name %s does not exist within a record" % (bibdoc_name, ))
else:
if bibdoc_name != None:
write_message("Warning: both name and id of the first document of a relation have been specified. Ignoring the name")
if (moreinfo_str is None or mode in ("replace", "correct")) and (not pretend):
MoreInfo(docid=bibdoc_id , version = bibdoc_ver,
docformat = bibdoc_fmt, relation = relation_id).delete()
if (not moreinfo_str is None) and (not pretend):
MoreInfo.create_from_serialised(moreinfo_str,
docid=bibdoc_id,
version = bibdoc_ver,
docformat = bibdoc_fmt,
relation = relation_id)
return record
def elaborate_brt_tags(record, rec_id, mode, pretend=False, tmp_ids = {}, tmp_vers = {}):
"""
Process BDR tags describing relations between existing objects
"""
tuple_list = extract_tag_from_record(record, 'BDR')
# Now gathering information from BDR tags - to be processed later
relations_to_create = []
write_message("Processing BDR entries of the record ")
recordDocs = BibRecDocs(rec_id) #TODO: check what happens if there is no record yet ! Will the class represent an empty set?
if tuple_list:
for brt in record_get_field_instances(record, 'BDR', ' ', ' '):
relation_id = _get_subfield_value(brt, "r")
bibdoc1_id = None
bibdoc1_name = None
bibdoc1_ver = None
bibdoc1_fmt = None
bibdoc2_id = None
bibdoc2_name = None
bibdoc2_ver = None
bibdoc2_fmt = None
if not relation_id:
bibdoc1_id = _get_subfield_value(brt, "i")
bibdoc1_name = _get_subfield_value(brt, "n")
if bibdoc1_id == None:
if bibdoc1_name == None:
raise StandardError("Incorrect relation. Neither name nor identifier of the first obejct has been specified")
else:
# retrieving the ID based on the document name (inside current record)
# The document is attached to current record.
try:
bibdoc1_id = recordDocs.get_docid(bibdoc1_name)
except:
raise StandardError("BibDoc of a name %s does not exist within a record" % \
(bibdoc1_name, ))
else:
# resolving temporary identifier
bibdoc1_id = resolve_identifier(tmp_ids, bibdoc1_id)
if bibdoc1_name != None:
write_message("Warning: both name and id of the first document of a relation have been specified. Ignoring the name")
bibdoc1_ver = _get_subfield_value(brt, "v")
if not (bibdoc1_ver is None):
bibdoc1_ver = resolve_identifier(tmp_vers, bibdoc1_ver)
bibdoc1_fmt = _get_subfield_value(brt, "f")
bibdoc2_id = _get_subfield_value(brt, "j")
bibdoc2_name = _get_subfield_value(brt, "o")
if bibdoc2_id == None:
if bibdoc2_name == None:
raise StandardError("Incorrect relation. Neither name nor identifier of the second obejct has been specified")
else:
# retrieving the ID based on the document name (inside current record)
# The document is attached to current record.
try:
bibdoc2_id = recordDocs.get_docid(bibdoc2_name)
except:
raise StandardError("BibDoc of a name %s does not exist within a record" % (bibdoc2_name, ))
else:
bibdoc2_id = resolve_identifier(tmp_ids, bibdoc2_id)
if bibdoc2_name != None:
write_message("Warning: both name and id of the first document of a relation have been specified. Ignoring the name")
bibdoc2_ver = _get_subfield_value(brt, "w")
if not (bibdoc2_ver is None):
bibdoc2_ver = resolve_identifier(tmp_vers, bibdoc2_ver)
bibdoc2_fmt = _get_subfield_value(brt, "g")
control_command = _get_subfield_value(brt, "d")
relation_type = _get_subfield_value(brt, "t")
if not relation_type and not relation_id:
raise StandardError("The relation type must be specified")
more_info = _get_subfield_value(brt, "m")
# the relation id might be specified in the case of updating
# MoreInfo table instead of other fields
rel_obj = None
if not relation_id:
rels = BibRelation.get_relations(rel_type = relation_type,
bibdoc1_id = bibdoc1_id,
bibdoc2_id = bibdoc2_id,
bibdoc1_ver = bibdoc1_ver,
bibdoc2_ver = bibdoc2_ver,
bibdoc1_fmt = bibdoc1_fmt,
bibdoc2_fmt = bibdoc2_fmt)
if len(rels) > 0:
rel_obj = rels[0]
relation_id = rel_obj.id
else:
rel_obj = BibRelation(rel_id=relation_id)
relations_to_create.append((relation_id, bibdoc1_id, bibdoc1_ver,
bibdoc1_fmt, bibdoc2_id, bibdoc2_ver,
bibdoc2_fmt, relation_type, more_info,
rel_obj, control_command))
record_delete_field(record, 'BDR', ' ', ' ')
if mode in ("insert", "replace_or_insert", "append", "correct", "replace"):
# now creating relations between objects based on the data
if not pretend:
for (relation_id, bibdoc1_id, bibdoc1_ver, bibdoc1_fmt,
bibdoc2_id, bibdoc2_ver, bibdoc2_fmt, rel_type,
more_info, rel_obj, control_command) in relations_to_create:
if rel_obj == None:
rel_obj = BibRelation.create(bibdoc1_id = bibdoc1_id,
bibdoc1_ver = bibdoc1_ver,
bibdoc1_fmt = bibdoc1_fmt,
bibdoc2_id = bibdoc2_id,
bibdoc2_ver = bibdoc2_ver,
bibdoc2_fmt = bibdoc2_fmt,
rel_type = rel_type)
relation_id = rel_obj.id
if mode in ("replace"):
# Clearing existing MoreInfo content
rel_obj.get_more_info().delete()
if more_info:
MoreInfo.create_from_serialised(more_info, relation = relation_id)
if control_command == "DELETE":
rel_obj.delete()
else:
write_message("BDR tag is not processed in the %s mode" % (mode, ))
return record
def elaborate_fft_tags(record, rec_id, mode, pretend=False,
tmp_ids = {}, tmp_vers = {}):
"""
Process FFT tags that should contain $a with file pathes or URLs
to get the fulltext from. This function enriches record with
proper 8564 URL tags, downloads fulltext files and stores them
into var/data structure where appropriate.
CFG_BIBUPLOAD_WGET_SLEEP_TIME defines time to sleep in seconds in
between URL downloads.
Note: if an FFT tag contains multiple $a subfields, we upload them
into different 856 URL tags in the metadata. See regression test
case test_multiple_fft_insert_via_http().
"""
# Let's define some handy sub procedure.
def _add_new_format(bibdoc, url, docformat, docname, doctype, newname, description, comment, flags, modification_date, pretend=False):
"""Adds a new format for a given bibdoc. Returns True when everything's fine."""
write_message('Add new format to %s url: %s, format: %s, docname: %s, doctype: %s, newname: %s, description: %s, comment: %s, flags: %s, modification_date: %s' % (repr(bibdoc), url, docformat, docname, doctype, newname, description, comment, flags, modification_date), verbose=9)
try:
if not url: # Not requesting a new url. Just updating comment & description
return _update_description_and_comment(bibdoc, docname, docformat, description, comment, flags, pretend=pretend)
try:
if not pretend:
bibdoc.add_file_new_format(url, description=description, comment=comment, flags=flags, modification_date=modification_date)
except StandardError, e:
write_message("('%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s') not inserted because format already exists (%s)." % (url, docformat, docname, doctype, newname, description, comment, flags, modification_date, e), stream=sys.stderr)
raise
except Exception, e:
write_message("Error in adding '%s' as a new format because of: %s" % (url, e), stream=sys.stderr)
raise
return True
def _add_new_version(bibdoc, url, docformat, docname, doctype, newname, description, comment, flags, modification_date, pretend=False):
"""Adds a new version for a given bibdoc. Returns True when everything's fine."""
write_message('Add new version to %s url: %s, format: %s, docname: %s, doctype: %s, newname: %s, description: %s, comment: %s, flags: %s' % (repr(bibdoc), url, docformat, docname, doctype, newname, description, comment, flags))
try:
if not url:
return _update_description_and_comment(bibdoc, docname, docformat, description, comment, flags, pretend=pretend)
try:
if not pretend:
bibdoc.add_file_new_version(url, description=description, comment=comment, flags=flags, modification_date=modification_date)
except StandardError, e:
write_message("('%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s') not inserted because '%s'." % (url, docformat, docname, doctype, newname, description, comment, flags, modification_date, e), stream=sys.stderr)
raise
except Exception, e:
write_message("Error in adding '%s' as a new version because of: %s" % (url, e), stream=sys.stderr)
raise
return True
def _update_description_and_comment(bibdoc, docname, docformat, description, comment, flags, pretend=False):
"""Directly update comments and descriptions."""
write_message('Just updating description and comment for %s with format %s with description %s, comment %s and flags %s' % (docname, docformat, description, comment, flags), verbose=9)
try:
if not pretend:
bibdoc.set_description(description, docformat)
bibdoc.set_comment(comment, docformat)
for flag in CFG_BIBDOCFILE_AVAILABLE_FLAGS:
if flag in flags:
bibdoc.set_flag(flag, docformat)
else:
bibdoc.unset_flag(flag, docformat)
except StandardError, e:
write_message("('%s', '%s', '%s', '%s', '%s') description and comment not updated because '%s'." % (docname, docformat, description, comment, flags, e))
raise
return True
def _process_document_moreinfos(more_infos, docname, version, docformat, mode):
if not mode in ('correct', 'append', 'replace_or_insert', 'replace', 'correct', 'insert'):
print "exited because the mode is incorrect"
return
brd = BibRecDocs(rec_id)
docid = None
try:
docid = brd.get_docid(docname)
except:
raise StandardError("MoreInfo: No document of a given name associated with the record")
if not version:
# We have to retrieve the most recent version ...
version = brd.get_bibdoc(docname).get_latest_version()
doc_moreinfo_s, version_moreinfo_s, version_format_moreinfo_s, format_moreinfo_s = more_infos
if mode in ("replace", "replace_or_insert"):
if doc_moreinfo_s: #only if specified, otherwise do not touch
MoreInfo(docid = docid).delete()
if format_moreinfo_s: #only if specified... otherwise do not touch
MoreInfo(docid = docid, docformat = docformat).delete()
if not doc_moreinfo_s is None:
MoreInfo.create_from_serialised(ser_str = doc_moreinfo_s, docid = docid)
if not version_moreinfo_s is None:
MoreInfo.create_from_serialised(ser_str = version_moreinfo_s,
docid = docid, version = version)
if not version_format_moreinfo_s is None:
MoreInfo.create_from_serialised(ser_str = version_format_moreinfo_s,
docid = docid, version = version,
docformat = docformat)
if not format_moreinfo_s is None:
MoreInfo.create_from_serialised(ser_str = format_moreinfo_s,
docid = docid, docformat = docformat)
if mode == 'delete':
raise StandardError('FFT tag specified but bibupload executed in --delete mode')
tuple_list = extract_tag_from_record(record, 'FFT')
if tuple_list: # FFT Tags analysis
write_message("FFTs: "+str(tuple_list), verbose=9)
docs = {} # docnames and their data
for fft in record_get_field_instances(record, 'FFT', ' ', ' '):
# Very first, we retrieve the potentially temporary odentifiers...
#even if the rest fails, we should include them in teh dictionary
version = _get_subfield_value(fft, 'v', '')
# checking if version is temporary... if so, filling a different varaible
is_tmp_ver, bibdoc_tmpver = parse_identifier(version)
if is_tmp_ver:
version = None
else:
bibdoc_tmpver = None
if not version: #treating cases of empty string etc...
version = None
bibdoc_tmpid = field_get_subfield_values(fft, 'i')
if bibdoc_tmpid:
bibdoc_tmpid = bibdoc_tmpid[0]
else:
bibdoc_tmpid
is_tmp_id, bibdoc_tmpid = parse_identifier(bibdoc_tmpid)
if not is_tmp_id:
bibdoc_tmpid = None
# In the case of having temporary id's, we dont resolve them yet but signaklise that they have been used
# value -1 means that identifier has been declared but not assigned a value yet
if bibdoc_tmpid:
if bibdoc_tmpid in tmp_ids:
write_message("WARNING: the temporary identifier %s has been declared more than once. Ignoring the second occurance" % (bibdoc_tmpid, ))
else:
tmp_ids[bibdoc_tmpid] = -1
if bibdoc_tmpver:
if bibdoc_tmpver in tmp_vers:
write_message("WARNING: the temporary version identifier %s has been declared more than once. Ignoring the second occurance" % (bibdoc_tmpver, ))
else:
tmp_vers[bibdoc_tmpver] = -1
# Let's discover the type of the document
# This is a legacy field and will not be enforced any particular
# check on it.
doctype = _get_subfield_value(fft, 't', 'Main') #Default is Main
# Let's discover the url.
url = field_get_subfield_values(fft, 'a')
if url:
url = url[0]
try:
check_valid_url(url)
except StandardError, e:
raise StandardError, "fft '%s' specifies in $a a location ('%s') with problems: %s" % (fft, url, e)
else:
url = ''
#TODO: a lot of code can be compactified using similar syntax ... should be more readable on the longer scale
# maybe right side expressions look a bit cryptic, but the elaborate_fft function would be much clearer
if mode == 'correct' and doctype != 'FIX-MARC':
arg2 = ""
else:
arg2 = KEEP_OLD_VALUE
description = _get_subfield_value(fft, 'd', arg2)
# Let's discover the description
# description = field_get_subfield_values(fft, 'd')
# if description != []:
# description = description[0]
# else:
# if mode == 'correct' and doctype != 'FIX-MARC':
## If the user require to correct, and do not specify
## a description this means she really want to
## modify the description.
# description = ''
# else:
# description = KEEP_OLD_VALUE
# Let's discover the desired docname to be created/altered
name = field_get_subfield_values(fft, 'n')
if name:
## Let's remove undesired extensions
name = file_strip_ext(name[0] + '.pdf')
else:
if url:
name = get_docname_from_url(url)
elif mode != 'correct' and doctype != 'FIX-MARC':
raise StandardError, "Warning: fft '%s' doesn't specifies either a location in $a or a docname in $n" % str(fft)
else:
continue
# Let's discover the desired new docname in case we want to change it
newname = field_get_subfield_values(fft, 'm')
if newname:
newname = file_strip_ext(newname[0] + '.pdf')
else:
newname = name
# Let's discover the desired format
docformat = field_get_subfield_values(fft, 'f')
if docformat:
docformat = normalize_format(docformat[0])
else:
if url:
docformat = guess_format_from_url(url)
else:
docformat = ""
# Let's discover the icon
icon = field_get_subfield_values(fft, 'x')
if icon != []:
icon = icon[0]
if icon != KEEP_OLD_VALUE:
try:
check_valid_url(icon)
except StandardError, e:
raise StandardError, "fft '%s' specifies in $x an icon ('%s') with problems: %s" % (fft, icon, e)
else:
icon = ''
# Let's discover the comment
comment = field_get_subfield_values(fft, 'z')
if comment != []:
comment = comment[0]
else:
if mode == 'correct' and doctype != 'FIX-MARC':
## See comment on description
comment = ''
else:
comment = KEEP_OLD_VALUE
# Let's discover the restriction
restriction = field_get_subfield_values(fft, 'r')
if restriction != []:
restriction = restriction[0]
else:
if mode == 'correct' and doctype != 'FIX-MARC':
## See comment on description
restriction = ''
else:
restriction = KEEP_OLD_VALUE
document_moreinfo = _get_subfield_value(fft, 'w')
version_moreinfo = _get_subfield_value(fft, 'p')
version_format_moreinfo = _get_subfield_value(fft, 'b')
format_moreinfo = _get_subfield_value(fft, 'u')
# Let's discover the timestamp of the file (if any)
timestamp = field_get_subfield_values(fft, 's')
if timestamp:
try:
timestamp = datetime(*(time.strptime(timestamp[0], "%Y-%m-%d %H:%M:%S")[:6]))
except ValueError:
write_message('Warning: The timestamp is not in a good format, thus will be ignored. The format should be YYYY-MM-DD HH:MM:SS')
timestamp = ''
else:
timestamp = ''
flags = field_get_subfield_values(fft, 'o')
for flag in flags:
if flag not in CFG_BIBDOCFILE_AVAILABLE_FLAGS:
raise StandardError, "fft '%s' specifies a non available flag: %s" % (fft, flag)
if docs.has_key(name): # new format considered
(doctype2, newname2, restriction2, version2, urls, dummybibdoc_moreinfos2, dummybibdoc_tmpid2, dummybibdoc_tmpver2 ) = docs[name]
if doctype2 != doctype:
raise StandardError, "fft '%s' specifies a different doctype from previous fft with docname '%s'" % (str(fft), name)
if newname2 != newname:
raise StandardError, "fft '%s' specifies a different newname from previous fft with docname '%s'" % (str(fft), name)
if restriction2 != restriction:
raise StandardError, "fft '%s' specifies a different restriction from previous fft with docname '%s'" % (str(fft), name)
if version2 != version:
raise StandardError, "fft '%s' specifies a different version than the previous fft with docname '%s'" % (str(fft), name)
for (dummyurl2, format2, dummydescription2, dummycomment2, dummyflags2, dummytimestamp2) in urls:
if docformat == format2:
raise StandardError, "fft '%s' specifies a second file '%s' with the same format '%s' from previous fft with docname '%s'" % (str(fft), url, docformat, name)
if url or docformat:
urls.append((url, docformat, description, comment, flags, timestamp))
if icon:
urls.append((icon, icon[len(file_strip_ext(icon)):] + ';icon', description, comment, flags, timestamp))
else:
if url or docformat:
docs[name] = (doctype, newname, restriction, version, [(url, docformat, description, comment, flags, timestamp)], [document_moreinfo, version_moreinfo, version_format_moreinfo, format_moreinfo], bibdoc_tmpid, bibdoc_tmpver)
if icon:
docs[name][4].append((icon, icon[len(file_strip_ext(icon)):] + ';icon', description, comment, flags, timestamp))
elif icon:
docs[name] = (doctype, newname, restriction, version, [(icon, icon[len(file_strip_ext(icon)):] + ';icon', description, comment, flags, timestamp)], [document_moreinfo, version_moreinfo, version_format_moreinfo, format_moreinfo], bibdoc_tmpid, bibdoc_tmpver)
else:
docs[name] = (doctype, newname, restriction, version, [], [document_moreinfo, version_moreinfo, version_format_moreinfo, format_moreinfo], bibdoc_tmpid, bibdoc_tmpver)
write_message('Result of FFT analysis:\n\tDocs: %s' % (docs,), verbose=9)
# Let's remove all FFT tags
record_delete_field(record, 'FFT', ' ', ' ')
# Preprocessed data elaboration
bibrecdocs = BibRecDocs(rec_id)
## Let's pre-download all the URLs to see if, in case of mode 'correct' or 'append'
## we can avoid creating a new revision.
for docname, (doctype, newname, restriction, version, urls, more_infos, bibdoc_tmpid, bibdoc_tmpver ) in docs.items():
downloaded_urls = []
try:
bibdoc = bibrecdocs.get_bibdoc(docname)
except InvenioBibDocFileError:
## A bibdoc with the given docname does not exists.
## So there is no chance we are going to revise an existing
## format with an identical file :-)
bibdoc = None
new_revision_needed = False
for url, docformat, description, comment, flags, timestamp in urls:
if url:
try:
downloaded_url = download_url(url, docformat)
write_message("%s saved into %s" % (url, downloaded_url), verbose=9)
except Exception, err:
write_message("Error in downloading '%s' because of: %s" % (url, err), stream=sys.stderr)
raise
if mode == 'correct' and bibdoc is not None and not new_revision_needed:
downloaded_urls.append((downloaded_url, docformat, description, comment, flags, timestamp))
if not bibrecdocs.check_file_exists(downloaded_url, docformat):
new_revision_needed = True
else:
write_message("WARNING: %s is already attached to bibdoc %s for recid %s" % (url, docname, rec_id), stream=sys.stderr)
elif mode == 'append' and bibdoc is not None:
if not bibrecdocs.check_file_exists(downloaded_url, docformat):
downloaded_urls.append((downloaded_url, docformat, description, comment, flags, timestamp))
else:
write_message("WARNING: %s is already attached to bibdoc %s for recid %s" % (url, docname, rec_id), stream=sys.stderr)
else:
downloaded_urls.append((downloaded_url, docformat, description, comment, flags, timestamp))
else:
downloaded_urls.append(('', docformat, description, comment, flags, timestamp))
if mode == 'correct' and bibdoc is not None and not new_revision_needed:
## Since we don't need a new revision (because all the files
## that are being uploaded are different)
## we can simply remove the urls but keep the other information
write_message("No need to add a new revision for docname %s for recid %s" % (docname, rec_id), verbose=2)
docs[docname] = (doctype, newname, restriction, version, [('', docformat, description, comment, flags, timestamp) for (dummy, docformat, description, comment, flags, timestamp) in downloaded_urls], more_infos, bibdoc_tmpid, bibdoc_tmpver)
for downloaded_url, dummy, dummy, dummy, dummy, dummy in downloaded_urls:
## Let's free up some space :-)
if downloaded_url and os.path.exists(downloaded_url):
os.remove(downloaded_url)
else:
if downloaded_urls or mode != 'append':
docs[docname] = (doctype, newname, restriction, version, downloaded_urls, more_infos, bibdoc_tmpid, bibdoc_tmpver)
else:
## In case we are in append mode and there are no urls to append
## we discard the whole FFT
del docs[docname]
if mode == 'replace': # First we erase previous bibdocs
if not pretend:
for bibdoc in bibrecdocs.list_bibdocs():
bibdoc.delete()
bibrecdocs.build_bibdoc_list()
for docname, (doctype, newname, restriction, version, urls, more_infos, bibdoc_tmpid, bibdoc_tmpver) in docs.iteritems():
write_message("Elaborating olddocname: '%s', newdocname: '%s', doctype: '%s', restriction: '%s', urls: '%s', mode: '%s'" % (docname, newname, doctype, restriction, urls, mode), verbose=9)
if mode in ('insert', 'replace'): # new bibdocs, new docnames, new marc
if newname in bibrecdocs.get_bibdoc_names():
write_message("('%s', '%s') not inserted because docname already exists." % (newname, urls), stream=sys.stderr)
raise StandardError("('%s', '%s') not inserted because docname already exists." % (newname, urls), stream=sys.stderr)
try:
if not pretend:
bibdoc = bibrecdocs.add_bibdoc(doctype, newname)
bibdoc.set_status(restriction)
else:
bibdoc = None
except Exception, e:
write_message("('%s', '%s', '%s') not inserted because: '%s'." % (doctype, newname, urls, e), stream=sys.stderr)
raise e
for (url, docformat, description, comment, flags, timestamp) in urls:
assert(_add_new_format(bibdoc, url, docformat, docname, doctype, newname, description, comment, flags, timestamp, pretend=pretend))
elif mode == 'replace_or_insert': # to be thought as correct_or_insert
for bibdoc in bibrecdocs.list_bibdocs():
brd = BibRecDocs(rec_id)
dn = brd.get_docname(bibdoc.id)
if dn == docname:
if doctype not in ('PURGE', 'DELETE', 'EXPUNGE', 'REVERT', 'FIX-ALL', 'FIX-MARC', 'DELETE-FILE'):
if newname != docname:
try:
if not pretend:
bibrecdocs.change_name(newname = newname, docid = bibdoc.id)
## Let's refresh the list of bibdocs.
bibrecdocs.build_bibdoc_list()
except StandardError, e:
write_message(e, stream=sys.stderr)
raise
found_bibdoc = False
for bibdoc in bibrecdocs.list_bibdocs():
brd = BibRecDocs(rec_id)
dn = brd.get_docname(bibdoc.id)
if dn == newname:
found_bibdoc = True
if doctype == 'PURGE':
if not pretend:
bibdoc.purge()
elif doctype == 'DELETE':
if not pretend:
bibdoc.delete()
elif doctype == 'EXPUNGE':
if not pretend:
bibdoc.expunge()
elif doctype == 'FIX-ALL':
if not pretend:
bibrecdocs.fix(docname)
elif doctype == 'FIX-MARC':
pass
elif doctype == 'DELETE-FILE':
if urls:
for (url, docformat, description, comment, flags, timestamp) in urls:
if not pretend:
bibdoc.delete_file(docformat, version)
elif doctype == 'REVERT':
try:
if not pretend:
bibdoc.revert(version)
except Exception, e:
write_message('(%s, %s) not correctly reverted: %s' % (newname, version, e), stream=sys.stderr)
raise
else:
if restriction != KEEP_OLD_VALUE:
if not pretend:
bibdoc.set_status(restriction)
# Since the docname already existed we have to first
# bump the version by pushing the first new file
# then pushing the other files.
if urls:
(first_url, first_format, first_description, first_comment, first_flags, first_timestamp) = urls[0]
other_urls = urls[1:]
assert(_add_new_version(bibdoc, first_url, first_format, docname, doctype, newname, first_description, first_comment, first_flags, first_timestamp, pretend=pretend))
for (url, docformat, description, comment, flags, timestamp) in other_urls:
assert(_add_new_format(bibdoc, url, docformat, docname, doctype, newname, description, comment, flags, timestamp, pretend=pretend))
## Let's refresh the list of bibdocs.
bibrecdocs.build_bibdoc_list()
if not found_bibdoc:
if not pretend:
bibdoc = bibrecdocs.add_bibdoc(doctype, newname)
bibdoc.set_status(restriction)
for (url, docformat, description, comment, flags, timestamp) in urls:
assert(_add_new_format(bibdoc, url, docformat, docname, doctype, newname, description, comment, flags, timestamp))
elif mode == 'correct':
for bibdoc in bibrecdocs.list_bibdocs():
brd = BibRecDocs(rec_id)
dn = brd.get_docname(bibdoc.id)
if dn == docname:
if doctype not in ('PURGE', 'DELETE', 'EXPUNGE', 'REVERT', 'FIX-ALL', 'FIX-MARC', 'DELETE-FILE'):
if newname != docname:
try:
if not pretend:
bibrecdocs.change_name(docid = bibdoc.id, newname=newname)
## Let's refresh the list of bibdocs.
bibrecdocs.build_bibdoc_list()
except StandardError, e:
write_message('Error in renaming %s to %s: %s' % (docname, newname, e), stream=sys.stderr)
raise
found_bibdoc = False
for bibdoc in bibrecdocs.list_bibdocs():
brd = BibRecDocs(rec_id)
dn = brd.get_docname(bibdoc.id)
if dn == newname:
found_bibdoc = True
if doctype == 'PURGE':
if not pretend:
bibdoc.purge()
elif doctype == 'DELETE':
if not pretend:
bibdoc.delete()
elif doctype == 'EXPUNGE':
if not pretend:
bibdoc.expunge()
elif doctype == 'FIX-ALL':
if not pretend:
bibrecdocs.fix(newname)
elif doctype == 'FIX-MARC':
pass
elif doctype == 'DELETE-FILE':
if urls:
for (url, docformat, description, comment, flags, timestamp) in urls:
if not pretend:
bibdoc.delete_file(docformat, version)
elif doctype == 'REVERT':
try:
if not pretend:
bibdoc.revert(version)
except Exception, e:
write_message('(%s, %s) not correctly reverted: %s' % (newname, version, e), stream=sys.stderr)
raise
else:
if restriction != KEEP_OLD_VALUE:
if not pretend:
bibdoc.set_status(restriction)
if doctype and doctype!= KEEP_OLD_VALUE:
if not pretend:
bibdoc.change_doctype(doctype)
if urls:
(first_url, first_format, first_description, first_comment, first_flags, first_timestamp) = urls[0]
other_urls = urls[1:]
assert(_add_new_version(bibdoc, first_url, first_format, docname, doctype, newname, first_description, first_comment, first_flags, first_timestamp, pretend=pretend))
for (url, docformat, description, comment, flags, timestamp) in other_urls:
assert(_add_new_format(bibdoc, url, docformat, docname, doctype, newname, description, comment, flags, timestamp, pretend=pretend))
## Let's refresh the list of bibdocs.
bibrecdocs.build_bibdoc_list()
if not found_bibdoc:
if doctype in ('PURGE', 'DELETE', 'EXPUNGE', 'FIX-ALL', 'FIX-MARC', 'DELETE-FILE', 'REVERT'):
write_message("('%s', '%s', '%s') not performed because '%s' docname didn't existed." % (doctype, newname, urls, docname), stream=sys.stderr)
raise StandardError
else:
if not pretend:
bibdoc = bibrecdocs.add_bibdoc(doctype, newname)
bibdoc.set_status(restriction)
for (url, docformat, description, comment, flags, timestamp) in urls:
assert(_add_new_format(bibdoc, url, docformat, docname, doctype, newname, description, comment, flags, timestamp))
elif mode == 'append':
try:
found_bibdoc = False
for bibdoc in bibrecdocs.list_bibdocs():
brd = BibRecDocs(rec_id)
dn = brd.get_docname(bibdoc.id)
if dn == docname:
found_bibdoc = True
for (url, docformat, description, comment, flags, timestamp) in urls:
assert(_add_new_format(bibdoc, url, docformat, docname, doctype, newname, description, comment, flags, timestamp, pretend=pretend))
if not found_bibdoc:
try:
if not pretend:
bibdoc = bibrecdocs.add_bibdoc(doctype, docname)
bibdoc.set_status(restriction)
for (url, docformat, description, comment, flags, timestamp) in urls:
assert(_add_new_format(bibdoc, url, docformat, docname, doctype, newname, description, comment, flags, timestamp))
except Exception, e:
register_exception()
write_message("('%s', '%s', '%s') not appended because: '%s'." % (doctype, newname, urls, e), stream=sys.stderr)
raise
except:
register_exception()
raise
if not pretend:
_process_document_moreinfos(more_infos, newname, version, urls and urls[0][1], mode)
# resolving temporary version and identifier
brd = BibRecDocs(rec_id)
if bibdoc_tmpid:
if bibdoc_tmpid in tmp_ids and tmp_ids[bibdoc_tmpid] != -1:
write_message("WARNING: the temporary identifier %s has been declared more than once. Ignoring the second occurance" % (bibdoc_tmpid, ))
else:
tmp_ids[bibdoc_tmpid] = brd.get_docid(docname)
if bibdoc_tmpver:
if bibdoc_tmpver in tmp_vers and tmp_vers[bibdoc_tmpver] != -1:
write_message("WARNING: the temporary version identifier %s has been declared more than once. Ignoring the second occurance" % (bibdoc_tmpver, ))
else:
if version == None:
if version:
tmp_vers[bibdoc_tmpver] = version
else:
tmp_vers[bibdoc_tmpver] = brd.get_bibdoc(docname).get_latest_version()
else:
tmp_vers[bibdoc_tmpver] = version
return record
### Update functions
def update_bibrec_date(now, bibrec_id, insert_mode_p, pretend=False):
"""Update the date of the record in bibrec table """
if insert_mode_p:
query = """UPDATE bibrec SET creation_date=%s, modification_date=%s WHERE id=%s"""
params = (now, now, bibrec_id)
else:
query = """UPDATE bibrec SET modification_date=%s WHERE id=%s"""
params = (now, bibrec_id)
if not pretend:
run_sql(query, params)
write_message(" -Update record creation/modification date: DONE" , verbose=2)
def update_bibfmt_format(id_bibrec, format_value, format_name, modification_date=None, pretend=False):
"""Update the format in the table bibfmt"""
if modification_date is None:
modification_date = time.strftime('%Y-%m-%d %H:%M:%S')
else:
try:
time.strptime(modification_date, "%Y-%m-%d %H:%M:%S")
except ValueError:
modification_date = '1970-01-01 00:00:00'
# We check if the format is already in bibFmt
nb_found = find_record_format(id_bibrec, format_name)
if nb_found == 1:
# we are going to update the format
# compress the format_value value
pickled_format_value = compress(format_value)
# update the format:
query = """UPDATE LOW_PRIORITY bibfmt SET last_updated=%s, value=%s WHERE id_bibrec=%s AND format=%s"""
params = (modification_date, pickled_format_value, id_bibrec, format_name)
if not pretend:
row_id = run_sql(query, params)
if not pretend and row_id is None:
write_message(" Failed: Error during update_bibfmt_format function", verbose=1, stream=sys.stderr)
return 1
else:
write_message(" -Update the format %s in bibfmt: DONE" % format_name , verbose=2)
return 0
elif nb_found > 1:
write_message(" Failed: Same format %s found several time in bibfmt for the same record." % format_name, verbose=1, stream=sys.stderr)
return 1
else:
# Insert the format information in BibFMT
res = insert_bibfmt(id_bibrec, format_value, format_name, modification_date, pretend=pretend)
if res is None:
write_message(" Failed: Error during insert_bibfmt", verbose=1, stream=sys.stderr)
return 1
else:
write_message(" -Insert the format %s in bibfmt: DONE" % format_name , verbose=2)
return 0
def delete_bibfmt_format(id_bibrec, format_name, pretend=False):
"""
Delete format FORMAT_NAME from bibfmt table fo record ID_BIBREC.
"""
if not pretend:
run_sql("DELETE LOW_PRIORITY FROM bibfmt WHERE id_bibrec=%s and format=%s", (id_bibrec, format_name))
return 0
def archive_marcxml_for_history(recID, affected_fields, pretend=False):
"""
Archive current MARCXML format of record RECID from BIBFMT table
into hstRECORD table. Useful to keep MARCXML history of records.
Return 0 if everything went fine. Return 1 otherwise.
"""
res = run_sql("SELECT id_bibrec, value, last_updated FROM bibfmt WHERE format='xm' AND id_bibrec=%s",
(recID,))
db_affected_fields = ""
if affected_fields:
tmp_affected_fields = {}
for field in affected_fields:
if field.isdigit(): #hack for tags from RevisionVerifier
for ind in affected_fields[field]:
tmp_affected_fields[(field + ind[0] + ind[1] + "%").replace(" ", "_")] = 1
else:
pass #future implementation for fields
tmp_affected_fields = tmp_affected_fields.keys()
tmp_affected_fields.sort()
db_affected_fields = ",".join(tmp_affected_fields)
if res and not pretend:
run_sql("""INSERT INTO hstRECORD (id_bibrec, marcxml, job_id, job_name, job_person, job_date, job_details, affected_fields)
VALUES (%s,%s,%s,%s,%s,%s,%s,%s)""",
(res[0][0], res[0][1], task_get_task_param('task_id', 0), 'bibupload', task_get_task_param('user', 'UNKNOWN'), res[0][2],
'mode: ' + task_get_option('mode', 'UNKNOWN') + '; file: ' + task_get_option('file_path', 'UNKNOWN') + '.',
db_affected_fields))
return 0
def update_database_with_metadata(record, rec_id, oai_rec_id="oai", affected_tags=None, pretend=False):
"""Update the database tables with the record and the record id given in parameter"""
# extract only those tags that have been affected.
# check happens at subfield level. This is to prevent overhead
# associated with inserting already existing field with given ind pair
write_message("update_database_with_metadata: record=%s, rec_id=%s, oai_rec_id=%s, affected_tags=%s" % (record, rec_id, oai_rec_id, affected_tags), verbose=9)
tmp_record = {}
if affected_tags:
for tag in record.keys():
if tag in affected_tags.keys():
write_message(" -Tag %s found to be modified.Setting up for update" % tag, verbose=9)
# initialize new list to hold affected field
new_data_tuple_list = []
for data_tuple in record[tag]:
ind1 = data_tuple[1]
ind2 = data_tuple[2]
if (ind1, ind2) in affected_tags[tag]:
write_message(" -Indicator pair (%s, %s) added to update list" % (ind1, ind2), verbose=9)
new_data_tuple_list.append(data_tuple)
tmp_record[tag] = new_data_tuple_list
write_message(lambda: " -Modified fields: \n%s" % record_xml_output(tmp_record), verbose=2)
else:
tmp_record = record
for tag in tmp_record.keys():
# check if tag is not a special one:
if tag not in CFG_BIBUPLOAD_SPECIAL_TAGS:
# for each tag there is a list of tuples representing datafields
tuple_list = tmp_record[tag]
# this list should contain the elements of a full tag [tag, ind1, ind2, subfield_code]
tag_list = []
tag_list.append(tag)
for single_tuple in tuple_list:
# these are the contents of a single tuple
subfield_list = single_tuple[0]
ind1 = single_tuple[1]
ind2 = single_tuple[2]
# append the ind's to the full tag
if ind1 == '' or ind1 == ' ':
tag_list.append('_')
else:
tag_list.append(ind1)
if ind2 == '' or ind2 == ' ':
tag_list.append('_')
else:
tag_list.append(ind2)
datafield_number = single_tuple[4]
if tag in CFG_BIBUPLOAD_SPECIAL_TAGS:
# nothing to do for special tags (FFT, BDR, BDM)
pass
elif tag in CFG_BIBUPLOAD_CONTROLFIELD_TAGS and tag != "001":
value = single_tuple[3]
# get the full tag
full_tag = ''.join(tag_list)
# update the tables
write_message(" insertion of the tag "+full_tag+" with the value "+value, verbose=9)
# insert the tag and value into into bibxxx
(table_name, bibxxx_row_id) = insert_record_bibxxx(full_tag, value, pretend=pretend)
#print 'tname, bibrow', table_name, bibxxx_row_id;
if table_name is None or bibxxx_row_id is None:
write_message(" Failed: during insert_record_bibxxx", verbose=1, stream=sys.stderr)
# connect bibxxx and bibrec with the table bibrec_bibxxx
res = insert_record_bibrec_bibxxx(table_name, bibxxx_row_id, datafield_number, rec_id, pretend=pretend)
if res is None:
write_message(" Failed: during insert_record_bibrec_bibxxx", verbose=1, stream=sys.stderr)
else:
# get the tag and value from the content of each subfield
for subfield in subfield_list:
subtag = subfield[0]
value = subfield[1]
tag_list.append(subtag)
# get the full tag
full_tag = ''.join(tag_list)
# update the tables
write_message(" insertion of the tag "+full_tag+" with the value "+value, verbose=9)
# insert the tag and value into into bibxxx
(table_name, bibxxx_row_id) = insert_record_bibxxx(full_tag, value, pretend=pretend)
if table_name is None or bibxxx_row_id is None:
write_message(" Failed: during insert_record_bibxxx", verbose=1, stream=sys.stderr)
# connect bibxxx and bibrec with the table bibrec_bibxxx
res = insert_record_bibrec_bibxxx(table_name, bibxxx_row_id, datafield_number, rec_id, pretend=pretend)
if res is None:
write_message(" Failed: during insert_record_bibrec_bibxxx", verbose=1, stream=sys.stderr)
# remove the subtag from the list
tag_list.pop()
tag_list.pop()
tag_list.pop()
tag_list.pop()
write_message(" -Update the database with metadata: DONE", verbose=2)
log_record_uploading(oai_rec_id, task_get_task_param('task_id', 0), rec_id, 'P', pretend=pretend)
def append_new_tag_to_old_record(record, rec_old):
"""Append new tags to a old record"""
def _append_tag(tag):
if tag in CFG_BIBUPLOAD_CONTROLFIELD_TAGS:
if tag == '001':
pass
else:
# if it is a controlfield, just access the value
for single_tuple in record[tag]:
controlfield_value = single_tuple[3]
# add the field to the old record
newfield_number = record_add_field(rec_old, tag,
controlfield_value=controlfield_value)
if newfield_number is None:
write_message(" Error when adding the field"+tag, verbose=1, stream=sys.stderr)
else:
# For each tag there is a list of tuples representing datafields
for single_tuple in record[tag]:
# We retrieve the information of the tag
subfield_list = single_tuple[0]
ind1 = single_tuple[1]
ind2 = single_tuple[2]
if '%s%s%s' % (tag, ind1 == ' ' and '_' or ind1, ind2 == ' ' and '_' or ind2) in (CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[:5], CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[:5]):
## We don't want to append the external identifier
## if it is already existing.
if record_find_field(rec_old, tag, single_tuple)[0] is not None:
write_message(" Not adding tag: %s ind1=%s ind2=%s subfields=%s: it's already there" % (tag, ind1, ind2, subfield_list), verbose=9)
continue
# We add the datafield to the old record
write_message(" Adding tag: %s ind1=%s ind2=%s subfields=%s" % (tag, ind1, ind2, subfield_list), verbose=9)
newfield_number = record_add_field(rec_old, tag, ind1,
ind2, subfields=subfield_list)
if newfield_number is None:
write_message(" Error when adding the field"+tag, verbose=1, stream=sys.stderr)
# Go through each tag in the appended record
for tag in record:
_append_tag(tag)
return rec_old
def copy_strong_tags_from_old_record(record, rec_old):
"""
Look for strong tags in RECORD and REC_OLD. If no strong tags are
found in RECORD, then copy them over from REC_OLD. This function
modifies RECORD structure on the spot.
"""
for strong_tag in CFG_BIBUPLOAD_STRONG_TAGS:
if not record_get_field_instances(record, strong_tag, strong_tag[3:4] or '%', strong_tag[4:5] or '%'):
strong_tag_old_field_instances = record_get_field_instances(rec_old, strong_tag)
if strong_tag_old_field_instances:
for strong_tag_old_field_instance in strong_tag_old_field_instances:
sf_vals, fi_ind1, fi_ind2, controlfield, dummy = strong_tag_old_field_instance
record_add_field(record, strong_tag, fi_ind1, fi_ind2, controlfield, sf_vals)
return
### Delete functions
def delete_tags(record, rec_old):
"""
Returns a record structure with all the fields in rec_old minus the
fields in record.
@param record: The record containing tags to delete.
@type record: record structure
@param rec_old: The original record.
@type rec_old: record structure
@return: The modified record.
@rtype: record structure
"""
returned_record = copy.deepcopy(rec_old)
for tag, fields in record.iteritems():
if tag in ('001', ):
continue
for field in fields:
local_position = record_find_field(returned_record, tag, field)[1]
if local_position is not None:
record_delete_field(returned_record, tag, field_position_local=local_position)
return returned_record
def delete_tags_to_correct(record, rec_old):
"""
Delete tags from REC_OLD which are also existing in RECORD. When
deleting, pay attention not only to tags, but also to indicators,
so that fields with the same tags but different indicators are not
deleted.
"""
## Some fields are controlled via provenance information.
## We should re-add saved fields at the end.
fields_to_readd = {}
for tag in CFG_BIBUPLOAD_CONTROLLED_PROVENANCE_TAGS:
if tag[:3] in record:
tmp_field_instances = record_get_field_instances(record, tag[:3], tag[3], tag[4]) ## Let's discover the provenance that will be updated
provenances_to_update = []
for instance in tmp_field_instances:
for code, value in instance[0]:
if code == tag[5]:
if value not in provenances_to_update:
provenances_to_update.append(value)
break
else:
## The provenance is not specified.
## let's add the special empty provenance.
if '' not in provenances_to_update:
provenances_to_update.append('')
potential_fields_to_readd = record_get_field_instances(rec_old, tag[:3], tag[3], tag[4]) ## Let's take all the field corresponding to tag
## Let's save apart all the fields that should be updated, but
## since they have a different provenance not mentioned in record
## they should be preserved.
fields = []
for sf_vals, ind1, ind2, dummy_cf, dummy_line in potential_fields_to_readd:
for code, value in sf_vals:
if code == tag[5]:
if value not in provenances_to_update:
fields.append(sf_vals)
break
else:
if '' not in provenances_to_update:
## Empty provenance, let's protect in any case
fields.append(sf_vals)
fields_to_readd[tag] = fields
# browse through all the tags from the MARCXML file:
for tag in record:
# check if the tag exists in the old record too:
if tag in rec_old and tag != '001':
# the tag does exist, so delete all record's tag+ind1+ind2 combinations from rec_old
for dummy_sf_vals, ind1, ind2, dummy_cf, dummyfield_number in record[tag]:
write_message(" Delete tag: " + tag + " ind1=" + ind1 + " ind2=" + ind2, verbose=9)
record_delete_field(rec_old, tag, ind1, ind2)
## Ok, we readd necessary fields!
for tag, fields in fields_to_readd.iteritems():
for sf_vals in fields:
write_message(" Adding tag: " + tag[:3] + " ind1=" + tag[3] + " ind2=" + tag[4] + " code=" + str(sf_vals), verbose=9)
record_add_field(rec_old, tag[:3], tag[3], tag[4], subfields=sf_vals)
def delete_bibrec_bibxxx(record, id_bibrec, affected_tags={}, pretend=False):
"""Delete the database record from the table bibxxx given in parameters"""
# we clear all the rows from bibrec_bibxxx from the old record
# clearing only those tags that have been modified.
write_message(lambda: "delete_bibrec_bibxxx(record=%s, id_bibrec=%s, affected_tags=%s)" % (record, id_bibrec, affected_tags), verbose=9)
for tag in affected_tags:
# sanity check with record keys just to make sure its fine.
if tag not in CFG_BIBUPLOAD_SPECIAL_TAGS:
write_message("%s found in record"%tag, verbose=2)
# for each name construct the bibrec_bibxxx table name
table_name = 'bib'+tag[0:2]+'x'
bibrec_table = 'bibrec_'+table_name
# delete all the records with proper id_bibrec. Indicators matter for individual affected tags
tmp_ind_1 = ''
tmp_ind_2 = ''
# construct exact tag value using indicators
for ind_pair in affected_tags[tag]:
if ind_pair[0] == ' ':
tmp_ind_1 = '_'
else:
tmp_ind_1 = ind_pair[0]
if ind_pair[1] == ' ':
tmp_ind_2 = '_'
else:
tmp_ind_2 = ind_pair[1]
# need to escape incase of underscore so that mysql treats it as a char
tag_val = tag+"\\"+tmp_ind_1+"\\"+tmp_ind_2 + '%'
query = """DELETE br.* FROM `%s` br,`%s` b where br.id_bibrec=%%s and br.id_bibxxx=b.id and b.tag like %%s""" % (bibrec_table, table_name)
params = (id_bibrec, tag_val)
write_message(query % params, verbose=9)
if not pretend:
run_sql(query, params)
else:
write_message("%s not found"%tag, verbose=2)
def main():
"""Main that construct all the bibtask."""
task_init(authorization_action='runbibupload',
authorization_msg="BibUpload Task Submission",
description="""Receive MARC XML file and update appropriate database
tables according to options.
Examples:
$ bibupload -i input.xml
""",
help_specific_usage=""" -a, --append\t\tnew fields are appended to the existing record
-c, --correct\t\tfields are replaced by the new ones in the existing record, except
\t\t\twhen overridden by CFG_BIBUPLOAD_CONTROLLED_PROVENANCE_TAGS
-i, --insert\t\tinsert the new record in the database
-r, --replace\t\tthe existing record is entirely replaced by the new one,
\t\t\texcept for fields in CFG_BIBUPLOAD_STRONG_TAGS
-d, --delete\t\tspecified fields are deleted in existing record
-n, --notimechange\tdo not change record last modification date when updating
-o, --holdingpen\tInsert record into holding pen instead of the normal database
--pretend\t\tdo not really insert/append/correct/replace the input file
--force\t\twhen --replace, use provided 001 tag values, even if the matching
\t\t\trecord does not exist (thus allocating it on-the-fly)
--callback-url\tSend via a POST request a JSON-serialized answer (see admin guide), in
\t\t\torder to provide a feedback to an external service about the outcome of the operation.
--nonce\t\twhen used together with --callback add the nonce value in the JSON message.
--special-treatment=MODE\tif "oracle" is specified, when used together with --callback_url,
\t\t\tPOST an application/x-www-form-urlencoded request where the JSON message is encoded
\t\t\tinside a form field called "results".
""",
version=__revision__,
specific_params=("ircazdnoS:",
[
"insert",
"replace",
"correct",
"append",
"reference",
"delete",
"notimechange",
"holdingpen",
"pretend",
"force",
"callback-url=",
"nonce=",
"special-treatment=",
"stage=",
]),
task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
task_run_fnc=task_run_core)
def task_submit_elaborate_specific_parameter(key, value, opts, args): # pylint: disable=W0613
""" Given the string key it checks it's meaning, eventually using the
value. Usually it fills some key in the options dict.
It must return True if it has elaborated the key, False, if it doesn't
know that key.
eg:
if key in ['-n', '--number']:
task_get_option(\1) = value
return True
return False
"""
# No time change option
if key in ("-n", "--notimechange"):
task_set_option('notimechange', 1)
# Insert mode option
elif key in ("-i", "--insert"):
if task_get_option('mode') == 'replace':
# if also replace found, then set to replace_or_insert
task_set_option('mode', 'replace_or_insert')
else:
task_set_option('mode', 'insert')
fix_argv_paths([args[0]])
task_set_option('file_path', os.path.abspath(args[0]))
# Replace mode option
elif key in ("-r", "--replace"):
if task_get_option('mode') == 'insert':
# if also insert found, then set to replace_or_insert
task_set_option('mode', 'replace_or_insert')
else:
task_set_option('mode', 'replace')
fix_argv_paths([args[0]])
task_set_option('file_path', os.path.abspath(args[0]))
# Holding pen mode option
elif key in ("-o", "--holdingpen"):
write_message("Holding pen mode", verbose=3)
task_set_option('mode', 'holdingpen')
fix_argv_paths([args[0]])
task_set_option('file_path', os.path.abspath(args[0]))
# Correct mode option
elif key in ("-c", "--correct"):
task_set_option('mode', 'correct')
fix_argv_paths([args[0]])
task_set_option('file_path', os.path.abspath(args[0]))
# Append mode option
elif key in ("-a", "--append"):
task_set_option('mode', 'append')
fix_argv_paths([args[0]])
task_set_option('file_path', os.path.abspath(args[0]))
# Deprecated reference mode option (now correct)
elif key in ("-z", "--reference"):
task_set_option('mode', 'correct')
fix_argv_paths([args[0]])
task_set_option('file_path', os.path.abspath(args[0]))
elif key in ("-d", "--delete"):
task_set_option('mode', 'delete')
fix_argv_paths([args[0]])
task_set_option('file_path', os.path.abspath(args[0]))
elif key in ("--pretend",):
task_set_option('pretend', True)
fix_argv_paths([args[0]])
task_set_option('file_path', os.path.abspath(args[0]))
elif key in ("--force",):
task_set_option('force', True)
fix_argv_paths([args[0]])
task_set_option('file_path', os.path.abspath(args[0]))
elif key in ("--callback-url", ):
task_set_option('callback_url', value)
elif key in ("--nonce", ):
task_set_option('nonce', value)
elif key in ("--special-treatment", ):
if value.lower() in CFG_BIBUPLOAD_ALLOWED_SPECIAL_TREATMENTS:
if value.lower() == 'oracle':
task_set_option('oracle_friendly', True)
else:
print >> sys.stderr, """The specified value is not in the list of allowed special treatments codes: %s""" % CFG_BIBUPLOAD_ALLOWED_SPECIAL_TREATMENTS
return False
elif key in ("-S", "--stage"):
print >> sys.stderr, """WARNING: the --stage parameter is deprecated and ignored."""
else:
return False
return True
def task_submit_check_options():
""" Reimplement this method for having the possibility to check options
before submitting the task, in order for example to provide default
values. It must return False if there are errors in the options.
"""
if task_get_option('mode') is None:
write_message("Please specify at least one update/insert mode!")
return False
if task_get_option('file_path') is None:
write_message("Missing filename! -h for help.")
return False
return True
def writing_rights_p():
"""Return True in case bibupload has the proper rights to write in the
fulltext file folder."""
if _WRITING_RIGHTS is not None:
return _WRITING_RIGHTS
try:
if not os.path.exists(CFG_BIBDOCFILE_FILEDIR):
os.makedirs(CFG_BIBDOCFILE_FILEDIR)
fd, filename = tempfile.mkstemp(suffix='.txt', prefix='test', dir=CFG_BIBDOCFILE_FILEDIR)
test = os.fdopen(fd, 'w')
test.write('TEST')
test.close()
if open(filename).read() != 'TEST':
raise IOError("Can not successfully write and readback %s" % filename)
os.remove(filename)
except:
register_exception(alert_admin=True)
return False
return True
def post_results_to_callback_url(results, callback_url):
write_message("Sending feedback to %s" % callback_url)
if not CFG_JSON_AVAILABLE:
from warnings import warn
warn("--callback-url used but simplejson/json not available")
return
json_results = json.dumps(results)
write_message("Message to send: %s" % json_results, verbose=9)
## <scheme>://<netloc>/<path>?<query>#<fragment>
scheme, dummynetloc, dummypath, dummyquery, dummyfragment = urlparse.urlsplit(callback_url)
## See: http://stackoverflow.com/questions/111945/is-there-any-way-to-do-http-put-in-python
if scheme == 'http':
opener = urllib2.build_opener(urllib2.HTTPHandler)
elif scheme == 'https':
opener = urllib2.build_opener(urllib2.HTTPSHandler)
else:
raise ValueError("Scheme not handled %s for callback_url %s" % (scheme, callback_url))
if task_get_option('oracle_friendly'):
write_message("Oracle friendly mode requested", verbose=9)
request = urllib2.Request(callback_url, data=urllib.urlencode({'results': json_results}))
request.add_header('Content-Type', 'application/x-www-form-urlencoded')
else:
request = urllib2.Request(callback_url, data=json_results)
request.add_header('Content-Type', 'application/json')
request.add_header('User-Agent', make_user_agent_string('BibUpload'))
write_message("Headers about to be sent: %s" % request.headers, verbose=9)
write_message("Data about to be sent: %s" % request.data, verbose=9)
res = opener.open(request)
msg = res.read()
write_message("Result of posting the feedback: %s %s" % (res.code, res.msg), verbose=9)
write_message("Returned message is: %s" % msg, verbose=9)
return res
def bibupload_records(records, opt_mode=None, opt_notimechange=0,
pretend=False, callback_url=None, results_for_callback=None):
"""perform the task of uploading a set of records
returns list of (error_code, recid) tuples for separate records
"""
#Dictionaries maintaining temporary identifiers
# Structure: identifier -> number
tmp_ids = {}
tmp_vers = {}
results = []
# The first phase -> assigning meaning to temporary identifiers
if opt_mode == 'reference':
## NOTE: reference mode has been deprecated in favour of 'correct'
opt_mode = 'correct'
record = None
for record in records:
record_id = record_extract_oai_id(record)
task_sleep_now_if_required(can_stop_too=True)
if opt_mode == "holdingpen":
#inserting into the holding pen
write_message("Inserting into holding pen", verbose=3)
insert_record_into_holding_pen(record, record_id, pretend=pretend)
else:
write_message("Inserting into main database", verbose=3)
error = bibupload(
record,
opt_mode = opt_mode,
opt_notimechange = opt_notimechange,
oai_rec_id = record_id,
pretend = pretend,
tmp_ids = tmp_ids,
tmp_vers = tmp_vers)
results.append(error)
if error[0] == 1:
if record:
write_message(lambda: record_xml_output(record),
stream=sys.stderr)
else:
write_message("Record could not have been parsed",
stream=sys.stderr)
stat['nb_errors'] += 1
if callback_url:
results_for_callback['results'].append({'recid': error[1], 'success': False, 'error_message': error[2]})
elif error[0] == 2:
if record:
write_message(lambda: record_xml_output(record),
stream=sys.stderr)
else:
write_message("Record could not have been parsed",
stream=sys.stderr)
stat['nb_holdingpen'] += 1
if callback_url:
results_for_callback['results'].append({'recid': error[1], 'success': False, 'error_message': error[2]})
elif error[0] == 0:
if callback_url:
from invenio.legacy.search_engine import print_record
results_for_callback['results'].append({'recid': error[1], 'success': True, "marcxml": print_record(error[1], 'xm'), 'url': "%s/%s/%s" % (CFG_SITE_URL, CFG_SITE_RECORD, error[1])})
else:
if callback_url:
results_for_callback['results'].append({'recid': error[1], 'success': False, 'error_message': error[2]})
# stat us a global variable
task_update_progress("Done %d out of %d." % \
(stat['nb_records_inserted'] + \
stat['nb_records_updated'],
stat['nb_records_to_upload']))
# Second phase -> Now we can process all entries where temporary identifiers might appear (BDR, BDM)
write_message("Identifiers table after processing: %s versions: %s" % (str(tmp_ids), str(tmp_vers)))
write_message("Uploading BDR and BDM fields")
if opt_mode != "holdingpen":
for record in records:
record_id = retrieve_rec_id(record, opt_mode, pretend=pretend, post_phase = True)
bibupload_post_phase(record,
rec_id = record_id,
mode = opt_mode,
pretend = pretend,
tmp_ids = tmp_ids,
tmp_vers = tmp_vers)
return results
def task_run_core():
""" Reimplement to add the body of the task."""
write_message("Input file '%s', input mode '%s'." %
(task_get_option('file_path'), task_get_option('mode')))
write_message("STAGE 0:", verbose=2)
if task_get_option('file_path') is not None:
write_message("start preocessing", verbose=3)
task_update_progress("Reading XML input")
recs = xml_marc_to_records(open_marc_file(task_get_option('file_path')))
stat['nb_records_to_upload'] = len(recs)
write_message(" -Open XML marc: DONE", verbose=2)
task_sleep_now_if_required(can_stop_too=True)
write_message("Entering records loop", verbose=3)
callback_url = task_get_option('callback_url')
results_for_callback = {'results': []}
if recs is not None:
# We proceed each record by record
bibupload_records(records=recs, opt_mode=task_get_option('mode'),
opt_notimechange=task_get_option('notimechange'),
pretend=task_get_option('pretend'),
callback_url=callback_url,
results_for_callback=results_for_callback)
else:
write_message(" Error bibupload failed: No record found",
verbose=1, stream=sys.stderr)
callback_url = task_get_option("callback_url")
if callback_url:
nonce = task_get_option("nonce")
if nonce:
results_for_callback["nonce"] = nonce
post_results_to_callback_url(results_for_callback, callback_url)
if task_get_task_param('verbose') >= 1:
# Print out the statistics
print_out_bibupload_statistics()
# Check if they were errors
return not stat['nb_errors'] >= 1
def log_record_uploading(oai_rec_id, task_id, bibrec_id, insertion_db, pretend=False):
if oai_rec_id != "" and oai_rec_id != None:
query = """UPDATE oaiHARVESTLOG SET date_inserted=NOW(), inserted_to_db=%s, id_bibrec=%s WHERE oai_id = %s AND bibupload_task_id = %s ORDER BY date_harvested LIMIT 1"""
if not pretend:
run_sql(query, (str(insertion_db), str(bibrec_id), str(oai_rec_id), str(task_id), ))
if __name__ == "__main__":
main()
diff --git a/invenio/legacy/bibupload/revisionverifier.py b/invenio/legacy/bibupload/revisionverifier.py
index 538d2f589..3942b25a2 100644
--- a/invenio/legacy/bibupload/revisionverifier.py
+++ b/invenio/legacy/bibupload/revisionverifier.py
@@ -1,456 +1,456 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
RevisionVerifier : Compares the Revision of Record to be uploaded
with the archived Revision and the latest Revision(if any) and
generates a record patch for modified fields alone. This is to
avoid replacing the whole record where changes are minimal
"""
__revision__ = "$Id$"
import zlib
import copy
from invenio.legacy.bibrecord import record_get_field_value, \
record_get_field_instances, \
record_add_field, \
record_delete_field, \
create_record
-from invenio.bibupload_config import CFG_BIBUPLOAD_CONTROLFIELD_TAGS, \
+from invenio.legacy.bibupload.config import CFG_BIBUPLOAD_CONTROLFIELD_TAGS, \
CFG_BIBUPLOAD_DELETE_CODE, \
CFG_BIBUPLOAD_DELETE_VALUE
-from invenio.bibedit_dblayer import get_marcxml_of_record_revision, \
+from invenio.legacy.bibedit.db_layer import get_marcxml_of_record_revision, \
get_record_revisions
class RevisionVerifier:
"""
Class RevisionVerifier contains methods for Revision comparison
for the given record.
"""
def __init__(self):
self.rec_id = ''
def group_tag_values_by_indicator(self, tag_value_list):
"""
Groups the field instances of tag based on indicator pairs
Returns a dictionary of format {(ind1,ind2):[subfield_tuple1, .., subfield_tuplen]}
"""
curr_tag_indicator = {}
for data_tuple in tag_value_list:
ind1 = data_tuple[1]
ind2 = data_tuple[2]
if (ind1, ind2) not in curr_tag_indicator:
curr_tag_indicator[(ind1, ind2)] = [data_tuple]
else:
curr_tag_indicator[(ind1, ind2)].append(data_tuple)
return curr_tag_indicator
def compare_tags_by_ind(self, rec1_tag_val, rec2_tag_val):
"""
Groups the fields(of given tag) based on the indicator pairs
Returns a tuple of lists,each list denoting common/specific indicators
"""
# temporary dictionary to hold fields from record2
tmp = copy.deepcopy(rec2_tag_val)
common_ind_pair = {}
ind_pair_in_rec1_tag = {}
ind_pair_in_rec2_tag = {}
for ind_pair in rec1_tag_val:
# if indicator pair is common
if ind_pair in tmp:
# copying values from record1 tag as this could help
# at next stage in case of any subfield level modifications
# this could be directly used.
common_ind_pair[ind_pair] = rec1_tag_val[ind_pair]
del tmp[ind_pair]
else:
# indicator pair is present only in current tag field
ind_pair_in_rec1_tag[ind_pair] = rec1_tag_val[ind_pair]
for ind_pair in rec2_tag_val:
# indicator pair present only in record2 tag field
if ind_pair in tmp:
ind_pair_in_rec2_tag[ind_pair] = rec2_tag_val[ind_pair]
return (common_ind_pair, ind_pair_in_rec1_tag, ind_pair_in_rec2_tag)
def find_modified_tags(self, common_tags, record1, record2):
"""
For each tag common to Record1 and Record2, checks for modifictions
at field-level, indicator-level and subfield-level.
Returns a dictionary of tags and corresponding fields from Record1
that have been found to have modified.
"""
result = {}
for tag in common_tags:
# retrieve tag instances of record1 and record2
rec1_tag_val = record_get_field_instances(record1, tag, '%', '%')
rec2_tag_val = record_get_field_instances(record2, tag, '%', '%')
if rec1_tag_val:
rec1_ind = self.group_tag_values_by_indicator(rec1_tag_val)
if rec2_tag_val:
rec2_ind = self.group_tag_values_by_indicator(rec2_tag_val)
# NOTE: At this point rec1_ind and rec2_ind will be dictionary
# Key ==> (ind1, ind2) tuple
# Val ==> list of data_tuple => [dt1,dt2]
# dt(n) => ([sfl],ind1,ind2,ctrlfield,fn)
# Generating 3 different dictionaries
# common/added/deleted ind pairs in record1 based on record2
(com_ind, add_ind, del_ind) = self.compare_tags_by_ind(rec1_ind, rec2_ind)
if add_ind:
for ind_pair in add_ind:
for data_tuple in add_ind[ind_pair]:
subfield_list = data_tuple[0]
record_add_field(result, tag, ind_pair[0], ind_pair[1], '', subfields=subfield_list)
# Indicators that are deleted from record1 w.r.t record2 will be added with special code
if del_ind:
for ind_pair in del_ind:
record_add_field(result, tag, ind_pair[0], ind_pair[1], '', [(CFG_BIBUPLOAD_DELETE_CODE, CFG_BIBUPLOAD_DELETE_VALUE)])
# Common modified fields. Identifying changes at subfield level
if com_ind:
for ind_pair in com_ind:
# NOTE: sf_rec1 and sf_rec2 are list of list of subfields
# A simple list comparison is sufficient in this scneario
# Any change in the order of fields or changes in subfields
# will cause the entire list of data_tuple for that ind_pair
# to be copied from record1(upload) to result.
if tag in CFG_BIBUPLOAD_CONTROLFIELD_TAGS:
cf_rec1 = [data_tuple[3] for data_tuple in rec1_ind[ind_pair]]
cf_rec2 = [data_tuple[3] for data_tuple in rec2_ind[ind_pair]]
if cf_rec1 != cf_rec2:
for data_tuple in com_ind[ind_pair]:
record_add_field(result, tag, controlfield_value=data_tuple[3])
else:
sf_rec1 = [data_tuple[0] for data_tuple in rec1_ind[ind_pair]]
sf_rec2 = [data_tuple[0] for data_tuple in rec2_ind[ind_pair]]
if sf_rec1 != sf_rec2:
# change at subfield level/ re-oredered fields
for data_tuple in com_ind[ind_pair]:
# com_ind will have data_tuples of record1(upload) and not record2
subfield_list = data_tuple[0]
record_add_field(result, tag, ind_pair[0], ind_pair[1], '', subfields=subfield_list)
return result
def compare_records(self, record1, record2, opt_mode=None):
"""
Compares two records to identify added/modified/deleted tags.
The records are either the upload record or existing record or
record archived.
Returns a Tuple of Dictionaries(For modified/added/deleted tags).
"""
def remove_control_tag(tag_list):
"""
Returns the list of keys without any control tags
"""
cleaned_list = [item for item in tag_list
if item not in CFG_BIBUPLOAD_CONTROLFIELD_TAGS]
return cleaned_list
def group_record_tags():
"""
Groups all the tags in a Record as Common/Added/Deleted tags.
Returns a Tuple of 3 lists for each category mentioned above.
"""
rec1_keys = record1.keys()
rec2_keys = record2.keys()
com_tag_lst = [key for key in rec1_keys if key in rec2_keys]
# tags in record2 not present in record1
del_tag_lst = [key for key in rec2_keys if key not in rec1_keys]
# additional tags in record1
add_tag_lst = [key for key in rec1_keys if key not in rec2_keys]
return (com_tag_lst, add_tag_lst, del_tag_lst)
# declaring dictionaries to hold the identified patch
mod_patch = {}
add_patch = {}
del_patch = {}
result = {}
(common_tags, added_tags, deleted_tags) = group_record_tags()
if common_tags:
mod_patch = self.find_modified_tags(common_tags, record1, record2)
if added_tags:
for tag in added_tags:
add_patch[tag] = record1[tag]
# if record comes with correct, it should already have fields
# marked with '0' code. If not deleted tag list will
if deleted_tags and \
opt_mode == 'replace' or opt_mode == 'delete':
for tag in deleted_tags:
del_patch[tag] = record2[tag]
# returning back a result dictionary with all available patches
if mod_patch:
result['MOD'] = mod_patch
if add_patch:
result['ADD'] = add_patch
if del_patch:
# for a tag that has been deleted in the upload record in replace
# mode, loop through all the fields of the tag and add additional
# subfield with code '0' and value '__DELETE_FIELDS__'
# NOTE Indicators taken into consideration while deleting fields
for tag in del_patch:
for data_tuple in del_patch[tag]:
ind1 = data_tuple[1]
ind2 = data_tuple[2]
record_delete_field(del_patch, tag, ind1, ind2)
record_add_field(del_patch, tag, ind1, ind2, "", [(CFG_BIBUPLOAD_DELETE_CODE, CFG_BIBUPLOAD_DELETE_VALUE)])
result['DEL'] = del_patch
return result
def detect_conflict(self, up_patch, up_date, orig_patch, orig_date):
"""
Compares the generated patches for Upload and Original Records for any common tags.
Raises Conflict Error in case of any common tags.
Returns the upload record patch in case of no conflicts.
"""
conflict_tags = []
# if tag is modified in upload rec but modified/deleted in current rec
if 'MOD' in up_patch:
for tag in up_patch['MOD']:
if 'MOD' in orig_patch and tag in orig_patch['MOD'] \
or 'DEL' in orig_patch and tag in orig_patch['DEL']:
conflict_tags.append(tag)
# if tag is added in upload rec but added in current revision
if 'ADD' in up_patch:
for tag in up_patch['ADD']:
if 'ADD' in orig_patch and tag in orig_patch['ADD']:
conflict_tags.append(tag)
# if tag is deleted in upload rec but modified/deleted in current rec
if 'DEL' in up_patch:
for tag in up_patch['DEL']:
if 'MOD' in orig_patch and tag in orig_patch['MOD'] \
or 'DEL' in orig_patch and tag in orig_patch['DEL']:
conflict_tags.append(tag)
if conflict_tags:
raise InvenioBibUploadConflictingRevisionsError(self.rec_id, \
conflict_tags, \
up_date, \
orig_date)
return up_patch
def generate_final_patch(self, patch_dict, recid):
"""
Generates patch by merging modified patch and added patch
Returns the final merged patch containing modified and added fields
"""
def _add_to_record(record, patch):
for tag in patch:
for data_tuple in patch[tag]:
record_add_field(record, tag, data_tuple[1], data_tuple[2], '', subfields=data_tuple[0])
return record
final_patch = {}
#tag_list = []
# merge processed and added fields into one patch
if 'MOD' in patch_dict:
# tag_list = tag_list + patch_dict['MOD'].items()
final_patch = _add_to_record(final_patch, patch_dict['MOD'])
if 'ADD' in patch_dict:
#tag_list = tag_list + patch_dict['ADD'].items()
final_patch = _add_to_record(final_patch, patch_dict['ADD'])
if 'DEL' in patch_dict:
#tag_list = tag_list + patch_dict['DEL'].items()
final_patch = _add_to_record(final_patch, patch_dict['DEL'])
record_add_field(final_patch, '001', ' ', ' ', recid)
return final_patch
def retrieve_affected_tags_with_ind(self, patch):
"""
Generates a dictionary of all the tags added/modified/romoved from
record1 w.r.t record2 (record1 is upload record and record2 the existing one)
Returns dictionary containing tag and corresponding ind pairs
"""
affected_tags = {}
# ==> Key will be either MOD/ADD/DEL and values will hold another dictionary
# containing tags and corresponding fields
for key in patch:
item = patch[key]
for tag in item:
#each tag will have LIST of TUPLES (data)
affected_tags[tag] = [(data_tuple[1], data_tuple[2]) for data_tuple in item[tag]]
return affected_tags
def verify_revision(self, verify_record, original_record, opt_mode=None):
"""
Compares the upload record with the same 005 record from archive.
Once the changes are identified, The latest revision of the record is fetched
from the system and the identified changes are applied over the latest.
Returns record patch in case of non-conflicting addition/modification/deletion
Conflicting records raise Error and stops the bibupload process
"""
upload_rev = ''
original_rev = ''
r_date = ''
record_patch = {}
# No need for revision check for other operations
if opt_mode not in ['replace', 'correct']:
return
if '001' in verify_record:
self.rec_id = record_get_field_value(verify_record, '001')
# Retrieving Revision tags for comparison
if '005' in verify_record:
upload_rev = record_get_field_value(verify_record, '005')
r_date = upload_rev.split('.')[0]
if r_date not in [k[1] for k in get_record_revisions(self.rec_id)]:
raise InvenioBibUploadInvalidRevisionError(self.rec_id, r_date)
else:
raise InvenioBibUploadMissing005Error(self.rec_id)
if '005' in original_record:
original_rev = record_get_field_value(original_record, '005')
else:
raise InvenioBibUploadMissing005Error(self.rec_id)
# Retrieving the archived version
marc_xml = get_marcxml_of_record_revision(self.rec_id, r_date)
res = create_record(zlib.decompress(marc_xml[0][0]))
archived_record = res[0]
# Comparing Upload and Archive record
curr_patch = self.compare_records(verify_record, archived_record, opt_mode)
# No changes in Upload Record compared to Archived Revision
# Raising Error to skip the bibupload for the record
if not curr_patch:
raise InvenioBibUploadUnchangedRecordError(self.rec_id, upload_rev)
if original_rev == upload_rev:
# Upload, Archive and Original Records have same Revisions.
affected_tags = self.retrieve_affected_tags_with_ind(curr_patch)
return ('correct', self.generate_final_patch(curr_patch, self.rec_id), affected_tags)
# Comparing Original and Archive record
orig_patch = self.compare_records(original_record, archived_record, opt_mode)
# Checking for conflicts
# If no original patch - Original Record same as Archived Record
if orig_patch:
curr_patch = self.detect_conflict(curr_patch, upload_rev, \
orig_patch, original_rev)
record_patch = self.generate_final_patch(curr_patch, self.rec_id)
affected_tags = self.retrieve_affected_tags_with_ind(curr_patch)
# Returning patch in case of no conflicting fields
return ('correct', record_patch, affected_tags)
class InvenioBibUploadUnchangedRecordError(Exception):
"""
Exception for unchanged upload records.
"""
def __init__(self, recid, current_rev):
self.cur_rev = current_rev
self.recid = recid
def __str__(self):
msg = 'UNCHANGED RECORD : Upload Record %s same as Rev-%s'
return repr(msg%(self.recid, self.cur_rev))
class InvenioBibUploadConflictingRevisionsError(Exception):
"""
Exception for conflicting records.
"""
def __init__(self, recid, tag_list, upload_rev, current_rev):
self.up_rev = upload_rev
self.cur_rev = current_rev
self.tags = tag_list
self.recid = recid
def __str__(self):
msg = 'CONFLICT : In Record %s between Rev-%s and Rev-%s for Tags : %s'
return repr(msg%(self.recid, self.up_rev, self.cur_rev, str(self.tags)))
class InvenioBibUploadInvalidRevisionError(Exception):
"""
Exception for incorrect revision of the upload records.
"""
def __init__(self, recid, upload_rev):
self.upload_rev = upload_rev
self.recid = recid
def __str__(self):
msg = 'INVALID REVISION : %s for Record %s not in Archive.'
return repr(msg%(self.upload_rev, self.recid))
class InvenioBibUploadMissing005Error(Exception):
"""
Exception for missing Revision field in Upload/Original records.
"""
def __init__(self, recid):
self.recid = recid
diff --git a/invenio/legacy/bibupload/scripts/bibupload.py b/invenio/legacy/bibupload/scripts/bibupload.py
index 34ce17356..1f9f7e8bd 100644
--- a/invenio/legacy/bibupload/scripts/bibupload.py
+++ b/invenio/legacy/bibupload/scripts/bibupload.py
@@ -1,54 +1,54 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
BibUpload: Receive MARC XML file and update the appropriate database tables according to options.
Usage: bibupload [options] input.xml
Examples:
$ bibupload -i input.xml
Options:
-a, --append new fields are appended to the existing record
-c, --correct fields are replaced by the new ones in the existing record
-f, --format takes only the FMT fields into account. Does not update
-i, --insert insert the new record in the database
-r, --replace the existing record is entirely replaced by the new one
-z, --reference update references (update only 999 fields)
-s, --stage=STAGE stage to start from in the algorithm (0: always done; 1: FMT tags;
2: FFT tags; 3: BibFmt; 4: Metadata update; 5: time update)
-n, --notimechange do not change record last modification date when updating
Scheduling options:
-u, --user=USER user name to store task, password needed
General options:
-h, --help print this help and exit
-v, --verbose=LEVEL verbose level (from 0 to 9, default 1)
-V --version print the script version
"""
from invenio.base.factory import with_app_context
@with_app_context()
def main():
- from invenio.bibupload import main as bibupload_main
+ from invenio.legacy.bibupload.engine import main as bibupload_main
return bibupload_main()
diff --git a/invenio/legacy/bibupload/utils.py b/invenio/legacy/bibupload/utils.py
index ca057cf82..6ec33a63b 100644
--- a/invenio/legacy/bibupload/utils.py
+++ b/invenio/legacy/bibupload/utils.py
@@ -1,84 +1,84 @@
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import os
import uuid
from tempfile import mkstemp
from invenio.legacy.bibrecord import record_xml_output
-from invenio.bibtask import task_low_level_submission
+from invenio.legacy.bibsched.bibtask import task_low_level_submission
from invenio.config import CFG_TMPSHAREDDIR
CFG_MAX_RECORDS = 500
def open_temp_file(prefix):
"""
Create a temporary file to write MARC XML in
"""
# Prepare to save results in a tmp file
(fd, filename) = mkstemp(
dir=CFG_TMPSHAREDDIR,
prefix=prefix + str(uuid.uuid4()),
suffix='.xml'
)
file_out = os.fdopen(fd, "w")
return (file_out, filename)
def close_temp_file(file_out, filename):
""" Close temporary file again """
file_out.close()
os.chmod(filename, 0644)
def bibupload_record(record=None, collection=None, file_prefix="bibuploadutils", mode="-c",
alias='bibuploadutils', opts=[]):
"""
General purpose function that will write a MARCXML file and call bibupload
on it.
"""
if collection is None and record is None:
return
(file_out, filename) = open_temp_file(file_prefix)
if collection is not None:
file_out.write("<collection>")
tot = 0
for rec in collection:
file_out.write(record_xml_output(rec))
tot += 1
if tot == CFG_MAX_RECORDS:
file_out.write("</collection>")
close_temp_file(file_out, filename)
task_low_level_submission(
'bibupload', alias, mode, filename, *opts
)
(file_out, filename) = open_temp_file(file_prefix)
file_out.write("<collection>")
tot = 0
file_out.write("</collection>")
elif record is not None:
tot = 1
file_out.write(record_xml_output(record))
close_temp_file(file_out, filename)
if tot > 0:
task_low_level_submission('bibupload', alias, mode, filename, *opts)
diff --git a/invenio/legacy/ckeditor/connector.py b/invenio/legacy/ckeditor/connector.py
index 6f403a869..b1d39cfca 100644
--- a/invenio/legacy/ckeditor/connector.py
+++ b/invenio/legacy/ckeditor/connector.py
@@ -1,132 +1,132 @@
# -*- coding: utf-8 -*-
## Comments and reviews for records.
## This file is part of Invenio.
## Copyright (C) 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Invenio implementation of the connector to CKEditor for file upload.
This is heavily borrowed from FCKeditor 'upload.py' sample connector.
"""
import os
import re
-from invenio.bibdocfile import decompose_file, propose_next_docname
+from invenio.legacy.bibdocfile.api import decompose_file, propose_next_docname
allowed_extensions = {}
allowed_extensions['File'] = ['7z','aiff','asf','avi','bmp','csv','doc','fla','flv','gif','gz','gzip','jpeg','jpg','mid','mov','mp3','mp4','mpc','mpeg','mpg','ods','odt','pdf','png','ppt','pxd','qt','ram','rar','rm','rmi','rmvb','rtf','sdc','sitd','swf','sxc','sxw','tar','tgz','tif','tiff','txt','vsd','wav','wma','wmv','xls','xml','zip']
allowed_extensions['Image'] = ['bmp','gif','jpeg','jpg','png']
allowed_extensions['Flash'] = ['swf','flv']
allowed_extensions['Media'] = ['aiff','asf','avi','bmp','fla', 'flv','gif','jpeg','jpg','mid','mov','mp3','mp4','mpc','mpeg','mpg','png','qt','ram','rm','rmi','rmvb','swf','tif','tiff','wav','wma','wmv']
default_allowed_types = ['File', 'Image', 'Flash', 'Media']
def process_CKEditor_upload(form, uid, user_files_path, user_files_absolute_path,
recid=None, allowed_types=default_allowed_types):
"""
Process a file upload request.
@param form: the form as in req object.
@type form: dict
@param uid: the user ID of the user uploading the file.
@type uid: int
@param user_files_path: the base URL where the file can be
accessed from the web after upload.
Note that you have to implement your own handler to stream the files from the directory
C{user_files_absolute_path} if you set this value.
@type user_files_path: string
@param user_files_absolute_path: the base path on the server where
the files should be saved.
Eg:C{%(CFG_PREFIX)s/var/data/comments/%(recid)s/%(uid)s}
@type user_files_absolute_path: string
@param recid: the record ID for which we upload a file. Leave None if not relevant.
@type recid: int
@param allowed_types: types allowed for uploading. These
are supported by CKEditor: ['File', 'Image', 'Flash', 'Media']
@type allowed_types: list of strings
@return: (msg, uploaded_file_path, uploaded_file_name, uploaded_file_url, callback_function)
"""
msg = ''
filename = ''
formfile = None
uploaded_file_path = ''
user_files_path = ''
for key, formfields in form.items():
if key != 'upload':
continue
if hasattr(formfields, "filename") and formfields.filename:
# We have found our file
filename = formfields.filename
formfile = formfields.file
break
can_upload_file_p = False
if not form['type'] in allowed_types:
# Is the type sent through the form ok?
msg = 'You are not allowed to upload a file of this type'
else:
# Is user allowed to upload such file extension?
basedir, name, extension = decompose_file(filename)
extension = extension[1:] # strip leading dot
if extension in allowed_extensions.get(form['type'], []):
can_upload_file_p = True
if not can_upload_file_p:
msg = 'You are not allowed to upload a file of this type'
elif filename and formfile:
## Before saving the file to disk, wash the filename (in particular
## washing away UNIX and Windows (e.g. DFS) paths):
filename = os.path.basename(filename.split('\\')[-1])
# Remove \ / | : ? *
filename = re.sub ( '\\\\|\\/|\\||\\:|\\?|\\*|"|<|>|[\x00-\x1f\x7f-\x9f]/', '_', filename)
filename = filename.strip()
if filename != "":
# Check that file does not already exist
n = 1
while os.path.exists(os.path.join(user_files_absolute_path, filename)):
basedir, name, extension = decompose_file(filename)
new_name = propose_next_docname(name)
filename = new_name + extension
# This may be dangerous if the file size is bigger than the available memory
fp = open(os.path.join(user_files_absolute_path, filename), "w")
fp.write(formfile.read())
fp.close()
uploaded_file_path = os.path.join(user_files_absolute_path, filename)
uploaded_file_name = filename
return (msg, uploaded_file_path, filename, user_files_path, form['CKEditorFuncNum'])
def send_response(req, msg, fileurl, callback_function):
"""
Send a response to the CKEdtior after a file upload.
@param req: the request object
@param msg: the message to send to the user
@param fileurl: the URL where the newly uploaded file can be found, if any
@param callback_function: a value returned when calling C{process_CKEditor_upload()}
"""
req.content_type = 'text/html'
req.send_http_header()
req.write('''<html><body><script type="text/javascript">window.parent.CKEDITOR.tools.callFunction(%(function_number)s, '%(url)s', '%(msg)s')</script></body></html>''' % \
{'function_number': callback_function,
'url': fileurl,
'msg': msg.replace("'", "\\'")})
diff --git a/invenio/legacy/docextract/task.py b/invenio/legacy/docextract/task.py
index 4c551d8f4..16123a331 100644
--- a/invenio/legacy/docextract/task.py
+++ b/invenio/legacy/docextract/task.py
@@ -1,205 +1,205 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Generic Framework for extracting metadata from records using bibsched"""
import traceback
from datetime import datetime
from itertools import chain
-from invenio.bibtask import task_get_option, write_message, \
+from invenio.legacy.bibsched.bibtask import task_get_option, write_message, \
task_sleep_now_if_required, \
task_update_progress
from invenio.legacy.dbquery import run_sql
from invenio.legacy.search_engine import get_record
from invenio.legacy.search_engine import get_collection_reclist
from invenio.refextract_api import get_pdf_doc
from invenio.legacy.bibrecord import record_get_field_instances, \
field_get_subfield_values
def task_run_core_wrapper(name, core_func, extra_vars=None):
def fun():
try:
return task_run_core(name, core_func, extra_vars)
except Exception:
# Remove extra '\n'
write_message(traceback.format_exc()[:-1])
raise
return fun
def fetch_last_updated(name):
select_sql = "SELECT last_recid, last_updated FROM xtrJOB" \
" WHERE name = %s LIMIT 1"
row = run_sql(select_sql, (name,))
if not row:
sql = "INSERT INTO xtrJOB (name, last_updated, last_recid) " \
"VALUES (%s, '1970-01-01', 0)"
run_sql(sql, (name,))
row = run_sql(select_sql, (name,))
# Fallback in case we receive None instead of a valid date
last_recid = row[0][0] or 0
last_date = row[0][1] or datetime(year=1, month=1, day=1)
return last_recid, last_date
def store_last_updated(recid, creation_date, name):
sql = "UPDATE xtrJOB SET last_recid = %s WHERE name=%s AND last_recid < %s"
run_sql(sql, (recid, name, recid))
sql = "UPDATE xtrJOB SET last_updated = %s " \
"WHERE name=%s AND last_updated < %s"
iso_date = creation_date.isoformat()
run_sql(sql, (iso_date, name, iso_date))
def fetch_concerned_records(name):
task_update_progress("Fetching record ids")
last_recid, last_date = fetch_last_updated(name)
if task_get_option('new'):
# Fetch all records inserted since last run
sql = "SELECT `id`, `creation_date` FROM `bibrec` " \
"WHERE `creation_date` >= %s " \
"AND `id` > %s " \
"ORDER BY `creation_date`"
records = run_sql(sql, (last_date.isoformat(), last_recid))
elif task_get_option('modified'):
# Fetch all records inserted since last run
sql = "SELECT `id`, `modification_date` FROM `bibrec` " \
"WHERE `modification_date` >= %s " \
"AND `id` > %s " \
"ORDER BY `modification_date`"
records = run_sql(sql, (last_date.isoformat(), last_recid))
else:
given_recids = task_get_option('recids')
for collection in task_get_option('collections'):
given_recids.add(get_collection_reclist(collection))
if given_recids:
format_strings = ','.join(['%s'] * len(given_recids))
records = run_sql("SELECT `id`, NULL FROM `bibrec` " \
"WHERE `id` IN (%s) ORDER BY `id`" % format_strings,
list(given_recids))
else:
records = []
task_update_progress("Done fetching record ids")
return records
def fetch_concerned_arxiv_records(name):
task_update_progress("Fetching arxiv record ids")
dummy, last_date = fetch_last_updated(name)
# Fetch all records inserted since last run
sql = "SELECT `id`, `modification_date` FROM `bibrec` " \
"WHERE `modification_date` >= %s " \
"AND `creation_date` > NOW() - INTERVAL 7 DAY " \
"ORDER BY `modification_date`" \
"LIMIT 5000"
records = run_sql(sql, [last_date.isoformat()])
def check_arxiv(recid):
record = get_record(recid)
for report_tag in record_get_field_instances(record, "037"):
for category in field_get_subfield_values(report_tag, 'a'):
if category.startswith('arXiv'):
return True
return False
def check_pdf_date(recid):
doc = get_pdf_doc(recid)
if doc:
return doc.md > last_date
return False
records = [(r, mod_date) for r, mod_date in records if check_arxiv(r)]
records = [(r, mod_date) for r, mod_date in records if check_pdf_date(r)]
write_message("recids %s" % repr([(r, mod_date.isoformat()) \
for r, mod_date in records]))
task_update_progress("Done fetching arxiv record ids")
return records
def process_records(name, records, func, extra_vars):
count = 1
total = len(records)
for recid, date in records:
task_sleep_now_if_required(can_stop_too=True)
msg = "Extracting for %s (%d/%d)" % (recid, count, total)
task_update_progress(msg)
write_message(msg)
func(recid, **extra_vars)
if date:
store_last_updated(recid, date, name)
count += 1
def task_run_core(name, func, extra_vars=None):
"""Calls extract_references in refextract"""
if task_get_option('task_specific_name'):
name = "%s:%s" % (name, task_get_option('task_specific_name'))
write_message("Starting %s" % name)
if not extra_vars:
extra_vars = {}
records = fetch_concerned_records(name)
process_records(name, records, func, extra_vars)
if task_get_option('arxiv'):
extra_vars['_arxiv'] = True
arxiv_name = "%s:arxiv" % name
records = fetch_concerned_arxiv_records(arxiv_name)
process_records(arxiv_name, records, func, extra_vars)
write_message("Complete")
return True
def split_ids(value):
"""
Split ids given in the command line
Possible formats are:
* 1
* 1,2,3,4
* 1-5,20,30,40
Returns respectively
* set([1])
* set([1,2,3,4])
* set([1,2,3,4,5,20,30,40])
"""
def parse(el):
el = el.strip()
if not el:
ret = []
elif '-' in el:
start, end = el.split('-', 1)
ret = xrange(int(start), int(end) + 1)
else:
ret = [int(el)]
return ret
return chain(*(parse(c) for c in value.split(',') if c.strip()))
diff --git a/invenio/legacy/docextract/utils.py b/invenio/legacy/docextract/utils.py
index f4ac313ac..0da08896e 100644
--- a/invenio/legacy/docextract/utils.py
+++ b/invenio/legacy/docextract/utils.py
@@ -1,45 +1,45 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
VERBOSITY = None
import sys
from datetime import datetime
-from invenio.bibtask import write_message as bibtask_write_message
+from invenio.legacy.bibsched.bibtask import write_message as bibtask_write_message
def setup_loggers(verbosity):
global VERBOSITY
if verbosity > 8:
print 'Setting up loggers: verbosity=%s' % verbosity
VERBOSITY = verbosity
def write_message(msg, stream=sys.stdout, verbose=1):
"""Write message and flush output stream (may be sys.stdout or sys.stderr).
Useful for debugging stuff."""
if VERBOSITY is None:
return bibtask_write_message(msg, stream, verbose)
elif msg and VERBOSITY >= verbose:
if VERBOSITY > 8:
print >>stream, datetime.now().strftime('[%H:%M:%S] '),
print >>stream, msg
diff --git a/invenio/legacy/inveniocfg.py b/invenio/legacy/inveniocfg.py
index b5fd28816..7053f9a0c 100644
--- a/invenio/legacy/inveniocfg.py
+++ b/invenio/legacy/inveniocfg.py
@@ -1,1362 +1,1362 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Invenio configuration and administration CLI tool.
Usage: inveniocfg [options]
General options:
-h, --help print this help
-V, --version print version number
Options to finish your installation:
--create-secret-key generate random CFG_SITE_SECRET_KEY
--create-apache-conf create Apache configuration files
--create-tables create DB tables for Invenio
--load-bibfield-conf load the BibField configuration
--load-webstat-conf load the WebStat configuration
--drop-tables drop DB tables of Invenio
--check-openoffice check for correctly set up of openoffice temporary directory
Options to set up and test a demo site:
--create-demo-site create demo site
--load-demo-records load demo records
--remove-demo-records remove demo records, keeping demo site
--drop-demo-site drop demo site configurations too
--run-unit-tests run unit test suite (needs demo site)
--run-regression-tests run regression test suite (needs demo site)
--run-web-tests run web tests in a browser (needs demo site, Firefox, Selenium IDE)
--run-flask-tests run Flask test suite
Options to update config files in situ:
--update-all perform all the update options
--update-config-py update config.py file from invenio.conf file
--update-dbquery-py update dbquery.py with DB credentials from invenio.conf
--update-dbexec update dbexec with DB credentials from invenio.conf
--update-bibconvert-tpl update bibconvert templates with CFG_SITE_URL from invenio.conf
--update-web-tests update web test cases with CFG_SITE_URL from invenio.conf
Options to update DB tables:
--reset-all perform all the reset options
--reset-sitename reset tables to take account of new CFG_SITE_NAME*
--reset-siteadminemail reset tables to take account of new CFG_SITE_ADMIN_EMAIL
--reset-fieldnames reset tables to take account of new I18N names from PO files
--reset-recstruct-cache reset record structure cache according to CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE
--reset-recjson-cache reset record json cache according to CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE
Options to upgrade your installation:
--upgrade apply all pending upgrades
--upgrade-check run pre-upgrade checks for all pending upgrades
--upgrade-show-pending show pending upgrades ready to be applied
--upgrade-show-applied show history of applied upgrades
--upgrade-create-standard-recipe create a new upgrade recipe (for developers)
--upgrade-create-release-recipe create a new release upgrade recipe (for developers)
Options to help the work:
--list print names and values of all options from conf files
--get <some-opt> get value of a given option from conf files
--conf-dir </some/path> path to directory where invenio*.conf files are [optional]
--detect-system-details print system details such as Apache/Python/MySQL versions
"""
__revision__ = "$Id$"
from ConfigParser import ConfigParser
from optparse import OptionParser, OptionGroup, IndentedHelpFormatter, Option, \
OptionError
import os
import random
import re
import shutil
import socket
import string
import sys
from warnings import warn
def print_usage():
"""Print help."""
print __doc__
def get_version():
""" Get running version of Invenio """
from invenio.config import CFG_VERSION
return CFG_VERSION
def print_version():
"""Print version information."""
print get_version()
def convert_conf_option(option_name, option_value):
"""
Convert conf option into Python config.py line, converting
values to ints or strings as appropriate.
"""
## 1) convert option name to uppercase:
option_name = option_name.upper()
## 1a) adjust renamed variables:
if option_name in ['CFG_WEBSUBMIT_DOCUMENT_FILE_MANAGER_DOCTYPES',
'CFG_WEBSUBMIT_DOCUMENT_FILE_MANAGER_RESTRICTIONS',
'CFG_WEBSUBMIT_DOCUMENT_FILE_MANAGER_MISC',
'CFG_WEBSUBMIT_FILESYSTEM_BIBDOC_GROUP_LIMIT',
'CFG_WEBSUBMIT_ADDITIONAL_KNOWN_FILE_EXTENSIONS',
'CFG_WEBSUBMIT_DESIRED_CONVERSIONS']:
new_option_name = option_name.replace('WEBSUBMIT', 'BIBDOCFILE')
print >> sys.stderr, ("""WARNING: %s has been renamed to %s.
Please, update your invenio-local.conf file accordingly.""" % (option_name, new_option_name))
option_name = new_option_name
## 2) convert option value to int or string:
if option_name in ['CFG_BIBUPLOAD_REFERENCE_TAG',
'CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG',
'CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG',
'CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG',
'CFG_BIBUPLOAD_STRONG_TAGS',
'CFG_BIBFORMAT_HIDDEN_TAGS',]:
# some options are supposed be string even when they look like
# numeric
option_value = '"' + option_value + '"'
else:
try:
option_value = int(option_value)
except ValueError:
option_value = '"' + option_value + '"'
## 3a) special cases: chars regexps
if option_name in ['CFG_BIBINDEX_CHARS_ALPHANUMERIC_SEPARATORS',
'CFG_BIBINDEX_CHARS_PUNCTUATION']:
option_value = 'r"[' + option_value[1:-1] + ']"'
## 3abis) special cases: real regexps
if option_name in ['CFG_BIBINDEX_PERFORM_OCR_ON_DOCNAMES',
'CFG_BATCHUPLOADER_WEB_ROBOT_AGENTS']:
option_value = 'r"' + option_value[1:-1] + '"'
## 3b) special cases: True, False, None
if option_value in ['"True"', '"False"', '"None"']:
option_value = option_value[1:-1]
## 3c) special cases: dicts and real pythonic lists
if option_name in ['CFG_WEBSEARCH_FIELDS_CONVERT',
'CFG_BATCHUPLOADER_WEB_ROBOT_RIGHTS',
'CFG_WEBSEARCH_FULLTEXT_SNIPPETS',
'CFG_WEBSEARCH_FULLTEXT_SNIPPETS_CHARS',
'CFG_SITE_EMERGENCY_EMAIL_ADDRESSES',
'CFG_BIBMATCH_FUZZY_WORDLIMITS',
'CFG_BIBMATCH_QUERY_TEMPLATES',
'CFG_WEBSEARCH_SYNONYM_KBRS',
'CFG_BIBINDEX_SYNONYM_KBRS',
'CFG_WEBCOMMENT_EMAIL_REPLIES_TO',
'CFG_WEBCOMMENT_RESTRICTION_DATAFIELD',
'CFG_WEBCOMMENT_ROUND_DATAFIELD',
'CFG_BIBUPLOAD_FFT_ALLOWED_EXTERNAL_URLS',
'CFG_BIBSCHED_NODE_TASKS',
'CFG_BIBEDIT_EXTEND_RECORD_WITH_COLLECTION_TEMPLATE',
'CFG_OAI_METADATA_FORMATS',
'CFG_BIBDOCFILE_DESIRED_CONVERSIONS',
'CFG_BIBDOCFILE_BEST_FORMATS_TO_EXTRACT_TEXT_FROM',
'CFG_WEB_API_KEY_ALLOWED_URL',
'CFG_BIBDOCFILE_DOCUMENT_FILE_MANAGER_MISC',
'CFG_BIBDOCFILE_DOCUMENT_FILE_MANAGER_DOCTYPES',
'CFG_BIBDOCFILE_DOCUMENT_FILE_MANAGER_RESTRICTIONS',
'CFG_DEVEL_TEST_DATABASE_ENGINES',
'CFG_REFEXTRACT_KBS_OVERRIDE',
'CFG_OPENID_CONFIGURATIONS',
'CFG_OAUTH1_CONFIGURATIONS',
'CFG_OAUTH2_CONFIGURATIONS',
'CFG_BIBDOCFILE_ADDITIONAL_KNOWN_MIMETYPES',]:
try:
option_value = option_value[1:-1]
except TypeError:
if option_name in ('CFG_WEBSEARCH_FULLTEXT_SNIPPETS',):
print >> sys.stderr, """WARNING: CFG_WEBSEARCH_FULLTEXT_SNIPPETS
has changed syntax: it can be customised to display different snippets for
different document types. See the corresponding documentation in invenio.conf.
You may want to customise your invenio-local.conf configuration accordingly."""
option_value = """{'': %s}""" % option_value
else:
print >> sys.stderr, "ERROR: type error in %s value %s." % \
(option_name, option_value)
sys.exit(1)
## 3cbis) very special cases: dicts with backward compatible string
if option_name in ['CFG_BIBINDEX_SPLASH_PAGES']:
if option_value.startswith('"{') and option_value.endswith('}"'):
option_value = option_value[1:-1]
else:
option_value = """{%s: ".*"}""" % option_value
## 3d) special cases: comma-separated lists
if option_name in ['CFG_SITE_LANGS',
'CFG_BIBDOCFILE_ADDITIONAL_KNOWN_FILE_EXTENSIONS',
'CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS',
'CFG_BIBUPLOAD_STRONG_TAGS',
'CFG_BIBFORMAT_HIDDEN_TAGS',
'CFG_BIBSCHED_GC_TASKS_TO_REMOVE',
'CFG_BIBSCHED_GC_TASKS_TO_ARCHIVE',
'CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS',
'CFG_BIBUPLOAD_CONTROLLED_PROVENANCE_TAGS',
'CFG_BIBUPLOAD_DELETE_FORMATS',
'CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES',
'CFG_WEBSTYLE_HTTP_STATUS_ALERT_LIST',
'CFG_WEBSEARCH_RSS_I18N_COLLECTIONS',
'CFG_BATCHUPLOADER_FILENAME_MATCHING_POLICY',
'CFG_BIBAUTHORID_EXTERNAL_CLAIMED_RECORDS_KEY',
'CFG_BIBCIRCULATION_ITEM_STATUS_OPTIONAL',
'CFG_PLOTEXTRACTOR_DISALLOWED_TEX',
'CFG_OAI_FRIENDS',
'CFG_WEBSTYLE_REVERSE_PROXY_IPS',
'CFG_BIBEDIT_AUTOCOMPLETE_INSTITUTIONS_FIELDS',
'CFG_BIBFORMAT_DISABLE_I18N_FOR_CACHED_FORMATS',
'CFG_BIBFORMAT_HIDDEN_FILE_FORMATS',
'CFG_FLASK_DISABLED_BLUEPRINTS',
'CFG_DEVEL_TOOLS',
'CFG_BIBFIELD_MASTER_FORMATS',
'CFG_OPENID_PROVIDERS',
'CFG_OAUTH1_PROVIDERS',
'CFG_OAUTH2_PROVIDERS',]:
out = "["
for elem in option_value[1:-1].split(","):
if elem:
elem = elem.strip()
if option_name in ['CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES']:
# 3d1) integer values
out += "%i, " % int(elem)
else:
# 3d2) string values
out += "'%s', " % elem
out += "]"
option_value = out
## 3e) special cases: multiline
if option_name == 'CFG_OAI_IDENTIFY_DESCRIPTION':
# make triple quotes
option_value = '""' + option_value + '""'
## 3f) ignore some options:
if option_name.startswith('CFG_SITE_NAME_INTL'):
# treated elsewhere
return
## 3g) special cases: float
if option_name in ['CFG_BIBDOCFILE_MD5_CHECK_PROBABILITY',
'CFG_BIBMATCH_LOCAL_SLEEPTIME',
'CFG_BIBMATCH_REMOTE_SLEEPTIME',
'CFG_PLOTEXTRACTOR_DOWNLOAD_TIMEOUT',
'CFG_BIBMATCH_FUZZY_MATCH_VALIDATION_LIMIT']:
option_value = float(option_value[1:-1])
## 3h) special cases: bibmatch validation list
if option_name in ['CFG_BIBMATCH_MATCH_VALIDATION_RULESETS']:
option_value = option_value[1:-1]
## 4a) dropped variables
if option_name in ['CFG_BATCHUPLOADER_WEB_ROBOT_AGENT']:
print >> sys.stderr, ("""ERROR: CFG_BATCHUPLOADER_WEB_ROBOT_AGENT has been dropped in favour of
CFG_BATCHUPLOADER_WEB_ROBOT_AGENTS.
Please, update your invenio-local.conf file accordingly.""")
option_value = option_value[1:-1]
elif option_name in ['CFG_WEBSUBMIT_DOCUMENT_FILE_MANAGER_DOCTYPES',
'CFG_WEBSUBMIT_DOCUMENT_FILE_MANAGER_RESTRICTIONS',
'CFG_WEBSUBMIT_DOCUMENT_FILE_MANAGER_MISC',
'CFG_WEBSUBMIT_FILESYSTEM_BIBDOC_GROUP_LIMIT',
'CFG_WEBSUBMIT_ADDITIONAL_KNOWN_FILE_EXTENSIONS',
'CFG_WEBSUBMIT_DESIRED_CONVERSIONS']:
new_option_name = option_name.replace('WEBSUBMIT', 'BIBDOCFILE')
print >> sys.stderr, ("""ERROR: %s has been renamed to %s.
Please, update your invenio-local.conf file accordingly.""" % (option_name, new_option_name))
option_name = new_option_name
elif option_name in ['CFG_WEBSTYLE_INSPECT_TEMPLATES']:
print >> sys.stderr, ("""ERROR: CFG_WEBSTYLE_INSPECT_TEMPLATES has been dropped in favour of
CFG_DEVEL_TOOLS.
Please, update your invenio-local.conf file accordingly.""")
return
## 5) finally, return output line:
return '%s = %s' % (option_name, option_value)
def update_config_py(conf):
print '>>> NOT NEEDED!!!'
print '>>> quiting ...'
return
import sys
print ">>> Going to update config.py..."
## location where config.py is:
configpyfile = conf.get("Invenio", "CFG_PYLIBDIR") + \
os.sep + 'invenio' + os.sep + 'config.py'
## backup current config.py file:
if os.path.exists(configpyfile):
shutil.copy(configpyfile, configpyfile + '.OLD')
## here we go:
fdesc = open(configpyfile, 'w')
## generate preamble:
fdesc.write("# -*- coding: utf-8 -*-\n")
fdesc.write("# DO NOT EDIT THIS FILE! IT WAS AUTOMATICALLY GENERATED\n")
fdesc.write("# FROM INVENIO.CONF BY EXECUTING:\n")
fdesc.write("# " + " ".join(sys.argv) + "\n")
## special treatment for CFG_SITE_NAME_INTL options:
fdesc.write("CFG_SITE_NAME_INTL = {}\n")
for lang in conf.get("Invenio", "CFG_SITE_LANGS").split(","):
fdesc.write("CFG_SITE_NAME_INTL['%s'] = \"%s\"\n" % (lang, conf.get("Invenio",
"CFG_SITE_NAME_INTL_" + lang)))
## special treatment for CFG_SITE_SECURE_URL that may be empty, in
## which case it should be put equal to CFG_SITE_URL:
if not conf.get("Invenio", "CFG_SITE_SECURE_URL"):
conf.set("Invenio", "CFG_SITE_SECURE_URL",
conf.get("Invenio", "CFG_SITE_URL"))
## process all the options normally:
sections = conf.sections()
sections.sort()
for section in sections:
options = conf.options(section)
options.sort()
for option in options:
if not option.upper().startswith('CFG_DATABASE_'):
# put all options except for db credentials into config.py
line_out = convert_conf_option(option, conf.get(section, option))
if line_out:
fdesc.write(line_out + "\n")
## special treatment for CFG_SITE_SECRET_KEY that can not be empty
if not conf.get("Invenio", "CFG_SITE_SECRET_KEY"):
CFG_BINDIR = conf.get("Invenio", "CFG_BINDIR") + os.sep
print >> sys.stderr, """WARNING: CFG_SITE_SECRET_KEY can not be empty.
You may want to customise your invenio-local.conf configuration accordingly.
$ %sinveniomanage config create secret-key
$ %sinveniomanage config update
""" % (CFG_BINDIR, CFG_BINDIR)
## FIXME: special treatment for experimental variables
## CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES and CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE
## (not offering them in invenio.conf since they will be refactored)
fdesc.write("CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE = 0\n")
fdesc.write("CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES = [0, 1,]\n")
## generate postamble:
fdesc.write("")
fdesc.write("# END OF GENERATED FILE")
## we are done:
fdesc.close()
print "You may want to restart Apache now."
print ">>> config.py updated successfully."
def cli_cmd_update_config_py(conf):
"""
Update new config.py from conf options, keeping previous
config.py in a backup copy.
"""
update_config_py(conf)
# from invenio.base.scripts.config import main
# warn('inveniocfg --update-config-py is deprecated. Using instead: inveniomanage config update')
# sys_argv = sys.argv
# sys.argv = 'config_manager.py update'.split()
# main()
# sys.argv = sys_argv
def cli_cmd_update_dbquery_py(conf):
"""
Update lib/dbquery.py file with DB parameters read from conf file.
Note: this edits dbquery.py in situ, taking a backup first.
Use only when you know what you are doing.
"""
print ">>> Going to update dbquery.py..."
## location where dbquery.py is:
dbqueryconfigpyfile = conf.get("Invenio", "CFG_PYLIBDIR") + \
os.sep + 'invenio' + os.sep + 'dbquery_config.py'
## backup current dbquery.py file:
if os.path.exists(dbqueryconfigpyfile + 'c'):
shutil.copy(dbqueryconfigpyfile + 'c', dbqueryconfigpyfile + 'c.OLD')
out = ["%s = '%s'\n" % (item.upper(), value) \
for item, value in conf.items('Invenio') \
if item.upper().startswith('CFG_DATABASE_')]
fdesc = open(dbqueryconfigpyfile, 'w')
fdesc.write("# -*- coding: utf-8 -*-\n")
fdesc.writelines(out)
fdesc.close()
print "You may want to restart Apache now."
print ">>> dbquery.py updated successfully."
def cli_cmd_update_dbexec(conf):
"""
Update bin/dbexec file with DB parameters read from conf file.
Note: this edits dbexec in situ, taking a backup first.
Use only when you know what you are doing.
"""
print ">>> Going to update dbexec..."
## location where dbexec is:
dbexecfile = conf.get("Invenio", "CFG_BINDIR") + \
os.sep + 'dbexec'
## backup current dbexec file:
if os.path.exists(dbexecfile):
shutil.copy(dbexecfile, dbexecfile + '.OLD')
## replace db parameters via sed:
out = ''
for line in open(dbexecfile, 'r').readlines():
match = re.search(r'^CFG_DATABASE_(HOST|PORT|NAME|USER|PASS|SLAVE)(\s*=\s*)\'.*\'$', line)
if match:
dbparam = 'CFG_DATABASE_' + match.group(1)
out += "%s%s'%s'\n" % (dbparam, match.group(2),
conf.get("Invenio", dbparam))
else:
out += line
fdesc = open(dbexecfile, 'w')
fdesc.write(out)
fdesc.close()
print ">>> dbexec updated successfully."
def cli_cmd_update_bibconvert_tpl(conf):
"""
Update bibconvert/config/*.tpl files looking for 856
http://.../CFG_SITE_RECORD lines, replacing URL with CFG_SITE_URL taken
from conf file. Note: this edits tpl files in situ, taking a
backup first. Use only when you know what you are doing.
"""
from invenio.bibconvert_manager import main
warn('inveniocfg --update-bibconvert-tpl is deprecated. Using instead: inveniomanage bibconvert update')
sys_argv = sys.argv
sys.argv = 'bibconvert_manager.py update'.split()
main()
sys.argv = sys_argv
def cli_cmd_update_web_tests(conf):
"""
Update web test cases lib/webtest/test_*.html looking for
<td>http://.+?[</] strings and replacing them with CFG_SITE_URL
taken from conf file. Note: this edits test files in situ, taking
a backup first. Use only when you know what you are doing.
"""
print ">>> Going to update web tests..."
## location where test_*.html files are:
testdir = conf.get("Invenio", 'CFG_PREFIX') + os.sep + \
'lib' + os.sep + 'webtest' + os.sep + 'invenio'
## find all test_*.html files:
for testfilename in os.listdir(testdir):
if testfilename.startswith("test_") and \
testfilename.endswith(".html"):
## change test file:
testfile = testdir + os.sep + testfilename
shutil.copy(testfile, testfile + '.OLD')
out = ''
for line in open(testfile, 'r').readlines():
match = re.search(r'^(.*<td>)http://.+?([</].*)$', line)
if match:
out += "%s%s%s\n" % (match.group(1),
conf.get("Invenio", 'CFG_SITE_URL'),
match.group(2))
else:
match = re.search(r'^(.*<td>)/opt/invenio(.*)$', line)
if match:
out += "%s%s%s\n" % (match.group(1),
conf.get("Invenio", 'CFG_PREFIX'),
match.group(2))
else:
out += line
fdesc = open(testfile, 'w')
fdesc.write(out)
fdesc.close()
print ">>> web tests updated successfully."
def cli_cmd_reset_sitename(conf):
"""
Reset collection-related tables with new CFG_SITE_NAME and
CFG_SITE_NAME_INTL* read from conf files.
"""
print ">>> Going to reset CFG_SITE_NAME and CFG_SITE_NAME_INTL..."
from invenio.legacy.dbquery import run_sql, IntegrityError
# reset CFG_SITE_NAME:
sitename = conf.get("Invenio", "CFG_SITE_NAME")
try:
run_sql("""INSERT INTO collection (id, name, dbquery, reclist) VALUES
(1,%s,NULL,NULL)""", (sitename,))
except IntegrityError:
run_sql("""UPDATE collection SET name=%s WHERE id=1""", (sitename,))
# reset CFG_SITE_NAME_INTL:
for lang in conf.get("Invenio", "CFG_SITE_LANGS").split(","):
sitename_lang = conf.get("Invenio", "CFG_SITE_NAME_INTL_" + lang)
try:
run_sql("""INSERT INTO collectionname (id_collection, ln, type, value) VALUES
(%s,%s,%s,%s)""", (1, lang, 'ln', sitename_lang))
except IntegrityError:
run_sql("""UPDATE collectionname SET value=%s
WHERE ln=%s AND id_collection=1 AND type='ln'""",
(sitename_lang, lang))
print "You may want to restart Apache now."
print ">>> CFG_SITE_NAME and CFG_SITE_NAME_INTL* reset successfully."
def cli_cmd_reset_recstruct_cache(conf):
"""If CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE is changed, this function
will adapt the database to either store or not store the recstruct
format."""
from invenio.intbitset import intbitset
from invenio.legacy.dbquery import run_sql, serialize_via_marshal
from invenio.legacy.search_engine import get_record
- from invenio.bibsched import server_pid, pidfile
+ from invenio.legacy.bibsched.scripts.bibsched import server_pid, pidfile
enable_recstruct_cache = conf.get("Invenio", "CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE")
enable_recstruct_cache = enable_recstruct_cache in ('True', '1')
pid = server_pid(ping_the_process=False)
if pid:
print >> sys.stderr, "ERROR: bibsched seems to run with pid %d, according to %s." % (pid, pidfile)
print >> sys.stderr, " Please stop bibsched before running this procedure."
sys.exit(1)
if enable_recstruct_cache:
print ">>> Searching records which need recstruct cache resetting; this may take a while..."
all_recids = intbitset(run_sql("SELECT id FROM bibrec"))
good_recids = intbitset(run_sql("SELECT bibrec.id FROM bibrec JOIN bibfmt ON bibrec.id = bibfmt.id_bibrec WHERE format='recstruct' AND modification_date < last_updated"))
recids = all_recids - good_recids
print ">>> Generating recstruct cache..."
tot = len(recids)
count = 0
for recid in recids:
value = serialize_via_marshal(get_record(recid))
run_sql("DELETE FROM bibfmt WHERE id_bibrec=%s AND format='recstruct'", (recid, ))
run_sql("INSERT INTO bibfmt(id_bibrec, format, last_updated, value) VALUES(%s, 'recstruct', NOW(), %s)", (recid, value))
count += 1
if count % 1000 == 0:
print " ... done records %s/%s" % (count, tot)
if count % 1000 != 0:
print " ... done records %s/%s" % (count, tot)
print ">>> recstruct cache generated successfully."
else:
print ">>> Cleaning recstruct cache..."
run_sql("DELETE FROM bibfmt WHERE format='recstruct'")
def cli_cmd_reset_recjson_cache(conf):
"""If CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE is changed, this function
will adapt the database to either store or not store the recjson
format."""
from invenio.legacy.bibfield.bibfield_manager import main
warn('inveniocfg --reset-recjson-cache is deprecated. Using instead: inveniomanage bibfield reset')
sys_argv = sys.argv
sys.argv = 'bibfield_manager.py reset'.split()
main()
sys.argv = sys_argv
def cli_cmd_reset_siteadminemail(conf):
"""
Reset user-related tables with new CFG_SITE_ADMIN_EMAIL read from conf files.
"""
print ">>> Going to reset CFG_SITE_ADMIN_EMAIL..."
from invenio.legacy.dbquery import run_sql
siteadminemail = conf.get("Invenio", "CFG_SITE_ADMIN_EMAIL")
run_sql("DELETE FROM user WHERE id=1")
run_sql("""INSERT INTO user (id, email, password, note, nickname) VALUES
(1, %s, AES_ENCRYPT(email, ''), 1, 'admin')""",
(siteadminemail,))
print "You may want to restart Apache now."
print ">>> CFG_SITE_ADMIN_EMAIL reset successfully."
def cli_cmd_reset_fieldnames(conf):
"""
Reset I18N field names such as author, title, etc and other I18N
ranking method names such as word similarity. Their translations
are taken from the PO files.
"""
print ">>> Going to reset I18N field names..."
from invenio.base.i18n import gettext_set_language, language_list_long
from invenio.legacy.dbquery import run_sql, IntegrityError
## get field id and name list:
field_id_name_list = run_sql("SELECT id, name FROM field")
## get rankmethod id and name list:
rankmethod_id_name_list = run_sql("SELECT id, name FROM rnkMETHOD")
## update names for every language:
for lang, dummy in language_list_long():
_ = gettext_set_language(lang)
## this list is put here in order for PO system to pick names
## suitable for translation
field_name_names = {"any field": _("any field"),
"title": _("title"),
"author": _("author"),
"abstract": _("abstract"),
"keyword": _("keyword"),
"report number": _("report number"),
"subject": _("subject"),
"reference": _("reference"),
"fulltext": _("fulltext"),
"collection": _("collection"),
"division": _("division"),
"year": _("year"),
"journal": _("journal"),
"experiment": _("experiment"),
"record ID": _("record ID")}
## update I18N names for every language:
for (field_id, field_name) in field_id_name_list:
if field_name_names.has_key(field_name):
try:
run_sql("""INSERT INTO fieldname (id_field,ln,type,value) VALUES
(%s,%s,%s,%s)""", (field_id, lang, 'ln',
field_name_names[field_name]))
except IntegrityError:
run_sql("""UPDATE fieldname SET value=%s
WHERE id_field=%s AND ln=%s AND type=%s""",
(field_name_names[field_name], field_id, lang, 'ln',))
## ditto for rank methods:
rankmethod_name_names = {"wrd": _("word similarity"),
"demo_jif": _("journal impact factor"),
"citation": _("times cited"),
"citerank_citation_t": _("time-decay cite count"),
"citerank_pagerank_c": _("all-time-best cite rank"),
"citerank_pagerank_t": _("time-decay cite rank"),}
for (rankmethod_id, rankmethod_name) in rankmethod_id_name_list:
if rankmethod_name_names.has_key(rankmethod_name):
try:
run_sql("""INSERT INTO rnkMETHODNAME (id_rnkMETHOD,ln,type,value) VALUES
(%s,%s,%s,%s)""", (rankmethod_id, lang, 'ln',
rankmethod_name_names[rankmethod_name]))
except IntegrityError:
run_sql("""UPDATE rnkMETHODNAME SET value=%s
WHERE id_rnkMETHOD=%s AND ln=%s AND type=%s""",
(rankmethod_name_names[rankmethod_name], rankmethod_id, lang, 'ln',))
print ">>> I18N field names reset successfully."
def cli_check_openoffice(conf):
"""
If OpenOffice.org integration is enabled, checks whether the system is
properly configured.
"""
- from invenio.bibtask import check_running_process_user
- from invenio.websubmit_file_converter import can_unoconv, get_file_converter_logger
+ from invenio.legacy.bibsched.bibtask import check_running_process_user
+ from invenio.legacy.websubmit.file_converter import can_unoconv, get_file_converter_logger
logger = get_file_converter_logger()
for handler in logger.handlers:
logger.removeHandler(handler)
check_running_process_user()
print ">>> Checking if Libre/OpenOffice.org is correctly integrated...",
sys.stdout.flush()
if can_unoconv(True):
print "ok"
else:
sys.exit(1)
def test_db_connection():
"""
Test DB connection, and if fails, advise user how to set it up.
Useful to be called during table creation.
"""
print "Testing DB connection...",
from invenio.utils.text import wrap_text_in_a_box
from invenio.legacy.dbquery import run_sql, Error
## first, test connection to the DB server:
try:
run_sql("SHOW TABLES")
except Error, err:
from invenio.dbquery_config import CFG_DATABASE_HOST, \
CFG_DATABASE_PORT, CFG_DATABASE_NAME, CFG_DATABASE_USER, \
CFG_DATABASE_PASS
print wrap_text_in_a_box("""\
DATABASE CONNECTIVITY ERROR %(errno)d: %(errmsg)s.\n
Perhaps you need to set up database and connection rights?
If yes, then please login as MySQL admin user and run the
following commands now:
$ mysql -h %(dbhost)s -P %(dbport)s -u root -p mysql
mysql> CREATE DATABASE %(dbname)s DEFAULT CHARACTER SET utf8;
mysql> GRANT ALL PRIVILEGES ON %(dbname)s.*
TO %(dbuser)s@%(webhost)s IDENTIFIED BY '%(dbpass)s';
mysql> QUIT
The values printed above were detected from your
configuration. If they are not right, then please edit your
invenio-local.conf file and rerun 'inveniocfg --update-all' first.
If the problem is of different nature, then please inspect
the above error message and fix the problem before continuing.""" % \
{'errno': err.args[0],
'errmsg': err.args[1],
'dbname': CFG_DATABASE_NAME,
'dbhost': CFG_DATABASE_HOST,
'dbport': CFG_DATABASE_PORT,
'dbuser': CFG_DATABASE_USER,
'dbpass': CFG_DATABASE_PASS,
'webhost': CFG_DATABASE_HOST == 'localhost' and 'localhost' or os.popen('hostname -f', 'r').read().strip(),
})
sys.exit(1)
print "ok"
## second, test insert/select of a Unicode string to detect
## possible Python/MySQL/MySQLdb mis-setup:
print "Testing Python/MySQL/MySQLdb UTF-8 chain...",
try:
try:
beta_in_utf8 = "β" # Greek beta in UTF-8 is 0xCEB2
run_sql("CREATE TABLE test__invenio__utf8 (x char(1), y varbinary(2)) DEFAULT CHARACTER SET utf8 ENGINE=MyISAM;")
run_sql("INSERT INTO test__invenio__utf8 (x, y) VALUES (%s, %s)", (beta_in_utf8, beta_in_utf8))
res = run_sql("SELECT x,y,HEX(x),HEX(y),LENGTH(x),LENGTH(y),CHAR_LENGTH(x),CHAR_LENGTH(y) FROM test__invenio__utf8")
assert res[0] == ('\xce\xb2', '\xce\xb2', 'CEB2', 'CEB2', 2L, 2L, 1L, 2L)
run_sql("DROP TABLE test__invenio__utf8")
except Exception, err:
print wrap_text_in_a_box("""\
DATABASE RELATED ERROR %s\n
A problem was detected with the UTF-8 treatment in the chain
between the Python application, the MySQLdb connector, and
the MySQL database. You may perhaps have installed older
versions of some prerequisite packages?\n
Please check the INSTALL file and please fix this problem
before continuing.""" % err)
sys.exit(1)
finally:
run_sql("DROP TABLE IF EXISTS test__invenio__utf8")
print "ok"
def cli_cmd_create_secret_key(conf):
"""Generate and append CFG_SITE_SECRET_KEY to invenio-local.conf.
Useful for the installation process."""
from invenio.base.scripts.config import main
warn('inveniocfg --create-secret-key is deprecated. Using instead: inveniomanage config create secret-key')
sys_argv = sys.argv
sys.argv = 'config_manager.py create secret-key'.split()
main()
sys.argv = sys_argv
def cli_cmd_create_tables(conf):
"""Create and fill Invenio DB tables. Useful for the installation process."""
from invenio.base.scripts.database import main
warn('inveniocfg --create-tables is deprecated. Using instead: inveniomanage database create')
sys_argv = sys.argv
sys.argv = 'database_manager.py create'.split()
main()
sys.argv = sys_argv
def cli_cmd_load_webstat_conf(conf):
print ">>> Going to load WebStat config..."
from invenio.config import CFG_PREFIX
cmd = "%s/bin/webstatadmin --load-config" % CFG_PREFIX
if os.system(cmd):
print "ERROR: failed execution of", cmd
sys.exit(1)
print ">>> WebStat config load successfully."
def cli_cmd_load_bibfield_config(conf):
from invenio.legacy.bibfield.bibfield_manager import main
warn('inveniocfg --load-bibfield-conf is deprecated. Using instead: inveniomanage bibfield config load')
sys_argv = sys.argv
sys.argv = 'bibfield_manager.py config load'.split()
main()
sys.argv = sys_argv
def cli_cmd_drop_tables(conf):
"""Drop Invenio DB tables. Useful for the uninstallation process."""
print ">>> Going to drop tables and related data on filesystem ..."
from invenio.base.scripts.database import main
warn('inveniocfg --drop-tables is deprecated. Using instead: inveniomanage database drop')
sys_argv = sys.argv
if '--yes-i-know' in sys_argv:
sys.argv.append('--yes-i-know')
sys.argv = 'database_manager.py drop'.split()
main()
sys.argv = sys_argv
def cli_cmd_create_demo_site(conf):
"""Create demo site. Useful for testing purposes."""
print ">>> Going to create demo site..."
from invenio.config import CFG_PREFIX
from invenio.legacy.dbquery import run_sql
run_sql("TRUNCATE schTASK")
run_sql("TRUNCATE session")
run_sql("DELETE FROM user WHERE email=''")
for cmd in ["%s/bin/dbexec < %s/lib/sql/invenio/democfgdata.sql" % \
(CFG_PREFIX, CFG_PREFIX),]:
if os.system(cmd):
print "ERROR: failed execution of", cmd
sys.exit(1)
cli_cmd_reset_fieldnames(conf) # needed for I18N demo ranking method names
for cmd in ["%s/bin/webaccessadmin -u admin -c -r -D" % CFG_PREFIX,
"%s/bin/webcoll -u admin" % CFG_PREFIX,
"%s/bin/webcoll 1" % CFG_PREFIX,
"%s/bin/bibsort -u admin --load-config" % CFG_PREFIX,
"%s/bin/bibsort 2" % CFG_PREFIX, ]:
if os.system(cmd):
print "ERROR: failed execution of", cmd
sys.exit(1)
print ">>> Demo site created successfully."
def cli_cmd_load_demo_records(conf):
"""Load demo records. Useful for testing purposes."""
from invenio.config import CFG_PREFIX
from invenio.legacy.dbquery import run_sql
print ">>> Going to load demo records..."
run_sql("TRUNCATE schTASK")
for cmd in ["%s/bin/bibupload -u admin -i %s/var/tmp/demobibdata.xml" % (CFG_PREFIX, CFG_PREFIX),
"%s/bin/bibupload 1" % CFG_PREFIX,
"%s/bin/bibdocfile --textify --with-ocr --recid 97" % CFG_PREFIX,
"%s/bin/bibdocfile --textify --all" % CFG_PREFIX,
"%s/bin/bibindex -u admin" % CFG_PREFIX,
"%s/bin/bibindex 2" % CFG_PREFIX,
"%s/bin/bibreformat -u admin -o HB" % CFG_PREFIX,
"%s/bin/bibreformat 3" % CFG_PREFIX,
"%s/bin/webcoll -u admin" % CFG_PREFIX,
"%s/bin/webcoll 4" % CFG_PREFIX,
"%s/bin/bibrank -u admin" % CFG_PREFIX,
"%s/bin/bibrank 5" % CFG_PREFIX,
"%s/bin/bibsort -u admin -R" % CFG_PREFIX,
"%s/bin/bibsort 6" % CFG_PREFIX,
"%s/bin/oairepositoryupdater -u admin" % CFG_PREFIX,
"%s/bin/oairepositoryupdater 7" % CFG_PREFIX,
"%s/bin/bibupload 8" % CFG_PREFIX,]:
if os.system(cmd):
print "ERROR: failed execution of", cmd
sys.exit(1)
print ">>> Demo records loaded successfully."
def cli_cmd_remove_demo_records(conf):
"""Remove demo records. Useful when you are finished testing."""
print ">>> Going to remove demo records..."
from invenio.config import CFG_PREFIX
from invenio.legacy.dbquery import run_sql
from invenio.utils.text import wrap_text_in_a_box, wait_for_user
wait_for_user(wrap_text_in_a_box("""WARNING: You are going to destroy
your records and documents!"""))
if os.path.exists(CFG_PREFIX + os.sep + 'var' + os.sep + 'data'):
shutil.rmtree(CFG_PREFIX + os.sep + 'var' + os.sep + 'data')
run_sql("TRUNCATE schTASK")
for cmd in ["%s/bin/dbexec < %s/lib/sql/invenio/tabbibclean.sql" % (CFG_PREFIX, CFG_PREFIX),
"%s/bin/webcoll -u admin" % CFG_PREFIX,
"%s/bin/webcoll 1" % CFG_PREFIX,]:
if os.system(cmd):
print "ERROR: failed execution of", cmd
sys.exit(1)
print ">>> Demo records removed successfully."
def cli_cmd_drop_demo_site(conf):
"""Drop demo site completely. Useful when you are finished testing."""
print ">>> Going to drop demo site..."
from invenio.utils.text import wrap_text_in_a_box, wait_for_user
wait_for_user(wrap_text_in_a_box("""WARNING: You are going to destroy
your site and documents!"""))
cli_cmd_drop_tables(conf)
cli_cmd_create_tables(conf)
cli_cmd_remove_demo_records(conf)
print ">>> Demo site dropped successfully."
def cli_cmd_run_unit_tests(conf):
"""Run unit tests, usually on the working demo site."""
from invenio.testsuite import build_and_run_unit_test_suite
if not build_and_run_unit_test_suite():
sys.exit(1)
def cli_cmd_run_js_unit_tests(conf):
"""Run JavaScript unit tests, usually on the working demo site."""
from invenio.testsuite import build_and_run_js_unit_test_suite
if not build_and_run_js_unit_test_suite():
sys.exit(1)
def cli_cmd_run_regression_tests(conf):
"""Run regression tests, usually on the working demo site."""
from invenio.testsuite import build_and_run_regression_test_suite
if not build_and_run_regression_test_suite():
sys.exit(1)
def cli_cmd_run_web_tests(conf):
"""Run web tests in a browser. Requires Firefox with Selenium."""
from invenio.testsuite import build_and_run_web_test_suite
if not build_and_run_web_test_suite():
sys.exit(1)
def cli_cmd_run_flask_tests(conf):
"""Run flask tests."""
from invenio.testsuite import build_and_run_flask_test_suite
build_and_run_flask_test_suite()
def _detect_ip_address():
"""Detect IP address of this computer. Useful for creating Apache
vhost conf snippet on RHEL like machines.
@return: IP address, or '*' if cannot detect
@rtype: string
@note: creates socket for real in order to detect real IP address,
not the loopback one.
"""
try:
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.connect(('invenio-software.org', 0))
return s.getsockname()[0]
except:
return '*'
def cli_cmd_create_apache_conf(conf):
"""
Create Apache conf files for this site, keeping previous
files in a backup copy.
"""
from invenio.apache_manager import main
warn('inveniocfg --create-apache-conf is deprecated. Using instead: inveniomanage apache create-config')
sys_argv = sys.argv
sys.argv = 'apache_manager.py create-config'.split()
main()
sys.argv = sys_argv
def cli_cmd_get(conf, varname):
"""
Return value of VARNAME read from CONF files. Useful for
third-party programs to access values of conf options such as
CFG_PREFIX. Return None if VARNAME is not found.
"""
from invenio.base.scripts.config import main
warn('inveniocfg --get="%(varname)s" is deprecated. '
'Using instead: inveniomanage config get "%(varname)s"' % {
'varname': varname
})
sys_argv = sys.argv
sys.argv = 'config_manager.py get'.split()
sys.argv.append(varname)
try:
main()
except SystemExit:
pass
sys.argv = sys_argv
def cli_cmd_list(conf):
"""
Print a list of all conf options and values from CONF.
"""
from invenio.base.scripts.config import main
warn('inveniocfg --list is deprecated. '
'Using instead: inveniomanage config list')
sys_argv = sys.argv
sys.argv = 'config_manager.py list'.split()
main()
sys.argv = sys_argv
def _grep_version_from_executable(path_to_exec, version_regexp):
"""
Try to detect a program version by digging into its binary
PATH_TO_EXEC and looking for VERSION_REGEXP. Return program
version as a string. Return empty string if not succeeded.
"""
from invenio.utils.shell import run_shell_command
exec_version = ""
if os.path.exists(path_to_exec):
dummy1, cmd2_out, dummy2 = run_shell_command("strings %s | grep %s",
(path_to_exec, version_regexp))
if cmd2_out:
for cmd2_out_line in cmd2_out.split("\n"):
if len(cmd2_out_line) > len(exec_version):
# the longest the better
exec_version = cmd2_out_line
return exec_version
def cli_cmd_detect_system_details(conf):
"""
Detect and print system details such as Apache/Python/MySQL
versions etc. Useful for debugging problems on various OS.
"""
from invenio.base.manage import main
warn('inveniocfg --detect-system-name is deprecated. Using instead: inveniomanage detect-system-name')
sys_argv = sys.argv
sys.argv = 'inveniomanage detect-system-name'.split()
main()
sys.argv = sys_argv
def cli_cmd_upgrade(conf):
"""
Command for applying upgrades
"""
from invenio.modules.upgrader.manage import main
warn('inveniocfg --upgrade-check is deprecated. Using instead: inveniomanage upgrade run')
sys_argv = sys.argv
sys.argv = 'modules.upgrader.manage.py run'.split()
main()
sys.argv = sys_argv
def cli_cmd_upgrade_check(conf):
"""
Command for running pre-upgrade checks
"""
from invenio.modules.upgrader.manage import main
warn('inveniocfg --upgrade-check is deprecated. Using instead: inveniomanage upgrade check')
sys_argv = sys.argv
sys.argv = 'modules.upgrader.manage.py check'.split()
main()
sys.argv = sys_argv
def cli_cmd_upgrade_show_pending(conf):
"""
Command for showing upgrades ready to be applied
"""
from invenio.modules.upgrader.manage import main
warn('inveniocfg --upgrade-show-pending is deprecated. Using instead: inveniomanage upgrade show pending')
sys_argv = sys.argv
sys.argv = 'modules.upgrader.manage.py show pending'.split()
main()
sys.argv = sys_argv
def cli_cmd_upgrade_show_applied(conf):
"""
Command for showing all upgrades already applied.
"""
from invenio.modules.upgrader.manage import main
warn('inveniocfg --upgrade-show-applied is deprecated. Using instead: inveniomanage upgrade show applied')
sys_argv = sys.argv
sys.argv = 'modules.upgrader.manage.py show applied'.split()
main()
sys.argv = sys_argv
def prepare_option_parser():
"""Parse the command line options."""
class InvenioOption(Option):
"""
Option class that implements the action 'store_append_const' which will
1) append <const> to list in options.<dest>
2) take a value and store in options.<const>
Useful for e.g. appending a const to an actions list, while also taking
an option value and storing it.
This ensures that we can run actions in the order they are given on the
command-line.
Python 2.4 compatibility note: *append_const* action is not available in
Python 2.4, so it is implemented here, together with the new action
*store_append_const*.
"""
ACTIONS = Option.ACTIONS + ("store_append_const", "append_const")
STORE_ACTIONS = Option.STORE_ACTIONS + ("store_append_const", "append_const")
TYPED_ACTIONS = Option.TYPED_ACTIONS + ("store_append_const", )
ALWAYS_TYPED_ACTIONS = Option.ALWAYS_TYPED_ACTIONS + ("store_append_const", )
CONST_ACTIONS = getattr(Option, 'CONST_ACTIONS', ()) + ("store_append_const", "append_const")
def take_action(self, action, dest, opt, value, values, parser):
if action == "store_append_const":
# Combination of 'store' and 'append_const' actions
values.ensure_value(dest, []).append(self.const)
value_dest = self.const.replace('-', '_')
setattr(values, value_dest, value)
elif action == "append_const" and not hasattr(Option, 'CONST_ACTIONS'):
values.ensure_value(dest, []).append(self.const)
else:
Option.take_action(self, action, dest, opt, value, values, parser)
def _check_const(self):
if self.action not in self.CONST_ACTIONS and self.const is not None:
raise OptionError(
"'const' must not be supplied for action %r" % self.action,
self)
CHECK_METHODS = [
Option._check_action,
Option._check_type,
Option._check_choice,
Option._check_dest,
_check_const,
Option._check_nargs,
Option._check_callback,
]
parser = OptionParser(option_class=InvenioOption, description="Invenio configuration and administration CLI tool", formatter=IndentedHelpFormatter(max_help_position=31))
parser.add_option("-V", "--version", action="store_true", help="print version number")
finish_options = OptionGroup(parser, "Options to finish your installation")
finish_options.add_option("", "--create-secret-key", dest='actions', const='create-secret-key', action="append_const", help="generate random CFG_SITE_SECRET_KEY")
finish_options.add_option("", "--create-apache-conf", dest='actions', const='create-apache-conf', action="append_const", help="create Apache configuration files")
finish_options.add_option("", "--create-tables", dest='actions', const='create-tables', action="append_const", help="create DB tables for Invenio")
finish_options.add_option("", "--load-bibfield-conf", dest='actions', const='load-bibfield-conf', action="append_const", help="load bibfield configuration file")
finish_options.add_option("", "--load-webstat-conf", dest='actions', const='load-webstat-conf', action="append_const", help="load the WebStat configuration")
finish_options.add_option("", "--drop-tables", dest='actions', const='drop-tables', action="append_const", help="drop DB tables of Invenio")
finish_options.add_option("", "--check-openoffice", dest='actions', const='check-openoffice', action="append_const", help="check for correctly set up of openoffice temporary directory")
parser.add_option_group(finish_options)
demotest_options = OptionGroup(parser, "Options to set up and test a demo site")
demotest_options.add_option("", "--create-demo-site", dest='actions', const='create-demo-site', action="append_const", help="create demo site")
demotest_options.add_option("", "--load-demo-records", dest='actions', const='load-demo-records', action="append_const", help="load demo records")
demotest_options.add_option("", "--remove-demo-records", dest='actions', const='remove-demo-records', action="append_const", help="remove demo records, keeping demo site")
demotest_options.add_option("", "--drop-demo-site", dest='actions', const='drop-demo-site', action="append_const", help="drop demo site configurations too")
demotest_options.add_option("", "--run-unit-tests", dest='actions', const='run-unit-tests', action="append_const", help="run unit test suite (needs demo site)")
demotest_options.add_option("", "--run-js-unit-tests", dest='actions', const='run-js-unit-tests', action="append_const", help="run JS unit test suite (needs demo site)")
demotest_options.add_option("", "--run-regression-tests", dest='actions', const='run-regression-tests', action="append_const", help="run regression test suite (needs demo site)")
demotest_options.add_option("", "--run-web-tests", dest='actions', const='run-web-tests', action="append_const", help="run web tests in a browser (needs demo site, Firefox, Selenium IDE)")
demotest_options.add_option("", "--run-flask-tests", dest='actions', const='run-flask-tests', action="append_const", help="run Flask test suite")
parser.add_option_group(demotest_options)
config_options = OptionGroup(parser, "Options to update config files in situ")
config_options.add_option("", "--update-all", dest='actions', const='update-all', action="append_const", help="perform all the update options")
config_options.add_option("", "--update-config-py", dest='actions', const='update-config-py', action="append_const", help="update config.py file from invenio.conf file")
config_options.add_option("", "--update-dbquery-py", dest='actions', const='update-dbquery-py', action="append_const", help="update dbquery.py with DB credentials from invenio.conf")
config_options.add_option("", "--update-dbexec", dest='actions', const='update-dbexec', action="append_const", help="update dbexec with DB credentials from invenio.conf")
config_options.add_option("", "--update-bibconvert-tpl", dest='actions', const='update-bibconvert-tpl', action="append_const", help="update bibconvert templates with CFG_SITE_URL from invenio.conf")
config_options.add_option("", "--update-web-tests", dest='actions', const='update-web-tests', action="append_const", help="update web test cases with CFG_SITE_URL from invenio.conf")
parser.add_option_group(config_options)
reset_options = OptionGroup(parser, "Options to update DB tables")
reset_options.add_option("", "--reset-all", dest='actions', const='reset-all', action="append_const", help="perform all the reset options")
reset_options.add_option("", "--reset-sitename", dest='actions', const='reset-sitename', action="append_const", help="reset tables to take account of new CFG_SITE_NAME*")
reset_options.add_option("", "--reset-siteadminemail", dest='actions', const='reset-siteadminemail', action="append_const", help="reset tables to take account of new CFG_SITE_ADMIN_EMAIL")
reset_options.add_option("", "--reset-fieldnames", dest='actions', const='reset-fieldnames', action="append_const", help="reset tables to take account of new I18N names from PO files")
reset_options.add_option("", "--reset-recstruct-cache", dest='actions', const='reset-recstruct-cache', action="append_const", help="reset record structure cache according to CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE")
reset_options.add_option("", "--reset-recjson-cache", dest='actions', const='reset-recjson-cache', action="append_const", help="reset record json structure cache according to CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE")
parser.add_option_group(reset_options)
upgrade_options = OptionGroup(parser, "Options to upgrade your installation")
upgrade_options.add_option("", "--upgrade", dest='actions', const='upgrade', action="append_const", help="apply all pending upgrades")
upgrade_options.add_option("", "--upgrade-check", dest='actions', const='upgrade-check', action="append_const", help="run pre-upgrade checks for pending upgrades")
upgrade_options.add_option("", "--upgrade-show-pending", dest='actions', const='upgrade-show-pending', action="append_const", help="show pending upgrades")
upgrade_options.add_option("", "--upgrade-show-applied", dest='actions', const='upgrade-show-applied', action="append_const", help="show history of applied upgrades")
upgrade_options.add_option("", "--upgrade-create-standard-recipe", dest='actions', metavar='REPOSITORY[,DIR]', const='upgrade-create-standard-recipe', action="append_const", help="use: inveniomanage upgrade create recipe")
upgrade_options.add_option("", "--upgrade-create-release-recipe", dest='actions', metavar='REPOSITORY[,DIR]', const='upgrade-create-release-recipe', action="append_const", help="use: inveniomanage upgrade create release")
parser.add_option_group(upgrade_options)
helper_options = OptionGroup(parser, "Options to help the work")
helper_options.add_option("", "--list", dest='actions', const='list', action="append_const", help="print names and values of all options from conf files")
helper_options.add_option("", "--get", dest='actions', const='get', action="store_append_const", metavar="OPTION", help="get value of a given option from conf files")
helper_options.add_option("", "--conf-dir", action="store", metavar="PATH", help="path to directory where invenio*.conf files are [optional]")
helper_options.add_option("", "--detect-system-details", dest='actions', const='detect-system-details', action="append_const", help="print system details such as Apache/Python/MySQL versions")
parser.add_option_group(helper_options)
parser.add_option('--yes-i-know', action='store_true', dest='yes-i-know', help='use with care!')
return parser
def prepare_conf(options):
""" Read configuration files """
conf = ConfigParser()
confdir = getattr(options, 'conf_dir', None)
if confdir is None:
## try to detect path to conf dir (relative to this bin dir):
confdir = re.sub(r'/bin$', '/etc', sys.path[0])
if confdir and not os.path.exists(confdir):
raise Exception("ERROR: bad --conf-dir option value - directory does not exists.")
sys.exit(1)
## read conf files:
for conffile in [confdir + os.sep + 'invenio.conf',
confdir + os.sep + 'invenio-autotools.conf',
confdir + os.sep + 'invenio-local.conf', ]:
if os.path.exists(conffile):
conf.read(conffile)
else:
if not conffile.endswith("invenio-local.conf"):
# invenio-local.conf is optional, otherwise stop
raise Exception("ERROR: Badly guessed conf file location %s (Please use --conf-dir option.)" % conffile)
return conf
def main(*cmd_args):
"""Main entry point."""
# Allow easier testing
if not cmd_args:
cmd_args = sys.argv[1:]
# Parse arguments
parser = prepare_option_parser()
(options, dummy_args) = parser.parse_args(list(cmd_args))
if getattr(options, 'version', False):
from invenio.base import manage
warn('inveniocfg --version is deprecated. Using instead: inveniomanage version')
sys_argv = sys.argv
sys.argv = 'inveniomanage.py version'.split()
manage.main()
sys.argv = sys_argv
else:
# Read configuration
try:
conf = prepare_conf(options)
except Exception, e:
print e
sys.exit(1)
## Decide what to do
actions = getattr(options, 'actions', None)
if not actions:
print """ERROR: Please specify a command. Please see '--help'."""
sys.exit(1)
if len(actions) > 1:
print """ERROR: Please specify only one command. Please see '--help'."""
sys.exit(1)
for action in actions:
if action == 'get':
cli_cmd_get(conf, getattr(options, 'get', None))
elif action == 'list':
cli_cmd_list(conf)
elif action == 'detect-system-details':
cli_cmd_detect_system_details(conf)
elif action == 'create-secret-key':
cli_cmd_create_secret_key(conf)
elif action == 'create-tables':
cli_cmd_create_tables(conf)
elif action == 'load-webstat-conf':
cli_cmd_load_webstat_conf(conf)
elif action == 'drop-tables':
cli_cmd_drop_tables(conf)
elif action == 'check-openoffice':
cli_check_openoffice(conf)
elif action == 'load-bibfield-conf':
cli_cmd_load_bibfield_config(conf)
elif action == 'create-demo-site':
cli_cmd_create_demo_site(conf)
elif action == 'load-demo-records':
cli_cmd_load_demo_records(conf)
elif action == 'remove-demo-records':
cli_cmd_remove_demo_records(conf)
elif action == 'drop-demo-site':
cli_cmd_drop_demo_site(conf)
elif action == 'run-unit-tests':
cli_cmd_run_unit_tests(conf)
elif action == 'run-js-unit-tests':
cli_cmd_run_js_unit_tests(conf)
elif action == 'run-regression-tests':
cli_cmd_run_regression_tests(conf)
elif action == 'run-web-tests':
cli_cmd_run_web_tests(conf)
elif action == 'run-flask-tests':
cli_cmd_run_flask_tests(conf)
elif action == 'update-all':
for f in [cli_cmd_update_config_py,
cli_cmd_update_dbquery_py,
cli_cmd_update_dbexec,
cli_cmd_update_bibconvert_tpl,
cli_cmd_update_web_tests]:
try:
f(conf)
except:
pass
elif action == 'update-config-py':
cli_cmd_update_config_py(conf)
elif action == 'update-dbquery-py':
cli_cmd_update_dbquery_py(conf)
elif action == 'update-dbexec':
cli_cmd_update_dbexec(conf)
elif action == 'update-bibconvert-tpl':
cli_cmd_update_bibconvert_tpl(conf)
elif action == 'update-web-tests':
cli_cmd_update_web_tests(conf)
elif action == 'reset-all':
cli_cmd_reset_sitename(conf)
cli_cmd_reset_siteadminemail(conf)
cli_cmd_reset_fieldnames(conf)
cli_cmd_reset_recstruct_cache(conf)
elif action == 'reset-sitename':
cli_cmd_reset_sitename(conf)
elif action == 'reset-siteadminemail':
cli_cmd_reset_siteadminemail(conf)
elif action == 'reset-fieldnames':
cli_cmd_reset_fieldnames(conf)
elif action == 'reset-recstruct-cache':
cli_cmd_reset_recstruct_cache(conf)
elif action == 'reset-recjson-cache':
cli_cmd_reset_recjson_cache(conf)
elif action == 'create-apache-conf':
cli_cmd_create_apache_conf(conf)
elif action == 'upgrade':
cli_cmd_upgrade(conf)
elif action == 'upgrade-check':
cli_cmd_upgrade_check(conf)
elif action == 'upgrade-show-pending':
cli_cmd_upgrade_show_pending(conf)
elif action == 'upgrade-show-applied':
cli_cmd_upgrade_show_applied(conf)
elif action == 'upgrade-create-standard-recipe':
print >> sys.stderr, 'ERROR: inveniocfg --upgrade-create-release-recipe is not supported anymore. Use instead: inveniomanage upgrade create release'
sys.exit(1)
elif action == 'upgrade-create-release-recipe':
print >> sys.stderr, 'ERROR: inveniocfg --upgrade-create-standard-recipe is not supported anymore. Use instead: inveniomanage upgrade create recipe'
sys.exit(1)
else:
print "ERROR: Unknown command", action
sys.exit(1)
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/miscutil/dbdump.py b/invenio/legacy/miscutil/dbdump.py
index 6fc168201..c06b8bce3 100644
--- a/invenio/legacy/miscutil/dbdump.py
+++ b/invenio/legacy/miscutil/dbdump.py
@@ -1,179 +1,179 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Invenio DB dumper.
Usage: /opt/invenio/bin/dbdump [options]
Command options:
-o, --output=DIR Output directory. [default=/opt/invenio/var/log]
-n, --number=NUM Keep up to NUM previous dump files. [default=5]
Scheduling options:
-u, --user=USER User name to submit the task as, password needed.
-t, --runtime=TIME Time to execute the task (now), e.g. +15s, 5m, 3h, 2002-10-27 13:57:26.
-s, --sleeptime=SLEEP Sleeping frequency after which to repeat task (no), e.g.: 30m, 2h, 1d.
-L --limit=LIMIT Time limit when it is allowed to execute the task, e.g. Sunday 01:00-05:00.
The time limit syntax is [Wee[kday]] [hh[:mm][-hh[:mm]]].
-P, --priority=PRI Task priority (0=default, 1=higher, etc).
-N, --name=NAME Task specific name (advanced option).
General options:
-h, --help Print this help.
-V, --version Print version information.
-v, --verbose=LEVEL Verbose level (0=min, 1=default, 9=max).
--profile=STATS Print profile information. STATS is a comma-separated
list of desired output stats (calls, cumulative,
file, line, module, name, nfl, pcalls, stdname, time).
"""
__revision__ = "$Id$"
import os
import sys
from invenio.config import CFG_LOGDIR, CFG_PATH_MYSQL, CFG_PATH_GZIP
from invenio.legacy.dbquery import CFG_DATABASE_HOST, \
CFG_DATABASE_USER, \
CFG_DATABASE_PASS, \
CFG_DATABASE_NAME
-from invenio.bibtask import task_init, write_message, task_set_option, \
+from invenio.legacy.bibsched.bibtask import task_init, write_message, task_set_option, \
task_get_option, task_update_progress, \
task_get_task_param
from invenio.utils.shell import run_shell_command, escape_shell_arg
def _delete_old_dumps(dirname, filename, number_to_keep):
"""
Look for files in DIRNAME directory starting with FILENAME
pattern. Delete up to NUMBER_TO_KEEP files (when sorted
alphabetically, which is equal to sorted by date). Useful to
prune old dump files.
"""
files = [x for x in os.listdir(dirname) if x.startswith(filename)]
files.sort()
for afile in files[:-number_to_keep]:
write_message("... deleting %s" % dirname + os.sep + afile)
os.remove(dirname + os.sep + afile)
def _dump_database(dirname, filename):
"""
Dump Invenio database into SQL file called FILENAME living in
DIRNAME.
"""
write_message("... writing %s" % dirname + os.sep + filename)
cmd = CFG_PATH_MYSQL + 'dump'
if not os.path.exists(cmd):
msg = "ERROR: cannot find %s." % cmd
write_message(msg, stream=sys.stderr)
raise StandardError(msg)
cmd += " --skip-opt --add-drop-table --add-locks --create-options " \
" --quick --extended-insert --set-charset --disable-keys " \
" --host=%s --user=%s --password=%s %s | %s -c " % \
(escape_shell_arg(CFG_DATABASE_HOST),
escape_shell_arg(CFG_DATABASE_USER),
escape_shell_arg(CFG_DATABASE_PASS),
escape_shell_arg(CFG_DATABASE_NAME),
CFG_PATH_GZIP)
dummy1, dummy2, dummy3 = run_shell_command(cmd, None, dirname + os.sep + filename)
if dummy1:
msg = "ERROR: mysqldump exit code is %s." % repr(dummy1)
write_message(msg, stream=sys.stderr)
raise StandardError(msg)
if dummy2:
msg = "ERROR: mysqldump stdout is %s." % repr(dummy1)
write_message(msg, stream=sys.stderr)
raise StandardError(msg)
if dummy3:
msg = "ERROR: mysqldump stderr is %s." % repr(dummy1)
write_message(msg, stream=sys.stderr)
raise StandardError(msg)
def _dbdump_elaborate_submit_param(key, value, dummyopts, dummyargs):
"""
Elaborate task submission parameter. See bibtask's
task_submit_elaborate_specific_parameter_fnc for help.
"""
if key in ('-n', '--number'):
try:
task_set_option('number', int(value))
except ValueError:
raise StandardError("ERROR: Number '%s' is not integer." % value)
elif key in ('-o', '--output'):
if os.path.isdir(value):
task_set_option('output', value)
else:
raise StandardError("ERROR: Output '%s' is not a directory." % \
value)
else:
return False
return True
def _dbdump_run_task_core():
"""
Run DB dumper core stuff.
Note: do not use task_can_sleep() stuff here because we don't want
other tasks to interrupt us while we are dumping the DB content.
"""
# read params:
task_update_progress("Reading parameters")
write_message("Reading parameters started")
output_dir = task_get_option('output', CFG_LOGDIR)
output_num = task_get_option('number', 5)
output_fil_prefix = CFG_DATABASE_NAME + '-dbdump-'
output_fil_suffix = task_get_task_param('task_starting_time').replace(' ', '_') + '.sql.gz'
output_fil = output_fil_prefix + output_fil_suffix
write_message("Reading parameters ended")
# make dump:
task_update_progress("Dumping database")
write_message("Database dump started")
_dump_database(output_dir, output_fil)
write_message("Database dump ended")
# prune old dump files:
task_update_progress("Pruning old dump files")
write_message("Pruning old dump files started")
_delete_old_dumps(output_dir, output_fil_prefix, output_num)
write_message("Pruning old dump files ended")
# we are done:
task_update_progress("Done.")
return True
def main():
"""Main that construct all the bibtask."""
task_init(authorization_action='rundbdump',
authorization_msg="DB Dump Task Submission",
help_specific_usage="""\
-o, --output=DIR Output directory. [default=%s]
-n, --number=NUM Keep up to NUM previous dump files. [default=5]
""" % CFG_LOGDIR,
version=__revision__,
specific_params=("n:o:",
["number=", "output="]),
task_submit_elaborate_specific_parameter_fnc=_dbdump_elaborate_submit_param,
task_run_fnc=_dbdump_run_task_core)
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/miscutil/solrutils_bibindex_indexer.py b/invenio/legacy/miscutil/solrutils_bibindex_indexer.py
index 0f549a9e9..8c2d7e309 100644
--- a/invenio/legacy/miscutil/solrutils_bibindex_indexer.py
+++ b/invenio/legacy/miscutil/solrutils_bibindex_indexer.py
@@ -1,76 +1,76 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Solr utilities.
"""
from invenio.config import CFG_SOLR_URL
-from invenio.solrutils_config import CFG_SOLR_INVALID_CHAR_RANGES
+from invenio.legacy.miscutil.solrutils_config import CFG_SOLR_INVALID_CHAR_RANGES
from invenio.ext.logging import register_exception
if CFG_SOLR_URL:
import solr
SOLR_CONNECTION = solr.SolrConnection(CFG_SOLR_URL) # pylint: disable=E1101
def replace_invalid_solr_characters(utext):
def replace(x):
o = ord(x)
for r in CFG_SOLR_INVALID_CHAR_RANGES:
if r[0] <= o <= r[1]:
return r[2]
return x
utext_elements = map(replace, utext)
return ''.join(utext_elements)
def solr_add_fulltext(recid, text):
"""
Helper function that dispatches TEXT to Solr for given record ID.
Returns True/False upon success/failure.
"""
if recid:
try:
utext = unicode(text, 'utf-8')
utext = replace_invalid_solr_characters(utext)
SOLR_CONNECTION.add(id=recid, abstract="", author="", fulltext=utext, keyword="", title="")
return True
except (UnicodeDecodeError, UnicodeEncodeError):
# forget about bad UTF-8 files
pass
except:
# In case anything else happens
register_exception(alert_admin=True)
return False
def solr_commit():
try:
# Commits might cause an exception, most likely a
# timeout while hitting a background merge
# Changes will then be committed later by the
# calling (periodical) task
# Also, autocommits can be used in the solrconfig
SOLR_CONNECTION.commit()
except:
register_exception(alert_admin=True)
diff --git a/invenio/legacy/miscutil/solrutils_bibrank_indexer.py b/invenio/legacy/miscutil/solrutils_bibrank_indexer.py
index c78b0196a..f45c94738 100644
--- a/invenio/legacy/miscutil/solrutils_bibrank_indexer.py
+++ b/invenio/legacy/miscutil/solrutils_bibrank_indexer.py
@@ -1,184 +1,184 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Solr utilities.
"""
import time
from invenio.config import CFG_SOLR_URL
-from invenio.bibtask import write_message, task_get_option, task_update_progress, \
+from invenio.legacy.bibsched.bibtask import write_message, task_get_option, task_update_progress, \
task_sleep_now_if_required
from invenio.legacy.dbquery import run_sql
from invenio.legacy.search_engine import record_exists
-from invenio.bibdocfile import BibRecDocs
-from invenio.solrutils_bibindex_indexer import replace_invalid_solr_characters
-from invenio.bibindex_engine import create_range_list
+from invenio.legacy.bibdocfile.api import BibRecDocs
+from invenio.legacy.miscutil.solrutils_bibindex_indexer import replace_invalid_solr_characters
+from invenio.legacy.bibindex.engine import create_range_list
from invenio.ext.logging import register_exception
from invenio.legacy.bibrank.bridge_utils import get_tags, get_field_content_in_utf8
if CFG_SOLR_URL:
import solr
SOLR_CONNECTION = solr.SolrConnection(CFG_SOLR_URL) # pylint: disable=E1101
def solr_add_ranges(id_ranges):
sub_range_length = task_get_option("flush")
id_ranges_to_index = []
for id_range in id_ranges:
lower_recid = id_range[0]
upper_recid = id_range[1]
i_low = lower_recid
while i_low <= upper_recid:
i_up = min(i_low + sub_range_length - 1, upper_recid)
id_ranges_to_index.append((i_low, i_up))
i_low += sub_range_length
tags_to_index = get_tags()
# Indexes latest records first by reversing
# This allows the ranker to return better results during long indexing
# runs as the ranker cuts the hitset using latest records
id_ranges_to_index.reverse()
next_commit_counter = 0
for id_range_to_index in id_ranges_to_index:
lower_recid = id_range_to_index[0]
upper_recid = id_range_to_index[1]
status_msg = "Solr ranking indexer called for %s-%s" % (lower_recid, upper_recid)
write_message(status_msg)
task_update_progress(status_msg)
next_commit_counter = solr_add_range(lower_recid, upper_recid, tags_to_index, next_commit_counter)
solr_commit_if_necessary(next_commit_counter, final_commit=True)
def solr_commit_if_necessary(next_commit_counter, final_commit=False, recid=None):
# Counter full or final commit if counter set
if next_commit_counter == task_get_option("flush") - 1 or (final_commit and next_commit_counter > 0):
recid_info = ''
if recid:
recid_info = ' for recid=%s' % recid
status_msg = 'Solr ranking indexer COMMITTING' + recid_info
write_message(status_msg)
task_update_progress(status_msg)
try:
# Commits might cause an exception, most likely a
# timeout while hitting a background merge
# Changes will then be committed later by the
# calling (periodical) task
# Also, autocommits can be used in the solrconfig
SOLR_CONNECTION.commit()
except:
register_exception(alert_admin=True)
next_commit_counter = 0
task_sleep_now_if_required(can_stop_too=True)
else:
next_commit_counter = next_commit_counter + 1
return next_commit_counter
def solr_add_range(lower_recid, upper_recid, tags_to_index, next_commit_counter):
"""
Adds the regarding field values of all records from the lower recid to the upper one to Solr.
It preserves the fulltext information.
"""
for recid in range(lower_recid, upper_recid + 1):
if record_exists(recid):
abstract = get_field_content_in_utf8(recid, 'abstract', tags_to_index)
author = get_field_content_in_utf8(recid, 'author', tags_to_index)
keyword = get_field_content_in_utf8(recid, 'keyword', tags_to_index)
title = get_field_content_in_utf8(recid, 'title', tags_to_index)
try:
bibrecdocs = BibRecDocs(recid)
fulltext = unicode(bibrecdocs.get_text(), 'utf-8')
except:
fulltext = ''
solr_add(recid, abstract, author, fulltext, keyword, title)
next_commit_counter = solr_commit_if_necessary(next_commit_counter,recid=recid)
return next_commit_counter
def solr_add(recid, abstract, author, fulltext, keyword, title):
"""
Helper function that adds word similarity ranking relevant indexes to Solr.
"""
try:
SOLR_CONNECTION.add(id=recid,
abstract=replace_invalid_solr_characters(abstract),
author=replace_invalid_solr_characters(author),
fulltext=replace_invalid_solr_characters(fulltext),
keyword=replace_invalid_solr_characters(keyword),
title=replace_invalid_solr_characters(title))
except:
register_exception(alert_admin=True)
def word_similarity_solr(run):
return word_index(run)
def get_recIDs_by_date(dates=""):
"""Returns recIDs modified between DATES[0] and DATES[1].
If DATES is not set, then returns records modified since the last run of
the ranking method.
"""
if not dates:
write_message("Using the last update time for the rank method")
res = run_sql('SELECT last_updated FROM rnkMETHOD WHERE name="wrd"')
if not res:
return
if not res[0][0]:
dates = ("0000-00-00",'')
else:
dates = (res[0][0],'')
if dates[1]:
res = run_sql('SELECT id FROM bibrec WHERE modification_date >= %s AND modification_date <= %s ORDER BY id ASC', (dates[0], dates[1]))
else:
res = run_sql('SELECT id FROM bibrec WHERE modification_date >= %s ORDER BY id ASC', (dates[0],))
return create_range_list([row[0] for row in res])
def word_index(run): # pylint: disable=W0613
"""
Runs the indexing task.
"""
# Explicitly set ids
id_option = task_get_option("id")
if len(id_option):
solr_add_ranges([(id_elem[0], id_elem[1]) for id_elem in id_option])
# Indexes modified ids since last run
else:
starting_time = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
id_ranges = get_recIDs_by_date()
if id_ranges:
solr_add_ranges([(id_range[0], id_range[1]) for id_range in id_ranges])
run_sql('UPDATE rnkMETHOD SET last_updated=%s WHERE name="wrd"', (starting_time, ))
else:
write_message("No new records. Solr index is up to date")
write_message("Solr ranking indexer completed")
diff --git a/invenio/legacy/miscutil/sql/tabfill.sql b/invenio/legacy/miscutil/sql/tabfill.sql
index 3151b5c07..7e59c9cab 100644
--- a/invenio/legacy/miscutil/sql/tabfill.sql
+++ b/invenio/legacy/miscutil/sql/tabfill.sql
@@ -1,858 +1,858 @@
-- This file is part of Invenio.
-- Copyright (C) 2008, 2009, 2010, 2011, 2012, 2013 CERN.
--
-- Invenio is free software; you can redistribute it and/or
-- modify it under the terms of the GNU General Public License as
-- published by the Free Software Foundation; either version 2 of the
-- License, or (at your option) any later version.
--
-- Invenio is distributed in the hope that it will be useful, but
-- WITHOUT ANY WARRANTY; without even the implied warranty of
-- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
-- General Public License for more details.
--
-- You should have received a copy of the GNU General Public License
-- along with Invenio; if not, write to the Free Software Foundation, Inc.,
-- 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
-- Fill Invenio configuration tables with defaults suitable for any site.
INSERT INTO rnkMETHOD (id,name,last_updated) VALUES (1,'wrd','0000-00-00 00:00:00');
INSERT INTO collection_rnkMETHOD (id_collection,id_rnkMETHOD,score) VALUES (1,1,100);
INSERT INTO rnkCITATIONDATA VALUES (1,'citationdict',NULL,'0000-00-00');
INSERT INTO rnkCITATIONDATA VALUES (2,'reversedict',NULL,'0000-00-00');
INSERT INTO rnkCITATIONDATA VALUES (3,'selfcitdict',NULL,'0000-00-00');
INSERT INTO rnkCITATIONDATA VALUES (4,'selfcitedbydict',NULL,'0000-00-00');
INSERT INTO field VALUES (1,'any field','anyfield');
INSERT INTO field VALUES (2,'title','title');
INSERT INTO field VALUES (3,'author','author');
INSERT INTO field VALUES (4,'abstract','abstract');
INSERT INTO field VALUES (5,'keyword','keyword');
INSERT INTO field VALUES (6,'report number','reportnumber');
INSERT INTO field VALUES (7,'subject','subject');
INSERT INTO field VALUES (8,'reference','reference');
INSERT INTO field VALUES (9,'fulltext','fulltext');
INSERT INTO field VALUES (10,'collection','collection');
INSERT INTO field VALUES (11,'division','division');
INSERT INTO field VALUES (12,'year','year');
INSERT INTO field VALUES (13,'experiment','experiment');
INSERT INTO field VALUES (14,'record ID','recid');
INSERT INTO field VALUES (15,'isbn','isbn');
INSERT INTO field VALUES (16,'issn','issn');
INSERT INTO field VALUES (17,'coden','coden');
-- INSERT INTO field VALUES (18,'doi','doi');
INSERT INTO field VALUES (19,'journal','journal');
INSERT INTO field VALUES (20,'collaboration','collaboration');
INSERT INTO field VALUES (21,'affiliation','affiliation');
INSERT INTO field VALUES (22,'exact author','exactauthor');
INSERT INTO field VALUES (23,'date created','datecreated');
INSERT INTO field VALUES (24,'date modified','datemodified');
INSERT INTO field VALUES (25,'refers to','refersto');
INSERT INTO field VALUES (26,'cited by','citedby');
INSERT INTO field VALUES (27,'caption','caption');
INSERT INTO field VALUES (28,'first author','firstauthor');
INSERT INTO field VALUES (29,'exact first author','exactfirstauthor');
INSERT INTO field VALUES (30,'author count','authorcount');
INSERT INTO field VALUES (31,'reference to','rawref');
INSERT INTO field VALUES (32,'exact title','exacttitle');
INSERT INTO field VALUES (33,'authority author','authorityauthor');
INSERT INTO field VALUES (34,'authority institution','authorityinstitution');
INSERT INTO field VALUES (35,'authority journal','authorityjournal');
INSERT INTO field VALUES (36,'authority subject','authoritysubject');
INSERT INTO field VALUES (37,'item count','itemcount');
INSERT INTO field VALUES (38,'file type','filetype');
INSERT INTO field VALUES (39,'miscellaneous', 'miscellaneous');
INSERT INTO field VALUES (40,'tag','tag');
INSERT INTO field_tag VALUES (10,11,100);
INSERT INTO field_tag VALUES (11,14,100);
INSERT INTO field_tag VALUES (12,15,10);
INSERT INTO field_tag VALUES (13,116,10);
INSERT INTO field_tag VALUES (2,3,100);
INSERT INTO field_tag VALUES (2,4,90);
INSERT INTO field_tag VALUES (3,1,100);
INSERT INTO field_tag VALUES (3,2,90);
INSERT INTO field_tag VALUES (4,5,100);
INSERT INTO field_tag VALUES (5,6,100);
INSERT INTO field_tag VALUES (6,7,30);
INSERT INTO field_tag VALUES (6,8,10);
INSERT INTO field_tag VALUES (6,9,20);
INSERT INTO field_tag VALUES (7,12,100);
INSERT INTO field_tag VALUES (7,13,90);
INSERT INTO field_tag VALUES (8,10,100);
INSERT INTO field_tag VALUES (9,115,100);
INSERT INTO field_tag VALUES (14,117,100);
INSERT INTO field_tag VALUES (15,118,100);
INSERT INTO field_tag VALUES (16,119,100);
INSERT INTO field_tag VALUES (17,120,100);
-- INSERT INTO field_tag VALUES (18,121,100);
INSERT INTO field_tag VALUES (19,131,100);
INSERT INTO field_tag VALUES (20,132,100);
INSERT INTO field_tag VALUES (21,133,100);
INSERT INTO field_tag VALUES (21,134,90);
INSERT INTO field_tag VALUES (22,1,100);
INSERT INTO field_tag VALUES (22,2,90);
INSERT INTO field_tag VALUES (27,135,100);
INSERT INTO field_tag VALUES (28,1,100);
INSERT INTO field_tag VALUES (29,1,100);
INSERT INTO field_tag VALUES (30,1,100);
INSERT INTO field_tag VALUES (30,2,90);
INSERT INTO field_tag VALUES (32,3,100);
INSERT INTO field_tag VALUES (32,4,90);
-- authority fields
INSERT INTO field_tag VALUES (33,1,100);
INSERT INTO field_tag VALUES (33,146,100);
INSERT INTO field_tag VALUES (33,140,100);
INSERT INTO field_tag VALUES (34,148,100);
INSERT INTO field_tag VALUES (34,149,100);
INSERT INTO field_tag VALUES (34,150,100);
INSERT INTO field_tag VALUES (35,151,100);
INSERT INTO field_tag VALUES (35,152,100);
INSERT INTO field_tag VALUES (35,153,100);
INSERT INTO field_tag VALUES (36,154,100);
INSERT INTO field_tag VALUES (36,155,100);
INSERT INTO field_tag VALUES (36,156,100);
-- misc fields
INSERT INTO field_tag VALUES (39,17,10);
INSERT INTO field_tag VALUES (39,18,10);
INSERT INTO field_tag VALUES (39,157,10);
INSERT INTO field_tag VALUES (39,158,10);
INSERT INTO field_tag VALUES (39,159,10);
INSERT INTO field_tag VALUES (39,160,10);
INSERT INTO field_tag VALUES (39,161,10);
INSERT INTO field_tag VALUES (39,162,10);
INSERT INTO field_tag VALUES (39,163,10);
INSERT INTO field_tag VALUES (39,164,10);
INSERT INTO field_tag VALUES (39,20,10);
INSERT INTO field_tag VALUES (39,21,10);
INSERT INTO field_tag VALUES (39,22,10);
INSERT INTO field_tag VALUES (39,23,10);
INSERT INTO field_tag VALUES (39,165,10);
INSERT INTO field_tag VALUES (39,166,10);
INSERT INTO field_tag VALUES (39,167,10);
INSERT INTO field_tag VALUES (39,168,10);
INSERT INTO field_tag VALUES (39,169,10);
INSERT INTO field_tag VALUES (39,170,10);
INSERT INTO field_tag VALUES (39,25,10);
INSERT INTO field_tag VALUES (39,27,10);
INSERT INTO field_tag VALUES (39,28,10);
INSERT INTO field_tag VALUES (39,29,10);
INSERT INTO field_tag VALUES (39,30,10);
INSERT INTO field_tag VALUES (39,31,10);
INSERT INTO field_tag VALUES (39,32,10);
INSERT INTO field_tag VALUES (39,33,10);
INSERT INTO field_tag VALUES (39,34,10);
INSERT INTO field_tag VALUES (39,35,10);
INSERT INTO field_tag VALUES (39,36,10);
INSERT INTO field_tag VALUES (39,37,10);
INSERT INTO field_tag VALUES (39,38,10);
INSERT INTO field_tag VALUES (39,39,10);
INSERT INTO field_tag VALUES (39,171,10);
INSERT INTO field_tag VALUES (39,172,10);
INSERT INTO field_tag VALUES (39,173,10);
INSERT INTO field_tag VALUES (39,174,10);
INSERT INTO field_tag VALUES (39,175,10);
INSERT INTO field_tag VALUES (39,41,10);
INSERT INTO field_tag VALUES (39,42,10);
INSERT INTO field_tag VALUES (39,43,10);
INSERT INTO field_tag VALUES (39,44,10);
INSERT INTO field_tag VALUES (39,45,10);
INSERT INTO field_tag VALUES (39,46,10);
INSERT INTO field_tag VALUES (39,47,10);
INSERT INTO field_tag VALUES (39,48,10);
INSERT INTO field_tag VALUES (39,49,10);
INSERT INTO field_tag VALUES (39,50,10);
INSERT INTO field_tag VALUES (39,51,10);
INSERT INTO field_tag VALUES (39,52,10);
INSERT INTO field_tag VALUES (39,53,10);
INSERT INTO field_tag VALUES (39,54,10);
INSERT INTO field_tag VALUES (39,55,10);
INSERT INTO field_tag VALUES (39,56,10);
INSERT INTO field_tag VALUES (39,57,10);
INSERT INTO field_tag VALUES (39,58,10);
INSERT INTO field_tag VALUES (39,59,10);
INSERT INTO field_tag VALUES (39,60,10);
INSERT INTO field_tag VALUES (39,61,10);
INSERT INTO field_tag VALUES (39,62,10);
INSERT INTO field_tag VALUES (39,63,10);
INSERT INTO field_tag VALUES (39,64,10);
INSERT INTO field_tag VALUES (39,65,10);
INSERT INTO field_tag VALUES (39,66,10);
INSERT INTO field_tag VALUES (39,67,10);
INSERT INTO field_tag VALUES (39,176,10);
INSERT INTO field_tag VALUES (39,177,10);
INSERT INTO field_tag VALUES (39,178,10);
INSERT INTO field_tag VALUES (39,179,10);
INSERT INTO field_tag VALUES (39,180,10);
INSERT INTO field_tag VALUES (39,69,10);
INSERT INTO field_tag VALUES (39,70,10);
INSERT INTO field_tag VALUES (39,71,10);
INSERT INTO field_tag VALUES (39,72,10);
INSERT INTO field_tag VALUES (39,73,10);
INSERT INTO field_tag VALUES (39,74,10);
INSERT INTO field_tag VALUES (39,75,10);
INSERT INTO field_tag VALUES (39,76,10);
INSERT INTO field_tag VALUES (39,77,10);
INSERT INTO field_tag VALUES (39,78,10);
INSERT INTO field_tag VALUES (39,79,10);
INSERT INTO field_tag VALUES (39,80,10);
INSERT INTO field_tag VALUES (39,181,10);
INSERT INTO field_tag VALUES (39,182,10);
INSERT INTO field_tag VALUES (39,183,10);
INSERT INTO field_tag VALUES (39,184,10);
INSERT INTO field_tag VALUES (39,185,10);
INSERT INTO field_tag VALUES (39,186,10);
INSERT INTO field_tag VALUES (39,82,10);
INSERT INTO field_tag VALUES (39,83,10);
INSERT INTO field_tag VALUES (39,84,10);
INSERT INTO field_tag VALUES (39,85,10);
INSERT INTO field_tag VALUES (39,187,10);
INSERT INTO field_tag VALUES (39,88,10);
INSERT INTO field_tag VALUES (39,89,10);
INSERT INTO field_tag VALUES (39,90,10);
INSERT INTO field_tag VALUES (39,91,10);
INSERT INTO field_tag VALUES (39,92,10);
INSERT INTO field_tag VALUES (39,93,10);
INSERT INTO field_tag VALUES (39,94,10);
INSERT INTO field_tag VALUES (39,95,10);
INSERT INTO field_tag VALUES (39,96,10);
INSERT INTO field_tag VALUES (39,97,10);
INSERT INTO field_tag VALUES (39,98,10);
INSERT INTO field_tag VALUES (39,99,10);
INSERT INTO field_tag VALUES (39,100,10);
INSERT INTO field_tag VALUES (39,102,10);
INSERT INTO field_tag VALUES (39,103,10);
INSERT INTO field_tag VALUES (39,104,10);
INSERT INTO field_tag VALUES (39,105,10);
INSERT INTO field_tag VALUES (39,188,10);
INSERT INTO field_tag VALUES (39,189,10);
INSERT INTO field_tag VALUES (39,190,10);
INSERT INTO field_tag VALUES (39,191,10);
INSERT INTO field_tag VALUES (39,192,10);
INSERT INTO field_tag VALUES (39,193,10);
INSERT INTO field_tag VALUES (39,194,10);
INSERT INTO field_tag VALUES (39,195,10);
INSERT INTO field_tag VALUES (39,196,10);
INSERT INTO field_tag VALUES (39,107,10);
INSERT INTO field_tag VALUES (39,108,10);
INSERT INTO field_tag VALUES (39,109,10);
INSERT INTO field_tag VALUES (39,110,10);
INSERT INTO field_tag VALUES (39,111,10);
INSERT INTO field_tag VALUES (39,112,10);
INSERT INTO field_tag VALUES (39,113,10);
INSERT INTO field_tag VALUES (39,197,10);
INSERT INTO field_tag VALUES (39,198,10);
INSERT INTO field_tag VALUES (39,199,10);
INSERT INTO field_tag VALUES (39,200,10);
INSERT INTO field_tag VALUES (39,201,10);
INSERT INTO field_tag VALUES (39,202,10);
INSERT INTO field_tag VALUES (39,203,10);
INSERT INTO field_tag VALUES (39,204,10);
INSERT INTO field_tag VALUES (39,205,10);
INSERT INTO field_tag VALUES (39,206,10);
INSERT INTO field_tag VALUES (39,207,10);
INSERT INTO field_tag VALUES (39,208,10);
INSERT INTO field_tag VALUES (39,209,10);
INSERT INTO field_tag VALUES (39,210,10);
INSERT INTO field_tag VALUES (39,211,10);
INSERT INTO field_tag VALUES (39,212,10);
INSERT INTO field_tag VALUES (39,213,10);
INSERT INTO field_tag VALUES (39,214,10);
INSERT INTO field_tag VALUES (39,215,10);
INSERT INTO field_tag VALUES (39,122,10);
INSERT INTO field_tag VALUES (39,123,10);
INSERT INTO field_tag VALUES (39,124,10);
INSERT INTO field_tag VALUES (39,125,10);
INSERT INTO field_tag VALUES (39,126,10);
INSERT INTO field_tag VALUES (39,127,10);
INSERT INTO field_tag VALUES (39,128,10);
INSERT INTO field_tag VALUES (39,129,10);
INSERT INTO field_tag VALUES (39,130,10);
INSERT INTO field_tag VALUES (39,1,10);
INSERT INTO field_tag VALUES (39,2,10);
-- misc authority fields
INSERT INTO field_tag VALUES (39,216,10);
INSERT INTO field_tag VALUES (39,217,10);
INSERT INTO field_tag VALUES (39,218,10);
INSERT INTO field_tag VALUES (39,219,10);
INSERT INTO field_tag VALUES (39,220,10);
INSERT INTO field_tag VALUES (39,221,10);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (1,'HTML brief','hb', 'HTML brief output format, used for search results pages.', 'text/html', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (2,'HTML detailed','hd', 'HTML detailed output format, used for Detailed record pages.', 'text/html', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (3,'MARC','hm', 'HTML MARC.', 'text/html', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (4,'Dublin Core','xd', 'XML Dublin Core.', 'text/xml', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (5,'MARCXML','xm', 'XML MARC.', 'text/xml', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (6,'portfolio','hp', 'HTML portfolio-style output format for photos.', 'text/html', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (7,'photo captions only','hc', 'HTML caption-only output format for photos.', 'text/html', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (8,'BibTeX','hx', 'BibTeX.', 'text/html', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (9,'EndNote','xe', 'XML EndNote.', 'text/xml', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (10,'NLM','xn', 'XML NLM.', 'text/xml', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (11,'Excel','excel', 'Excel csv output', 'application/ms-excel', 0);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (12,'HTML similarity','hs', 'Very short HTML output for similarity box (<i>people also viewed..</i>).', 'text/html', 0);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (13,'RSS','xr', 'RSS.', 'text/xml', 0);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (14,'OAI DC','xoaidc', 'OAI DC.', 'text/xml', 0);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (15,'File mini-panel', 'hdfile', 'Used to show fulltext files in mini-panel of detailed record pages.', 'text/html', 0);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (16,'Actions mini-panel', 'hdact', 'Used to display actions in mini-panel of detailed record pages.', 'text/html', 0);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (17,'References tab', 'hdref', 'Display record references in References tab.', 'text/html', 0);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (18,'HTML citesummary','hcs', 'HTML cite summary format, used for search results pages.', 'text/html', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (19,'RefWorks','xw', 'RefWorks.', 'text/xml', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (20,'MODS', 'xo', 'Metadata Object Description Schema', 'application/xml', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (21,'HTML author claiming', 'ha', 'Very brief HTML output format for author/paper claiming facility.', 'text/html', 0);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (22,'Podcast', 'xp', 'Sample format suitable for multimedia feeds, such as podcasts', 'application/rss+xml', 0);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (23,'WebAuthorProfile affiliations helper','wapaff', 'cPickled dicts', 'text', 0);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (24,'EndNote (8-X)','xe8x', 'XML EndNote (8-X).', 'text/xml', 1);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (25,'HTML citesummary extended','hcs2', 'HTML cite summary format, including self-citations counts.', 'text/html', 0);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (26,'DataCite','dcite', 'DataCite XML format.', 'text/xml', 0);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (27,'Mobile brief','mobb', 'Mobile brief format.', 'text/html', 0);
INSERT INTO format (id,name,code,description,content_type,visibility) VALUES (28,'Mobile detailed','mobd', 'Mobile detailed format.', 'text/html', 0);
INSERT INTO tag VALUES (1,'first author name','100__a');
INSERT INTO tag VALUES (2,'additional author name','700__a');
INSERT INTO tag VALUES (3,'main title','245__%');
INSERT INTO tag VALUES (4,'additional title','246__%');
INSERT INTO tag VALUES (5,'abstract','520__%');
INSERT INTO tag VALUES (6,'keyword','6531_a');
INSERT INTO tag VALUES (7,'primary report number','037__a');
INSERT INTO tag VALUES (8,'additional report number','088__a');
INSERT INTO tag VALUES (9,'added report number','909C0r');
INSERT INTO tag VALUES (10,'reference','999C5%');
INSERT INTO tag VALUES (11,'collection identifier','980__%');
INSERT INTO tag VALUES (12,'main subject','65017a');
INSERT INTO tag VALUES (13,'additional subject','65027a');
INSERT INTO tag VALUES (14,'division','909C0p');
INSERT INTO tag VALUES (15,'year','909C0y');
INSERT INTO tag VALUES (16,'00x','00%');
INSERT INTO tag VALUES (17,'01x','01%');
INSERT INTO tag VALUES (18,'02x','02%');
INSERT INTO tag VALUES (19,'03x','03%');
INSERT INTO tag VALUES (20,'lang','04%');
INSERT INTO tag VALUES (21,'05x','05%');
INSERT INTO tag VALUES (22,'06x','06%');
INSERT INTO tag VALUES (23,'07x','07%');
INSERT INTO tag VALUES (24,'08x','08%');
INSERT INTO tag VALUES (25,'09x','09%');
INSERT INTO tag VALUES (26,'10x','10%');
INSERT INTO tag VALUES (27,'11x','11%');
INSERT INTO tag VALUES (28,'12x','12%');
INSERT INTO tag VALUES (29,'13x','13%');
INSERT INTO tag VALUES (30,'14x','14%');
INSERT INTO tag VALUES (31,'15x','15%');
INSERT INTO tag VALUES (32,'16x','16%');
INSERT INTO tag VALUES (33,'17x','17%');
INSERT INTO tag VALUES (34,'18x','18%');
INSERT INTO tag VALUES (35,'19x','19%');
INSERT INTO tag VALUES (36,'20x','20%');
INSERT INTO tag VALUES (37,'21x','21%');
INSERT INTO tag VALUES (38,'22x','22%');
INSERT INTO tag VALUES (39,'23x','23%');
INSERT INTO tag VALUES (40,'24x','24%');
INSERT INTO tag VALUES (41,'25x','25%');
INSERT INTO tag VALUES (42,'internal','26%');
INSERT INTO tag VALUES (43,'27x','27%');
INSERT INTO tag VALUES (44,'28x','28%');
INSERT INTO tag VALUES (45,'29x','29%');
INSERT INTO tag VALUES (46,'pages','30%');
INSERT INTO tag VALUES (47,'31x','31%');
INSERT INTO tag VALUES (48,'32x','32%');
INSERT INTO tag VALUES (49,'33x','33%');
INSERT INTO tag VALUES (50,'34x','34%');
INSERT INTO tag VALUES (51,'35x','35%');
INSERT INTO tag VALUES (52,'36x','36%');
INSERT INTO tag VALUES (53,'37x','37%');
INSERT INTO tag VALUES (54,'38x','38%');
INSERT INTO tag VALUES (55,'39x','39%');
INSERT INTO tag VALUES (56,'40x','40%');
INSERT INTO tag VALUES (57,'41x','41%');
INSERT INTO tag VALUES (58,'42x','42%');
INSERT INTO tag VALUES (59,'43x','43%');
INSERT INTO tag VALUES (60,'44x','44%');
INSERT INTO tag VALUES (61,'45x','45%');
INSERT INTO tag VALUES (62,'46x','46%');
INSERT INTO tag VALUES (63,'47x','47%');
INSERT INTO tag VALUES (64,'48x','48%');
INSERT INTO tag VALUES (65,'series','49%');
INSERT INTO tag VALUES (66,'50x','50%');
INSERT INTO tag VALUES (67,'51x','51%');
INSERT INTO tag VALUES (68,'52x','52%');
INSERT INTO tag VALUES (69,'53x','53%');
INSERT INTO tag VALUES (70,'54x','54%');
INSERT INTO tag VALUES (71,'55x','55%');
INSERT INTO tag VALUES (72,'56x','56%');
INSERT INTO tag VALUES (73,'57x','57%');
INSERT INTO tag VALUES (74,'58x','58%');
INSERT INTO tag VALUES (75,'summary','59%');
INSERT INTO tag VALUES (76,'60x','60%');
INSERT INTO tag VALUES (77,'61x','61%');
INSERT INTO tag VALUES (78,'62x','62%');
INSERT INTO tag VALUES (79,'63x','63%');
INSERT INTO tag VALUES (80,'64x','64%');
INSERT INTO tag VALUES (81,'65x','65%');
INSERT INTO tag VALUES (82,'66x','66%');
INSERT INTO tag VALUES (83,'67x','67%');
INSERT INTO tag VALUES (84,'68x','68%');
INSERT INTO tag VALUES (85,'subject','69%');
INSERT INTO tag VALUES (86,'70x','70%');
INSERT INTO tag VALUES (87,'71x','71%');
INSERT INTO tag VALUES (88,'author-ad','72%');
INSERT INTO tag VALUES (89,'73x','73%');
INSERT INTO tag VALUES (90,'74x','74%');
INSERT INTO tag VALUES (91,'75x','75%');
INSERT INTO tag VALUES (92,'76x','76%');
INSERT INTO tag VALUES (93,'77x','77%');
INSERT INTO tag VALUES (94,'78x','78%');
INSERT INTO tag VALUES (95,'79x','79%');
INSERT INTO tag VALUES (96,'80x','80%');
INSERT INTO tag VALUES (97,'81x','81%');
INSERT INTO tag VALUES (98,'82x','82%');
INSERT INTO tag VALUES (99,'83x','83%');
INSERT INTO tag VALUES (100,'84x','84%');
INSERT INTO tag VALUES (101,'electr','85%');
INSERT INTO tag VALUES (102,'86x','86%');
INSERT INTO tag VALUES (103,'87x','87%');
INSERT INTO tag VALUES (104,'88x','88%');
INSERT INTO tag VALUES (105,'89x','89%');
INSERT INTO tag VALUES (106,'publication','90%');
INSERT INTO tag VALUES (107,'pub-conf-cit','91%');
INSERT INTO tag VALUES (108,'92x','92%');
INSERT INTO tag VALUES (109,'93x','93%');
INSERT INTO tag VALUES (110,'94x','94%');
INSERT INTO tag VALUES (111,'95x','95%');
INSERT INTO tag VALUES (112,'catinfo','96%');
INSERT INTO tag VALUES (113,'97x','97%');
INSERT INTO tag VALUES (114,'98x','98%');
INSERT INTO tag VALUES (115,'url','8564_u');
INSERT INTO tag VALUES (116,'experiment','909C0e');
INSERT INTO tag VALUES (117,'record ID','001');
INSERT INTO tag VALUES (118,'isbn','020__a');
INSERT INTO tag VALUES (119,'issn','022__a');
INSERT INTO tag VALUES (120,'coden','030__a');
INSERT INTO tag VALUES (121,'doi','909C4a');
INSERT INTO tag VALUES (122,'850x','850%');
INSERT INTO tag VALUES (123,'851x','851%');
INSERT INTO tag VALUES (124,'852x','852%');
INSERT INTO tag VALUES (125,'853x','853%');
INSERT INTO tag VALUES (126,'854x','854%');
INSERT INTO tag VALUES (127,'855x','855%');
INSERT INTO tag VALUES (128,'857x','857%');
INSERT INTO tag VALUES (129,'858x','858%');
INSERT INTO tag VALUES (130,'859x','859%');
INSERT INTO tag VALUES (131,'journal','909C4%');
INSERT INTO tag VALUES (132,'collaboration','710__g');
INSERT INTO tag VALUES (133,'first author affiliation','100__u');
INSERT INTO tag VALUES (134,'additional author affiliation','700__u');
INSERT INTO tag VALUES (135,'caption','8564_y');
INSERT INTO tag VALUES (136,'journal page','909C4c');
INSERT INTO tag VALUES (137,'journal title','909C4p');
INSERT INTO tag VALUES (138,'journal volume','909C4v');
INSERT INTO tag VALUES (139,'journal year','909C4y');
INSERT INTO tag VALUES (140,'comment','500__a');
INSERT INTO tag VALUES (141,'title','245__a');
INSERT INTO tag VALUES (142,'main abstract','245__a');
INSERT INTO tag VALUES (143,'internal notes','595__a');
INSERT INTO tag VALUES (144,'other relationship entry', '787%');
-- INSERT INTO tag VALUES (145,'authority: main personal name','100__a'); -- already exists under a different name ('first author name')
INSERT INTO tag VALUES (146,'authority: alternative personal name','400__a');
-- INSERT INTO tag VALUES (147,'authority: personal name from other record','500__a'); -- already exists under a different name ('comment')
INSERT INTO tag VALUES (148,'authority: organization main name','110__a');
INSERT INTO tag VALUES (149,'organization alternative name','410__a');
INSERT INTO tag VALUES (150,'organization main from other record','510__a');
INSERT INTO tag VALUES (151,'authority: uniform title','130__a');
INSERT INTO tag VALUES (152,'authority: uniform title alternatives','430__a');
INSERT INTO tag VALUES (153,'authority: uniform title from other record','530__a');
INSERT INTO tag VALUES (154,'authority: subject from other record','150__a');
INSERT INTO tag VALUES (155,'authority: subject alternative name','450__a');
INSERT INTO tag VALUES (156,'authority: subject main name','550__a');
-- tags for misc index
INSERT INTO tag VALUES (157,'031x','031%');
INSERT INTO tag VALUES (158,'032x','032%');
INSERT INTO tag VALUES (159,'033x','033%');
INSERT INTO tag VALUES (160,'034x','034%');
INSERT INTO tag VALUES (161,'035x','035%');
INSERT INTO tag VALUES (162,'036x','036%');
INSERT INTO tag VALUES (163,'037x','037%');
INSERT INTO tag VALUES (164,'038x','038%');
INSERT INTO tag VALUES (165,'080x','080%');
INSERT INTO tag VALUES (166,'082x','082%');
INSERT INTO tag VALUES (167,'083x','083%');
INSERT INTO tag VALUES (168,'084x','084%');
INSERT INTO tag VALUES (169,'085x','085%');
INSERT INTO tag VALUES (170,'086x','086%');
INSERT INTO tag VALUES (171,'240x','240%');
INSERT INTO tag VALUES (172,'242x','242%');
INSERT INTO tag VALUES (173,'243x','243%');
INSERT INTO tag VALUES (174,'244x','244%');
INSERT INTO tag VALUES (175,'247x','247%');
INSERT INTO tag VALUES (176,'521x','521%');
INSERT INTO tag VALUES (177,'522x','522%');
INSERT INTO tag VALUES (178,'524x','524%');
INSERT INTO tag VALUES (179,'525x','525%');
INSERT INTO tag VALUES (180,'526x','526%');
INSERT INTO tag VALUES (181,'650x','650%');
INSERT INTO tag VALUES (182,'651x','651%');
INSERT INTO tag VALUES (183,'6531_v','6531_v');
INSERT INTO tag VALUES (184,'6531_y','6531_y');
INSERT INTO tag VALUES (185,'6531_9','6531_9');
INSERT INTO tag VALUES (186,'654x','654%');
INSERT INTO tag VALUES (187,'655x','655%');
INSERT INTO tag VALUES (188,'656x','656%');
INSERT INTO tag VALUES (189,'657x','657%');
INSERT INTO tag VALUES (190,'658x','658%');
INSERT INTO tag VALUES (191,'711x','711%');
INSERT INTO tag VALUES (192,'900x','900%');
INSERT INTO tag VALUES (193,'901x','901%');
INSERT INTO tag VALUES (194,'902x','902%');
INSERT INTO tag VALUES (195,'903x','903%');
INSERT INTO tag VALUES (196,'904x','904%');
INSERT INTO tag VALUES (197,'905x','905%');
INSERT INTO tag VALUES (198,'906x','906%');
INSERT INTO tag VALUES (199,'907x','907%');
INSERT INTO tag VALUES (200,'908x','908%');
INSERT INTO tag VALUES (201,'909C1x','909C1%');
INSERT INTO tag VALUES (202,'909C5x','909C5%');
INSERT INTO tag VALUES (203,'909CSx','909CS%');
INSERT INTO tag VALUES (204,'909COx','909CO%');
INSERT INTO tag VALUES (205,'909CKx','909CK%');
INSERT INTO tag VALUES (206,'909CPx','909CP%');
INSERT INTO tag VALUES (207,'981x','981%');
INSERT INTO tag VALUES (208,'982x','982%');
INSERT INTO tag VALUES (209,'983x','983%');
INSERT INTO tag VALUES (210,'984x','984%');
INSERT INTO tag VALUES (211,'985x','985%');
INSERT INTO tag VALUES (212,'986x','986%');
INSERT INTO tag VALUES (213,'987x','987%');
INSERT INTO tag VALUES (214,'988x','988%');
INSERT INTO tag VALUES (215,'989x','989%');
-- authority controled tags
INSERT INTO tag VALUES (216,'author control','100__0');
INSERT INTO tag VALUES (217,'institution control','110__0');
INSERT INTO tag VALUES (218,'journal control','130__0');
INSERT INTO tag VALUES (219,'subject control','150__0');
INSERT INTO tag VALUES (220,'additional institution control', '260__0');
INSERT INTO tag VALUES (221,'additional author control', '700__0');
INSERT INTO idxINDEX VALUES (1,'global','This index contains words/phrases from global fields.','0000-00-00 00:00:00', '', 'native', 'INDEX-SYNONYM-TITLE,exact','No','No','No','BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (2,'collection','This index contains words/phrases from collection identifiers fields.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (3,'abstract','This index contains words/phrases from abstract fields.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (4,'author','This index contains fuzzy words/phrases from author fields.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexAuthorTokenizer');
INSERT INTO idxINDEX VALUES (5,'keyword','This index contains words/phrases from keyword fields.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (6,'reference','This index contains words/phrases from references fields.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (7,'reportnumber','This index contains words/phrases from report numbers fields.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (8,'title','This index contains words/phrases from title fields.','0000-00-00 00:00:00', '', 'native','INDEX-SYNONYM-TITLE,exact','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (9,'fulltext','This index contains words/phrases from fulltext fields.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexFulltextTokenizer');
INSERT INTO idxINDEX VALUES (10,'year','This index contains words/phrases from year fields.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexYearTokenizer');
INSERT INTO idxINDEX VALUES (11,'journal','This index contains words/phrases from journal publication information fields.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexJournalTokenizer');
INSERT INTO idxINDEX VALUES (12,'collaboration','This index contains words/phrases from collaboration name fields.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (13,'affiliation','This index contains words/phrases from institutional affiliation fields.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (14,'exactauthor','This index contains exact words/phrases from author fields.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexExactAuthorTokenizer');
INSERT INTO idxINDEX VALUES (15,'caption','This index contains exact words/phrases from figure captions.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (16,'firstauthor','This index contains fuzzy words/phrases from first author field.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexAuthorTokenizer');
INSERT INTO idxINDEX VALUES (17,'exactfirstauthor','This index contains exact words/phrases from first author field.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexExactAuthorTokenizer');
INSERT INTO idxINDEX VALUES (18,'authorcount','This index contains number of authors of the record.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexAuthorCountTokenizer');
INSERT INTO idxINDEX VALUES (19,'exacttitle','This index contains exact words/phrases from title fields.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (20,'authorityauthor','This index contains words/phrases from author authority records.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexAuthorTokenizer');
INSERT INTO idxINDEX VALUES (21,'authorityinstitution','This index contains words/phrases from institution authority records.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (22,'authorityjournal','This index contains words/phrases from journal authority records.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (23,'authoritysubject','This index contains words/phrases from subject authority records.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX VALUES (24,'itemcount','This index contains number of copies of items in the library.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexItemCountTokenizer');
INSERT INTO idxINDEX VALUES (25,'filetype','This index contains extensions of files connected to records.','0000-00-00 00:00:00', '', 'native', '','No','No','No', 'BibIndexFiletypeTokenizer');
INSERT INTO idxINDEX VALUES (26,'miscellaneous','This index contains words/phrases from miscellaneous fields','0000-00-00 00:00:00', '', 'native','','No','No','No', 'BibIndexDefaultTokenizer');
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (1,1);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (2,10);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (3,4);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (4,3);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (5,5);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (6,8);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (7,6);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (8,2);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (9,9);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (10,12);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (11,19);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (12,20);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (13,21);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (14,22);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (15,27);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (16,28);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (17,29);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (18,30);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (19,32);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (20,33);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (21,34);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (22,35);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (23,36);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (24,37);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (25,38);
INSERT INTO idxINDEX_field (id_idxINDEX, id_field) VALUES (26,39);
INSERT INTO idxINDEX_idxINDEX (id_virtual, id_normal) VALUES (1, 2);
INSERT INTO idxINDEX_idxINDEX (id_virtual, id_normal) VALUES (1, 3);
INSERT INTO idxINDEX_idxINDEX (id_virtual, id_normal) VALUES (1, 5);
INSERT INTO idxINDEX_idxINDEX (id_virtual, id_normal) VALUES (1, 7);
INSERT INTO idxINDEX_idxINDEX (id_virtual, id_normal) VALUES (1, 8);
INSERT INTO idxINDEX_idxINDEX (id_virtual, id_normal) VALUES (1, 10);
INSERT INTO idxINDEX_idxINDEX (id_virtual, id_normal) VALUES (1, 11);
INSERT INTO idxINDEX_idxINDEX (id_virtual, id_normal) VALUES (1, 12);
INSERT INTO idxINDEX_idxINDEX (id_virtual, id_normal) VALUES (1, 13);
INSERT INTO idxINDEX_idxINDEX (id_virtual, id_normal) VALUES (1, 19);
INSERT INTO idxINDEX_idxINDEX (id_virtual, id_normal) VALUES (1, 26);
INSERT INTO sbmACTION VALUES ('Submit New Record','SBI','running','1998-08-17','2001-08-08','','Submit New Record');
INSERT INTO sbmACTION VALUES ('Modify Record','MBI','modify','1998-08-17','2001-11-07','','Modify Record');
INSERT INTO sbmACTION VALUES ('Submit New File','SRV','revise','0000-00-00','2001-11-07','','Submit New File');
INSERT INTO sbmACTION VALUES ('Approve Record','APP','approve','2001-11-08','2002-06-11','','Approve Record');
INSERT INTO sbmALLFUNCDESCR VALUES ('Ask_For_Record_Details_Confirmation','');
INSERT INTO sbmALLFUNCDESCR VALUES ('CaseEDS','');
INSERT INTO sbmALLFUNCDESCR VALUES ('Create_Modify_Interface',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Create_Recid',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Finish_Submission','');
INSERT INTO sbmALLFUNCDESCR VALUES ('Get_Info','');
INSERT INTO sbmALLFUNCDESCR VALUES ('Get_Recid', 'This function gets the recid for a document with a given report-number (as stored in the global variable rn).');
INSERT INTO sbmALLFUNCDESCR VALUES ('Get_Report_Number',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Get_Sysno',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Insert_Modify_Record','');
INSERT INTO sbmALLFUNCDESCR VALUES ('Insert_Record',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Is_Original_Submitter','');
INSERT INTO sbmALLFUNCDESCR VALUES ('Is_Referee','This function checks whether the logged user is a referee for the current document');
INSERT INTO sbmALLFUNCDESCR VALUES ('Mail_Approval_Request_to_Referee',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Mail_Approval_Withdrawn_to_Referee',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Mail_Submitter',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Make_Modify_Record',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Make_Record','');
INSERT INTO sbmALLFUNCDESCR VALUES ('Move_From_Pending','');
INSERT INTO sbmALLFUNCDESCR VALUES ('Move_to_Done',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Move_to_Pending',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Print_Success','');
INSERT INTO sbmALLFUNCDESCR VALUES ('Print_Success_Approval_Request',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Print_Success_APP','');
INSERT INTO sbmALLFUNCDESCR VALUES ('Print_Success_DEL','Prepare a message for the user informing them that their record was successfully deleted.');
INSERT INTO sbmALLFUNCDESCR VALUES ('Print_Success_MBI',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Print_Success_SRV',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Register_Approval_Request',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Register_Referee_Decision',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Withdraw_Approval_Request',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Report_Number_Generation',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Second_Report_Number_Generation','Generate a secondary report number for a document.');
INSERT INTO sbmALLFUNCDESCR VALUES ('Send_Approval_Request',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Send_APP_Mail','');
INSERT INTO sbmALLFUNCDESCR VALUES ('Send_Delete_Mail','');
INSERT INTO sbmALLFUNCDESCR VALUES ('Send_Modify_Mail',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Send_SRV_Mail',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Set_Embargo','Set an embargo on all the documents of a given record.');
INSERT INTO sbmALLFUNCDESCR VALUES ('Stamp_Replace_Single_File_Approval','Stamp a single file when a document is approved.');
INSERT INTO sbmALLFUNCDESCR VALUES ('Stamp_Uploaded_Files','Stamp some of the files that were uploaded during a submission.');
INSERT INTO sbmALLFUNCDESCR VALUES ('Test_Status','');
INSERT INTO sbmALLFUNCDESCR VALUES ('Update_Approval_DB',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('User_is_Record_Owner_or_Curator','Check if user is owner or special editor of a record');
INSERT INTO sbmALLFUNCDESCR VALUES ('Move_Files_to_Storage','Attach files received from chosen file input element(s)');
INSERT INTO sbmALLFUNCDESCR VALUES ('Move_Revised_Files_to_Storage','Revise files initially uploaded with "Move_Files_to_Storage"');
INSERT INTO sbmALLFUNCDESCR VALUES ('Make_Dummy_MARC_XML_Record','');
INSERT INTO sbmALLFUNCDESCR VALUES ('Move_CKEditor_Files_to_Storage','Transfer files attached to the record with the CKEditor');
INSERT INTO sbmALLFUNCDESCR VALUES ('Create_Upload_Files_Interface','Display generic interface to add/revise/delete files. To be used before function "Move_Uploaded_Files_to_Storage"');
INSERT INTO sbmALLFUNCDESCR VALUES ('Move_Uploaded_Files_to_Storage','Attach files uploaded with "Create_Upload_Files_Interface"');
INSERT INTO sbmALLFUNCDESCR VALUES ('Move_Photos_to_Storage','Attach/edit the pictures uploaded with the "create_photos_manager_interface()" function');
INSERT INTO sbmALLFUNCDESCR VALUES ('Link_Records','Link two records toghether via MARC');
INSERT INTO sbmALLFUNCDESCR VALUES ('Video_Processing',NULL);
INSERT INTO sbmALLFUNCDESCR VALUES ('Set_RN_From_Sysno', 'Set the value of global rn variable to the report number identified by sysno (recid)');
INSERT INTO sbmALLFUNCDESCR VALUES ('Notify_URL','Access URL, possibly to post content');
INSERT INTO sbmFIELDDESC VALUES ('Upload_Photos',NULL,'','R',NULL,NULL,NULL,NULL,NULL,'\"\"\"\r\nThis is an example of element that creates a photos upload interface.\r\nClone it, customize it and integrate it into your submission. Then add function \r\n\'Move_Photos_to_Storage\' to your submission functions list, in order for files \r\nuploaded with this interface to be attached to the record. More information in \r\nthe WebSubmit admin guide.\r\n\"\"\"\r\n\r\nfrom invenio.legacy.websubmit.functions.Shared_Functions import ParamFromFile\r\nfrom invenio.websubmit_functions.Move_Photos_to_Storage import \\\r\n read_param_file, \\\r\n create_photos_manager_interface, \\\r\n get_session_id\r\n\r\n# Retrieve session id\r\ntry:\r\n # User info is defined only in MBI/MPI actions...\r\n session_id = get_session_id(None, uid, user_info) \r\nexcept:\r\n session_id = get_session_id(req, uid, {})\r\n\r\n# Retrieve context\r\nindir = curdir.split(\'/\')[-3]\r\ndoctype = curdir.split(\'/\')[-2]\r\naccess = curdir.split(\'/\')[-1]\r\n\r\n# Get the record ID, if any\r\nsysno = ParamFromFile(\"%s/%s\" % (curdir,\'SN\')).strip()\r\n\r\n\"\"\"\r\nModify below the configuration of the photos manager interface.\r\nNote: `can_reorder_photos\' parameter is not yet fully taken into consideration\r\n\r\nDocumentation of the function is available at <http://localhost/admin/websubmit/websubmitadmin.py/functionedit?funcname=Move_Photos_to_Storage>\r\n\"\"\"\r\ntext += create_photos_manager_interface(sysno, session_id, uid,\r\n doctype, indir, curdir, access,\r\n can_delete_photos=True,\r\n can_reorder_photos=True,\r\n can_upload_photos=True,\r\n editor_width=700,\r\n editor_height=400,\r\n initial_slider_value=100,\r\n max_slider_value=200,\r\n min_slider_value=80)','0000-00-00','0000-00-00',NULL,NULL,0);
INSERT INTO sbmCHECKS VALUES ('AUCheck','function AUCheck(txt) {\r\n var res=1;\r\n tmp=txt.indexOf(\"\\015\");\r\n while (tmp != -1) {\r\n left=txt.substring(0,tmp);\r\n right=txt.substring(tmp+2,txt.length);\r\n txt=left + \"\\012\" + right;\r\n tmp=txt.indexOf(\"\\015\");\r\n }\r\n tmp=txt.indexOf(\"\\012\");\r\n if (tmp==-1){\r\n line=txt;\r\n txt=\'\';}\r\n else{\r\n line=txt.substring(0,tmp);\r\n txt=txt.substring(tmp+1,txt.length);}\r\n while (line != \"\"){\r\n coma=line.indexOf(\",\");\r\n left=line.substring(0,coma);\r\n right=line.substring(coma+1,line.length);\r\n coma2=right.indexOf(\",\");\r\n space=right.indexOf(\" \");\r\n if ((coma==-1)||(left==\"\")||(right==\"\")||(space!=0)||(coma2!=-1)){\r\n res=0;\r\n error_log=line;\r\n }\r\n tmp=txt.indexOf(\"\\012\");\r\n if (tmp==-1){\r\n line=txt;\r\n txt=\'\';}\r\n else{\r\n line=txt.substring(0,tmp-1);\r\n txt=txt.substring(tmp+1,txt.length);}\r\n }\r\n if (res == 0){\r\n alert(\"This author name cannot be managed \\: \\012\\012\" + error_log + \" \\012\\012It is not in the required format!\\012Put one author per line and a comma (,) between the name and the firstname initial letters. \\012The name is going first, followed by the firstname initial letters.\\012Do not forget the whitespace after the comma!!!\\012\\012Example \\: Put\\012\\012Le Meur, J Y \\012Baron, T \\012\\012for\\012\\012Le Meur Jean-Yves & Baron Thomas.\");\r\n return 0;\r\n } \r\n return 1; \r\n}','1998-08-18','0000-00-00','','');
INSERT INTO sbmCHECKS VALUES ('DatCheckNew','function DatCheckNew(txt) {\r\n var res=1;\r\n if (txt.length != 10){res=0;}\r\n if (txt.indexOf(\"/\") != 2){res=0;}\r\n if (txt.lastIndexOf(\"/\") != 5){res=0;}\r\n tmp=parseInt(txt.substring(0,2),10);\r\n if ((tmp > 31)||(tmp < 1)||(isNaN(tmp))){res=0;}\r\n tmp=parseInt(txt.substring(3,5),10);\r\n if ((tmp > 12)||(tmp < 1)||(isNaN(tmp))){res=0;}\r\n tmp=parseInt(txt.substring(6,10),10);\r\n if ((tmp < 1)||(isNaN(tmp))){res=0;}\r\n if (txt.length == 0){res=1;}\r\n if (res == 0){\r\n alert(\"Please enter a correct Date \\012Format: dd/mm/yyyy\");\r\n return 0;\r\n }\r\n return 1; \r\n}','0000-00-00','0000-00-00','','');
-INSERT INTO sbmFIELDDESC VALUES ('Upload_Files',NULL,'','R',NULL,NULL,NULL,NULL,NULL,'\"\"\"\r\nThis is an example of element that creates a file upload interface.\r\nClone it, customize it and integrate it into your submission. Then add function \r\n\'Move_Uploaded_Files_to_Storage\' to your submission functions list, in order for files \r\nuploaded with this interface to be attached to the record. More information in \r\nthe WebSubmit admin guide.\r\n\"\"\"\r\nimport os\r\nfrom invenio.bibdocfile_managedocfiles import create_file_upload_interface\r\nfrom invenio.legacy.websubmit.functions.Shared_Functions import ParamFromFile\r\n\r\nindir = ParamFromFile(os.path.join(curdir, \'indir\'))\r\ndoctype = ParamFromFile(os.path.join(curdir, \'doctype\'))\r\naccess = ParamFromFile(os.path.join(curdir, \'access\'))\r\ntry:\r\n sysno = int(ParamFromFile(os.path.join(curdir, \'SN\')).strip())\r\nexcept:\r\n sysno = -1\r\nln = ParamFromFile(os.path.join(curdir, \'ln\'))\r\n\r\n\"\"\"\r\nRun the following to get the list of parameters of function \'create_file_upload_interface\':\r\necho -e \'from invenio.bibdocfile_managedocfiles import create_file_upload_interface as f\\nprint f.__doc__\' | python\r\n\"\"\"\r\ntext = create_file_upload_interface(recid=sysno,\r\n print_outside_form_tag=False,\r\n include_headers=True,\r\n ln=ln,\r\n doctypes_and_desc=[(\'main\',\'Main document\'),\r\n (\'additional\',\'Figure, schema, etc.\')],\r\n can_revise_doctypes=[\'*\'],\r\n can_describe_doctypes=[\'main\'],\r\n can_delete_doctypes=[\'additional\'],\r\n can_rename_doctypes=[\'main\'],\r\n sbm_indir=indir, sbm_doctype=doctype, sbm_access=access)[1]\r\n','0000-00-00','0000-00-00',NULL,NULL,0);
+INSERT INTO sbmFIELDDESC VALUES ('Upload_Files',NULL,'','R',NULL,NULL,NULL,NULL,NULL,'\"\"\"\r\nThis is an example of element that creates a file upload interface.\r\nClone it, customize it and integrate it into your submission. Then add function \r\n\'Move_Uploaded_Files_to_Storage\' to your submission functions list, in order for files \r\nuploaded with this interface to be attached to the record. More information in \r\nthe WebSubmit admin guide.\r\n\"\"\"\r\nimport os\r\nfrom invenio.legacy.bibdocfile.managedocfiles import create_file_upload_interface\r\nfrom invenio.legacy.websubmit.functions.Shared_Functions import ParamFromFile\r\n\r\nindir = ParamFromFile(os.path.join(curdir, \'indir\'))\r\ndoctype = ParamFromFile(os.path.join(curdir, \'doctype\'))\r\naccess = ParamFromFile(os.path.join(curdir, \'access\'))\r\ntry:\r\n sysno = int(ParamFromFile(os.path.join(curdir, \'SN\')).strip())\r\nexcept:\r\n sysno = -1\r\nln = ParamFromFile(os.path.join(curdir, \'ln\'))\r\n\r\n\"\"\"\r\nRun the following to get the list of parameters of function \'create_file_upload_interface\':\r\necho -e \'from invenio.legacy.bibdocfile.managedocfiles import create_file_upload_interface as f\\nprint f.__doc__\' | python\r\n\"\"\"\r\ntext = create_file_upload_interface(recid=sysno,\r\n print_outside_form_tag=False,\r\n include_headers=True,\r\n ln=ln,\r\n doctypes_and_desc=[(\'main\',\'Main document\'),\r\n (\'additional\',\'Figure, schema, etc.\')],\r\n can_revise_doctypes=[\'*\'],\r\n can_describe_doctypes=[\'main\'],\r\n can_delete_doctypes=[\'additional\'],\r\n can_rename_doctypes=[\'main\'],\r\n sbm_indir=indir, sbm_doctype=doctype, sbm_access=access)[1]\r\n','0000-00-00','0000-00-00',NULL,NULL,0);
INSERT INTO sbmFORMATEXTENSION VALUES ('WORD','.doc');
INSERT INTO sbmFORMATEXTENSION VALUES ('PostScript','.ps');
INSERT INTO sbmFORMATEXTENSION VALUES ('PDF','.pdf');
INSERT INTO sbmFORMATEXTENSION VALUES ('JPEG','.jpg');
INSERT INTO sbmFORMATEXTENSION VALUES ('JPEG','.jpeg');
INSERT INTO sbmFORMATEXTENSION VALUES ('GIF','.gif');
INSERT INTO sbmFORMATEXTENSION VALUES ('PPT','.ppt');
INSERT INTO sbmFORMATEXTENSION VALUES ('HTML','.htm');
INSERT INTO sbmFORMATEXTENSION VALUES ('HTML','.html');
INSERT INTO sbmFORMATEXTENSION VALUES ('Latex','.tex');
INSERT INTO sbmFORMATEXTENSION VALUES ('Compressed PostScript','.ps.gz');
INSERT INTO sbmFORMATEXTENSION VALUES ('Tarred Tex (.tar)','.tar');
INSERT INTO sbmFORMATEXTENSION VALUES ('Text','.txt');
INSERT INTO sbmFUNDESC VALUES ('Get_Recid','record_search_pattern');
INSERT INTO sbmFUNDESC VALUES ('Get_Report_Number','edsrn');
INSERT INTO sbmFUNDESC VALUES ('Send_Modify_Mail','addressesMBI');
INSERT INTO sbmFUNDESC VALUES ('Send_Modify_Mail','sourceDoc');
INSERT INTO sbmFUNDESC VALUES ('Register_Approval_Request','categ_file_appreq');
INSERT INTO sbmFUNDESC VALUES ('Register_Approval_Request','categ_rnseek_appreq');
INSERT INTO sbmFUNDESC VALUES ('Register_Approval_Request','note_file_appreq');
INSERT INTO sbmFUNDESC VALUES ('Register_Referee_Decision','decision_file');
INSERT INTO sbmFUNDESC VALUES ('Withdraw_Approval_Request','categ_file_withd');
INSERT INTO sbmFUNDESC VALUES ('Withdraw_Approval_Request','categ_rnseek_withd');
INSERT INTO sbmFUNDESC VALUES ('Report_Number_Generation','edsrn');
INSERT INTO sbmFUNDESC VALUES ('Report_Number_Generation','autorngen');
INSERT INTO sbmFUNDESC VALUES ('Report_Number_Generation','rnin');
INSERT INTO sbmFUNDESC VALUES ('Report_Number_Generation','counterpath');
INSERT INTO sbmFUNDESC VALUES ('Report_Number_Generation','rnformat');
INSERT INTO sbmFUNDESC VALUES ('Report_Number_Generation','yeargen');
INSERT INTO sbmFUNDESC VALUES ('Report_Number_Generation','nblength');
INSERT INTO sbmFUNDESC VALUES ('Report_Number_Generation','initialvalue');
INSERT INTO sbmFUNDESC VALUES ('Mail_Approval_Request_to_Referee','categ_file_appreq');
INSERT INTO sbmFUNDESC VALUES ('Mail_Approval_Request_to_Referee','categ_rnseek_appreq');
INSERT INTO sbmFUNDESC VALUES ('Mail_Approval_Request_to_Referee','edsrn');
INSERT INTO sbmFUNDESC VALUES ('Mail_Approval_Withdrawn_to_Referee','categ_file_withd');
INSERT INTO sbmFUNDESC VALUES ('Mail_Approval_Withdrawn_to_Referee','categ_rnseek_withd');
INSERT INTO sbmFUNDESC VALUES ('Mail_Submitter','authorfile');
INSERT INTO sbmFUNDESC VALUES ('Mail_Submitter','status');
INSERT INTO sbmFUNDESC VALUES ('Send_Approval_Request','authorfile');
INSERT INTO sbmFUNDESC VALUES ('Create_Modify_Interface','fieldnameMBI');
INSERT INTO sbmFUNDESC VALUES ('Send_Modify_Mail','fieldnameMBI');
INSERT INTO sbmFUNDESC VALUES ('Update_Approval_DB','categformatDAM');
INSERT INTO sbmFUNDESC VALUES ('Update_Approval_DB','decision_file');
INSERT INTO sbmFUNDESC VALUES ('Send_SRV_Mail','categformatDAM');
INSERT INTO sbmFUNDESC VALUES ('Send_SRV_Mail','addressesSRV');
INSERT INTO sbmFUNDESC VALUES ('Send_Approval_Request','directory');
INSERT INTO sbmFUNDESC VALUES ('Send_Approval_Request','categformatDAM');
INSERT INTO sbmFUNDESC VALUES ('Send_Approval_Request','addressesDAM');
INSERT INTO sbmFUNDESC VALUES ('Send_Approval_Request','titleFile');
INSERT INTO sbmFUNDESC VALUES ('Send_APP_Mail','edsrn');
INSERT INTO sbmFUNDESC VALUES ('Mail_Submitter','titleFile');
INSERT INTO sbmFUNDESC VALUES ('Send_Modify_Mail','emailFile');
INSERT INTO sbmFUNDESC VALUES ('Get_Info','authorFile');
INSERT INTO sbmFUNDESC VALUES ('Get_Info','emailFile');
INSERT INTO sbmFUNDESC VALUES ('Get_Info','titleFile');
INSERT INTO sbmFUNDESC VALUES ('Make_Modify_Record','modifyTemplate');
INSERT INTO sbmFUNDESC VALUES ('Send_APP_Mail','addressesAPP');
INSERT INTO sbmFUNDESC VALUES ('Send_APP_Mail','categformatAPP');
INSERT INTO sbmFUNDESC VALUES ('Send_APP_Mail','newrnin');
INSERT INTO sbmFUNDESC VALUES ('Send_APP_Mail','decision_file');
INSERT INTO sbmFUNDESC VALUES ('Send_APP_Mail','comments_file');
INSERT INTO sbmFUNDESC VALUES ('CaseEDS','casevariable');
INSERT INTO sbmFUNDESC VALUES ('CaseEDS','casevalues');
INSERT INTO sbmFUNDESC VALUES ('CaseEDS','casesteps');
INSERT INTO sbmFUNDESC VALUES ('CaseEDS','casedefault');
INSERT INTO sbmFUNDESC VALUES ('Send_SRV_Mail','noteFile');
INSERT INTO sbmFUNDESC VALUES ('Send_SRV_Mail','emailFile');
INSERT INTO sbmFUNDESC VALUES ('Mail_Submitter','emailFile');
INSERT INTO sbmFUNDESC VALUES ('Mail_Submitter','edsrn');
INSERT INTO sbmFUNDESC VALUES ('Mail_Submitter','newrnin');
INSERT INTO sbmFUNDESC VALUES ('Make_Record','sourceTemplate');
INSERT INTO sbmFUNDESC VALUES ('Make_Record','createTemplate');
INSERT INTO sbmFUNDESC VALUES ('Print_Success','edsrn');
INSERT INTO sbmFUNDESC VALUES ('Print_Success','newrnin');
INSERT INTO sbmFUNDESC VALUES ('Print_Success','status');
INSERT INTO sbmFUNDESC VALUES ('Make_Modify_Record','sourceTemplate');
INSERT INTO sbmFUNDESC VALUES ('Move_Files_to_Storage','documenttype');
INSERT INTO sbmFUNDESC VALUES ('Move_Files_to_Storage','iconsize');
INSERT INTO sbmFUNDESC VALUES ('Move_Files_to_Storage','paths_and_suffixes');
INSERT INTO sbmFUNDESC VALUES ('Move_Files_to_Storage','rename');
INSERT INTO sbmFUNDESC VALUES ('Move_Files_to_Storage','paths_and_restrictions');
INSERT INTO sbmFUNDESC VALUES ('Move_Files_to_Storage','paths_and_doctypes');
INSERT INTO sbmFUNDESC VALUES ('Move_Revised_Files_to_Storage','elementNameToDoctype');
INSERT INTO sbmFUNDESC VALUES ('Move_Revised_Files_to_Storage','createIconDoctypes');
INSERT INTO sbmFUNDESC VALUES ('Move_Revised_Files_to_Storage','createRelatedFormats');
INSERT INTO sbmFUNDESC VALUES ('Move_Revised_Files_to_Storage','iconsize');
INSERT INTO sbmFUNDESC VALUES ('Move_Revised_Files_to_Storage','keepPreviousVersionDoctypes');
INSERT INTO sbmFUNDESC VALUES ('Set_Embargo','date_file');
INSERT INTO sbmFUNDESC VALUES ('Set_Embargo','date_format');
INSERT INTO sbmFUNDESC VALUES ('Stamp_Uploaded_Files','files_to_be_stamped');
INSERT INTO sbmFUNDESC VALUES ('Stamp_Uploaded_Files','latex_template');
INSERT INTO sbmFUNDESC VALUES ('Stamp_Uploaded_Files','latex_template_vars');
INSERT INTO sbmFUNDESC VALUES ('Stamp_Uploaded_Files','stamp');
INSERT INTO sbmFUNDESC VALUES ('Stamp_Uploaded_Files','layer');
INSERT INTO sbmFUNDESC VALUES ('Stamp_Uploaded_Files','switch_file');
INSERT INTO sbmFUNDESC VALUES ('Make_Dummy_MARC_XML_Record','dummyrec_source_tpl');
INSERT INTO sbmFUNDESC VALUES ('Make_Dummy_MARC_XML_Record','dummyrec_create_tpl');
INSERT INTO sbmFUNDESC VALUES ('Print_Success_APP','decision_file');
INSERT INTO sbmFUNDESC VALUES ('Print_Success_APP','newrnin');
INSERT INTO sbmFUNDESC VALUES ('Send_Delete_Mail','edsrn');
INSERT INTO sbmFUNDESC VALUES ('Send_Delete_Mail','record_managers');
INSERT INTO sbmFUNDESC VALUES ('Second_Report_Number_Generation','2nd_rn_file');
INSERT INTO sbmFUNDESC VALUES ('Second_Report_Number_Generation','2nd_rn_format');
INSERT INTO sbmFUNDESC VALUES ('Second_Report_Number_Generation','2nd_rn_yeargen');
INSERT INTO sbmFUNDESC VALUES ('Second_Report_Number_Generation','2nd_rncateg_file');
INSERT INTO sbmFUNDESC VALUES ('Second_Report_Number_Generation','2nd_counterpath');
INSERT INTO sbmFUNDESC VALUES ('Second_Report_Number_Generation','2nd_nb_length');
INSERT INTO sbmFUNDESC VALUES ('Stamp_Replace_Single_File_Approval','file_to_be_stamped');
INSERT INTO sbmFUNDESC VALUES ('Stamp_Replace_Single_File_Approval','latex_template');
INSERT INTO sbmFUNDESC VALUES ('Stamp_Replace_Single_File_Approval','latex_template_vars');
INSERT INTO sbmFUNDESC VALUES ('Stamp_Replace_Single_File_Approval','new_file_name');
INSERT INTO sbmFUNDESC VALUES ('Stamp_Replace_Single_File_Approval','stamp');
INSERT INTO sbmFUNDESC VALUES ('Stamp_Replace_Single_File_Approval','layer');
INSERT INTO sbmFUNDESC VALUES ('Stamp_Replace_Single_File_Approval','switch_file');
INSERT INTO sbmFUNDESC VALUES ('Move_CKEditor_Files_to_Storage','input_fields');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','maxsize');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','minsize');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','doctypes');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','restrictions');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','canDeleteDoctypes');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','canReviseDoctypes');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','canDescribeDoctypes');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','canCommentDoctypes');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','canKeepDoctypes');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','canAddFormatDoctypes');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','canRestrictDoctypes');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','canRenameDoctypes');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','canNameNewFiles');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','createRelatedFormats');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','keepDefault');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','showLinks');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','fileLabel');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','filenameLabel');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','descriptionLabel');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','commentLabel');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','restrictionLabel');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','startDoc');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','endDoc');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','defaultFilenameDoctypes');
INSERT INTO sbmFUNDESC VALUES ('Create_Upload_Files_Interface','maxFilesDoctypes');
INSERT INTO sbmFUNDESC VALUES ('Move_Uploaded_Files_to_Storage','iconsize');
INSERT INTO sbmFUNDESC VALUES ('Move_Uploaded_Files_to_Storage','createIconDoctypes');
INSERT INTO sbmFUNDESC VALUES ('Move_Uploaded_Files_to_Storage','forceFileRevision');
INSERT INTO sbmFUNDESC VALUES ('Move_Photos_to_Storage','iconsize');
INSERT INTO sbmFUNDESC VALUES ('Move_Photos_to_Storage','iconformat');
INSERT INTO sbmFUNDESC VALUES ('User_is_Record_Owner_or_Curator','curator_role');
INSERT INTO sbmFUNDESC VALUES ('User_is_Record_Owner_or_Curator','curator_flag');
INSERT INTO sbmFUNDESC VALUES ('Link_Records','edsrn');
INSERT INTO sbmFUNDESC VALUES ('Link_Records','edsrn2');
INSERT INTO sbmFUNDESC VALUES ('Link_Records','directRelationship');
INSERT INTO sbmFUNDESC VALUES ('Link_Records','reverseRelationship');
INSERT INTO sbmFUNDESC VALUES ('Link_Records','keep_original_edsrn2');
INSERT INTO sbmFUNDESC VALUES ('Video_Processing','aspect');
INSERT INTO sbmFUNDESC VALUES ('Video_Processing','batch_template');
INSERT INTO sbmFUNDESC VALUES ('Video_Processing','title');
INSERT INTO sbmFUNDESC VALUES ('Set_RN_From_Sysno','edsrn');
INSERT INTO sbmFUNDESC VALUES ('Set_RN_From_Sysno','rep_tags');
INSERT INTO sbmFUNDESC VALUES ('Set_RN_From_Sysno','record_search_pattern');
INSERT INTO sbmFUNDESC VALUES ('Notify_URL','url');
INSERT INTO sbmFUNDESC VALUES ('Notify_URL','data');
INSERT INTO sbmFUNDESC VALUES ('Notify_URL','admin_emails');
INSERT INTO sbmFUNDESC VALUES ('Notify_URL','content_type');
INSERT INTO sbmFUNDESC VALUES ('Notify_URL','attempt_times');
INSERT INTO sbmFUNDESC VALUES ('Notify_URL','attempt_sleeptime');
INSERT INTO sbmFUNDESC VALUES ('Notify_URL','user');
INSERT INTO sbmGFILERESULT VALUES ('HTML','HTML document');
INSERT INTO sbmGFILERESULT VALUES ('WORD','data');
INSERT INTO sbmGFILERESULT VALUES ('PDF','PDF document');
INSERT INTO sbmGFILERESULT VALUES ('PostScript','PostScript document');
INSERT INTO sbmGFILERESULT VALUES ('PostScript','data ');
INSERT INTO sbmGFILERESULT VALUES ('PostScript','HP Printer Job Language data');
INSERT INTO sbmGFILERESULT VALUES ('jpg','JPEG image');
INSERT INTO sbmGFILERESULT VALUES ('Compressed PostScript','gzip compressed data');
INSERT INTO sbmGFILERESULT VALUES ('Tarred Tex (.tar)','tar archive');
INSERT INTO sbmGFILERESULT VALUES ('JPEG','JPEG image');
INSERT INTO sbmGFILERESULT VALUES ('GIF','GIF');
INSERT INTO swrREMOTESERVER VALUES (1, 'arXiv', 'arxiv.org', 'CDS_Invenio', 'sword_invenio', 'admin', 'SWORD at arXiv', 'http://arxiv.org/abs', 'https://arxiv.org/sword-app/servicedocument', '', 0);
-- end of file
diff --git a/invenio/legacy/miscutil/xapianutils_bibindex_indexer.py b/invenio/legacy/miscutil/xapianutils_bibindex_indexer.py
index 753977465..e27e987a4 100644
--- a/invenio/legacy/miscutil/xapianutils_bibindex_indexer.py
+++ b/invenio/legacy/miscutil/xapianutils_bibindex_indexer.py
@@ -1,71 +1,71 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Xapian utilities.
"""
import os
from invenio.config import CFG_CACHEDIR, CFG_XAPIAN_ENABLED
-from invenio.xapianutils_config import XAPIAN_DIR, XAPIAN_DIR_NAME
+from invenio.legacy.miscutil.xapianutils_config import XAPIAN_DIR, XAPIAN_DIR_NAME
if CFG_XAPIAN_ENABLED:
import xapian
DATABASES = dict()
def xapian_ensure_db_dir(name):
path = CFG_CACHEDIR + "/" + name
if not os.path.exists(path):
os.makedirs(path)
def xapian_add(recid, field, value):
"""
Helper function that adds word similarity ranking relevant indexes to Solr.
"""
# FIXME: remove as soon as the fulltext indexing is moved in BibIndex to the id_range part
if not DATABASES:
xapian_init_databases()
content_string = value
doc = xapian.Document()
doc.set_data(content_string)
(database, indexer) = DATABASES[field]
indexer.set_document(doc)
indexer.index_text(content_string)
database.replace_document(recid, doc)
def xapian_init_databases():
"""
Initializes all database objects.
"""
xapian_ensure_db_dir(XAPIAN_DIR_NAME)
field = 'fulltext'
xapian_ensure_db_dir(XAPIAN_DIR_NAME + "/" + field)
database = xapian.WritableDatabase(XAPIAN_DIR + "/" + field, xapian.DB_CREATE_OR_OPEN)
indexer = xapian.TermGenerator()
stemmer = xapian.Stem("english")
indexer.set_stemmer(stemmer)
DATABASES[field] = (database, indexer)
diff --git a/invenio/legacy/miscutil/xapianutils_bibindex_searcher.py b/invenio/legacy/miscutil/xapianutils_bibindex_searcher.py
index 575484502..d3a20ca88 100644
--- a/invenio/legacy/miscutil/xapianutils_bibindex_searcher.py
+++ b/invenio/legacy/miscutil/xapianutils_bibindex_searcher.py
@@ -1,71 +1,71 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Xapian utilities.
"""
from invenio.config import CFG_XAPIAN_ENABLED
from invenio.intbitset import intbitset
-from invenio.xapianutils_config import XAPIAN_DIR
+from invenio.legacy.miscutil.xapianutils_config import XAPIAN_DIR
if CFG_XAPIAN_ENABLED:
import xapian
DATABASES = dict()
def xapian_get_bitset(index, query):
"""
Queries a Xapian index.
Returns: an intbitset containing all record ids
"""
if not DATABASES:
xapian_init_databases()
result = intbitset()
database = DATABASES[index]
enquire = xapian.Enquire(database)
query_string = query
qp = xapian.QueryParser()
stemmer = xapian.Stem("english")
qp.set_stemmer(stemmer)
qp.set_database(database)
qp.set_stemming_strategy(xapian.QueryParser.STEM_SOME)
pattern = qp.parse_query(query_string, xapian.QueryParser.FLAG_PHRASE)
enquire.set_query(pattern)
matches = enquire.get_mset(0, database.get_lastdocid())
for match in matches:
result.add(match.docid)
return result
def xapian_init_databases():
"""
Initializes all database objects.
"""
field = 'fulltext'
database = xapian.Database(XAPIAN_DIR + "/" + field)
DATABASES[field] = database
diff --git a/invenio/legacy/miscutil/xapianutils_bibrank_indexer.py b/invenio/legacy/miscutil/xapianutils_bibrank_indexer.py
index 92ace9e46..81bf40780 100644
--- a/invenio/legacy/miscutil/xapianutils_bibrank_indexer.py
+++ b/invenio/legacy/miscutil/xapianutils_bibrank_indexer.py
@@ -1,142 +1,142 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Xapian utilities.
"""
import os
from invenio.config import CFG_CACHEDIR, CFG_XAPIAN_ENABLED
-from invenio.bibtask import write_message, task_get_option
+from invenio.legacy.bibsched.bibtask import write_message, task_get_option
from invenio.legacy.dbquery import run_sql
from invenio.legacy.search_engine import get_fieldvalues
-from invenio.xapianutils_config import DATABASES, XAPIAN_DIR, XAPIAN_DIR_NAME, INDEXES
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.miscutil.xapianutils_config import DATABASES, XAPIAN_DIR, XAPIAN_DIR_NAME, INDEXES
+from invenio.legacy.bibdocfile.api import BibRecDocs
from invenio.legacy.bibrank.bridge_config import CFG_MARC_ABSTRACT, \
CFG_MARC_AUTHOR_NAME, \
CFG_MARC_ADDITIONAL_AUTHOR_NAME, \
CFG_MARC_KEYWORD, \
CFG_MARC_TITLE
if CFG_XAPIAN_ENABLED:
import xapian
def xapian_ensure_db_dir(name):
path = CFG_CACHEDIR + "/" + name
if not os.path.exists(path):
os.makedirs(path)
def xapian_add_all(lower_recid, upper_recid):
"""
Adds the regarding field values of all records from the lower recid to the upper one to Xapian.
It preserves the fulltext information.
"""
xapian_init_databases()
for recid in range(lower_recid, upper_recid + 1):
try:
abstract = unicode(get_fieldvalues(recid, CFG_MARC_ABSTRACT)[0], 'utf-8')
except:
abstract = ""
xapian_add(recid, "abstract", abstract)
try:
first_author = get_fieldvalues(recid, CFG_MARC_AUTHOR_NAME)[0]
additional_authors = reduce(lambda x, y: x + " " + y, get_fieldvalues(recid, CFG_MARC_ADDITIONAL_AUTHOR_NAME), '')
author = unicode(first_author + " " + additional_authors, 'utf-8')
except:
author = ""
xapian_add(recid, "author", author)
try:
bibrecdocs = BibRecDocs(recid)
fulltext = unicode(bibrecdocs.get_text(), 'utf-8')
except:
fulltext = ""
xapian_add(recid, "fulltext", fulltext)
try:
keyword = unicode(reduce(lambda x, y: x + " " + y, get_fieldvalues(recid, CFG_MARC_KEYWORD), ''), 'utf-8')
except:
keyword = ""
xapian_add(recid, "keyword", keyword)
try:
title = unicode(get_fieldvalues(recid, CFG_MARC_TITLE)[0], 'utf-8')
except:
title = ""
xapian_add(recid, "title", title)
def xapian_add(recid, field, value):
"""
Helper function that adds word similarity ranking relevant indexes to Solr.
"""
content_string = value
doc = xapian.Document()
doc.set_data(content_string)
(database, indexer) = DATABASES[field]
indexer.set_document(doc)
indexer.index_text(content_string)
database.replace_document(recid, doc)
def xapian_init_databases():
"""
Initializes all database objects.
"""
xapian_ensure_db_dir(XAPIAN_DIR_NAME)
for field in INDEXES:
xapian_ensure_db_dir(XAPIAN_DIR_NAME + "/" + field)
database = xapian.WritableDatabase(XAPIAN_DIR + "/" + field, xapian.DB_CREATE_OR_OPEN)
indexer = xapian.TermGenerator()
stemmer = xapian.Stem("english")
indexer.set_stemmer(stemmer)
DATABASES[field] = (database, indexer)
def word_similarity_xapian(run):
return word_index(run)
def word_index(run): # pylint: disable=W0613
"""
Runs the indexing task.
"""
id_option = task_get_option("id")
if len(id_option):
for id_elem in id_option:
lower_recid= id_elem[0]
upper_recid = id_elem[1]
write_message("Xapian ranking indexer called for %s-%s" % (lower_recid, upper_recid))
xapian_add_all(lower_recid, upper_recid)
write_message("Xapian ranking indexer completed")
else:
max_recid = 0
res = run_sql("SELECT max(id) FROM bibrec")
if res and res[0][0]:
max_recid = int(res[0][0])
write_message("Xapian ranking indexer called for %s-%s" % (1, max_recid))
xapian_add_all(1, max_recid)
write_message("Xapian ranking indexer completed")
diff --git a/invenio/legacy/miscutil/xapianutils_bibrank_searcher.py b/invenio/legacy/miscutil/xapianutils_bibrank_searcher.py
index ec2348fdd..b51e56817 100644
--- a/invenio/legacy/miscutil/xapianutils_bibrank_searcher.py
+++ b/invenio/legacy/miscutil/xapianutils_bibrank_searcher.py
@@ -1,207 +1,207 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Xapian utilities.
"""
from invenio.config import CFG_XAPIAN_ENABLED
from invenio.intbitset import intbitset
-from invenio.xapianutils_config import INDEXES, XAPIAN_DIR
+from invenio.legacy.miscutil.xapianutils_config import INDEXES, XAPIAN_DIR
if CFG_XAPIAN_ENABLED:
import xapian
class MatchDecider(xapian.MatchDecider):
def __init__(self, ids):
xapian.MatchDecider.__init__(self)
self.ids = ids
def __call__(self, document):
return document.get_docid() in self.ids
DATABASES = dict()
def xapian_get_ranked_index(index, pattern, params, hitset, ranked_result_amount):
"""
Queries a Xapian index.
Returns: a list of ranked record ids [(recid, score), ...) contained in hitset
and an intbitset of record ids contained in hitset.
"""
result = []
matched_recs = intbitset()
database = DATABASES[index]
enquire = xapian.Enquire(database)
qp = xapian.QueryParser()
stemmer = xapian.Stem("english")
qp.set_stemmer(stemmer)
qp.set_database(database)
qp.set_stemming_strategy(xapian.QueryParser.STEM_SOME)
# Avoids phrase search to increase performance
if "avoid_phrase_search_threshold" in params and len(hitset) >= params["avoid_phrase_search_threshold"] and pattern.startswith('"'):
pattern = pattern[1:-1]
query_string = ' AND '.join(pattern.split(' '))
pattern = qp.parse_query(query_string)
else:
query_string = pattern
pattern = qp.parse_query(query_string, xapian.QueryParser.FLAG_PHRASE)
enquire.set_query(pattern)
matches = enquire.get_mset(0, ranked_result_amount, None, MatchDecider(hitset))
weight = params["weight"]
for match in matches:
recid = match.docid
if recid in hitset:
score = int(match.percent) * weight
result.append((recid, score))
matched_recs.add(recid)
return (result, matched_recs)
def xapian_init_databases():
"""
Initializes all database objects.
"""
for field in INDEXES:
database = xapian.Database(XAPIAN_DIR + "/" + field)
DATABASES[field] = database
def get_greatest_ranked_records(raw_reclist):
"""
Returns unique records having selecting the ones with the greatest records
in case of duplicates.
"""
unique_records = dict()
for (recid, score) in raw_reclist:
if not recid in unique_records:
unique_records[recid] = score
else:
current_score = unique_records[recid]
if score > current_score:
unique_records[recid] = score
result = []
for recid in unique_records.keys():
result.append((recid, unique_records[recid]))
return result
def word_similarity_xapian(pattern, hitset, params, verbose, field, ranked_result_amount):
"""
Ranking a records containing specified words and returns a sorted list.
input:
hitset - a list of hits for the query found by search_engine
verbose - verbose value
field - field to search (selected in GUI)
ranked_result_amount - amount of results to be ranked
output:
recset - a list of sorted records: [[23,34], [344,24], [1,01]]
prefix - what to show before the rank value
postfix - what to show after the rank value
voutput - contains extra information, content dependent on verbose value
"""
voutput = ""
search_units = []
if pattern:
xapian_init_databases()
pattern = " ".join(map(str, pattern))
from invenio.legacy.search_engine import create_basic_search_units
search_units = create_basic_search_units(None, pattern, field)
if verbose > 0:
voutput += "Hitset: %s<br/>" % hitset
voutput += "Pattern: %s<br/>" % pattern
voutput += "Search units: %s<br/>" % search_units
all_ranked_results = []
included_hits = intbitset()
excluded_hits = intbitset()
for (operator, pattern, field, unit_type) in search_units: #@UnusedVariable
# Field might not exist
if field not in params["fields"].keys():
field = params["default_field"]
if unit_type == "a":
# Eliminates leading and trailing %
if pattern[0] == "%":
pattern = pattern[1:-1]
pattern = "\"" + pattern + "\""
(ranked_result_part, matched_recs) = xapian_get_ranked_index(field, pattern, params["fields"][field], hitset, ranked_result_amount)
if verbose > 0:
voutput += "Index %s: %s<br/>" % (field, ranked_result_part)
voutput += "Index records %s: %s<br/>" % (field, matched_recs)
# Excludes - results
if operator == "-":
excluded_hits = excluded_hits.union(matched_recs)
# + and | are interpreted as OR
else:
included_hits = included_hits.union(matched_recs)
all_ranked_results.extend(ranked_result_part)
ranked_result = []
if hitset:
# Removes the excluded records
result_hits = included_hits.difference(excluded_hits)
# Avoids duplicate results and normalises scores
ranked_result = get_greatest_ranked_records(all_ranked_results)
ranked_result = get_normalized_ranking_scores(ranked_result)
# Considers not ranked records
not_ranked = hitset.difference(result_hits)
if not_ranked:
lrecIDs = list(not_ranked)
ranked_result = zip(lrecIDs, [0] * len(lrecIDs)) + ranked_result
if verbose > 0:
voutput += "All matched records: %s<br/>" % result_hits
voutput += "All ranked records: %s<br/>" % ranked_result
voutput += "All not ranked records: %s<br/>" % not_ranked
ranked_result.sort(lambda x, y: cmp(x[1], y[1]))
return (ranked_result, params["prefix"], params["postfix"], voutput)
return (ranked_result, "", "", voutput)
def get_normalized_ranking_scores(ranked_result):
max_score = 0
for res in ranked_result:
if res[1] > max_score:
max_score = res[1]
normalized_ranked_result = []
for res in ranked_result:
normalized_score = int(100.0 / max_score * res[1])
normalized_ranked_result.append((res[0], normalized_score))
return normalized_ranked_result
diff --git a/invenio/legacy/oaiharvest/daemon.py b/invenio/legacy/oaiharvest/daemon.py
index 1081a21d0..7beb83f54 100644
--- a/invenio/legacy/oaiharvest/daemon.py
+++ b/invenio/legacy/oaiharvest/daemon.py
@@ -1,1622 +1,1622 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
OAI Harvest daemon - harvest records from OAI repositories.
If started via CLI with --verb parameters, starts a manual single-shot
harvesting. Otherwise starts a BibSched task for periodical harvesting
of repositories defined in the OAI Harvest admin interface
"""
__revision__ = "$Id$"
import os
import sys
import getopt
import getpass
import re
import time
import calendar
import shutil
import tempfile
import urlparse
import random
from invenio.config import \
CFG_BINDIR, \
CFG_TMPDIR, \
CFG_ETCDIR, \
CFG_INSPIRE_SITE, \
CFG_CERN_SITE, \
CFG_PLOTEXTRACTOR_DOWNLOAD_TIMEOUT, \
CFG_SITE_URL, \
CFG_OAI_FAILED_HARVESTING_STOP_QUEUE, \
CFG_OAI_FAILED_HARVESTING_EMAILS_ADMIN
from invenio.oai_harvest_config import InvenioOAIHarvestWarning
from invenio.legacy.dbquery import run_sql
-from invenio.bibtask import \
+from invenio.legacy.bibsched.bibtask import \
task_get_task_param, \
task_get_option, \
task_set_option, \
write_message, \
task_init, \
task_sleep_now_if_required, \
task_update_progress, \
task_low_level_submission
from invenio.legacy.bibrecord import record_extract_oai_id, create_records, \
create_record, record_add_fields, \
record_delete_fields, record_xml_output, \
record_get_field_instances, \
record_modify_subfield, \
record_has_field, field_xml_output
from invenio import oai_harvest_getter
from invenio.ext.logging import register_exception
from invenio.plotextractor_getter import harvest_single, make_single_directory
from invenio.plotextractor_converter import untar
from invenio.plotextractor import process_single, get_defaults
from invenio.utils.shell import run_shell_command, Timeout
from invenio.utils.text import translate_latex2unicode
-from invenio.bibedit_utils import record_find_matching_fields
-from invenio.bibcatalog import bibcatalog_system
+from invenio.legacy.bibedit.utils import record_find_matching_fields
+from invenio.legacy.bibcatalog.api import bibcatalog_system
import invenio.template
oaiharvest_templates = invenio.template.load('oai_harvest')
from invenio.ext.legacy.handler_flask import with_app_context
## precompile some often-used regexp for speed reasons:
REGEXP_OAI_ID = re.compile("<identifier.*?>(.*?)<\/identifier>", re.DOTALL)
REGEXP_RECORD = re.compile("<record.*?>(.*?)</record>", re.DOTALL)
REGEXP_REFS = re.compile("<record.*?>.*?<controlfield .*?>.*?</controlfield>(.*?)</record>", re.DOTALL)
REGEXP_AUTHLIST = re.compile("<collaborationauthorlist.*?</collaborationauthorlist>", re.DOTALL)
CFG_OAI_AUTHORLIST_POSTMODE_STYLESHEET = "%s/bibconvert/config/%s" % (CFG_ETCDIR, "authorlist2marcxml.xsl")
def get_nb_records_in_file(filename):
"""
Return number of record in FILENAME that is either harvested or converted
file. Useful for statistics.
"""
try:
nb = open(filename, 'r').read().count("</record>")
except IOError:
nb = 0 # file not exists and such
except:
nb = -1
return nb
def task_run_core():
"""Run the harvesting task. The row argument is the oaiharvest task
queue row, containing if, arguments, etc.
Return 1 in case of success and 0 in case of failure.
"""
reposlist = []
datelist = []
dateflag = 0
filepath_prefix = "%s/oaiharvest_%s" % (CFG_TMPDIR, str(task_get_task_param("task_id")))
### go ahead: build up the reposlist
if task_get_option("repository") is not None:
### user requests harvesting from selected repositories
write_message("harvesting from selected repositories")
for reposname in task_get_option("repository"):
row = get_row_from_reposname(reposname)
if row == []:
write_message("source name %s is not valid" % (reposname,))
continue
else:
reposlist.append(get_row_from_reposname(reposname))
else:
### user requests harvesting from all repositories
write_message("harvesting from all repositories in the database")
reposlist = get_all_rows_from_db()
### go ahead: check if user requested from-until harvesting
if task_get_option("dates"):
### for each repos simply perform a from-until date harvesting...
### no need to update anything
dateflag = 1
for element in task_get_option("dates"):
datelist.append(element)
error_happened_p = 0 # 0: no error, 1: "recoverable" error (don't stop queue), 2: error (admin intervention needed)
j = 0
for repos in reposlist:
j += 1
task_sleep_now_if_required()
# Extract values from database row (in exact order):
# | id | baseurl | metadataprefix | arguments | comment
# | bibconvertcfgfile | name | lastrun | frequency
# | postprocess | setspecs | bibfilterprogram
source_id = repos[0][0]
baseurl = str(repos[0][1])
metadataprefix = str(repos[0][2])
bibconvert_cfgfile = str(repos[0][5])
reponame = str(repos[0][6])
lastrun = repos[0][7]
frequency = repos[0][8]
postmode = repos[0][9]
setspecs = str(repos[0][10])
bibfilterprogram = str(repos[0][11])
write_message("running in postmode %s" % (postmode,))
downloaded_material_dict = {}
harvested_files_list = []
# Harvest phase
harvestpath = "%s_%d_%s_" % (filepath_prefix, j, time.strftime("%Y%m%d%H%M%S"))
if dateflag == 1:
task_update_progress("Harvesting %s from %s to %s (%i/%i)" % \
(reponame, \
str(datelist[0]),
str(datelist[1]),
j, \
len(reposlist)))
exit_code, file_list = oai_harvest_get(prefix=metadataprefix,
baseurl=baseurl,
harvestpath=harvestpath,
fro=str(datelist[0]),
until=str(datelist[1]),
setspecs=setspecs)
if exit_code == 1 :
write_message("source %s was harvested from %s to %s" % \
(reponame, str(datelist[0]), str(datelist[1])))
harvested_files_list = file_list
else:
write_message("an error occurred while harvesting from source %s for the dates chosen:\n%s\n" % \
(reponame, file_list))
if error_happened_p < 1:
error_happened_p = 1
continue
elif dateflag != 1 and lastrun is None and frequency != 0:
write_message("source %s was never harvested before - harvesting whole repository" % \
(reponame,))
task_update_progress("Harvesting %s (%i/%i)" % \
(reponame,
j, \
len(reposlist)))
exit_code, file_list = oai_harvest_get(prefix=metadataprefix,
baseurl=baseurl,
harvestpath=harvestpath,
setspecs=setspecs)
if exit_code == 1 :
update_lastrun(source_id)
harvested_files_list = file_list
else :
write_message("an error occurred while harvesting from source %s:\n%s\n" % \
(reponame, file_list))
if error_happened_p < 1:
error_happened_p = 1
continue
elif dateflag != 1 and frequency != 0:
### check that update is actually needed,
### i.e. lastrun+frequency>today
timenow = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
lastrundate = re.sub(r'\.[0-9]+$', '',
str(lastrun)) # remove trailing .00
timeinsec = int(frequency) * 60 * 60
updatedue = add_timestamp_and_timelag(lastrundate, timeinsec)
proceed = compare_timestamps_with_tolerance(updatedue, timenow)
if proceed == 0 or proceed == -1 : #update needed!
write_message("source %s is going to be updated" % (reponame,))
fromdate = str(lastrun)
fromdate = fromdate.split()[0] # get rid of time of the day for the moment
task_update_progress("Harvesting %s (%i/%i)" % \
(reponame,
j, \
len(reposlist)))
exit_code, file_list = oai_harvest_get(prefix=metadataprefix,
baseurl=baseurl,
harvestpath=harvestpath,
fro=fromdate,
setspecs=setspecs)
if exit_code == 1 :
update_lastrun(source_id)
harvested_files_list = file_list
else :
write_message("an error occurred while harvesting from source %s:\n%s\n" % \
(reponame, file_list))
if error_happened_p < 1:
error_happened_p = 1
continue
else:
write_message("source %s does not need updating" % (reponame,))
continue
elif dateflag != 1 and frequency == 0:
write_message("source %s has frequency set to 'Never' so it will not be updated" % \
(reponame,))
continue
# Harvesting done, now convert/extract/filter/upload as requested
if len(harvested_files_list) < 1:
write_message("No records harvested for %s" % (reponame,))
continue
# Retrieve all OAI IDs and set active list
harvested_identifier_list = collect_identifiers(harvested_files_list)
active_files_list = harvested_files_list
if len(active_files_list) != len(harvested_identifier_list):
# Harvested files and its identifiers are 'out of sync', abort harvest
write_message("Harvested files miss identifiers for %s" % (reponame,))
continue
write_message("post-harvest processes started")
# Convert phase
if 'c' in postmode:
updated_files_list = []
i = 0
write_message("conversion step started")
for active_file in active_files_list:
i += 1
task_sleep_now_if_required()
task_update_progress("Converting material harvested from %s (%i/%i)" % \
(reponame, \
i, \
len(active_files_list)))
updated_file = "%s.converted" % (active_file.split('.')[0],)
updated_files_list.append(updated_file)
(exitcode, err_msg) = call_bibconvert(config=bibconvert_cfgfile,
harvestpath=active_file,
convertpath=updated_file)
if exitcode == 0:
write_message("harvested file %s was successfully converted" % \
(active_file,))
else:
write_message("an error occurred while converting %s:\n%s" % \
(active_file, err_msg))
error_happened_p = 2
continue
# print stats:
for updated_file in updated_files_list:
write_message("File %s contains %i records." % \
(updated_file,
get_nb_records_in_file(updated_file)))
active_files_list = updated_files_list
write_message("conversion step ended")
# plotextract phase
if 'p' in postmode:
write_message("plotextraction step started")
# Download tarball for each harvested/converted record, then run plotextrator.
# Update converted xml files with generated xml or add it for upload
updated_files_list = []
i = 0
for active_file in active_files_list:
identifiers = harvested_identifier_list[i]
i += 1
task_sleep_now_if_required()
task_update_progress("Extracting plots from harvested material from %s (%i/%i)" % \
(reponame, i, len(active_files_list)))
updated_file = "%s.plotextracted" % (active_file.split('.')[0],)
updated_files_list.append(updated_file)
(exitcode, err_msg) = call_plotextractor(active_file,
updated_file,
identifiers,
downloaded_material_dict,
source_id)
if exitcode == 0:
if err_msg != "":
write_message("plots from %s was extracted, but with some errors:\n%s" % \
(active_file, err_msg))
else:
write_message("plots from %s was successfully extracted" % \
(active_file,))
else:
write_message("an error occurred while extracting plots from %s:\n%s" % \
(active_file, err_msg))
error_happened_p = 2
continue
# print stats:
for updated_file in updated_files_list:
write_message("File %s contains %i records." % \
(updated_file,
get_nb_records_in_file(updated_file)))
active_files_list = updated_files_list
write_message("plotextraction step ended")
# refextract phase
if 'r' in postmode:
updated_files_list = []
i = 0
write_message("refextraction step started")
for active_file in active_files_list:
identifiers = harvested_identifier_list[i]
i += 1
task_sleep_now_if_required()
task_update_progress("Extracting references from material harvested from %s (%i/%i)" % \
(reponame, i, len(active_files_list)))
updated_file = "%s.refextracted" % (active_file.split('.')[0],)
updated_files_list.append(updated_file)
(exitcode, err_msg) = call_refextract(active_file,
updated_file,
identifiers,
downloaded_material_dict,
source_id)
if exitcode == 0:
if err_msg != "":
write_message("references from %s was extracted, but with some errors:\n%s" % \
(active_file, err_msg))
else:
write_message("references from %s was successfully extracted" % \
(active_file,))
else:
write_message("an error occurred while extracting references from %s:\n%s" % \
(active_file, err_msg))
error_happened_p = 2
continue
# print stats:
for updated_file in updated_files_list:
write_message("File %s contains %i records." % \
(updated_file,
get_nb_records_in_file(updated_file)))
active_files_list = updated_files_list
write_message("refextraction step ended")
# authorlist phase
if 'a' in postmode:
write_message("authorlist extraction step started")
# Initialize BibCatalog connection as default user, if possible
if bibcatalog_system is not None:
bibcatalog_response = bibcatalog_system.check_system()
else:
bibcatalog_response = "No ticket system configured"
if bibcatalog_response != "":
write_message("BibCatalog error: %s\n" % (bibcatalog_response,))
updated_files_list = []
i = 0
for active_file in active_files_list:
identifiers = harvested_identifier_list[i]
i += 1
task_sleep_now_if_required()
task_update_progress("Extracting any authorlists from material harvested from %s (%i/%i)" % \
(reponame, i, len(active_files_list)))
updated_file = "%s.authextracted" % (active_file.split('.')[0],)
updated_files_list.append(updated_file)
(exitcode, err_msg) = call_authorlist_extract(active_file,
updated_file,
identifiers,
downloaded_material_dict,
source_id)
if exitcode == 0:
if err_msg != "":
write_message("authorlists from %s was extracted, but with some errors:\n%s" % \
(active_file, err_msg))
else:
write_message("any authorlists from %s was successfully extracted" % \
(active_file,))
else:
write_message("an error occurred while extracting authorlists from %s:\n%s" % \
(active_file, err_msg))
error_happened_p = 2
continue
# print stats:
for updated_file in updated_files_list:
write_message("File %s contains %i records." % \
(updated_file,
get_nb_records_in_file(updated_file)))
active_files_list = updated_files_list
write_message("authorlist extraction step ended")
# fulltext phase
if 't' in postmode:
write_message("full-text attachment step started")
# Attaching fulltext
updated_files_list = []
i = 0
for active_file in active_files_list:
identifiers = harvested_identifier_list[i]
i += 1
task_sleep_now_if_required()
task_update_progress("Attaching fulltext to records harvested from %s (%i/%i)" % \
(reponame, i, len(active_files_list)))
updated_file = "%s.fulltext" % (active_file.split('.')[0],)
updated_files_list.append(updated_file)
(exitcode, err_msg) = call_fulltext(active_file,
updated_file,
identifiers,
downloaded_material_dict,
source_id)
if exitcode == 0:
write_message("fulltext from %s was successfully attached" % \
(active_file,))
else:
write_message("an error occurred while attaching fulltext to %s:\n%s" % \
(active_file, err_msg))
error_happened_p = 2
continue
# print stats:
for updated_file in updated_files_list:
write_message("File %s contains %i records." % \
(updated_file,
get_nb_records_in_file(updated_file)))
active_files_list = updated_files_list
write_message("full-text attachment step ended")
# Filter-phase
if 'f' in postmode:
write_message("filtering step started")
# first call bibfilter:
res = 0
i = 0
for active_file in active_files_list:
i += 1
task_sleep_now_if_required()
task_update_progress("Filtering material harvested from %s (%i/%i)" % \
(reponame, \
i, \
len(active_files_list)))
(exitcode, err_msg) = call_bibfilter(bibfilterprogram, active_file)
if exitcode == 0:
write_message("%s was successfully bibfiltered" % \
(active_file,))
else:
write_message("an error occurred while bibfiltering %s:\n%s" % \
(active_file, err_msg))
error_happened_p = 2
continue
# print stats:
for active_file in active_files_list:
write_message("File %s contains %i records." % \
(active_file + ".insert.xml",
get_nb_records_in_file(active_file + ".insert.xml")))
write_message("File %s contains %i records." % \
(active_file + ".correct.xml",
get_nb_records_in_file(active_file + ".correct.xml")))
write_message("File %s contains %i records." % \
(active_file + ".append.xml",
get_nb_records_in_file(active_file + ".append.xml")))
write_message("File %s contains %i records." % \
(active_file + ".holdingpen.xml",
get_nb_records_in_file(active_file + ".holdingpen.xml")))
write_message("filtering step ended")
# Upload files
if "u" in postmode:
write_message("upload step started")
if 'f' in postmode:
upload_modes = [('.insert.xml', '-i'),
('.correct.xml', '-c'),
('.append.xml', '-a'),
('.holdingpen.xml', '-o')]
else:
upload_modes = [('', '-ir')]
i = 0
last_upload_task_id = -1
# Get a random sequence ID that will allow for the tasks to be
# run in order, regardless if parallel task execution is activated
sequence_id = random.randrange(1, 4294967296)
for active_file in active_files_list:
task_sleep_now_if_required()
i += 1
task_update_progress("Uploading records harvested from %s (%i/%i)" % \
(reponame, \
i, \
len(active_files_list)))
for suffix, mode in upload_modes:
upload_filename = active_file + suffix
if get_nb_records_in_file(upload_filename) == 0:
continue
last_upload_task_id = call_bibupload(upload_filename, \
[mode], \
source_id, \
sequence_id)
if not last_upload_task_id:
error_happened_p = 2
write_message("an error occurred while uploading %s from %s" % \
(upload_filename, reponame))
break
else:
write_message("material harvested from source %s was successfully uploaded" % \
(reponame,))
if len(active_files_list) > 0:
write_message("nothing to upload")
write_message("upload step ended")
if CFG_INSPIRE_SITE:
# Launch BibIndex,Webcoll update task to show uploaded content quickly
bibindex_params = ['-w', 'reportnumber,collection', \
'-P', '6', \
'-I', str(sequence_id), \
'--post-process', 'bst_run_bibtask[taskname="webcoll", user="oaiharvest", P="6", c="HEP"]']
task_low_level_submission("bibindex", "oaiharvest", *tuple(bibindex_params))
write_message("post-harvest processes ended")
if error_happened_p:
if CFG_OAI_FAILED_HARVESTING_STOP_QUEUE == 0 or \
not task_get_task_param("sleeptime") or \
error_happened_p > 1:
# Admin want BibSched to stop, or the task is not set to
# run at a later date: we must stop the queue.
write_message("An error occurred. Task is configured to stop")
return False
else:
# An error happened, but it can be recovered at next run
# (task is re-scheduled) and admin set BibSched to
# continue even after failure.
write_message("An error occurred, but task is configured to continue")
if CFG_OAI_FAILED_HARVESTING_EMAILS_ADMIN:
try:
raise InvenioOAIHarvestWarning("OAIHarvest (task #%s) failed at fully harvesting source(s) %s. BibSched has NOT been stopped, and OAIHarvest will try to recover at next run" % (task_get_task_param("task_id"), ", ".join([repo[0][6] for repo in reposlist]),))
except InvenioOAIHarvestWarning, e:
register_exception(stream='warning', alert_admin=True)
return True
else:
return True
def collect_identifiers(harvested_file_list):
"""Collects all OAI PMH identifiers from each file in the list
and adds them to a list of identifiers per file.
@param harvested_file_list: list of filepaths to harvested files
@return list of lists, containing each files' identifier list"""
result = []
for harvested_file in harvested_file_list:
try:
fd_active = open(harvested_file)
except IOError:
write_message("Error opening harvested file '%s'. Skipping.." % (harvested_file,))
continue
data = fd_active.read()
fd_active.close()
result.append(REGEXP_OAI_ID.findall(data))
return result
def remove_duplicates(harvested_file_list):
"""
Go through a list of harvested files and remove any duplicate records.
"""
harvested_identifiers = []
for harvested_file in harvested_file_list:
# Firstly, rename original file to temporary name
try:
os.rename(harvested_file, "%s~" % (harvested_file,))
except OSError:
write_message("Error renaming harvested file '%s'. Skipping.." % (harvested_file,))
continue
# Secondly, open files for writing and reading
try:
updated_harvested_file = open(harvested_file, 'w')
original_harvested_file = open("%s~" % (harvested_file,))
except IOError:
write_message("Error opening harvested file '%s'. Skipping.." % (harvested_file,))
continue
data = original_harvested_file.read()
original_harvested_file.close()
# Get and write OAI-PMH XML header data to updated file
header_index_end = data.find("<ListRecords>") + len("<ListRecords>")
updated_harvested_file.write("%s\n" % (data[:header_index_end],))
# By checking the OAI ID we write all records not written previously (in any file)
harvested_records = REGEXP_RECORD.findall(data)
for record in harvested_records:
oai_identifier = REGEXP_OAI_ID.search(record)
if oai_identifier != None and oai_identifier.group(1) not in harvested_identifiers:
updated_harvested_file.write("<record>%s</record>\n" % (record,))
harvested_identifiers.append(oai_identifier.group(1))
updated_harvested_file.write("</ListRecords>\n</OAI-PMH>\n")
updated_harvested_file.close()
def add_timestamp_and_timelag(timestamp,
timelag):
""" Adds a time lag in seconds to a given date (timestamp).
Returns the resulting date. """
# remove any trailing .00 in timestamp:
timestamp = re.sub(r'\.[0-9]+$', '', timestamp)
# first convert timestamp to Unix epoch seconds:
timestamp_seconds = calendar.timegm(time.strptime(timestamp,
"%Y-%m-%d %H:%M:%S"))
# now add them:
result_seconds = timestamp_seconds + timelag
result = time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime(result_seconds))
return result
def update_lastrun(index):
""" A method that updates the lastrun of a repository
successfully harvested """
try:
today = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
sql = 'UPDATE oaiHARVEST SET lastrun=%s WHERE id=%s'
run_sql(sql, (today, index))
return 1
except StandardError, e:
return (0, e)
def oai_harvest_get(prefix, baseurl, harvestpath,
fro=None, until=None, setspecs=None,
user=None, password=None, cert_file=None,
key_file=None, method="POST"):
"""
Retrieve OAI records from given repository, with given arguments
"""
try:
(addressing_scheme, network_location, path, dummy1, \
dummy2, dummy3) = urlparse.urlparse(baseurl)
secure = (addressing_scheme == "https")
http_param_dict = {'verb': "ListRecords",
'metadataPrefix': prefix}
if fro:
http_param_dict['from'] = fro
if until:
http_param_dict['until'] = until
sets = None
if setspecs:
sets = [oai_set.strip() for oai_set in setspecs.split(' ')]
harvested_files = oai_harvest_getter.harvest(network_location, path, http_param_dict, method, harvestpath,
sets, secure, user, password, cert_file, key_file)
remove_duplicates(harvested_files)
return (1, harvested_files)
except (StandardError, oai_harvest_getter.InvenioOAIRequestError), e:
return (0, e)
def call_bibconvert(config, harvestpath, convertpath):
""" Call BibConvert to convert file given at 'harvestpath' with
conversion template 'config', and save the result in file at
'convertpath'.
Returns status exit code of the conversion, as well as error
messages, if any
"""
exitcode, dummy, cmd_stderr = \
run_shell_command(cmd="%s/bibconvert -c %s < %s", \
args=(CFG_BINDIR, config, harvestpath), filename_out=convertpath)
return (exitcode, cmd_stderr)
def call_plotextractor(active_file, extracted_file, harvested_identifier_list, \
downloaded_files, source_id):
"""
Function that generates proper MARCXML containing harvested plots for
each record.
@param active_file: path to the currently processed file
@param extracted_file: path to the file where the final results will be saved
@param harvested_identifier_list: list of OAI identifiers for this active_file
@param downloaded_files: dict of identifier -> dict mappings for downloaded material.
@param source_id: the repository identifier
@type source_id: integer
@return: exitcode and any error messages as: (exitcode, err_msg)
"""
all_err_msg = []
exitcode = 0
# Read in active file
recs_fd = open(active_file, 'r')
records = recs_fd.read()
recs_fd.close()
# Find all record
record_xmls = REGEXP_RECORD.findall(records)
updated_xml = ['<?xml version="1.0" encoding="UTF-8"?>']
updated_xml.append('<collection>')
i = 0
for record_xml in record_xmls:
current_exitcode = 0
identifier = harvested_identifier_list[i]
i += 1
if identifier not in downloaded_files:
downloaded_files[identifier] = {}
updated_xml.append("<record>")
updated_xml.append(record_xml)
if not oaiharvest_templates.tmpl_should_process_record_with_mode(record_xml, 'p', source_id):
# We skip this record
updated_xml.append("</record>")
continue
if "tarball" not in downloaded_files[identifier]:
current_exitcode, err_msg, tarball, dummy = \
plotextractor_harvest(identifier, active_file, selection=["tarball"])
if current_exitcode != 0:
exitcode = current_exitcode
all_err_msg.append(err_msg)
else:
downloaded_files[identifier]["tarball"] = tarball
if current_exitcode == 0:
plotextracted_xml_path = process_single(downloaded_files[identifier]["tarball"])
if plotextracted_xml_path != None:
# We store the path to the directory the tarball contents live
downloaded_files[identifier]["tarball-extracted"] = os.path.split(plotextracted_xml_path)[0]
# Read and grab MARCXML from plotextractor run
plotsxml_fd = open(plotextracted_xml_path, 'r')
plotextracted_xml = plotsxml_fd.read()
plotsxml_fd.close()
re_list = REGEXP_RECORD.findall(plotextracted_xml)
if re_list != []:
updated_xml.append(re_list[0])
updated_xml.append("</record>")
updated_xml.append('</collection>')
# Write to file
file_fd = open(extracted_file, 'w')
file_fd.write("\n".join(updated_xml))
file_fd.close()
if len(all_err_msg) > 0:
return exitcode, "\n".join(all_err_msg)
return exitcode, ""
def call_refextract(active_file, extracted_file, harvested_identifier_list,
downloaded_files, source_id):
"""
Function that calls refextractor to extract references and attach them to
harvested records. It will download the fulltext-pdf for each identifier
if necessary.
@param active_file: path to the currently processed file
@param extracted_file: path to the file where the final results will be saved
@param harvested_identifier_list: list of OAI identifiers for this active_file
@param downloaded_files: dict of identifier -> dict mappings for downloaded material.
@param source_id: the repository identifier
@type source_id: integer
@return: exitcode and any error messages as: (exitcode, all_err_msg)
"""
all_err_msg = []
exitcode = 0
flag = ""
if CFG_INSPIRE_SITE == 1:
flag = "--inspire"
# Read in active file
recs_fd = open(active_file, 'r')
records = recs_fd.read()
recs_fd.close()
# Find all record
record_xmls = REGEXP_RECORD.findall(records)
updated_xml = ['<?xml version="1.0" encoding="UTF-8"?>']
updated_xml.append('<collection>')
i = 0
for record_xml in record_xmls:
current_exitcode = 0
identifier = harvested_identifier_list[i]
i += 1
if identifier not in downloaded_files:
downloaded_files[identifier] = {}
updated_xml.append("<record>")
updated_xml.append(record_xml)
if not oaiharvest_templates.tmpl_should_process_record_with_mode(record_xml, 'p', source_id):
# We skip this record
updated_xml.append("</record>")
continue
if "pdf" not in downloaded_files[identifier]:
current_exitcode, err_msg, dummy, pdf = \
plotextractor_harvest(identifier, active_file, selection=["pdf"])
if current_exitcode != 0:
exitcode = current_exitcode
all_err_msg.append(err_msg)
else:
downloaded_files[identifier]["pdf"] = pdf
if current_exitcode == 0:
current_exitcode, cmd_stdout, err_msg = run_shell_command(cmd="%s/refextract %s -f '%s'" % \
(CFG_BINDIR, flag, downloaded_files[identifier]["pdf"]))
if err_msg != "" or current_exitcode != 0:
exitcode = current_exitcode
all_err_msg.append("Error extracting references from id: %s\nError:%s" % \
(identifier, err_msg))
else:
references_xml = REGEXP_REFS.search(cmd_stdout)
if references_xml:
updated_xml.append(references_xml.group(1))
updated_xml.append("</record>")
updated_xml.append('</collection>')
# Write to file
file_fd = open(extracted_file, 'w')
file_fd.write("\n".join(updated_xml))
file_fd.close()
if len(all_err_msg) > 0:
return exitcode, "\n".join(all_err_msg)
return exitcode, ""
def call_authorlist_extract(active_file, extracted_file, harvested_identifier_list,
downloaded_files, source_id):
"""
Function that will look in harvested tarball for any authorlists. If found
it will extract and convert the authors using a XSLT stylesheet.
@param active_file: path to the currently processed file
@type active_file: string
@param extracted_file: path to the file where the final results will be saved
@type extracted_file: string
@param harvested_identifier_list: list of OAI identifiers for this active_file
@type harvested_identifier_list: list
@param downloaded_files: dict of identifier -> dict mappings for downloaded material.
@type downloaded_files: dict
@param source_id: the repository identifier
@type source_id: integer
@return: exitcode and any error messages as: (exitcode, all_err_msg)
@rtype: tuple
"""
all_err_msg = []
exitcode = 0
# Read in active file
recs_fd = open(active_file, 'r')
records = recs_fd.read()
recs_fd.close()
# Find all records
record_xmls = REGEXP_RECORD.findall(records)
updated_xml = ['<?xml version="1.0" encoding="UTF-8"?>']
updated_xml.append('<collection>')
i = 0
for record_xml in record_xmls:
current_exitcode = 0
identifier = harvested_identifier_list[i]
i += 1
if not oaiharvest_templates.tmpl_should_process_record_with_mode(record_xml, 'p', source_id):
# We skip this record
updated_xml.append("<record>")
updated_xml.append(record_xml)
updated_xml.append("</record>")
continue
# Grab BibRec instance of current record for later amending
existing_record, status_code, dummy1 = create_record("<record>%s</record>" % (record_xml,))
if status_code == 0:
all_err_msg.append("Error parsing record, skipping authorlist extraction of: %s\n" % \
(identifier,))
updated_xml.append("<record>%s</record>" % (record_xml,))
continue
if identifier not in downloaded_files:
downloaded_files[identifier] = {}
if "tarball" not in downloaded_files[identifier]:
current_exitcode, err_msg, tarball, dummy = \
plotextractor_harvest(identifier, active_file, selection=["tarball"])
if current_exitcode != 0:
exitcode = current_exitcode
all_err_msg.append(err_msg)
else:
downloaded_files[identifier]["tarball"] = tarball
if current_exitcode == 0:
current_exitcode, err_msg, authorlist_xml_path = authorlist_extract(downloaded_files[identifier]["tarball"], \
identifier, downloaded_files)
if current_exitcode != 0:
exitcode = current_exitcode
all_err_msg.append("Error extracting authors from id: %s\nError:%s" % \
(identifier, err_msg))
elif authorlist_xml_path is not None:
## Authorlist found
# Read and create BibRec
xml_fd = open(authorlist_xml_path, 'r')
author_xml = xml_fd.read()
xml_fd.close()
authorlist_record = create_records(author_xml)
if len(authorlist_record) == 1:
if authorlist_record[0][0] == None:
all_err_msg.append("Error parsing authorlist record for id: %s" % \
(identifier,))
continue
authorlist_record = authorlist_record[0][0]
# Convert any LaTeX symbols in authornames
translate_fieldvalues_from_latex(authorlist_record, '100', code='a')
translate_fieldvalues_from_latex(authorlist_record, '700', code='a')
# Look for any UNDEFINED fields in authorlist
key = "UNDEFINED"
matching_fields = record_find_matching_fields(key, authorlist_record, tag='100') \
+ record_find_matching_fields(key, authorlist_record, tag='700')
if len(matching_fields) > 0 and bibcatalog_system != None:
# UNDEFINED found. Create ticket in author queue
ticketid = create_authorlist_ticket(matching_fields, identifier)
if ticketid:
write_message("authorlist RT ticket %d submitted for %s" % (ticketid, identifier))
else:
all_err_msg.append("Error while submitting RT ticket for %s" % (identifier,))
# Replace 100,700 fields of original record with extracted fields
record_delete_fields(existing_record, '100')
record_delete_fields(existing_record, '700')
first_author = record_get_field_instances(authorlist_record, '100')
additional_authors = record_get_field_instances(authorlist_record, '700')
record_add_fields(existing_record, '100', first_author)
record_add_fields(existing_record, '700', additional_authors)
updated_xml.append(record_xml_output(existing_record))
updated_xml.append('</collection>')
# Write to file
file_fd = open(extracted_file, 'w')
file_fd.write("\n".join(updated_xml))
file_fd.close()
if len(all_err_msg) > 0:
return exitcode, all_err_msg
return exitcode, ""
def call_fulltext(active_file, extracted_file, harvested_identifier_list,
downloaded_files, source_id):
"""
Function that calls attach FFT tag for full-text pdf to harvested records.
It will download the fulltext-pdf for each identifier if necessary.
@param active_file: path to the currently processed file
@param extracted_file: path to the file where the final results will be saved
@param harvested_identifier_list: list of OAI identifiers for this active_file
@param downloaded_files: dict of identifier -> dict mappings for downloaded material.
@return: exitcode and any error messages as: (exitcode, err_msg)
"""
all_err_msg = []
exitcode = 0
# Read in active file
recs_fd = open(active_file, 'r')
records = recs_fd.read()
recs_fd.close()
# Set doctype FIXME: Remove when parameters are introduced to post-process steps
if CFG_INSPIRE_SITE == 1:
doctype = "arXiv"
elif CFG_CERN_SITE == 1:
doctype = ""
else:
doctype = ""
# Find all records
record_xmls = REGEXP_RECORD.findall(records)
updated_xml = ['<?xml version="1.0" encoding="UTF-8"?>']
updated_xml.append('<collection>')
i = 0
for record_xml in record_xmls:
current_exitcode = 0
identifier = harvested_identifier_list[i]
i += 1
if identifier not in downloaded_files:
downloaded_files[identifier] = {}
updated_xml.append("<record>")
updated_xml.append(record_xml)
if not oaiharvest_templates.tmpl_should_process_record_with_mode(record_xml, 'p', source_id):
# We skip this record
updated_xml.append("</record>")
continue
if "pdf" not in downloaded_files[identifier]:
current_exitcode, err_msg, dummy, pdf = \
plotextractor_harvest(identifier, active_file, selection=["pdf"])
if current_exitcode != 0:
exitcode = current_exitcode
all_err_msg.append(err_msg)
else:
downloaded_files[identifier]["pdf"] = pdf
if current_exitcode == 0:
fulltext_xml = """ <datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(url)s</subfield>
<subfield code="t">%(doctype)s</subfield>
</datafield>""" % {'url': downloaded_files[identifier]["pdf"],
'doctype': doctype}
updated_xml.append(fulltext_xml)
updated_xml.append("</record>")
updated_xml.append('</collection>')
# Write to file
file_fd = open(extracted_file, 'w')
file_fd.write("\n".join(updated_xml))
file_fd.close()
if len(all_err_msg) > 0:
return exitcode, "\n".join(all_err_msg)
return exitcode, ""
def authorlist_extract(tarball_path, identifier, downloaded_files):
"""
Try to extract the tarball given, if not already extracted, and look for
any XML files that could be authorlists. If any is found, use a XSLT stylesheet
to transform the authorlist into MARCXML author-fields, and return the full path
of resulting conversion.
@param tarball_path: path to the tarball to check
@type tarball_path: string
@param identifier: OAI Identifier to the current record
@type identifier: string
@param downloaded_files: dict of identifier -> dict mappings for downloaded material.
@type downloaded_files: dict
@return: path to converted authorlist together with exitcode and any error messages as:
(exitcode, err_msg, authorlist_path)
@rtype: tuple
"""
all_err_msg = []
exitcode = 0
if "tarball-extracted" not in downloaded_files[identifier]:
# tarball has not been extracted
tar_dir, dummy = get_defaults(tarball=tarball_path, sdir=CFG_TMPDIR, refno_url="")
try:
dummy = untar(tarball_path, tar_dir)
except Timeout:
all_err_msg.append("Timeout during tarball extraction of %s" % (tarball_path,))
exitcode = 1
return exitcode, "\n".join(all_err_msg), None
downloaded_files[identifier]["tarball-extracted"] = tar_dir
# tarball is now surely extracted, so we try to fetch all XML in the folder
xml_files_list = find_matching_files(downloaded_files[identifier]["tarball-extracted"], \
["xml"])
# Try to convert authorlist candidates, returning on first success
for xml_file in xml_files_list:
xml_file_fd = open(xml_file, "r")
xml_content = xml_file_fd.read()
xml_file_fd.close()
match = REGEXP_AUTHLIST.findall(xml_content)
if match != []:
tempfile_fd, temp_authorlist_path = tempfile.mkstemp(suffix=".xml", prefix="authorlist_temp", dir=CFG_TMPDIR)
os.write(tempfile_fd, match[0])
os.close(tempfile_fd)
# Generate file to store conversion results
newfile_fd, authorlist_resultxml_path = tempfile.mkstemp(suffix=".xml", prefix="authorlist_MARCXML", \
dir=downloaded_files[identifier]["tarball-extracted"])
os.close(newfile_fd)
exitcode, cmd_stderr = call_bibconvert(config=CFG_OAI_AUTHORLIST_POSTMODE_STYLESHEET, \
harvestpath=temp_authorlist_path, \
convertpath=authorlist_resultxml_path)
if cmd_stderr == "" and exitcode == 0:
# Success!
return 0, "", authorlist_resultxml_path
# No valid authorlist found
return 0, "", None
def plotextractor_harvest(identifier, active_file, selection=["pdf", "tarball"]):
"""
Function that calls plotextractor library to download selected material,
i.e. tarball or pdf, for passed identifier. Returns paths to respective files.
@param identifier: OAI identifier of the record to harvest
@param active_file: path to the currently processed file
@param selection: list of materials to harvest
@return: exitcode, errormessages and paths to harvested tarball and fulltexts
(exitcode, err_msg, tarball, pdf)
"""
all_err_msg = []
exitcode = 0
active_dir, active_name = os.path.split(active_file)
# turn oaiharvest_23_1_20110214161632_converted -> oaiharvest_23_1_material
# to let harvested material in same folder structure
active_name = "_".join(active_name.split('_')[:-2]) + "_material"
extract_path = make_single_directory(active_dir, active_name)
tarball, pdf = harvest_single(identifier, extract_path, selection)
time.sleep(CFG_PLOTEXTRACTOR_DOWNLOAD_TIMEOUT)
if tarball == None and "tarball" in selection:
all_err_msg.append("Error harvesting tarball from id: %s %s" % \
(identifier, extract_path))
exitcode = 1
if pdf == None and "pdf" in selection:
all_err_msg.append("Error harvesting full-text from id: %s %s" % \
(identifier, extract_path))
exitcode = 1
return exitcode, "\n".join(all_err_msg), tarball, pdf
def find_matching_files(basedir, filetypes):
"""
This functions tries to find all files matching given filetypes by looking at
all the files and filenames in the given directory, including subdirectories.
@param basedir: full path to base directory to search in
@type basedir: string
@param filetypes: list of filetypes, extensions
@type filetypes: list
@return: exitcode and any error messages as: (exitcode, err_msg)
@rtype: tuple
"""
files_list = []
for dirpath, dummy0, filenames in os.walk(basedir):
for filename in filenames:
full_path = os.path.join(dirpath, filename)
dummy1, cmd_out, dummy2 = run_shell_command('file %s', (full_path,))
for filetype in filetypes:
if cmd_out.lower().find(filetype) > -1:
files_list.append(full_path)
elif filename.split('.')[-1].lower() == filetype:
files_list.append(full_path)
return files_list
def translate_fieldvalues_from_latex(record, tag, code='', encoding='utf-8'):
"""
Given a record and field tag, this function will modify the record by
translating the subfield values of found fields from LaTeX to chosen
encoding for all the subfields with given code (or all if no code is given).
@param record: record to modify, in BibRec style structure
@type record: dict
@param tag: tag of fields to modify
@type tag: string
@param code: restrict the translation to a given subfield code
@type code: string
@param encoding: scharacter encoding for the new value. Defaults to UTF-8.
@type encoding: string
"""
field_list = record_get_field_instances(record, tag)
for field in field_list:
subfields = field[0]
subfield_index = 0
for subfield_code, subfield_value in subfields:
if code == '' or subfield_code == code:
newvalue = translate_latex2unicode(subfield_value).encode(encoding)
record_modify_subfield(record, tag, subfield_code, newvalue, \
subfield_index, field_position_global=field[4])
subfield_index += 1
def create_authorlist_ticket(matching_fields, identifier):
"""
This function will submit a ticket generated by UNDEFINED affiliations
in extracted authors from collaboration authorlists.
@param matching_fields: list of (tag, field_instances) for UNDEFINED nodes
@type matching_fields: list
@param identifier: OAI identifier of record
@type identifier: string
@return: return the ID of the created ticket, or None on failure
@rtype: int or None
"""
if bibcatalog_system is None:
return None
subject = "[OAI Harvest] UNDEFINED affiliations for record %s" % (identifier,)
text = """
Harvested record with identifier %(ident)s has had its authorlist extracted and contains some UNDEFINED affiliations.
To see the record, go here: %(baseurl)s/search?p=%(ident)s
If the record is not there yet, try again later. It may take some time for it to load into the system.
List of unidentified fields:
%(fields)s
""" % {
'ident' : identifier,
'baseurl' : CFG_SITE_URL,
'fields' : "\n".join([field_xml_output(field, tag) for tag, field_instances in matching_fields \
for field in field_instances])
}
queue = "Authors"
ticketid = bibcatalog_system.ticket_submit(subject=subject, queue=queue)
if bibcatalog_system.ticket_comment(None, ticketid, text) == None:
write_message("Error: commenting on ticket %s failed." % (str(ticketid),))
return ticketid
def create_oaiharvest_log(task_id, oai_src_id, marcxmlfile):
"""
Function which creates the harvesting logs
@param task_id bibupload task id
"""
file_fd = open(marcxmlfile, "r")
xml_content = file_fd.read(-1)
file_fd.close()
create_oaiharvest_log_str(task_id, oai_src_id, xml_content)
def create_oaiharvest_log_str(task_id, oai_src_id, xml_content):
"""
Function which creates the harvesting logs
@param task_id bibupload task id
"""
try:
records = create_records(xml_content)
for record in records:
oai_id = record_extract_oai_id(record[0])
query = "INSERT INTO oaiHARVESTLOG (id_oaiHARVEST, oai_id, date_harvested, bibupload_task_id) VALUES (%s, %s, NOW(), %s)"
run_sql(query, (str(oai_src_id), str(oai_id), str(task_id)))
except Exception, msg:
print "Logging exception : %s " % (str(msg),)
def call_bibupload(marcxmlfile, mode=None, oai_src_id= -1, sequence_id=None):
"""
Creates a bibupload task for the task scheduler in given mode
on given file. Returns the generated task id and logs the event
in oaiHARVESTLOGS, also adding any given oai source identifier.
@param marcxmlfile: base-marcxmlfilename to upload
@param mode: mode to upload in
@param oai_src_id: id of current source config
@param sequence_id: sequence-number, if relevant
@return: task_id if successful, otherwise None.
"""
if mode is None:
mode = ["-r", "-i"]
if os.path.exists(marcxmlfile):
try:
args = mode
# Add job with priority 6 (above normal bibedit tasks) and file to upload to arguments
#FIXME: allow per-harvest arguments
args.extend(["-P", "6", marcxmlfile])
if sequence_id:
args.extend(['-I', str(sequence_id)])
task_id = task_low_level_submission("bibupload", "oaiharvest", *tuple(args))
create_oaiharvest_log(task_id, oai_src_id, marcxmlfile)
except Exception, msg:
write_message("An exception during submitting oaiharvest task occured : %s " % (str(msg)))
return None
return task_id
else:
write_message("marcxmlfile %s does not exist" % (marcxmlfile,))
return None
def call_bibfilter(bibfilterprogram, marcxmlfile):
"""
Call bibfilter program BIBFILTERPROGRAM on MARCXMLFILE, which is usually
run before uploading records.
The bibfilter should produce up to four files called MARCXMLFILE.insert.xml,
MARCXMLFILE.correct.xml, MARCXMLFILE.append.xml and MARCXMLFILE.holdingpen.xml.
The first file contains parts of MARCXML to be uploaded in insert mode,
the second file is uploaded in correct mode, third in append mode and the last file
contains MARCXML to be uploaded into the holding pen.
@param bibfilterprogram: path to bibfilter script to run
@param marcxmlfile: base-marcxmlfilename
@return: exitcode and any error messages as: (exitcode, err_msg)
"""
all_err_msg = []
exitcode = 0
if bibfilterprogram:
if not os.path.isfile(bibfilterprogram):
all_err_msg.append("bibfilterprogram %s is not a file" %
(bibfilterprogram,))
exitcode = 1
elif not os.path.isfile(marcxmlfile):
all_err_msg.append("marcxmlfile %s is not a file" % (marcxmlfile,))
exitcode = 1
else:
exitcode, dummy, cmd_stderr = run_shell_command(cmd="%s '%s'", \
args=(bibfilterprogram, \
marcxmlfile))
if exitcode != 0 or cmd_stderr != "":
all_err_msg.append("Error while running filtering script on %s\nError:%s" % \
(marcxmlfile, cmd_stderr))
else:
try:
all_err_msg.append("no bibfilterprogram defined, copying %s only" %
(marcxmlfile,))
shutil.copy(marcxmlfile, marcxmlfile + ".insert.xml")
except:
all_err_msg.append("cannot copy %s into %s.insert.xml" % (marcxmlfile, marcxmlfile))
exitcode = 1
return exitcode, "\n".join(all_err_msg)
def get_row_from_reposname(reposname):
""" Returns all information about a row (OAI source)
from the source name """
try:
sql = """SELECT id, baseurl, metadataprefix, arguments,
comment, bibconvertcfgfile, name, lastrun,
frequency, postprocess, setspecs,
bibfilterprogram
FROM oaiHARVEST WHERE name=%s"""
res = run_sql(sql, (reposname,))
reposdata = []
for element in res:
reposdata.append(element)
return reposdata
except StandardError, e:
return (0, e)
def get_all_rows_from_db():
""" This method retrieves the full database of repositories and returns
a list containing (in exact order):
| id | baseurl | metadataprefix | arguments | comment
| bibconvertcfgfile | name | lastrun | frequency
| postprocess | setspecs | bibfilterprogram
"""
try:
reposlist = []
sql = """SELECT id FROM oaiHARVEST"""
idlist = run_sql(sql)
for index in idlist:
sql = """SELECT id, baseurl, metadataprefix, arguments,
comment, bibconvertcfgfile, name, lastrun,
frequency, postprocess, setspecs,
bibfilterprogram
FROM oaiHARVEST WHERE id=%s""" % index
reposelements = run_sql(sql)
repos = []
for element in reposelements:
repos.append(element)
reposlist.append(repos)
return reposlist
except StandardError, e:
return (0, e)
def compare_timestamps_with_tolerance(timestamp1,
timestamp2,
tolerance=0):
"""Compare two timestamps TIMESTAMP1 and TIMESTAMP2, of the form
'2005-03-31 17:37:26'. Optionally receives a TOLERANCE argument
(in seconds). Return -1 if TIMESTAMP1 is less than TIMESTAMP2
minus TOLERANCE, 0 if they are equal within TOLERANCE limit,
and 1 if TIMESTAMP1 is greater than TIMESTAMP2 plus TOLERANCE.
"""
# remove any trailing .00 in timestamps:
timestamp1 = re.sub(r'\.[0-9]+$', '', timestamp1)
timestamp2 = re.sub(r'\.[0-9]+$', '', timestamp2)
# first convert timestamps to Unix epoch seconds:
timestamp1_seconds = calendar.timegm(time.strptime(timestamp1,
"%Y-%m-%d %H:%M:%S"))
timestamp2_seconds = calendar.timegm(time.strptime(timestamp2,
"%Y-%m-%d %H:%M:%S"))
# now compare them:
if timestamp1_seconds < timestamp2_seconds - tolerance:
return -1
elif timestamp1_seconds > timestamp2_seconds + tolerance:
return 1
else:
return 0
def get_dates(dates):
""" A method to validate and process the dates input by the user
at the command line """
twodates = []
if dates:
datestring = dates.split(":")
if len(datestring) == 2:
for date in datestring:
### perform some checks on the date format
datechunks = date.split("-")
if len(datechunks) == 3:
try:
if int(datechunks[0]) and int(datechunks[1]) and \
int(datechunks[2]):
twodates.append(date)
except StandardError:
write_message("Dates have invalid format, not "
"'yyyy-mm-dd:yyyy-mm-dd'")
twodates = None
return twodates
else:
write_message("Dates have invalid format, not "
"'yyyy-mm-dd:yyyy-mm-dd'")
twodates = None
return twodates
## final check.. date1 must me smaller than date2
date1 = str(twodates[0]) + " 01:00:00"
date2 = str(twodates[1]) + " 01:00:00"
if compare_timestamps_with_tolerance(date1, date2) != -1:
write_message("First date must be before second date.")
twodates = None
return twodates
else:
write_message("Dates have invalid format, not "
"'yyyy-mm-dd:yyyy-mm-dd'")
twodates = None
else:
twodates = None
return twodates
def get_repository_names(repositories):
""" A method to validate and process the repository names input by the
user at the command line """
repository_names = []
if repositories:
names = repositories.split(",")
for name in names:
### take into account both single word names and multiple word
### names (which get wrapped around "" or '')
name = name.strip()
if name.startswith("'"):
name = name.strip("'")
elif name.startswith('"'):
name = name.strip('"')
repository_names.append(name)
else:
repository_names = None
return repository_names
def usage(exitcode=0, msg=""):
"Print out info. Only used when run in 'manual' harvesting mode"
sys.stderr.write("*Manual single-shot harvesting mode*\n")
if msg:
sys.stderr.write(msg + "\n")
sys.exit(exitcode)
@with_app_context
def main():
"""Starts the tool.
If the command line arguments are those of the 'manual' mode, then
starts a manual one-time harvesting. Else trigger a BibSched task
for automated harvesting based on the OAIHarvest admin settings.
"""
# Let's try to parse the arguments as used in manual harvesting:
try:
opts, args = getopt.getopt(sys.argv[1:], "o:v:m:p:i:s:f:u:r:x:c:k:w:l:",
["output=",
"verb=",
"method=",
"metadataPrefix=",
"identifier=",
"set=",
"from=",
"until=",
"resumptionToken=",
"certificate=",
"key=",
"user=",
"password="]
)
# So everything went smoothly: start harvesting in manual mode
if len([opt for opt, opt_value in opts if opt in ['-v', '--verb']]) > 0:
# verb parameter is given
http_param_dict = {}
method = "POST"
output = ""
user = None
password = None
cert_file = None
key_file = None
sets = []
# get options and arguments
for opt, opt_value in opts:
if opt in ["-v", "--verb"]:
http_param_dict['verb'] = opt_value
elif opt in ["-m", '--method']:
if opt_value == "GET" or opt_value == "POST":
method = opt_value
elif opt in ["-p", "--metadataPrefix"]:
http_param_dict['metadataPrefix'] = opt_value
elif opt in ["-i", "--identifier"]:
http_param_dict['identifier'] = opt_value
elif opt in ["-s", "--set"]:
sets = opt_value.split()
elif opt in ["-f", "--from"]:
http_param_dict['from'] = opt_value
elif opt in ["-u", "--until"]:
http_param_dict['until'] = opt_value
elif opt in ["-r", "--resumptionToken"]:
http_param_dict['resumptionToken'] = opt_value
elif opt in ["-o", "--output"]:
output = opt_value
elif opt in ["-c", "--certificate"]:
cert_file = opt_value
elif opt in ["-k", "--key"]:
key_file = opt_value
elif opt in ["-l", "--user"]:
user = opt_value
elif opt in ["-w", "--password"]:
password = opt_value
elif opt in ["-V", "--version"]:
print __revision__
sys.exit(0)
else:
usage(1, "Option %s is not allowed" % opt)
if len(args) > 0:
base_url = args[-1]
if not base_url.lower().startswith('http'):
base_url = 'http://' + base_url
(addressing_scheme, network_location, path, dummy1, \
dummy2, dummy3) = urlparse.urlparse(base_url)
secure = (addressing_scheme == "https")
if (cert_file and not key_file) or \
(key_file and not cert_file):
# Both are needed if one specified
usage(1, "You must specify both certificate and key files")
if password and not user:
# User must be specified when password is given
usage(1, "You must specify a username")
elif user and not password:
if not secure:
sys.stderr.write("*WARNING* Your password will be sent in clear!\n")
try:
password = getpass.getpass()
except KeyboardInterrupt, error:
sys.stderr.write("\n%s\n" % (error,))
sys.exit(0)
oai_harvest_getter.harvest(network_location, path,
http_param_dict, method,
output, sets, secure, user,
password, cert_file,
key_file)
sys.stderr.write("Harvesting completed at: %s\n\n" %
time.strftime("%Y-%m-%d %H:%M:%S --> ", time.localtime()))
return
else:
usage(1, "You must specify the URL to harvest")
else:
# verb is not given. We will continue with periodic
# harvesting. But first check if URL parameter is given:
# if it is, then warn directly now
if len(args) > 1 or \
(len(args) == 1 and not args[0].isdigit()):
usage(1, "You must specify the --verb parameter")
except getopt.error, e:
# So could it be that we are using different arguments? Try to
# start the BibSched task (automated harvesting) and see if it
# validates
pass
# BibSched mode - periodical harvesting
# Note that the 'help' is common to both manual and automated
# mode.
task_set_option("repository", None)
task_set_option("dates", None)
task_init(authorization_action='runoaiharvest',
authorization_msg="oaiharvest Task Submission",
description="""
Harvest records from OAI sources.
Manual vs automatic harvesting:
- Manual harvesting retrieves records from the specified URL,
with the specified OAI arguments. Harvested records are displayed
on the standard output or saved to a file, but are not integrated
into the repository. This mode is useful to 'play' with OAI
repositories or to build special harvesting scripts.
- Automatic harvesting relies on the settings defined in the OAI
Harvest admin interface to periodically retrieve the repositories
and sets to harvest. It also take care of harvesting only new or
modified records. Records harvested using this mode are converted
and integrated into the repository, according to the settings
defined in the OAI Harvest admin interface.
Examples:
Manual (single-shot) harvesting mode:
Save to /tmp/z.xml records from CDS added/modified between 2004-04-01
and 2004-04-02, in MARCXML:
$ oaiharvest -vListRecords -f2004-04-01 -u2004-04-02 -pmarcxml -o/tmp/z.xml http://cds.cern.ch/oai2d
Automatic (periodical) harvesting mode:
Schedule daily harvesting of all repositories defined in OAIHarvest admin:
$ oaiharvest -s 24h
Schedule daily harvesting of repository 'arxiv', defined in OAIHarvest admin:
$ oaiharvest -r arxiv -s 24h
Harvest in 10 minutes from 'pubmed' repository records added/modified
between 2005-05-05 and 2005-05-10:
$ oaiharvest -r pubmed -d 2005-05-05:2005-05-10 -t 10m
""",
help_specific_usage='Manual single-shot harvesting mode:\n'
' -o, --output specify output file\n'
' -v, --verb OAI verb to be executed\n'
' -m, --method http method (default POST)\n'
' -p, --metadataPrefix metadata format\n'
' -i, --identifier OAI identifier\n'
' -s, --set OAI set(s). Whitespace-separated list\n'
' -r, --resuptionToken Resume previous harvest\n'
' -f, --from from date (datestamp)\n'
' -u, --until until date (datestamp)\n'
' -c, --certificate path to public certificate (in case of certificate-based harvesting)\n'
' -k, --key path to private key (in case of certificate-based harvesting)\n'
' -l, --user username (in case of password-protected harvesting)\n'
' -w, --password password (in case of password-protected harvesting)\n'
'Automatic periodical harvesting mode:\n'
' -r, --repository="repo A"[,"repo B"] \t which repositories to harvest (default=all)\n'
' -d, --dates=yyyy-mm-dd:yyyy-mm-dd \t reharvest given dates only\n',
version=__revision__,
specific_params=("r:d:", ["repository=", "dates=", ]),
task_submit_elaborate_specific_parameter_fnc=
task_submit_elaborate_specific_parameter,
task_run_fnc=task_run_core)
def task_submit_elaborate_specific_parameter(key, value, opts, args):
"""Elaborate specific cli parameters for oaiharvest."""
if key in ("-r", "--repository"):
task_set_option('repository', get_repository_names(value))
elif key in ("-d", "--dates"):
task_set_option('dates', get_dates(value))
if value is not None and task_get_option("dates") is None:
raise StandardError, "Date format not valid."
else:
return False
return True
### okay, here we go:
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/oairepository/updater.py b/invenio/legacy/oairepository/updater.py
index ee1b5a960..07173935f 100644
--- a/invenio/legacy/oairepository/updater.py
+++ b/invenio/legacy/oairepository/updater.py
@@ -1,523 +1,523 @@
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""OAI Repository administration tool -
Updates the metadata of the records to include OAI identifiers and
OAI SetSpec according to the settings defined in OAI Repository
admin interface
"""
import os
import sys
import time
if sys.hexversion < 0x2040000:
# pylint: disable=W0622
from sets import Set as set
# pylint: enable=W0622
from tempfile import mkstemp
from pprint import pformat
from invenio.config import \
CFG_OAI_ID_FIELD, \
CFG_OAI_ID_PREFIX, \
CFG_OAI_SET_FIELD, \
CFG_OAI_PREVIOUS_SET_FIELD, \
CFG_SITE_NAME, \
CFG_TMPDIR
from invenio.oai_repository_config import CFG_OAI_REPOSITORY_MARCXML_SIZE, \
CFG_OAI_REPOSITORY_GLOBAL_SET_SPEC
from invenio.legacy.search_engine import perform_request_search, get_record, search_unit_in_bibxxx
from invenio.intbitset import intbitset
from invenio.legacy.dbquery import run_sql
-from invenio.bibtask import \
+from invenio.legacy.bibsched.bibtask import \
task_get_option, \
task_set_option, \
write_message, \
task_update_progress, \
task_init, \
task_sleep_now_if_required, \
task_low_level_submission
from invenio.legacy.bibrecord import \
record_get_field_value, \
record_get_field_values, \
record_add_field, \
record_xml_output
def get_set_definitions(set_spec):
"""
Retrieve set definitions from oaiREPOSITORY table.
The set definitions are the search patterns that define the records
which are in the set
"""
set_definitions = []
query = "select setName, setDefinition from oaiREPOSITORY where setSpec=%s"
res = run_sql(query, (set_spec, ))
for (set_name, set_definition) in res:
params = parse_set_definition(set_definition)
params['setSpec'] = set_spec
params['setName'] = set_name
set_definitions.append(params)
return set_definitions
def parse_set_definition(set_definition):
"""
Returns the parameters for the given set definition.
The returned structure is a dictionary with keys being
c, p1, f1, m1, p2, f2, m2, p3, f3, m3 and corresponding values
@param set_definition: a string as returned by the database for column 'setDefinition'
@return: a dictionary
"""
params = {'c':'',
'p1':'', 'f1':'', 'm1':'',
'p2':'', 'f2':'', 'm2':'',
'p3':'', 'f3':'', 'm3':'',
'op1':'a', 'op2':'a'}
definitions = set_definition.split(';')
for definition in definitions:
arguments = definition.split('=')
if len(arguments) == 2:
params[arguments[0]] = arguments[1]
return params
def all_set_specs():
"""
Returns the list of (distinct) setSpecs defined in the settings.
This also include the "empty" setSpec if any setting uses it.
Note: there can be several times the same setSpec in the settings,
given that a setSpec might be defined by several search
queries. Here we return distinct values
"""
query = "SELECT DISTINCT setSpec FROM oaiREPOSITORY"
res = run_sql(query)
return [row[0] for row in res]
def get_recids_for_set_spec(set_spec):
"""
Returns the list (as intbitset) of recids belonging to 'set'
Parameters:
set_spec - *str* the set_spec for which we would like to get the
recids
"""
recids = intbitset()
for set_def in get_set_definitions(set_spec):
new_recids = perform_request_search(c=[coll.strip() \
for coll in set_def['c'].split(',')],
p1=set_def['p1'],
f1=set_def['f1'],
m1=set_def['m1'],
op1=set_def['op1'],
p2=set_def['p2'],
f2=set_def['f2'],
m2=set_def['m2'],
op2=set_def['op2'],
p3=set_def['p3'],
f3=set_def['f3'],
m3=set_def['m3'],
ap=0)
recids |= intbitset(new_recids)
return recids
def get_set_name_for_set_spec(set_spec):
"""
Returns the OAI setName of a setSpec.
Note that the OAI Repository admin lets the user add several set
definition with the same setSpec, and possibly with different
setNames... -> Returns the first (non empty) one found.
Parameters:
set_spec - *str* the set_spec for which we would like to get the
setName
"""
query = "select setName from oaiREPOSITORY where setSpec=%s and setName!=''"
res = run_sql(query, (set_spec, ))
if len(res) > 0:
return res[0][0]
else:
return ""
def print_repository_status(local_write_message=write_message,
verbose=0):
"""
Prints the repository status to the standard output.
Parameters:
write_message - *function* the function used to write the output
verbose - *int* the verbosity of the output
- 0: print repository size
- 1: print quick status of each set (numbers
can be wrong if the repository is in some
inconsistent state, i.e. a record is in an
OAI setSpec but has not OAI ID)
- 2: print detailed status of repository, with
number of records that needs to be
synchronized according to the sets
definitions. Precise, but ~slow...
"""
repository_size_s = "%d" % repository_size()
repository_recids_after_update = intbitset()
local_write_message(CFG_SITE_NAME)
local_write_message(" OAI Repository Status")
set_spec_max_length = 19 # How many max char do we display for
set_name_max_length = 20 # setName and setSpec?
if verbose == 0:
# Just print repository size
local_write_message(" Total(**)" + " " * 29 +
" " * (9 - len(repository_size_s)) + repository_size_s)
return
elif verbose == 1:
# We display few information: show longer set name and spec
set_spec_max_length = 30
set_name_max_length = 30
local_write_message("=" * 80)
header = " setSpec" + " " * (set_spec_max_length - 7) + \
" setName" + " " * (set_name_max_length - 5) + " Volume"
if verbose > 1:
header += " " * 5 + "After update(*):"
local_write_message(header)
if verbose > 1:
local_write_message(" " * 57 + "Additions Deletions")
local_write_message("-" * 80)
for set_spec in all_set_specs():
if verbose <= 1:
# Get the records that are in this set. This is an
# incomplete check, as it can happen that some records are
# in this set (according to the metadata) but have no OAI
# ID (so they are not exported). This can happen if the
# repository has some records coming from external
# sources, or if it has never been synchronized with this
# tool.
current_recids = get_recids_for_set_spec(set_spec)
nb_current_recids = len(current_recids)
else:
# Get the records that are *currently* exported for this
# setSpec
current_recids = search_unit_in_bibxxx(p=set_spec, f=CFG_OAI_SET_FIELD, type='e')
nb_current_recids = len(current_recids)
# Get the records that *should* be in this set according to
# the admin defined settings, and compute how many should be
# added or removed
should_recids = get_recids_for_set_spec(set_spec)
repository_recids_after_update |= should_recids
nb_add_recids = len(should_recids - current_recids)
nb_remove_recids = len(current_recids - should_recids)
nb_should_recids = len(should_recids)
# Adapt setName and setSpec strings lengths
set_spec_str = set_spec
if len(set_spec_str) > set_spec_max_length :
set_spec_str = "%s.." % set_spec_str[:set_spec_max_length]
set_name_str = get_set_name_for_set_spec(set_spec)
if len(set_name_str) > set_name_max_length :
set_name_str = "%s.." % set_name_str[:set_name_max_length]
row = " " + set_spec_str + \
" " * ((set_spec_max_length + 2) - len(set_spec_str)) + set_name_str + \
" " * ((set_name_max_length + 2) - len(set_name_str)) + \
" " * (7 - len(str(nb_current_recids))) + str(nb_current_recids)
if verbose > 1:
row += \
" " * max(9 - len(str(nb_add_recids)), 0) + '+' + str(nb_add_recids) + \
" " * max(7 - len(str(nb_remove_recids)), 0) + '-' + str(nb_remove_recids) + " = " +\
" " * max(7 - len(str(nb_should_recids)), 0) + str(nb_should_recids)
local_write_message(row)
local_write_message("=" * 80)
footer = " Total(**)" + " " * (set_spec_max_length + set_name_max_length - 7) + \
" " * (9 - len(repository_size_s)) + repository_size_s
if verbose > 1:
footer += ' ' * (28 - len(str(len(repository_recids_after_update)))) + str(len(repository_recids_after_update))
local_write_message(footer)
if verbose > 1:
local_write_message(' *The "after update" columns show the repository after you run this tool.')
else:
local_write_message(' *"Volume" is indicative if repository is out of sync. Use --detailed-report.')
local_write_message('**The "total" is not the sum of the above numbers, but the union of the records.')
def repository_size():
"""Read repository size"""
return len(search_unit_in_bibxxx(p="*", f=CFG_OAI_SET_FIELD, type="e"))
### MAIN ###
def oairepositoryupdater_task():
"""Main business logic code of oai_archive"""
no_upload = task_get_option("no_upload")
report = task_get_option("report")
if report > 1:
print_repository_status(verbose=report)
return True
initial_snapshot = {}
for set_spec in all_set_specs():
initial_snapshot[set_spec] = get_set_definitions(set_spec)
write_message("Initial set snapshot: %s" % pformat(initial_snapshot), verbose=2)
task_update_progress("Fetching records to process")
recids_with_oaiid = search_unit_in_bibxxx(p='*', f=CFG_OAI_ID_FIELD, type='e')
write_message("%s recids have an OAI ID" % len(recids_with_oaiid), verbose=2)
all_current_recids = search_unit_in_bibxxx(p='*', f=CFG_OAI_SET_FIELD, type='e')
no_more_exported_recids = intbitset(all_current_recids)
write_message("%s recids are currently exported" % (len(all_current_recids)), verbose=2)
all_affected_recids = intbitset()
all_should_recids = intbitset()
recids_for_set = {}
for set_spec in all_set_specs():
if not set_spec:
set_spec = CFG_OAI_REPOSITORY_GLOBAL_SET_SPEC
should_recids = get_recids_for_set_spec(set_spec)
recids_for_set[set_spec] = should_recids
no_more_exported_recids -= should_recids
all_should_recids |= should_recids
current_recids = search_unit_in_bibxxx(p=set_spec, f=CFG_OAI_SET_FIELD, type='e')
write_message("%s recids should be in %s. Currently %s are in %s" % (len(should_recids), set_spec, len(current_recids), set_spec), verbose=2)
to_add = should_recids - current_recids
write_message("%s recids should be added to %s" % (len(to_add), set_spec), verbose=2)
to_remove = current_recids - should_recids
write_message("%s recids should be removed from %s" % (len(to_remove), set_spec), verbose=2)
affected_recids = to_add | to_remove
write_message("%s recids should be hence updated for %s" % (len(affected_recids), set_spec), verbose=2)
all_affected_recids |= affected_recids
missing_oaiid = all_should_recids - recids_with_oaiid
write_message("%s recids are missing an oaiid" % len(missing_oaiid))
write_message("%s recids should no longer be exported" % len(no_more_exported_recids))
## Let's add records with missing OAI ID
all_affected_recids |= missing_oaiid | no_more_exported_recids
write_message("%s recids should updated" % (len(all_affected_recids)), verbose=2)
if not all_affected_recids:
write_message("Nothing to do!")
return True
# Prepare to save results in a tmp file
(fd, filename) = mkstemp(dir=CFG_TMPDIR,
prefix='oairepository_' + \
time.strftime("%Y%m%d_%H%M%S_",
time.localtime()))
oai_out = os.fdopen(fd, "w")
oai_out.write("<collection>")
tot = 0
# Iterate over the recids
for i, recid in enumerate(all_affected_recids):
task_sleep_now_if_required(can_stop_too=True)
task_update_progress("Done %s out of %s records." % \
(i, len(all_affected_recids)))
write_message("Elaborating recid %s" % recid, verbose=3)
record = get_record(recid)
if not record:
write_message("Record %s seems empty. Let's skip it." % recid, verbose=3)
continue
new_record = {}
# Check if an OAI identifier is already in the record or
# not.
assign_oai_id_entry = False
oai_id_entry = record_get_field_value(record, tag=CFG_OAI_ID_FIELD[:3], ind1=CFG_OAI_ID_FIELD[3], ind2=CFG_OAI_ID_FIELD[4], code=CFG_OAI_ID_FIELD[5])
if not oai_id_entry:
assign_oai_id_entry = True
oai_id_entry = "oai:%s:%s" % (CFG_OAI_ID_PREFIX, recid)
write_message("Setting new oai_id %s for record %s" % (oai_id_entry, recid), verbose=3)
else:
write_message("Already existing oai_id %s for record %s" % (oai_id_entry, recid), verbose=3)
# Get the sets to which this record already belongs according
# to the metadata
current_oai_sets = set(record_get_field_values(record, tag=CFG_OAI_SET_FIELD[:3], ind1=CFG_OAI_SET_FIELD[3], ind2=CFG_OAI_SET_FIELD[4], code=CFG_OAI_SET_FIELD[5]))
write_message("Record %s currently belongs to these oai_sets: %s" % (recid, ", ".join(current_oai_sets)), verbose=3)
current_previous_oai_sets = set(record_get_field_values(record, tag=CFG_OAI_PREVIOUS_SET_FIELD[:3], ind1=CFG_OAI_PREVIOUS_SET_FIELD[3], ind2=CFG_OAI_PREVIOUS_SET_FIELD[4], code=CFG_OAI_PREVIOUS_SET_FIELD[5]))
write_message("Record %s currently doesn't belong anymore to these oai_sets: %s" % (recid, ", ".join(current_previous_oai_sets)), verbose=3)
# Get the sets that should be in this record according to
# settings
updated_oai_sets = set(_set for _set, _recids in recids_for_set.iteritems()
if recid in _recids)
write_message("Record %s now belongs to these oai_sets: %s" % (recid, ", ".join(updated_oai_sets)), verbose=3)
updated_previous_oai_sets = set(_set for _set in (current_previous_oai_sets - updated_oai_sets) |
(current_oai_sets - updated_oai_sets))
write_message("Record %s now doesn't belong anymore to these oai_sets: %s" % (recid, ", ".join(updated_previous_oai_sets)), verbose=3)
# Ok, we have the old sets and the new sets. If they are equal
# and oai ID does not need to be added, then great, nothing to
# change . Otherwise apply the new sets.
if current_oai_sets == updated_oai_sets and not assign_oai_id_entry:
write_message("Nothing has changed for record %s, let's move on!" % recid, verbose=3)
continue # Jump to next recid
write_message("Something has changed for record %s, let's update it!" % recid, verbose=3)
subfields = [(CFG_OAI_ID_FIELD[5], oai_id_entry)]
for oai_set in updated_oai_sets:
subfields.append((CFG_OAI_SET_FIELD[5], oai_set))
for oai_set in updated_previous_oai_sets:
subfields.append((CFG_OAI_PREVIOUS_SET_FIELD[5], oai_set))
record_add_field(new_record, tag="001", controlfield_value=str(recid))
record_add_field(new_record, tag=CFG_OAI_ID_FIELD[:3], ind1=CFG_OAI_ID_FIELD[3], ind2=CFG_OAI_ID_FIELD[4], subfields=subfields)
oai_out.write(record_xml_output(new_record))
tot += 1
if tot == CFG_OAI_REPOSITORY_MARCXML_SIZE:
oai_out.write("</collection>")
oai_out.close()
write_message("Wrote to file %s" % filename)
if not no_upload:
if task_get_option("notimechange"):
task_low_level_submission('bibupload', 'oairepository', '-c', filename, '-n')
else:
task_low_level_submission('bibupload', 'oairepository', '-c', filename)
# Prepare to save results in a tmp file
(fd, filename) = mkstemp(dir=CFG_TMPDIR,
prefix='oairepository_' + \
time.strftime("%Y%m%d_%H%M%S_",
time.localtime()))
oai_out = os.fdopen(fd, "w")
oai_out.write("<collection>")
tot = 0
task_sleep_now_if_required(can_stop_too=True)
oai_out.write("</collection>")
oai_out.close()
write_message("Wrote to file %s" % filename)
if tot > 0:
if not no_upload:
task_sleep_now_if_required(can_stop_too=True)
if task_get_option("notimechange"):
task_low_level_submission('bibupload', 'oairepository', '-c', filename, '-n')
else:
task_low_level_submission('bibupload', 'oairepository', '-c', filename)
else:
os.remove(filename)
return True
#########################
def main():
"""Main that construct all the bibtask."""
# if there is any -r or --report option (or other similar options)
# in the arguments, just print the status and exit (do not run
# through BibSched...)
if (CFG_OAI_ID_FIELD[:5] != CFG_OAI_SET_FIELD[:5]) or \
(CFG_OAI_ID_FIELD[:5] != CFG_OAI_PREVIOUS_SET_FIELD[:5]):
print >> sys.stderr, """\
ERROR: since Invenio 1.0 the OAI ID and the OAI Set must be stored in the same
field. Please revise your configuration for the variables
CFG_OAI_ID_FIELD (currently set to %s)
CFG_OAI_SET_FIELD (currently set to %s)
CFG_OAI_PREVIOUS_SET_FIELD (currently set to %s)""" % (
CFG_OAI_ID_FIELD,
CFG_OAI_SET_FIELD,
CFG_OAI_PREVIOUS_SET_FIELD
)
sys.exit(1)
mode = -1
if '-d' in sys.argv[1:] or '--detailed-report' in sys.argv[1:]:
mode = 2
elif '-r' in sys.argv[1:] or '--report' in sys.argv[1:]:
mode = 1
if mode != -1:
def local_write_message(*args):
"""Overload BibTask function so that it does not need to
run in BibSched environment"""
sys.stdout.write(args[0] + '\n')
print_repository_status(local_write_message=local_write_message, verbose=mode)
return
task_init(authorization_action='runoairepository',
authorization_msg="OAI Archive Task Submission",
description="Examples:\n"
" Expose records according to sets defined in OAI Repository admin interface\n"
" $ oairepositoryupdater \n"
" Expose records according to sets defined in OAI Repository admin interface and update them every day\n"
" $ oairepositoryupdater -s24\n"
" Print OAI repository status\n"
" $ oairepositoryupdater -r\n"
" Print OAI repository detailed status\n"
" $ oairepositoryupdater -d\n\n",
help_specific_usage="Options:\n"
" -r --report\t\tOAI repository status\n"
" -d --detailed-report\t\tOAI repository detailed status\n"
" -n --no-process\tDo no upload the modifications\n"
" --notimechange\tDo not update record modification_date\n"
"NOTE: --notimechange should be used with care, basically only the first time a new set is added.",
specific_params=("rdn", [
"report",
"detailed-report",
"no-process",
"notimechange"]),
task_submit_elaborate_specific_parameter_fnc=
task_submit_elaborate_specific_parameter,
task_run_fnc=oairepositoryupdater_task)
def task_submit_elaborate_specific_parameter(key, _value, _opts, _args):
"""Elaborate specific CLI parameters of oairepositoryupdater"""
if key in ("-r", "--report"):
task_set_option("report", 1)
if key in ("-d", "--detailed-report"):
task_set_option("report", 2)
elif key in ("-n", "--no-process"):
task_set_option("no_upload", 1)
elif key in ("--notimechange",):
task_set_option("notimechange", 1)
else:
return False
return True
### okay, here we go:
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/refextract/api.py b/invenio/legacy/refextract/api.py
index 34520aa79..6ee210177 100644
--- a/invenio/legacy/refextract/api.py
+++ b/invenio/legacy/refextract/api.py
@@ -1,273 +1,273 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""This is where all the public API calls are accessible
This is the only file containing public calls and everything that is
present here can be considered private by the invenio modules.
"""
import os
from urllib import urlretrieve
from tempfile import mkstemp
from invenio.refextract_engine import parse_references, \
get_plaintext_document_body, \
parse_reference_line, \
get_kbs
from invenio.refextract_text import extract_references_from_fulltext
from invenio.bibindex_tokenizers.BibIndexJournalTokenizer import \
CFG_JOURNAL_PUBINFO_STANDARD_FORM, \
CFG_JOURNAL_TAG
from invenio.legacy.bibrecord import get_fieldvalues
-from invenio.bibdocfile import BibRecDocs, InvenioBibDocFileError
+from invenio.legacy.bibdocfile.api import BibRecDocs, InvenioBibDocFileError
from invenio.legacy.search_engine import get_record
-from invenio.bibtask import task_low_level_submission
+from invenio.legacy.bibsched.bibtask import task_low_level_submission
from invenio.legacy.bibrecord import record_delete_fields, record_xml_output, \
create_record, record_get_field_instances, record_add_fields, \
record_has_field
from invenio.refextract_find import get_reference_section_beginning, \
find_numeration_in_body
from invenio.refextract_text import rebuild_reference_lines
from invenio.refextract_config import CFG_REFEXTRACT_FILENAME
from invenio.config import CFG_TMPSHAREDDIR
class FullTextNotAvailable(Exception):
"""Raised when we cannot access the document text"""
class RecordHasReferences(Exception):
"""Raised when
* we asked to updated references for a record
* we explicitely asked for not overwriting references for this record
(via the appropriate function argument)
* the record has references thus we cannot update them
"""
def extract_references_from_url_xml(url):
"""Extract references from the pdf specified in the url
The single parameter is the path to the pdf.
It raises FullTextNotAvailable if the url gives a 404
The result is given in marcxml.
"""
filename, dummy = urlretrieve(url)
try:
try:
marcxml = extract_references_from_file_xml(filename)
except IOError, err:
if err.code == 404:
raise FullTextNotAvailable()
else:
raise
finally:
os.remove(filename)
return marcxml
def extract_references_from_file_xml(path, recid=1):
"""Extract references from a local pdf file
The single parameter is the path to the file
It raises FullTextNotAvailable if the file does not exist
The result is given in marcxml.
"""
if not os.path.isfile(path):
raise FullTextNotAvailable()
docbody, dummy = get_plaintext_document_body(path)
reflines, dummy, dummy = extract_references_from_fulltext(docbody)
if not len(reflines):
docbody, dummy = get_plaintext_document_body(path, keep_layout=True)
reflines, dummy, dummy = extract_references_from_fulltext(docbody)
return parse_references(reflines, recid=recid)
def extract_references_from_string_xml(source, is_only_references=True):
"""Extract references from a string
The single parameter is the document
The result is given in marcxml.
"""
docbody = source.split('\n')
if not is_only_references:
reflines, dummy, dummy = extract_references_from_fulltext(docbody)
else:
refs_info = get_reference_section_beginning(docbody)
if not refs_info:
refs_info, dummy = find_numeration_in_body(docbody)
refs_info['start_line'] = 0
refs_info['end_line'] = len(docbody) - 1,
reflines = rebuild_reference_lines(docbody, refs_info['marker_pattern'])
return parse_references(reflines)
def extract_references_from_record_xml(recid):
"""Extract references from a record id
The single parameter is the document
The result is given in marcxml.
"""
path = look_for_fulltext(recid)
if not path:
raise FullTextNotAvailable()
return extract_references_from_file_xml(path, recid=recid)
def replace_references(recid):
"""Replace references for a record
The record itself is not updated, the marc xml of the document with updated
references is returned
Parameters:
* recid: the id of the record
"""
# Parse references
references_xml = extract_references_from_record_xml(recid)
references = create_record(references_xml.encode('utf-8'))
# Record marc xml
record = get_record(recid)
if references[0]:
fields_to_add = record_get_field_instances(references[0],
tag='999',
ind1='%',
ind2='%')
# Replace 999 fields
record_delete_fields(record, '999')
record_add_fields(record, '999', fields_to_add)
# Update record references
out_xml = record_xml_output(record)
else:
out_xml = None
return out_xml
def update_references(recid, overwrite=True):
"""Update references for a record
First, we extract references from a record.
Then, we are not updating the record directly but adding a bibupload
task in -c mode which takes care of updating the record.
Parameters:
* recid: the id of the record
"""
if not overwrite:
# Check for references in record
record = get_record(recid)
if record and record_has_field(record, '999'):
raise RecordHasReferences('Record has references and overwrite ' \
'mode is disabled: %s' % recid)
if get_fieldvalues(recid, '999C59'):
raise RecordHasReferences('Record has been curated: %s' % recid)
# Parse references
references_xml = extract_references_from_record_xml(recid)
# Save new record to file
(temp_fd, temp_path) = mkstemp(prefix=CFG_REFEXTRACT_FILENAME,
dir=CFG_TMPSHAREDDIR)
temp_file = os.fdopen(temp_fd, 'w')
temp_file.write(references_xml.encode('utf-8'))
temp_file.close()
# Update record
task_low_level_submission('bibupload', 'refextract', '-P', '5',
'-c', temp_path)
def list_pdfs(recid):
rec_info = BibRecDocs(recid)
docs = rec_info.list_bibdocs()
for doc in docs:
for ext in ('pdf', 'pdfa', 'PDF'):
try:
yield doc.get_file(ext)
except InvenioBibDocFileError:
pass
def get_pdf_doc(recid):
try:
doc = list_pdfs(recid).next()
except StopIteration:
doc = None
return doc
def look_for_fulltext(recid):
doc = get_pdf_doc(recid)
path = None
if doc:
path = doc.get_full_path()
return path
def record_has_fulltext(recid):
"""Checks if we can access the fulltext for the given recid"""
path = look_for_fulltext(recid)
return path is not None
def search_from_reference(text):
"""Convert a raw reference to a search query
Called by the search engine to convert a raw reference:
find rawref John, JINST 4 (1994) 45
is converted to
journal:"JINST,4,45"
"""
field = ''
pattern = ''
kbs = get_kbs()
references, dummy_m, dummy_c, dummy_co = parse_reference_line(text, kbs)
for elements in references:
for el in elements:
if el['type'] == 'JOURNAL':
field = 'journal'
pattern = CFG_JOURNAL_PUBINFO_STANDARD_FORM \
.replace(CFG_JOURNAL_TAG.replace('%', 'p'), el['title']) \
.replace(CFG_JOURNAL_TAG.replace('%', 'v'), el['volume']) \
.replace(CFG_JOURNAL_TAG.replace('%', 'c'), el['page']) \
.replace(CFG_JOURNAL_TAG.replace('%', 'y'), el['year'])
break
elif el['type'] == 'REPORTNUMBER':
field = 'report'
pattern = el['report_num']
break
return field, pattern.encode('utf-8')
diff --git a/invenio/legacy/refextract/cli.py b/invenio/legacy/refextract/cli.py
index 43a58cdeb..ad0b00583 100644
--- a/invenio/legacy/refextract/cli.py
+++ b/invenio/legacy/refextract/cli.py
@@ -1,316 +1,316 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""This is file handles the command line interface
* We parse the options for both daemon and standalone usage
* When using using the standalone mode, we use the function "main"
defined here to begin the extraction of references
"""
__revision__ = "$Id$"
import traceback
import optparse
import sys
import os
from invenio.refextract_config import \
CFG_REFEXTRACT_XML_VERSION, \
CFG_REFEXTRACT_XML_COLLECTION_OPEN, \
CFG_REFEXTRACT_XML_COLLECTION_CLOSE
from invenio.docextract_utils import write_message, setup_loggers
-from invenio.bibtask import task_update_progress
+from invenio.legacy.bibsched.bibtask import task_update_progress
from invenio.refextract_api import extract_references_from_file_xml, \
extract_references_from_string_xml
# Is refextract running standalone? (Default = yes)
RUNNING_INDEPENDENTLY = False
DESCRIPTION = ""
# Help message, used by bibtask's 'task_init()' and 'usage()'
HELP_MESSAGE = """
-i, --inspire Output journal standard reference form in the INSPIRE
recognised format: [series]volume,page.
--kb-journals Manually specify the location of a journal title
knowledge-base file.
--kb-journals-re Manually specify the location of a journal title regexps
knowledge-base file.
--kb-report-numbers Manually specify the location of a report number
knowledge-base file.
--kb-authors Manually specify the location of an author
knowledge-base file.
--kb-books Manually specify the location of a book
knowledge-base file.
--no-overwrite Do not touch record if it already has references
Standalone Refextract options:
-o, --out Write the extracted references, in xml form, to a file
rather than standard output.
--dictfile Write statistics about all matched title abbreviations
(i.e. LHS terms in the titles knowledge base) to a file.
--output-raw-refs Output raw references, as extracted from the document.
No MARC XML mark-up - just each extracted line, prefixed
by the recid of the document that it came from.
--raw-references Treat the input file as pure references. i.e. skip the
stage of trying to locate the reference section within a
document and instead move to the stage of recognition
and standardisation of citations within lines.
"""
USAGE_MESSAGE = """Usage: docextract [options] file1 [file2 ...]
Command options: %s
Examples:
docextract -o /home/chayward/refs.xml /home/chayward/thesis.pdf
""" % HELP_MESSAGE
def get_cli_options():
"""Get the various arguments and options from the command line and populate
a dictionary of cli_options.
@return: (tuple) of 2 elements. First element is a dictionary of cli
options and flags, set as appropriate; Second element is a list of cli
arguments.
"""
parser = optparse.OptionParser(description=DESCRIPTION,
usage=USAGE_MESSAGE,
add_help_option=False)
# Display help and exit
parser.add_option('-h', '--help', action='store_true')
# Display version and exit
parser.add_option('-V', '--version', action='store_true')
# Output recognised journal titles in the Inspire compatible format
parser.add_option('-i', '--inspire', action='store_true')
# The location of the report number kb requested to override
# a 'configuration file'-specified kb
parser.add_option('--kb-report-numbers', dest='kb_report_numbers')
# The location of the journal title kb requested to override
# a 'configuration file'-specified kb, holding
# 'seek---replace' terms, used when matching titles in references
parser.add_option('--kb-journals', dest='kb_journals')
parser.add_option('--kb-journals-re', dest='kb_journals_re')
# The location of the author kb requested to override
parser.add_option('--kb-authors', dest='kb_authors')
# The location of the author kb requested to override
parser.add_option('--kb-books', dest='kb_books')
# The location of the author kb requested to override
parser.add_option('--kb-conferences', dest='kb_conferences')
# Write out the statistics of all titles matched during the
# extraction job to the specified file
parser.add_option('--dictfile')
# Write out MARC XML references to the specified file
parser.add_option('-o', '--out', dest='xmlfile')
# Handle verbosity
parser.add_option('-v', '--verbose', type=int, dest='verbosity', default=0)
# Output a raw list of refs
parser.add_option('--output-raw-refs', action='store_true',
dest='output_raw')
# Treat input as pure reference lines:
# (bypass the reference section lookup)
parser.add_option('--raw-references', action='store_true',
dest='treat_as_reference_section')
return parser.parse_args()
def halt(err=StandardError, msg=None, exit_code=1):
""" Stop extraction, and deal with the error in the appropriate
manner, based on whether Refextract is running in standalone or
bibsched mode.
@param err: (exception) The exception raised from an error, if any
@param msg: (string) The brief error message, either displayed
on the bibsched interface, or written to stderr.
@param exit_code: (integer) Either 0 or 1, depending on the cause
of the halting. This is only used when running standalone."""
# If refextract is running independently, exit.
# 'RUNNING_INDEPENDENTLY' is a global variable
if RUNNING_INDEPENDENTLY:
if msg:
write_message(msg, stream=sys.stderr, verbose=0)
sys.exit(exit_code)
# Else, raise an exception so Bibsched will flag this task.
else:
if msg:
# Update the status of refextract inside the Bibsched UI
task_update_progress(msg.strip())
raise err(msg)
def usage(wmsg=None, err_code=0):
"""Display a usage message for refextract on the standard error stream and
then exit.
@param wmsg: (string) some kind of brief warning message for the user.
@param err_code: (integer) an error code to be passed to halt,
which is called after the usage message has been printed.
@return: None.
"""
if wmsg:
wmsg = wmsg.strip()
# Display the help information and the warning in the stderr stream
# 'help_message' is global
print >> sys.stderr, USAGE_MESSAGE
# Output error message, either to the stderr stream also or
# on the interface. Stop the extraction procedure
halt(msg=wmsg, exit_code=err_code)
def main(config, args, run):
"""Main wrapper function for begin_extraction, and is
always accessed in a standalone/independent way. (i.e. calling main
will cause refextract to run in an independent mode)"""
# Flag as running out of bibtask
global RUNNING_INDEPENDENTLY
RUNNING_INDEPENDENTLY = True
if config.verbosity not in range(0, 10):
usage("Error: Verbosity must be an integer between 0 and 10")
setup_loggers(config.verbosity)
if config.version:
# version message and exit
write_message(__revision__, verbose=0)
halt(exit_code=0)
if config.help:
usage()
if not args:
# no files provided for reference extraction - error message
usage("Error: No valid input file specified (file1 [file2 ...])")
try:
run(config, args)
write_message("Extraction complete", verbose=2)
except StandardError, e:
# Remove extra '\n'
write_message(traceback.format_exc()[:-1], verbose=9)
write_message("Error: %s" % e, verbose=0)
halt(exit_code=1)
def extract_one(config, pdf_path):
"""Extract references from one file"""
# the document body is not empty:
# 2. If necessary, locate the reference section:
if config.treat_as_reference_section:
docbody = open(pdf_path).read().decode('utf-8')
out = extract_references_from_string_xml(docbody)
else:
write_message("* processing pdffile: %s" % pdf_path, verbose=2)
out = extract_references_from_file_xml(pdf_path)
return out
def begin_extraction(config, files):
"""Starts the core extraction procedure. [Entry point from main]
Only refextract_daemon calls this directly, from _task_run_core()
@param daemon_cli_options: contains the pre-assembled list of cli flags
and values processed by the Refextract Daemon. This is full only when
called as a scheduled bibtask inside bibsched.
"""
# Store xml records here
output = []
for num, path in enumerate(files):
# Announce the document extraction number
write_message("Extracting %d of %d" % (num + 1, len(files)),
verbose=1)
out = extract_one(config, path)
output.append(out)
# Write our references
write_references(config, output)
def write_references(config, xml_references):
"""Write marcxml to file
* Output xml header
* Output collection opening tag
* Output xml for each record
* Output collection closing tag
"""
if config.xmlfile:
ofilehdl = open(config.xmlfile, 'w')
else:
ofilehdl = sys.stdout
try:
print >>ofilehdl, CFG_REFEXTRACT_XML_VERSION.encode("utf-8")
print >>ofilehdl, CFG_REFEXTRACT_XML_COLLECTION_OPEN.encode("utf-8")
for out in xml_references:
print >>ofilehdl, out.encode("utf-8")
print >>ofilehdl, CFG_REFEXTRACT_XML_COLLECTION_CLOSE.encode("utf-8")
ofilehdl.flush()
except IOError, err:
write_message("%s\n%s\n" % (config.xmlfile, err), \
sys.stderr, verbose=0)
halt(err=IOError, msg="Error: Unable to write to '%s'" \
% config.xmlfile, exit_code=1)
if config.xmlfile:
ofilehdl.close()
# limit m tag data to something less than infinity
limit_m_tags(config.xmlfile, 2048)
def limit_m_tags(xml_file, length_limit):
"""Limit size of miscellaneous tags"""
temp_xml_file = xml_file + '.temp'
try:
ofilehdl = open(xml_file, 'r')
except IOError:
write_message("***%s\n" % xml_file, verbose=0)
raise IOError("Error: Unable to read from '%s'" % xml_file)
try:
nfilehdl = open(temp_xml_file, 'w')
except IOError:
write_message("***%s\n" % temp_xml_file, verbose=0)
raise IOError("Error: Unable to write to '%s'" % temp_xml_file)
for line in ofilehdl:
line_dec = line.decode("utf-8")
start_ind = line_dec.find('<subfield code="m">')
if start_ind != -1:
# This line is an "m" line:
last_ind = line_dec.find('</subfield>')
if last_ind != -1:
# This line contains the end-tag for the "m" section
leng = last_ind - start_ind - 19
if leng > length_limit:
# want to truncate on a blank to avoid problems..
end = start_ind + 19 + length_limit
for lett in range(end - 1, last_ind):
xx = line_dec[lett:lett+1]
if xx == ' ':
break
else:
end += 1
middle = line_dec[start_ind+19:end-1]
line_dec = start_ind * ' ' + '<subfield code="m">' + \
middle + ' !Data truncated! ' + '</subfield>\n'
nfilehdl.write("%s" % line_dec.encode("utf-8"))
nfilehdl.close()
# copy back to original file name
os.rename(temp_xml_file, xml_file)
diff --git a/invenio/legacy/refextract/task.py b/invenio/legacy/refextract/task.py
index ad40e8715..d95fdd64f 100644
--- a/invenio/legacy/refextract/task.py
+++ b/invenio/legacy/refextract/task.py
@@ -1,239 +1,239 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Refextract task
Sends references to parse through bibsched
"""
import sys
-from invenio.bibtask import task_init, task_set_option, \
+from invenio.legacy.bibsched.bibtask import task_init, task_set_option, \
task_get_option, write_message
from invenio.config import CFG_VERSION, \
CFG_SITE_SECURE_URL, \
CFG_BIBCATALOG_SYSTEM, \
CFG_REFEXTRACT_TICKET_QUEUE
from invenio.legacy.search_engine import perform_request_search
# Help message is the usage() print out of how to use Refextract
from invenio.refextract_cli import HELP_MESSAGE, DESCRIPTION
from invenio.refextract_api import update_references, \
FullTextNotAvailable, \
RecordHasReferences
from invenio.docextract_task import task_run_core_wrapper, split_ids
-from invenio.bibcatalog_system_rt import BibCatalogSystemRT
-from invenio.bibedit_utils import get_bibrecord
+from invenio.legacy.bibcatalog.system_rt import BibCatalogSystemRT
+from invenio.legacy.bibedit.utils import get_bibrecord
from invenio.legacy.bibrecord import record_get_field_instances, \
field_get_subfield_values
def check_options():
""" Reimplement this method for having the possibility to check options
before submitting the task, in order for example to provide default
values. It must return False if there are errors in the options.
"""
if not task_get_option('new') \
and not task_get_option('modified') \
and not task_get_option('recids') \
and not task_get_option('collections') \
and not task_get_option('arxiv'):
print >>sys.stderr, 'Error: No records specified, you need' \
' to specify which files to run on'
return False
return True
def cb_parse_option(key, value, opts, args):
""" Must be defined for bibtask to create a task """
if args and len(args) > 0:
# There should be no standalone arguments for any refextract job
# This will catch args before the job is shipped to Bibsched
raise StandardError("Error: Unrecognised argument '%s'." % args[0])
if key in ('-a', '--new'):
task_set_option('new', True)
task_set_option('no-overwrite', True)
elif key in ('-m', '--modified'):
task_set_option('modified', True)
task_set_option('no-overwrite', True)
elif key in ('-i', '--inspire', ):
task_set_option('inspire', True)
elif key in ('--kb-reports', ):
task_set_option('kb-reports', value)
elif key in ('--kb-journals', ):
task_set_option('kb-journals', value)
elif key in ('--kb-journals-re', ):
task_set_option('kb-journals-re', value)
elif key in ('--kb-authors', ):
task_set_option('kb-authors', value)
elif key in ('--kb-books', ):
task_set_option('kb-books', value)
elif key in ('--kb-conferences', ):
task_set_option('kb-conferences', value)
elif key in ('--create-ticket', ):
task_set_option('create-ticket', True)
elif key in ('--no-overwrite', ):
task_set_option('no-overwrite', True)
elif key in ('--arxiv'):
task_set_option('arxiv', True)
elif key in ('-c', '--collections'):
collections = task_get_option('collections')
if not collections:
collections = set()
task_set_option('collections', collections)
for v in value.split(","):
collections.update(perform_request_search(c=v))
elif key in ('-r', '--recids'):
recids = task_get_option('recids')
if not recids:
recids = set()
task_set_option('recids', recids)
recids.update(split_ids(value))
return True
def create_ticket(recid, bibcatalog_system, queue=CFG_REFEXTRACT_TICKET_QUEUE):
write_message('bibcatalog_system %s' % bibcatalog_system, verbose=1)
write_message('queue %s' % queue, verbose=1)
if bibcatalog_system and queue:
subject = "Refs for #%s" % recid
# Add report number in the subjecet
report_number = ""
record = get_bibrecord(recid)
in_hep = False
for collection_tag in record_get_field_instances(record, "980"):
for collection in field_get_subfield_values(collection_tag, 'a'):
if collection == 'HEP':
in_hep = True
# Only create tickets for HEP
if not in_hep:
write_message("not in hep", verbose=1)
return
for report_tag in record_get_field_instances(record, "037"):
for category in field_get_subfield_values(report_tag, 'c'):
if category.startswith('astro-ph'):
write_message("astro-ph", verbose=1)
# We do not curate astro-ph
return
for report_number in field_get_subfield_values(report_tag, 'a'):
subject += " " + report_number
break
text = '%s/record/edit/#state=edit&recid=%s' % (CFG_SITE_SECURE_URL, \
recid)
bibcatalog_system.ticket_submit(subject=subject,
queue=queue,
text=text,
recordid=recid)
def task_run_core(recid, bibcatalog_system=None, _arxiv=False):
if _arxiv:
overwrite = True
else:
overwrite = not task_get_option('no-overwrite')
try:
update_references(recid,
overwrite=overwrite)
msg = "Extracted references for %s" % recid
if overwrite:
write_message("%s (overwrite)" % msg)
else:
write_message(msg)
# Create a RT ticket if necessary
if not _arxiv and task_get_option('new') \
or task_get_option('create-ticket'):
write_message("Checking if we should create a ticket", verbose=1)
create_ticket(recid, bibcatalog_system)
except FullTextNotAvailable:
write_message("No full text available for %s" % recid)
except RecordHasReferences:
write_message("Record %s has references, skipping" % recid)
def main():
"""Constructs the refextract bibtask."""
if CFG_BIBCATALOG_SYSTEM == 'RT':
bibcatalog_system = BibCatalogSystemRT()
else:
bibcatalog_system = None
extra_vars = {'bibcatalog_system': bibcatalog_system}
# Build and submit the task
task_init(authorization_action='runrefextract',
authorization_msg="Refextract Task Submission",
description=DESCRIPTION,
# get the global help_message variable imported from refextract.py
help_specific_usage=HELP_MESSAGE + """
Scheduled (daemon) options:
-a, --new Run on all newly inserted records.
-m, --modified Run on all newly modified records.
-r, --recids Record id for extraction.
-c, --collections Entire Collection for extraction.
--arxiv All arxiv modified records within last week
Special (daemon) options:
--create-ticket Create a RT ticket for record references
Examples:
(run a daemon job)
refextract -a
(run on a set of records)
refextract --recids 1,2 -r 3
(run on a collection)
refextract --collections "Reports"
(run as standalone)
refextract -o /home/chayward/refs.xml /home/chayward/thesis.pdf
""",
version="Invenio v%s" % CFG_VERSION,
specific_params=("hVv:x:r:c:nai",
["help",
"version",
"verbose=",
"inspire",
"kb-journals=",
"kb-journals-re=",
"kb-report-numbers=",
"kb-authors=",
"kb-books=",
"recids=",
"collections=",
"new",
"modified",
"no-overwrite",
"arxiv",
"create-ticket"]),
task_submit_elaborate_specific_parameter_fnc=cb_parse_option,
task_submit_check_options_fnc=check_options,
task_run_fnc=task_run_core_wrapper('refextract',
task_run_core,
extra_vars=extra_vars))
diff --git a/invenio/legacy/search_engine/__init__.py b/invenio/legacy/search_engine/__init__.py
index 62f1ba465..7079cb6b7 100644
--- a/invenio/legacy/search_engine/__init__.py
+++ b/invenio/legacy/search_engine/__init__.py
@@ -1,6701 +1,6701 @@
# -*- coding: utf-8 -*-
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0301
"""Invenio Search Engine in mod_python."""
__lastupdated__ = """$Date$"""
__revision__ = "$Id$"
## import general modules:
import cgi
import cStringIO
import copy
import string
import os
import re
import time
import urllib
import urlparse
import zlib
import sys
try:
## import optional module:
import numpy
CFG_NUMPY_IMPORTABLE = True
except:
CFG_NUMPY_IMPORTABLE = False
if sys.hexversion < 0x2040000:
# pylint: disable=W0622
from sets import Set as set
# pylint: enable=W0622
## import Invenio stuff:
from invenio.base.globals import cfg
from invenio.config import \
CFG_CERN_SITE, \
CFG_INSPIRE_SITE, \
CFG_OAI_ID_FIELD, \
CFG_WEBCOMMENT_ALLOW_REVIEWS, \
CFG_WEBSEARCH_CALL_BIBFORMAT, \
CFG_WEBSEARCH_CREATE_SIMILARLY_NAMED_AUTHORS_LINK_BOX, \
CFG_WEBSEARCH_FIELDS_CONVERT, \
CFG_WEBSEARCH_NB_RECORDS_TO_SORT, \
CFG_WEBSEARCH_SEARCH_CACHE_SIZE, \
CFG_WEBSEARCH_SEARCH_CACHE_TIMEOUT, \
CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS, \
CFG_WEBSEARCH_USE_ALEPH_SYSNOS, \
CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS, \
CFG_WEBSEARCH_FULLTEXT_SNIPPETS, \
CFG_WEBSEARCH_DISPLAY_NEAREST_TERMS, \
CFG_WEBSEARCH_WILDCARD_LIMIT, \
CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE, \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG, \
CFG_BIBRANK_SHOW_DOWNLOAD_GRAPHS, \
CFG_WEBSEARCH_SYNONYM_KBRS, \
CFG_SITE_LANG, \
CFG_SITE_NAME, \
CFG_LOGDIR, \
CFG_SITE_URL, \
CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS, \
CFG_SOLR_URL, \
CFG_WEBSEARCH_DETAILED_META_FORMAT, \
CFG_SITE_RECORD, \
CFG_WEBSEARCH_PREV_NEXT_HIT_LIMIT, \
CFG_WEBSEARCH_VIEWRESTRCOLL_POLICY, \
CFG_BIBSORT_BUCKETS, \
CFG_XAPIAN_ENABLED, \
CFG_BIBINDEX_CHARS_PUNCTUATION
from invenio.modules.search.errors import \
InvenioWebSearchUnknownCollectionError, \
InvenioWebSearchWildcardLimitError
from invenio.legacy.bibrecord import get_fieldvalues, get_fieldvalues_alephseq_like
from invenio.legacy.bibrecord import create_record, record_xml_output
from invenio.legacy.bibrank.record_sorter import get_bibrank_methods, is_method_valid, rank_records as rank_records_bibrank
from invenio.legacy.bibrank.downloads_similarity import register_page_view_event, calculate_reading_similarity_list
-from invenio.bibindex_engine_stemmer import stem
+from invenio.legacy.bibindex.engine_stemmer import stem
from invenio.bibindex_tokenizers.BibIndexDefaultTokenizer import BibIndexDefaultTokenizer
from invenio.bibindex_tokenizers.BibIndexCJKTokenizer import BibIndexCJKTokenizer, is_there_any_CJK_character_in_text
-from invenio.bibindex_engine_utils import author_name_requires_phrase_search
-from invenio.bibindex_engine_washer import wash_index_term, lower_index_term, wash_author_name
-from invenio.bibindex_engine_config import CFG_BIBINDEX_SYNONYM_MATCH_TYPE
-from invenio.bibindex_engine_utils import get_idx_indexer
+from invenio.legacy.bibindex.engine_utils import author_name_requires_phrase_search
+from invenio.legacy.bibindex.engine_washer import wash_index_term, lower_index_term, wash_author_name
+from invenio.legacy.bibindex.engine_config import CFG_BIBINDEX_SYNONYM_MATCH_TYPE
+from invenio.legacy.bibindex.adminlib import get_idx_indexer
from invenio.modules.formatter import format_record, format_records, get_output_format_content_type, create_excel
from invenio.modules.formatter.config import CFG_BIBFORMAT_USE_OLD_BIBFORMAT
from invenio.legacy.bibrank.downloads_grapher import create_download_history_graph_and_box
from invenio.modules.knowledge.api import get_kbr_values
from invenio.legacy.miscutil.data_cacher import DataCacher
from invenio.legacy.websearch_external_collections import print_external_results_overview, perform_external_collection_search
from invenio.modules.access.control import acc_get_action_id
from invenio.modules.access.local_config import VIEWRESTRCOLL, \
CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS, \
CFG_ACC_GRANT_VIEWER_RIGHTS_TO_EMAILS_IN_TAGS
-from invenio.websearchadminlib import get_detailed_page_tabs, get_detailed_page_tabs_counts
+from invenio.legacy.websearch.adminlib import get_detailed_page_tabs, get_detailed_page_tabs_counts
from invenio.intbitset import intbitset
from invenio.legacy.dbquery import DatabaseError, deserialize_via_marshal, InvenioDbQueryWildcardLimitError
from invenio.modules.access.engine import acc_authorize_action
from invenio.ext.logging import register_exception
from invenio.ext.cache import cache
from invenio.utils.text import encode_for_xml, wash_for_utf8, strip_accents
from invenio.utils.html import get_mathjax_header
from invenio.utils.html import nmtoken_from_string
import invenio.legacy.template
webstyle_templates = invenio.legacy.template.load('webstyle')
webcomment_templates = invenio.legacy.template.load('webcomment')
from invenio.legacy.bibrank.citation_searcher import calculate_cited_by_list, \
calculate_co_cited_with_list, get_records_with_num_cites, get_self_cited_by, \
get_refersto_hitset, get_citedby_hitset
from invenio.legacy.bibrank.citation_grapher import create_citation_history_graph_and_box
from invenio.legacy.dbquery import run_sql, run_sql_with_limit, wash_table_column_name, \
get_table_update_time
from invenio.legacy.webuser import getUid, collect_user_info, session_param_set
from invenio.legacy.webpage import pageheaderonly, pagefooteronly, create_error_box, write_warning
from invenio.base.i18n import gettext_set_language
from invenio.legacy.search_engine.query_parser import SearchQueryParenthesisedParser, \
SpiresToInvenioSyntaxConverter
from invenio.utils import apache
-from invenio.solrutils_bibindex_searcher import solr_get_bitset
-from invenio.xapianutils_bibindex_searcher import xapian_get_bitset
+from invenio.legacy.miscutil.solrutils_bibindex_searcher import solr_get_bitset
+from invenio.legacy.miscutil.xapianutils_bibindex_searcher import xapian_get_bitset
try:
import invenio.legacy.template
websearch_templates = invenio.legacy.template.load('websearch')
except:
pass
from invenio.legacy.websearch_external_collections import calculate_hosted_collections_results, do_calculate_hosted_collections_results
from invenio.legacy.websearch_external_collections.websearch_external_collections_config import CFG_HOSTED_COLLECTION_TIMEOUT_ANTE_SEARCH
from invenio.legacy.websearch_external_collections.websearch_external_collections_config import CFG_HOSTED_COLLECTION_TIMEOUT_POST_SEARCH
from invenio.legacy.websearch_external_collections.websearch_external_collections_config import CFG_EXTERNAL_COLLECTION_MAXRESULTS
VIEWRESTRCOLL_ID = acc_get_action_id(VIEWRESTRCOLL)
## global vars:
cfg_nb_browse_seen_records = 100 # limit of the number of records to check when browsing certain collection
cfg_nicely_ordered_collection_list = 0 # do we propose collection list nicely ordered or alphabetical?
## precompile some often-used regexp for speed reasons:
re_word = re.compile('[\s]')
re_quotes = re.compile('[\'\"]')
re_doublequote = re.compile('\"')
re_logical_and = re.compile('\sand\s', re.I)
re_logical_or = re.compile('\sor\s', re.I)
re_logical_not = re.compile('\snot\s', re.I)
re_operators = re.compile(r'\s([\+\-\|])\s')
re_pattern_wildcards_after_spaces = re.compile(r'(\s)[\*\%]+')
re_pattern_single_quotes = re.compile("'(.*?)'")
re_pattern_double_quotes = re.compile("\"(.*?)\"")
re_pattern_parens_quotes = re.compile(r'[\'\"]{1}[^\'\"]*(\([^\'\"]*\))[^\'\"]*[\'\"]{1}')
re_pattern_regexp_quotes = re.compile("\/(.*?)\/")
re_pattern_spaces_after_colon = re.compile(r'(:\s+)')
re_pattern_short_words = re.compile(r'([\s\"]\w{1,3})[\*\%]+')
re_pattern_space = re.compile("__SPACE__")
re_pattern_today = re.compile("\$TODAY\$")
re_pattern_parens = re.compile(r'\([^\)]+\s+[^\)]+\)')
re_punctuation_followed_by_space = re.compile(CFG_BIBINDEX_CHARS_PUNCTUATION + '\s')
## em possible values
EM_REPOSITORY={"body" : "B",
"header" : "H",
"footer" : "F",
"search_box" : "S",
"see_also_box" : "L",
"basket" : "K",
"alert" : "A",
"search_info" : "I",
"overview" : "O",
"all_portalboxes" : "P",
"te_portalbox" : "Pte",
"tp_portalbox" : "Ptp",
"np_portalbox" : "Pnp",
"ne_portalbox" : "Pne",
"lt_portalbox" : "Plt",
"rt_portalbox" : "Prt"};
class RestrictedCollectionDataCacher(DataCacher):
def __init__(self):
def cache_filler():
ret = []
try:
res = run_sql("""SELECT DISTINCT ar.value
FROM accROLE_accACTION_accARGUMENT raa JOIN accARGUMENT ar ON raa.id_accARGUMENT = ar.id
WHERE ar.keyword = 'collection' AND raa.id_accACTION = %s""", (VIEWRESTRCOLL_ID,), run_on_slave=True)
except Exception:
# database problems, return empty cache
return []
for coll in res:
ret.append(coll[0])
return ret
def timestamp_verifier():
return max(get_table_update_time('accROLE_accACTION_accARGUMENT'), get_table_update_time('accARGUMENT'))
DataCacher.__init__(self, cache_filler, timestamp_verifier)
def collection_restricted_p(collection, recreate_cache_if_needed=True):
if recreate_cache_if_needed:
restricted_collection_cache.recreate_cache_if_needed()
return collection in restricted_collection_cache.cache
try:
restricted_collection_cache.is_ok_p
except Exception:
restricted_collection_cache = RestrictedCollectionDataCacher()
def ziplist(*lists):
"""Just like zip(), but returns lists of lists instead of lists of tuples
Example:
zip([f1, f2, f3], [p1, p2, p3], [op1, op2, '']) =>
[(f1, p1, op1), (f2, p2, op2), (f3, p3, '')]
ziplist([f1, f2, f3], [p1, p2, p3], [op1, op2, '']) =>
[[f1, p1, op1], [f2, p2, op2], [f3, p3, '']]
FIXME: This is handy to have, and should live somewhere else, like
miscutil.really_useful_functions or something.
XXX: Starting in python 2.6, the same can be achieved (faster) by
using itertools.izip_longest(); when the minimum recommended Python
is bumped, we should use that instead.
"""
def l(*items):
return list(items)
return map(l, *lists)
def get_permitted_restricted_collections(user_info, recreate_cache_if_needed=True):
"""Return a list of collection that are restricted but for which the user
is authorized."""
if recreate_cache_if_needed:
restricted_collection_cache.recreate_cache_if_needed()
ret = []
for collection in restricted_collection_cache.cache:
if acc_authorize_action(user_info, 'viewrestrcoll', collection=collection)[0] == 0:
ret.append(collection)
return ret
def get_all_restricted_recids():
"""
Return the set of all the restricted recids, i.e. the ids of those records
which belong to at least one restricted collection.
"""
ret = intbitset()
for collection in restricted_collection_cache.cache:
ret |= get_collection_reclist(collection)
return ret
def get_restricted_collections_for_recid(recid, recreate_cache_if_needed=True):
"""
Return the list of restricted collection names to which recid belongs.
"""
if recreate_cache_if_needed:
restricted_collection_cache.recreate_cache_if_needed()
collection_reclist_cache.recreate_cache_if_needed()
return [collection for collection in restricted_collection_cache.cache if recid in get_collection_reclist(collection, recreate_cache_if_needed=False)]
def is_user_owner_of_record(user_info, recid):
"""
Check if the user is owner of the record, i.e. he is the submitter
and/or belongs to a owner-like group authorized to 'see' the record.
@param user_info: the user_info dictionary that describe the user.
@type user_info: user_info dictionary
@param recid: the record identifier.
@type recid: positive integer
@return: True if the user is 'owner' of the record; False otherwise
@rtype: bool
"""
authorized_emails_or_group = []
for tag in CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS:
authorized_emails_or_group.extend(get_fieldvalues(recid, tag))
for email_or_group in authorized_emails_or_group:
if email_or_group in user_info['group']:
return True
email = email_or_group.strip().lower()
if user_info['email'].strip().lower() == email:
return True
return False
###FIXME: This method needs to be refactorized
def is_user_viewer_of_record(user_info, recid):
"""
Check if the user is allow to view the record based in the marc tags
inside CFG_ACC_GRANT_VIEWER_RIGHTS_TO_EMAILS_IN_TAGS
i.e. his email is inside the 506__m tag or he is inside an e-group listed
in the 506__m tag
@param user_info: the user_info dictionary that describe the user.
@type user_info: user_info dictionary
@param recid: the record identifier.
@type recid: positive integer
@return: True if the user is 'allow to view' the record; False otherwise
@rtype: bool
"""
authorized_emails_or_group = []
for tag in CFG_ACC_GRANT_VIEWER_RIGHTS_TO_EMAILS_IN_TAGS:
authorized_emails_or_group.extend(get_fieldvalues(recid, tag))
for email_or_group in authorized_emails_or_group:
if email_or_group in user_info['group']:
return True
email = email_or_group.strip().lower()
if user_info['email'].strip().lower() == email:
return True
return False
def check_user_can_view_record(user_info, recid):
"""
Check if the user is authorized to view the given recid. The function
grants access in two cases: either user has author rights on this
record, or he has view rights to the primary collection this record
belongs to.
@param user_info: the user_info dictionary that describe the user.
@type user_info: user_info dictionary
@param recid: the record identifier.
@type recid: positive integer
@return: (0, ''), when authorization is granted, (>0, 'message') when
authorization is not granted
@rtype: (int, string)
"""
policy = CFG_WEBSEARCH_VIEWRESTRCOLL_POLICY.strip().upper()
if isinstance(recid, str):
recid = int(recid)
## At this point, either webcoll has not yet run or there are some
## restricted collections. Let's see first if the user own the record.
if is_user_owner_of_record(user_info, recid):
## Perfect! It's authorized then!
return (0, '')
if is_user_viewer_of_record(user_info, recid):
## Perfect! It's authorized then!
return (0, '')
restricted_collections = get_restricted_collections_for_recid(recid, recreate_cache_if_needed=False)
if not restricted_collections and record_public_p(recid):
## The record is public and not part of any restricted collection
return (0, '')
if restricted_collections:
## If there are restricted collections the user must be authorized to all/any of them (depending on the policy)
auth_code, auth_msg = 0, ''
for collection in restricted_collections:
(auth_code, auth_msg) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=collection)
if auth_code and policy != 'ANY':
## Ouch! the user is not authorized to this collection
return (auth_code, auth_msg)
elif auth_code == 0 and policy == 'ANY':
## Good! At least one collection is authorized
return (0, '')
## Depending on the policy, the user will be either authorized or not
return auth_code, auth_msg
if is_record_in_any_collection(recid, recreate_cache_if_needed=False):
## the record is not in any restricted collection
return (0, '')
elif record_exists(recid) > 0:
## We are in the case where webcoll has not run.
## Let's authorize SUPERADMIN
(auth_code, auth_msg) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=None)
if auth_code == 0:
return (0, '')
else:
## Too bad. Let's print a nice message:
return (1, """The record you are trying to access has just been
submitted to the system and needs to be assigned to the
proper collections. It is currently restricted for security reasons
until the assignment will be fully completed. Please come back later to
properly access this record.""")
else:
## The record either does not exists or has been deleted.
## Let's handle these situations outside of this code.
return (0, '')
class IndexStemmingDataCacher(DataCacher):
"""
Provides cache for stemming information for word/phrase indexes.
This class is not to be used directly; use function
get_index_stemming_language() instead.
"""
def __init__(self):
def cache_filler():
try:
res = run_sql("""SELECT id, stemming_language FROM idxINDEX""")
except DatabaseError:
# database problems, return empty cache
return {}
return dict(res)
def timestamp_verifier():
return get_table_update_time('idxINDEX')
DataCacher.__init__(self, cache_filler, timestamp_verifier)
try:
index_stemming_cache.is_ok_p
except Exception:
index_stemming_cache = IndexStemmingDataCacher()
def get_index_stemming_language(index_id, recreate_cache_if_needed=True):
"""Return stemming langugage for given index."""
if recreate_cache_if_needed:
index_stemming_cache.recreate_cache_if_needed()
return index_stemming_cache.cache[index_id]
class FieldTokenizerDataCacher(DataCacher):
"""
Provides cache for tokenizer information for fields corresponding to indexes.
This class is not to be used directly; use function
get_field_tokenizer_type() instead.
"""
def __init__(self):
def cache_filler():
try:
res = run_sql("""SELECT fld.code, ind.tokenizer FROM idxINDEX AS ind, field AS fld, idxINDEX_field AS indfld WHERE ind.id = indfld.id_idxINDEX AND indfld.id_field = fld.id""")
except DatabaseError:
# database problems, return empty cache
return {}
return dict(res)
def timestamp_verifier():
return get_table_update_time('idxINDEX')
DataCacher.__init__(self, cache_filler, timestamp_verifier)
try:
field_tokenizer_cache.is_ok_p
except Exception:
field_tokenizer_cache = FieldTokenizerDataCacher()
def get_field_tokenizer_type(field_name, recreate_cache_if_needed=True):
"""Return tokenizer type for given field corresponding to an index if applicable."""
if recreate_cache_if_needed:
field_tokenizer_cache.recreate_cache_if_needed()
tokenizer = None
try:
tokenizer = field_tokenizer_cache.cache[field_name]
except KeyError:
return None
return tokenizer
class CollectionRecListDataCacher(DataCacher):
"""
Provides cache for collection reclist hitsets. This class is not
to be used directly; use function get_collection_reclist() instead.
"""
def __init__(self):
def cache_filler():
ret = {}
try:
res = run_sql("SELECT name FROM collection")
except Exception:
# database problems, return empty cache
return {}
for name in res:
ret[name[0]] = None # this will be filled later during runtime by calling get_collection_reclist(coll)
return ret
def timestamp_verifier():
return get_table_update_time('collection')
DataCacher.__init__(self, cache_filler, timestamp_verifier)
try:
if not collection_reclist_cache.is_ok_p:
raise Exception
except Exception:
collection_reclist_cache = CollectionRecListDataCacher()
def get_collection_reclist(coll, recreate_cache_if_needed=True):
"""Return hitset of recIDs that belong to the collection 'coll'."""
if recreate_cache_if_needed:
collection_reclist_cache.recreate_cache_if_needed()
if coll not in collection_reclist_cache.cache:
return intbitset() # collection does not exist; return empty set
if not collection_reclist_cache.cache[coll]:
# collection's reclist not in the cache yet, so calculate it
# and fill the cache:
reclist = intbitset()
query = "SELECT nbrecs,reclist FROM collection WHERE name=%s"
res = run_sql(query, (coll, ), 1)
if res:
try:
reclist = intbitset(res[0][1])
except:
pass
collection_reclist_cache.cache[coll] = reclist
# finally, return reclist:
return collection_reclist_cache.cache[coll]
def get_available_output_formats(visible_only=False):
"""
Return the list of available output formats. When visible_only is
True, returns only those output formats that have visibility flag
set to 1.
"""
formats = []
query = "SELECT code,name FROM format"
if visible_only:
query += " WHERE visibility='1'"
query += " ORDER BY name ASC"
res = run_sql(query)
if res:
# propose found formats:
for code, name in res:
formats.append({ 'value' : code,
'text' : name
})
else:
formats.append({'value' : 'hb',
'text' : "HTML brief"
})
return formats
# Flask cache for search results.
from invenio.modules.search.cache import search_results_cache, get_search_results_cache_key
class CollectionI18nNameDataCacher(DataCacher):
"""
Provides cache for I18N collection names. This class is not to be
used directly; use function get_coll_i18nname() instead.
"""
def __init__(self):
def cache_filler():
ret = {}
try:
res = run_sql("SELECT c.name,cn.ln,cn.value FROM collectionname AS cn, collection AS c WHERE cn.id_collection=c.id AND cn.type='ln'") # ln=long name
except Exception:
# database problems
return {}
for c, ln, i18nname in res:
if i18nname:
if not ret.has_key(c):
ret[c] = {}
ret[c][ln] = i18nname
return ret
def timestamp_verifier():
return get_table_update_time('collectionname')
DataCacher.__init__(self, cache_filler, timestamp_verifier)
try:
if not collection_i18nname_cache.is_ok_p:
raise Exception
except Exception:
collection_i18nname_cache = CollectionI18nNameDataCacher()
def get_coll_i18nname(c, ln=CFG_SITE_LANG, verify_cache_timestamp=True):
"""
Return nicely formatted collection name (of the name type `ln'
(=long name)) for collection C in language LN.
This function uses collection_i18nname_cache, but it verifies
whether the cache is up-to-date first by default. This
verification step is performed by checking the DB table update
time. So, if you call this function 1000 times, it can get very
slow because it will do 1000 table update time verifications, even
though collection names change not that often.
Hence the parameter VERIFY_CACHE_TIMESTAMP which, when set to
False, will assume the cache is already up-to-date. This is
useful namely in the generation of collection lists for the search
results page.
"""
if verify_cache_timestamp:
collection_i18nname_cache.recreate_cache_if_needed()
out = c
try:
out = collection_i18nname_cache.cache[c][ln]
except KeyError:
pass # translation in LN does not exist
return out
class FieldI18nNameDataCacher(DataCacher):
"""
Provides cache for I18N field names. This class is not to be used
directly; use function get_field_i18nname() instead.
"""
def __init__(self):
def cache_filler():
ret = {}
try:
res = run_sql("SELECT f.name,fn.ln,fn.value FROM fieldname AS fn, field AS f WHERE fn.id_field=f.id AND fn.type='ln'") # ln=long name
except Exception:
# database problems, return empty cache
return {}
for f, ln, i18nname in res:
if i18nname:
if not ret.has_key(f):
ret[f] = {}
ret[f][ln] = i18nname
return ret
def timestamp_verifier():
return get_table_update_time('fieldname')
DataCacher.__init__(self, cache_filler, timestamp_verifier)
try:
if not field_i18nname_cache.is_ok_p:
raise Exception
except Exception:
field_i18nname_cache = FieldI18nNameDataCacher()
def get_field_i18nname(f, ln=CFG_SITE_LANG, verify_cache_timestamp=True):
"""
Return nicely formatted field name (of type 'ln', 'long name') for
field F in language LN.
If VERIFY_CACHE_TIMESTAMP is set to True, then verify DB timestamp
and field I18N name cache timestamp and refresh cache from the DB
if needed. Otherwise don't bother checking DB timestamp and
return the cached value. (This is useful when get_field_i18nname
is called inside a loop.)
"""
if verify_cache_timestamp:
field_i18nname_cache.recreate_cache_if_needed()
out = f
try:
out = field_i18nname_cache.cache[f][ln]
except KeyError:
pass # translation in LN does not exist
return out
def get_alphabetically_ordered_collection_list(level=0, ln=CFG_SITE_LANG):
"""Returns nicely ordered (score respected) list of collections, more exactly list of tuples
(collection name, printable collection name).
Suitable for create_search_box()."""
out = []
res = run_sql("SELECT name FROM collection ORDER BY name ASC")
for c_name in res:
c_name = c_name[0]
# make a nice printable name (e.g. truncate c_printable for
# long collection names in given language):
c_printable_fullname = get_coll_i18nname(c_name, ln, False)
c_printable = wash_index_term(c_printable_fullname, 30, False)
if c_printable != c_printable_fullname:
c_printable = c_printable + "..."
if level:
c_printable = " " + level * '-' + " " + c_printable
out.append([c_name, c_printable])
return out
def get_nicely_ordered_collection_list(collid=1, level=0, ln=CFG_SITE_LANG):
"""Returns nicely ordered (score respected) list of collections, more exactly list of tuples
(collection name, printable collection name).
Suitable for create_search_box()."""
colls_nicely_ordered = []
res = run_sql("""SELECT c.name,cc.id_son FROM collection_collection AS cc, collection AS c
WHERE c.id=cc.id_son AND cc.id_dad=%s ORDER BY score DESC""", (collid, ))
for c, cid in res:
# make a nice printable name (e.g. truncate c_printable for
# long collection names in given language):
c_printable_fullname = get_coll_i18nname(c, ln, False)
c_printable = wash_index_term(c_printable_fullname, 30, False)
if c_printable != c_printable_fullname:
c_printable = c_printable + "..."
if level:
c_printable = " " + level * '-' + " " + c_printable
colls_nicely_ordered.append([c, c_printable])
colls_nicely_ordered = colls_nicely_ordered + get_nicely_ordered_collection_list(cid, level+1, ln=ln)
return colls_nicely_ordered
def get_index_id_from_field(field):
"""
Return index id with name corresponding to FIELD, or the first
index id where the logical field code named FIELD is indexed.
Return zero in case there is no index defined for this field.
Example: field='author', output=4.
"""
out = 0
if not field:
field = 'global' # empty string field means 'global' index (field 'anyfield')
# first look in the index table:
res = run_sql("""SELECT id FROM idxINDEX WHERE name=%s""", (field,))
if res:
out = res[0][0]
return out
# not found in the index table, now look in the field table:
res = run_sql("""SELECT w.id FROM idxINDEX AS w, idxINDEX_field AS wf, field AS f
WHERE f.code=%s AND wf.id_field=f.id AND w.id=wf.id_idxINDEX
LIMIT 1""", (field,))
if res:
out = res[0][0]
return out
def get_words_from_pattern(pattern):
"""
Returns list of whitespace-separated words from pattern, removing any
trailing punctuation-like signs from words in pattern.
"""
words = {}
# clean trailing punctuation signs inside pattern
pattern = re_punctuation_followed_by_space.sub(' ', pattern)
for word in string.split(pattern):
if not words.has_key(word):
words[word] = 1
return words.keys()
def create_basic_search_units(req, p, f, m=None, of='hb'):
"""Splits search pattern and search field into a list of independently searchable units.
- A search unit consists of '(operator, pattern, field, type, hitset)' tuples where
'operator' is set union (|), set intersection (+) or set exclusion (-);
'pattern' is either a word (e.g. muon*) or a phrase (e.g. 'nuclear physics');
'field' is either a code like 'title' or MARC tag like '100__a';
'type' is the search type ('w' for word file search, 'a' for access file search).
- Optionally, the function accepts the match type argument 'm'.
If it is set (e.g. from advanced search interface), then it
performs this kind of matching. If it is not set, then a guess is made.
'm' can have values: 'a'='all of the words', 'o'='any of the words',
'p'='phrase/substring', 'r'='regular expression',
'e'='exact value'.
- Warnings are printed on req (when not None) in case of HTML output formats."""
opfts = [] # will hold (o,p,f,t,h) units
# FIXME: quick hack for the journal index
if f == 'journal':
opfts.append(['+', p, f, 'w'])
return opfts
## check arguments: is desired matching type set?
if m:
## A - matching type is known; good!
if m == 'e':
# A1 - exact value:
opfts.append(['+', p, f, 'a']) # '+' since we have only one unit
elif m == 'p':
# A2 - phrase/substring:
opfts.append(['+', "%" + p + "%", f, 'a']) # '+' since we have only one unit
elif m == 'r':
# A3 - regular expression:
opfts.append(['+', p, f, 'r']) # '+' since we have only one unit
elif m == 'a' or m == 'w':
# A4 - all of the words:
p = strip_accents(p) # strip accents for 'w' mode, FIXME: delete when not needed
for word in get_words_from_pattern(p):
opfts.append(['+', word, f, 'w']) # '+' in all units
elif m == 'o':
# A5 - any of the words:
p = strip_accents(p) # strip accents for 'w' mode, FIXME: delete when not needed
for word in get_words_from_pattern(p):
if len(opfts)==0:
opfts.append(['+', word, f, 'w']) # '+' in the first unit
else:
opfts.append(['|', word, f, 'w']) # '|' in further units
else:
if of.startswith("h"):
write_warning("Matching type '%s' is not implemented yet." % cgi.escape(m), "Warning", req=req)
opfts.append(['+', "%" + p + "%", f, 'w'])
else:
## B - matching type is not known: let us try to determine it by some heuristics
if f and p[0] == '"' and p[-1] == '"':
## B0 - does 'p' start and end by double quote, and is 'f' defined? => doing ACC search
opfts.append(['+', p[1:-1], f, 'a'])
elif f in ('author', 'firstauthor', 'exactauthor', 'exactfirstauthor', 'authorityauthor') and author_name_requires_phrase_search(p):
## B1 - do we search in author, and does 'p' contain space/comma/dot/etc?
## => doing washed ACC search
opfts.append(['+', p, f, 'a'])
elif f and p[0] == "'" and p[-1] == "'":
## B0bis - does 'p' start and end by single quote, and is 'f' defined? => doing ACC search
opfts.append(['+', '%' + p[1:-1] + '%', f, 'a'])
elif f and p[0] == "/" and p[-1] == "/":
## B0ter - does 'p' start and end by a slash, and is 'f' defined? => doing regexp search
opfts.append(['+', p[1:-1], f, 'r'])
elif f and string.find(p, ',') >= 0:
## B1 - does 'p' contain comma, and is 'f' defined? => doing ACC search
opfts.append(['+', p, f, 'a'])
elif f and str(f[0:2]).isdigit():
## B2 - does 'f' exist and starts by two digits? => doing ACC search
opfts.append(['+', p, f, 'a'])
else:
## B3 - doing WRD search, but maybe ACC too
# search units are separated by spaces unless the space is within single or double quotes
# so, let us replace temporarily any space within quotes by '__SPACE__'
p = re_pattern_single_quotes.sub(lambda x: "'"+string.replace(x.group(1), ' ', '__SPACE__')+"'", p)
p = re_pattern_double_quotes.sub(lambda x: "\""+string.replace(x.group(1), ' ', '__SPACE__')+"\"", p)
p = re_pattern_regexp_quotes.sub(lambda x: "/"+string.replace(x.group(1), ' ', '__SPACE__')+"/", p)
# and spaces after colon as well:
p = re_pattern_spaces_after_colon.sub(lambda x: string.replace(x.group(1), ' ', '__SPACE__'), p)
# wash argument:
p = re_logical_and.sub(" ", p)
p = re_logical_or.sub(" |", p)
p = re_logical_not.sub(" -", p)
p = re_operators.sub(r' \1', p)
for pi in string.split(p): # iterate through separated units (or items, as "pi" stands for "p item")
pi = re_pattern_space.sub(" ", pi) # replace back '__SPACE__' by ' '
# firstly, determine set operator
if pi[0] == '+' or pi[0] == '-' or pi[0] == '|':
oi = pi[0]
pi = pi[1:]
else:
# okay, there is no operator, so let us decide what to do by default
oi = '+' # by default we are doing set intersection...
# secondly, determine search pattern and field:
if string.find(pi, ":") > 0:
fi, pi = string.split(pi, ":", 1)
fi = wash_field(fi)
# test whether fi is a real index code or a MARC-tag defined code:
if fi in get_fieldcodes() or '00' <= fi[:2] <= '99':
pass
else:
# it is not, so join it back:
fi, pi = f, fi + ":" + pi
else:
fi, pi = f, pi
# wash 'fi' argument:
fi = wash_field(fi)
# wash 'pi' argument:
pi = pi.strip() # strip eventual spaces
if re_quotes.match(pi):
# B3a - quotes are found => do ACC search (phrase search)
if pi[0] == '"' and pi[-1] == '"':
pi = string.replace(pi, '"', '') # remove quote signs
opfts.append([oi, pi, fi, 'a'])
elif pi[0] == "'" and pi[-1] == "'":
pi = string.replace(pi, "'", "") # remove quote signs
opfts.append([oi, "%" + pi + "%", fi, 'a'])
else: # unbalanced quotes, so fall back to WRD query:
opfts.append([oi, pi, fi, 'w'])
elif pi.startswith('/') and pi.endswith('/'):
# B3b - pi has slashes around => do regexp search
opfts.append([oi, pi[1:-1], fi, 'r'])
elif fi and len(fi) > 1 and str(fi[0]).isdigit() and str(fi[1]).isdigit():
# B3c - fi exists and starts by two digits => do ACC search
opfts.append([oi, pi, fi, 'a'])
elif fi and not get_index_id_from_field(fi) and get_field_name(fi):
# B3d - logical field fi exists but there is no WRD index for fi => try ACC search
opfts.append([oi, pi, fi, 'a'])
else:
# B3e - general case => do WRD search
pi = strip_accents(pi) # strip accents for 'w' mode, FIXME: delete when not needed
for pii in get_words_from_pattern(pi):
opfts.append([oi, pii, fi, 'w'])
## sanity check:
for i in range(0, len(opfts)):
try:
pi = opfts[i][1]
if pi == '*':
if of.startswith("h"):
write_warning("Ignoring standalone wildcard word.", "Warning", req=req)
del opfts[i]
if pi == '' or pi == ' ':
fi = opfts[i][2]
if fi:
if of.startswith("h"):
write_warning("Ignoring empty <em>%s</em> search term." % fi, "Warning", req=req)
del opfts[i]
except:
pass
## replace old logical field names if applicable:
if CFG_WEBSEARCH_FIELDS_CONVERT:
opfts = [[o, p, wash_field(f), t] for o, p, f, t in opfts]
## return search units:
return opfts
def page_start(req, of, cc, aas, ln, uid, title_message=None,
description='', keywords='', recID=-1, tab='', p='', em=''):
"""
Start page according to given output format.
@param title_message: title of the page, not escaped for HTML
@param description: description of the page, not escaped for HTML
@param keywords: keywords of the page, not escaped for HTML
"""
_ = gettext_set_language(ln)
if not req or isinstance(req, cStringIO.OutputType):
return # we were called from CLI
if not title_message:
title_message = _("Search Results")
content_type = get_output_format_content_type(of)
if of.startswith('x'):
if of == 'xr':
# we are doing RSS output
req.content_type = "application/rss+xml"
req.send_http_header()
req.write("""<?xml version="1.0" encoding="UTF-8"?>\n""")
else:
# we are doing XML output:
req.content_type = get_output_format_content_type(of, 'text/xml')
req.send_http_header()
req.write("""<?xml version="1.0" encoding="UTF-8"?>\n""")
elif of.startswith('t') or str(of[0:3]).isdigit():
# we are doing plain text output:
req.content_type = "text/plain"
req.send_http_header()
elif of == "intbitset":
req.content_type = "application/octet-stream"
req.send_http_header()
elif of == "id":
pass # nothing to do, we shall only return list of recIDs
elif content_type == 'text/html':
# we are doing HTML output:
req.content_type = "text/html"
req.send_http_header()
if not description:
description = "%s %s." % (cc, _("Search Results"))
if not keywords:
keywords = "%s, WebSearch, %s" % (get_coll_i18nname(CFG_SITE_NAME, ln, False), get_coll_i18nname(cc, ln, False))
## generate RSS URL:
argd = {}
if req.args:
argd = cgi.parse_qs(req.args)
rssurl = websearch_templates.build_rss_url(argd)
## add MathJax if displaying single records (FIXME: find
## eventual better place to this code)
if of.lower() in CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS:
metaheaderadd = get_mathjax_header(req.is_https())
else:
metaheaderadd = ''
# Add metadata in meta tags for Google scholar-esque harvesting...
# only if we have a detailed meta format and we are looking at a
# single record
if (recID != -1 and CFG_WEBSEARCH_DETAILED_META_FORMAT):
metaheaderadd += format_record(recID, \
CFG_WEBSEARCH_DETAILED_META_FORMAT, \
ln = ln)
## generate navtrail:
navtrail = create_navtrail_links(cc, aas, ln)
if navtrail != '':
navtrail += ' &gt; '
if (tab != '' or ((of != '' or of.lower() != 'hd') and of != 'hb')) and \
recID != -1:
# If we are not in information tab in HD format, customize
# the nav. trail to have a link back to main record. (Due
# to the way perform_request_search() works, hb
# (lowercase) is equal to hd)
navtrail += ' <a class="navtrail" href="%s/%s/%s">%s</a>' % \
(CFG_SITE_URL, CFG_SITE_RECORD, recID, cgi.escape(title_message))
if (of != '' or of.lower() != 'hd') and of != 'hb':
# Export
format_name = of
query = "SELECT name FROM format WHERE code=%s"
res = run_sql(query, (of,))
if res:
format_name = res[0][0]
navtrail += ' &gt; ' + format_name
else:
# Discussion, citations, etc. tabs
tab_label = get_detailed_page_tabs(cc, ln=ln)[tab]['label']
navtrail += ' &gt; ' + _(tab_label)
else:
navtrail += cgi.escape(title_message)
if p:
# we are serving search/browse results pages, so insert pattern:
navtrail += ": " + cgi.escape(p)
title_message = p + " - " + title_message
body_css_classes = []
if cc:
# we know the collection, lets allow page styles based on cc
#collection names may not satisfy rules for css classes which
#are something like: -?[_a-zA-Z]+[_a-zA-Z0-9-]*
#however it isn't clear what we should do about cases with
#numbers, so we leave them to fail. Everything else becomes "_"
css = nmtoken_from_string(cc).replace('.','_').replace('-','_').replace(':','_')
body_css_classes.append(css)
## finally, print page header:
if em == '' or EM_REPOSITORY["header"] in em:
req.write(pageheaderonly(req=req, title=title_message,
navtrail=navtrail,
description=description,
keywords=keywords,
metaheaderadd=metaheaderadd,
uid=uid,
language=ln,
navmenuid='search',
navtrail_append_title_p=0,
rssurl=rssurl,
body_css_classes=body_css_classes))
req.write(websearch_templates.tmpl_search_pagestart(ln=ln))
else:
req.content_type = content_type
req.send_http_header()
def page_end(req, of="hb", ln=CFG_SITE_LANG, em=""):
"End page according to given output format: e.g. close XML tags, add HTML footer, etc."
if of == "id":
return [] # empty recID list
if of == "intbitset":
return intbitset()
if not req:
return # we were called from CLI
if of.startswith('h'):
req.write(websearch_templates.tmpl_search_pageend(ln = ln)) # pagebody end
if em == "" or EM_REPOSITORY["footer"] in em:
req.write(pagefooteronly(lastupdated=__lastupdated__, language=ln, req=req))
return
def create_page_title_search_pattern_info(p, p1, p2, p3):
"""Create the search pattern bit for the page <title> web page
HTML header. Basically combine p and (p1,p2,p3) together so that
the page header may be filled whether we are in the Simple Search
or Advanced Search interface contexts."""
out = ""
if p:
out = p
else:
out = p1
if p2:
out += ' ' + p2
if p3:
out += ' ' + p3
return out
def create_inputdate_box(name="d1", selected_year=0, selected_month=0, selected_day=0, ln=CFG_SITE_LANG):
"Produces 'From Date', 'Until Date' kind of selection box. Suitable for search options."
_ = gettext_set_language(ln)
box = ""
# day
box += """<select name="%sd">""" % name
box += """<option value="">%s""" % _("any day")
for day in range(1, 32):
box += """<option value="%02d"%s>%02d""" % (day, is_selected(day, selected_day), day)
box += """</select>"""
# month
box += """<select name="%sm">""" % name
box += """<option value="">%s""" % _("any month")
# trailing space in May distinguishes short/long form of the month name
for mm, month in [(1, _("January")), (2, _("February")), (3, _("March")), (4, _("April")), \
(5, _("May ")), (6, _("June")), (7, _("July")), (8, _("August")), \
(9, _("September")), (10, _("October")), (11, _("November")), (12, _("December"))]:
box += """<option value="%02d"%s>%s""" % (mm, is_selected(mm, selected_month), month.strip())
box += """</select>"""
# year
box += """<select name="%sy">""" % name
box += """<option value="">%s""" % _("any year")
this_year = int(time.strftime("%Y", time.localtime()))
for year in range(this_year-20, this_year+1):
box += """<option value="%d"%s>%d""" % (year, is_selected(year, selected_year), year)
box += """</select>"""
return box
def create_search_box(cc, colls, p, f, rg, sf, so, sp, rm, of, ot, aas,
ln, p1, f1, m1, op1, p2, f2, m2, op2, p3, f3,
m3, sc, pl, d1y, d1m, d1d, d2y, d2m, d2d, dt, jrec, ec,
action="", em=""):
"""Create search box for 'search again in the results page' functionality."""
if em != "" and EM_REPOSITORY["search_box"] not in em:
if EM_REPOSITORY["body"] in em and cc != CFG_SITE_NAME:
return '''
<h1 class="headline">%(ccname)s</h1>''' % {'ccname' : cgi.escape(cc), }
else:
return ""
# load the right message language
_ = gettext_set_language(ln)
# some computations
cc_intl = get_coll_i18nname(cc, ln, False)
cc_colID = get_colID(cc)
colls_nicely_ordered = []
if cfg_nicely_ordered_collection_list:
colls_nicely_ordered = get_nicely_ordered_collection_list(ln=ln)
else:
colls_nicely_ordered = get_alphabetically_ordered_collection_list(ln=ln)
colls_nice = []
for (cx, cx_printable) in colls_nicely_ordered:
if not cx.startswith("Unnamed collection"):
colls_nice.append({ 'value' : cx,
'text' : cx_printable
})
coll_selects = []
if colls and colls[0] != CFG_SITE_NAME:
# some collections are defined, so print these first, and only then print 'add another collection' heading:
for c in colls:
if c:
temp = []
temp.append({ 'value' : CFG_SITE_NAME,
'text' : '*** %s ***' % _("any public collection")
})
# this field is used to remove the current collection from the ones to be searched.
temp.append({ 'value' : '',
'text' : '*** %s ***' % _("remove this collection")
})
for val in colls_nice:
# print collection:
if not cx.startswith("Unnamed collection"):
temp.append({ 'value' : val['value'],
'text' : val['text'],
'selected' : (c == re.sub("^[\s\-]*","", val['value']))
})
coll_selects.append(temp)
coll_selects.append([{ 'value' : '',
'text' : '*** %s ***' % _("add another collection")
}] + colls_nice)
else: # we searched in CFG_SITE_NAME, so print 'any public collection' heading
coll_selects.append([{ 'value' : CFG_SITE_NAME,
'text' : '*** %s ***' % _("any public collection")
}] + colls_nice)
## ranking methods
ranks = [{
'value' : '',
'text' : "- %s %s -" % (_("OR").lower (), _("rank by")),
}]
for (code, name) in get_bibrank_methods(cc_colID, ln):
# propose found rank methods:
ranks.append({
'value' : code,
'text' : name,
})
formats = get_available_output_formats(visible_only=True)
# show collections in the search box? (not if there is only one
# collection defined, and not if we are in light search)
show_colls = True
show_title = True
if len(collection_reclist_cache.cache.keys()) == 1 or \
aas == -1:
show_colls = False
show_title = False
if cc == CFG_SITE_NAME:
show_title = False
if CFG_INSPIRE_SITE:
show_title = False
return websearch_templates.tmpl_search_box(
ln = ln,
aas = aas,
cc_intl = cc_intl,
cc = cc,
ot = ot,
sp = sp,
action = action,
fieldslist = get_searchwithin_fields(ln=ln, colID=cc_colID),
f1 = f1,
f2 = f2,
f3 = f3,
m1 = m1,
m2 = m2,
m3 = m3,
p1 = p1,
p2 = p2,
p3 = p3,
op1 = op1,
op2 = op2,
rm = rm,
p = p,
f = f,
coll_selects = coll_selects,
d1y = d1y, d2y = d2y, d1m = d1m, d2m = d2m, d1d = d1d, d2d = d2d,
dt = dt,
sort_fields = get_sortby_fields(ln=ln, colID=cc_colID),
sf = sf,
so = so,
ranks = ranks,
sc = sc,
rg = rg,
formats = formats,
of = of,
pl = pl,
jrec = jrec,
ec = ec,
show_colls = show_colls,
show_title = show_title and (em=="" or EM_REPOSITORY["body"] in em)
)
def create_exact_author_browse_help_link(p=None, p1=None, p2=None, p3=None, f=None, f1=None, f2=None, f3=None,
rm=None, cc=None, ln=None, jrec=None, rg=None, aas=0, action=""):
"""Creates a link to help switch from author to exact author while browsing"""
if action == 'browse':
search_fields = (f, f1, f2, f3)
if ('author' in search_fields) or ('firstauthor' in search_fields):
def add_exact(field):
if field == 'author' or field == 'firstauthor':
return 'exact' + field
return field
(fe, f1e, f2e, f3e) = map(add_exact, search_fields)
link_name = f or f1
link_name = (link_name == 'firstauthor' and 'exact first author') or 'exact author'
return websearch_templates.tmpl_exact_author_browse_help_link(p=p, p1=p1, p2=p2, p3=p3, f=fe, f1=f1e, f2=f2e, f3=f3e,
rm=rm, cc=cc, ln=ln, jrec=jrec, rg=rg, aas=aas, action=action,
link_name=link_name)
return ""
def create_navtrail_links(cc=CFG_SITE_NAME, aas=0, ln=CFG_SITE_LANG, self_p=1, tab=''):
"""Creates navigation trail links, i.e. links to collection
ancestors (except Home collection). If aas==1, then links to
Advanced Search interfaces; otherwise Simple Search.
"""
dads = []
for dad in get_coll_ancestors(cc):
if dad != CFG_SITE_NAME: # exclude Home collection
dads.append ((dad, get_coll_i18nname(dad, ln, False)))
if self_p and cc != CFG_SITE_NAME:
dads.append((cc, get_coll_i18nname(cc, ln, False)))
return websearch_templates.tmpl_navtrail_links(
aas=aas, ln=ln, dads=dads)
def get_searchwithin_fields(ln='en', colID=None):
"""Retrieves the fields name used in the 'search within' selection box for the collection ID colID."""
res = None
if colID:
res = run_sql("""SELECT f.code,f.name FROM field AS f, collection_field_fieldvalue AS cff
WHERE cff.type='sew' AND cff.id_collection=%s AND cff.id_field=f.id
ORDER BY cff.score DESC, f.name ASC""", (colID,))
if not res:
res = run_sql("SELECT code,name FROM field ORDER BY name ASC")
fields = [{
'value' : '',
'text' : get_field_i18nname("any field", ln, False)
}]
for field_code, field_name in res:
if field_code and field_code != "anyfield":
fields.append({ 'value' : field_code,
'text' : get_field_i18nname(field_name, ln, False)
})
return fields
def get_sortby_fields(ln='en', colID=None):
"""Retrieves the fields name used in the 'sort by' selection box for the collection ID colID."""
_ = gettext_set_language(ln)
res = None
if colID:
res = run_sql("""SELECT DISTINCT(f.code),f.name FROM field AS f, collection_field_fieldvalue AS cff
WHERE cff.type='soo' AND cff.id_collection=%s AND cff.id_field=f.id
ORDER BY cff.score DESC, f.name ASC""", (colID,))
if not res:
# no sort fields defined for this colID, try to take Home collection:
res = run_sql("""SELECT DISTINCT(f.code),f.name FROM field AS f, collection_field_fieldvalue AS cff
WHERE cff.type='soo' AND cff.id_collection=%s AND cff.id_field=f.id
ORDER BY cff.score DESC, f.name ASC""", (1,))
if not res:
# no sort fields defined for the Home collection, take all sort fields defined wherever they are:
res = run_sql("""SELECT DISTINCT(f.code),f.name FROM field AS f, collection_field_fieldvalue AS cff
WHERE cff.type='soo' AND cff.id_field=f.id
ORDER BY cff.score DESC, f.name ASC""",)
fields = [{
'value' : '',
'text' : _("latest first")
}]
for field_code, field_name in res:
if field_code and field_code != "anyfield":
fields.append({ 'value' : field_code,
'text' : get_field_i18nname(field_name, ln, False)
})
return fields
def create_andornot_box(name='op', value='', ln='en'):
"Returns HTML code for the AND/OR/NOT selection box."
_ = gettext_set_language(ln)
out = """
<select name="%s">
<option value="a"%s>%s
<option value="o"%s>%s
<option value="n"%s>%s
</select>
""" % (name,
is_selected('a', value), _("AND"),
is_selected('o', value), _("OR"),
is_selected('n', value), _("AND NOT"))
return out
def create_matchtype_box(name='m', value='', ln='en'):
"Returns HTML code for the 'match type' selection box."
_ = gettext_set_language(ln)
out = """
<select name="%s">
<option value="a"%s>%s
<option value="o"%s>%s
<option value="e"%s>%s
<option value="p"%s>%s
<option value="r"%s>%s
</select>
""" % (name,
is_selected('a', value), _("All of the words:"),
is_selected('o', value), _("Any of the words:"),
is_selected('e', value), _("Exact phrase:"),
is_selected('p', value), _("Partial phrase:"),
is_selected('r', value), _("Regular expression:"))
return out
def is_selected(var, fld):
"Checks if the two are equal, and if yes, returns ' selected'. Useful for select boxes."
if type(var) is int and type(fld) is int:
if var == fld:
return " selected"
elif str(var) == str(fld):
return " selected"
elif fld and len(fld)==3 and fld[0] == "w" and var == fld[1:]:
return " selected"
return ""
def wash_colls(cc, c, split_colls=0, verbose=0):
"""Wash collection list by checking whether user has deselected
anything under 'Narrow search'. Checks also if cc is a list or not.
Return list of cc, colls_to_display, colls_to_search since the list
of collections to display is different from that to search in.
This is because users might have chosen 'split by collection'
functionality.
The behaviour of "collections to display" depends solely whether
user has deselected a particular collection: e.g. if it started
from 'Articles and Preprints' page, and deselected 'Preprints',
then collection to display is 'Articles'. If he did not deselect
anything, then collection to display is 'Articles & Preprints'.
The behaviour of "collections to search in" depends on the
'split_colls' parameter:
* if is equal to 1, then we can wash the colls list down
and search solely in the collection the user started from;
* if is equal to 0, then we are splitting to the first level
of collections, i.e. collections as they appear on the page
we started to search from;
The function raises exception
InvenioWebSearchUnknownCollectionError
if cc or one of c collections is not known.
"""
colls_out = []
colls_out_for_display = []
# list to hold the hosted collections to be searched and displayed
hosted_colls_out = []
debug = ""
if verbose:
debug += "<br />"
debug += "<br />1) --- initial parameters ---"
debug += "<br />cc : %s" % cc
debug += "<br />c : %s" % c
debug += "<br />"
# check what type is 'cc':
if type(cc) is list:
for ci in cc:
if collection_reclist_cache.cache.has_key(ci):
# yes this collection is real, so use it:
cc = ci
break
else:
# check once if cc is real:
if not collection_reclist_cache.cache.has_key(cc):
if cc:
raise InvenioWebSearchUnknownCollectionError(cc)
else:
cc = CFG_SITE_NAME # cc is not set, so replace it with Home collection
# check type of 'c' argument:
if type(c) is list:
colls = c
else:
colls = [c]
if verbose:
debug += "<br />2) --- after check for the integrity of cc and the being or not c a list ---"
debug += "<br />cc : %s" % cc
debug += "<br />c : %s" % c
debug += "<br />"
# remove all 'unreal' collections:
colls_real = []
for coll in colls:
if collection_reclist_cache.cache.has_key(coll):
colls_real.append(coll)
else:
if coll:
raise InvenioWebSearchUnknownCollectionError(coll)
colls = colls_real
if verbose:
debug += "<br />3) --- keeping only the real colls of c ---"
debug += "<br />colls : %s" % colls
debug += "<br />"
# check if some real collections remain:
if len(colls)==0:
colls = [cc]
if verbose:
debug += "<br />4) --- in case no colls were left we use cc directly ---"
debug += "<br />colls : %s" % colls
debug += "<br />"
# then let us check the list of non-restricted "real" sons of 'cc' and compare it to 'coll':
res = run_sql("""SELECT c.name FROM collection AS c,
collection_collection AS cc,
collection AS ccc
WHERE c.id=cc.id_son AND cc.id_dad=ccc.id
AND ccc.name=%s AND cc.type='r'""", (cc,))
# list that holds all the non restricted sons of cc that are also not hosted collections
l_cc_nonrestricted_sons_and_nonhosted_colls = []
res_hosted = run_sql("""SELECT c.name FROM collection AS c,
collection_collection AS cc,
collection AS ccc
WHERE c.id=cc.id_son AND cc.id_dad=ccc.id
AND ccc.name=%s AND cc.type='r'
AND (c.dbquery NOT LIKE 'hostedcollection:%%' OR c.dbquery IS NULL)""", (cc,))
for row_hosted in res_hosted:
l_cc_nonrestricted_sons_and_nonhosted_colls.append(row_hosted[0])
l_cc_nonrestricted_sons_and_nonhosted_colls.sort()
l_cc_nonrestricted_sons = []
l_c = colls[:]
for row in res:
if not collection_restricted_p(row[0]):
l_cc_nonrestricted_sons.append(row[0])
l_c.sort()
l_cc_nonrestricted_sons.sort()
if l_cc_nonrestricted_sons == l_c:
colls_out_for_display = [cc] # yep, washing permitted, it is sufficient to display 'cc'
# the following elif is a hack that preserves the above funcionality when we start searching from
# the frontpage with some hosted collections deselected (either by default or manually)
elif set(l_cc_nonrestricted_sons_and_nonhosted_colls).issubset(set(l_c)):
colls_out_for_display = colls
split_colls = 0
else:
colls_out_for_display = colls # nope, we need to display all 'colls' successively
# remove duplicates:
#colls_out_for_display_nondups=filter(lambda x, colls_out_for_display=colls_out_for_display: colls_out_for_display[x-1] not in colls_out_for_display[x:], range(1, len(colls_out_for_display)+1))
#colls_out_for_display = map(lambda x, colls_out_for_display=colls_out_for_display:colls_out_for_display[x-1], colls_out_for_display_nondups)
#colls_out_for_display = list(set(colls_out_for_display))
#remove duplicates while preserving the order
set_out = set()
colls_out_for_display = [coll for coll in colls_out_for_display if coll not in set_out and not set_out.add(coll)]
if verbose:
debug += "<br />5) --- decide whether colls_out_for_diplay should be colls or is it sufficient for it to be cc; remove duplicates ---"
debug += "<br />colls_out_for_display : %s" % colls_out_for_display
debug += "<br />"
# FIXME: The below quoted part of the code has been commented out
# because it prevents searching in individual restricted daughter
# collections when both parent and all its public daughter
# collections were asked for, in addition to some restricted
# daughter collections. The removal was introduced for hosted
# collections, so we may want to double check in this context.
# the following piece of code takes care of removing collections whose ancestors are going to be searched anyway
# list to hold the collections to be removed
#colls_to_be_removed = []
# first calculate the collections that can safely be removed
#for coll in colls_out_for_display:
# for ancestor in get_coll_ancestors(coll):
# #if ancestor in colls_out_for_display: colls_to_be_removed.append(coll)
# if ancestor in colls_out_for_display and not is_hosted_collection(coll): colls_to_be_removed.append(coll)
# secondly remove the collections
#for coll in colls_to_be_removed:
# colls_out_for_display.remove(coll)
if verbose:
debug += "<br />6) --- remove collections that have ancestors about to be search, unless they are hosted ---"
debug += "<br />colls_out_for_display : %s" % colls_out_for_display
debug += "<br />"
# calculate the hosted collections to be searched.
if colls_out_for_display == [cc]:
if is_hosted_collection(cc):
hosted_colls_out.append(cc)
else:
for coll in get_coll_sons(cc):
if is_hosted_collection(coll):
hosted_colls_out.append(coll)
else:
for coll in colls_out_for_display:
if is_hosted_collection(coll):
hosted_colls_out.append(coll)
if verbose:
debug += "<br />7) --- calculate the hosted_colls_out ---"
debug += "<br />hosted_colls_out : %s" % hosted_colls_out
debug += "<br />"
# second, let us decide on collection splitting:
if split_colls == 0:
# type A - no sons are wanted
colls_out = colls_out_for_display
else:
# type B - sons (first-level descendants) are wanted
for coll in colls_out_for_display:
coll_sons = get_coll_sons(coll)
if coll_sons == []:
colls_out.append(coll)
else:
for coll_son in coll_sons:
if not is_hosted_collection(coll_son):
colls_out.append(coll_son)
#else:
# colls_out = colls_out + coll_sons
# remove duplicates:
#colls_out_nondups=filter(lambda x, colls_out=colls_out: colls_out[x-1] not in colls_out[x:], range(1, len(colls_out)+1))
#colls_out = map(lambda x, colls_out=colls_out:colls_out[x-1], colls_out_nondups)
#colls_out = list(set(colls_out))
#remove duplicates while preserving the order
set_out = set()
colls_out = [coll for coll in colls_out if coll not in set_out and not set_out.add(coll)]
if verbose:
debug += "<br />8) --- calculate the colls_out; remove duplicates ---"
debug += "<br />colls_out : %s" % colls_out
debug += "<br />"
# remove the hosted collections from the collections to be searched
if hosted_colls_out:
for coll in hosted_colls_out:
try:
colls_out.remove(coll)
except ValueError:
# in case coll was not found in colls_out
pass
if verbose:
debug += "<br />9) --- remove the hosted_colls from the colls_out ---"
debug += "<br />colls_out : %s" % colls_out
return (cc, colls_out_for_display, colls_out, hosted_colls_out, debug)
def get_synonym_terms(term, kbr_name, match_type, use_memoise=False):
"""
Return list of synonyms for TERM by looking in KBR_NAME in
MATCH_TYPE style.
@param term: search-time term or index-time term
@type term: str
@param kbr_name: knowledge base name
@type kbr_name: str
@param match_type: specifies how the term matches against the KBR
before doing the lookup. Could be `exact' (default),
'leading_to_comma', `leading_to_number'.
@type match_type: str
@param use_memoise: can we memoise while doing lookups?
@type use_memoise: bool
@return: list of term synonyms
@rtype: list of strings
"""
dterms = {}
## exact match is default:
term_for_lookup = term
term_remainder = ''
## but maybe match different term:
if match_type == CFG_BIBINDEX_SYNONYM_MATCH_TYPE['leading_to_comma']:
mmm = re.match(r'^(.*?)(\s*,.*)$', term)
if mmm:
term_for_lookup = mmm.group(1)
term_remainder = mmm.group(2)
elif match_type == CFG_BIBINDEX_SYNONYM_MATCH_TYPE['leading_to_number']:
mmm = re.match(r'^(.*?)(\s*\d.*)$', term)
if mmm:
term_for_lookup = mmm.group(1)
term_remainder = mmm.group(2)
## FIXME: workaround: escaping SQL wild-card signs, since KBR's
## exact search is doing LIKE query, so would match everything:
term_for_lookup = term_for_lookup.replace('%', '\%')
## OK, now find synonyms:
for kbr_values in get_kbr_values(kbr_name,
searchkey=term_for_lookup,
searchtype='e',
use_memoise=use_memoise):
for kbr_value in kbr_values:
dterms[kbr_value + term_remainder] = 1
## return list of term synonyms:
return dterms.keys()
def wash_output_format(format):
"""Wash output format FORMAT. Currently only prevents input like
'of=9' for backwards-compatible format that prints certain fields
only. (for this task, 'of=tm' is preferred)"""
if str(format[0:3]).isdigit() and len(format) != 6:
# asked to print MARC tags, but not enough digits,
# so let's switch back to HTML brief default
return 'hb'
else:
return format
def wash_pattern(p):
"""Wash pattern passed by URL. Check for sanity of the wildcard by
removing wildcards if they are appended to extremely short words
(1-3 letters). TODO: instead of this approximative treatment, it
will be much better to introduce a temporal limit, e.g. to kill a
query if it does not finish in 10 seconds."""
# strip accents:
# p = strip_accents(p) # FIXME: when available, strip accents all the time
# add leading/trailing whitespace for the two following wildcard-sanity checking regexps:
p = " " + p + " "
# replace spaces within quotes by __SPACE__ temporarily:
p = re_pattern_single_quotes.sub(lambda x: "'"+string.replace(x.group(1), ' ', '__SPACE__')+"'", p)
p = re_pattern_double_quotes.sub(lambda x: "\""+string.replace(x.group(1), ' ', '__SPACE__')+"\"", p)
p = re_pattern_regexp_quotes.sub(lambda x: "/"+string.replace(x.group(1), ' ', '__SPACE__')+"/", p)
# get rid of unquoted wildcards after spaces:
p = re_pattern_wildcards_after_spaces.sub("\\1", p)
# get rid of extremely short words (1-3 letters with wildcards):
#p = re_pattern_short_words.sub("\\1", p)
# replace back __SPACE__ by spaces:
p = re_pattern_space.sub(" ", p)
# replace special terms:
p = re_pattern_today.sub(time.strftime("%Y-%m-%d", time.localtime()), p)
# remove unnecessary whitespace:
p = string.strip(p)
# remove potentially wrong UTF-8 characters:
p = wash_for_utf8(p)
return p
def wash_field(f):
"""Wash field passed by URL."""
if f:
# get rid of unnecessary whitespace and make it lowercase
# (e.g. Author -> author) to better suit iPhone etc input
# mode:
f = f.strip().lower()
# wash legacy 'f' field names, e.g. replace 'wau' or `au' by
# 'author', if applicable:
if CFG_WEBSEARCH_FIELDS_CONVERT:
f = CFG_WEBSEARCH_FIELDS_CONVERT.get(f, f)
return f
def wash_dates(d1="", d1y=0, d1m=0, d1d=0, d2="", d2y=0, d2m=0, d2d=0):
"""
Take user-submitted date arguments D1 (full datetime string) or
(D1Y, D1M, D1Y) year, month, day tuple and D2 or (D2Y, D2M, D2Y)
and return (YYY1-M1-D2 H1:M1:S2, YYY2-M2-D2 H2:M2:S2) datetime
strings in the YYYY-MM-DD HH:MM:SS format suitable for time
restricted searching.
Note that when both D1 and (D1Y, D1M, D1D) parameters are present,
the precedence goes to D1. Ditto for D2*.
Note that when (D1Y, D1M, D1D) are taken into account, some values
may be missing and are completed e.g. to 01 or 12 according to
whether it is the starting or the ending date.
"""
datetext1, datetext2 = "", ""
# sanity checking:
if d1 == "" and d1y == 0 and d1m == 0 and d1d == 0 and d2 == "" and d2y == 0 and d2m == 0 and d2d == 0:
return ("", "") # nothing selected, so return empty values
# wash first (starting) date:
if d1:
# full datetime string takes precedence:
datetext1 = d1
else:
# okay, first date passed as (year,month,day):
if d1y:
datetext1 += "%04d" % d1y
else:
datetext1 += "0000"
if d1m:
datetext1 += "-%02d" % d1m
else:
datetext1 += "-01"
if d1d:
datetext1 += "-%02d" % d1d
else:
datetext1 += "-01"
datetext1 += " 00:00:00"
# wash second (ending) date:
if d2:
# full datetime string takes precedence:
datetext2 = d2
else:
# okay, second date passed as (year,month,day):
if d2y:
datetext2 += "%04d" % d2y
else:
datetext2 += "9999"
if d2m:
datetext2 += "-%02d" % d2m
else:
datetext2 += "-12"
if d2d:
datetext2 += "-%02d" % d2d
else:
datetext2 += "-31" # NOTE: perhaps we should add max(datenumber) in
# given month, but for our quering it's not
# needed, 31 will always do
datetext2 += " 00:00:00"
# okay, return constructed YYYY-MM-DD HH:MM:SS datetexts:
return (datetext1, datetext2)
def is_hosted_collection(coll):
"""Check if the given collection is a hosted one; i.e. its dbquery starts with hostedcollection:
Returns True if it is, False if it's not or if the result is empty or if the query failed"""
res = run_sql("SELECT dbquery FROM collection WHERE name=%s", (coll, ))
try:
return res[0][0].startswith("hostedcollection:")
except:
return False
def get_colID(c):
"Return collection ID for collection name C. Return None if no match found."
colID = None
res = run_sql("SELECT id FROM collection WHERE name=%s", (c,), 1)
if res:
colID = res[0][0]
return colID
def get_coll_normalised_name(c):
"""Returns normalised collection name (case sensitive) for collection name
C (case insensitive).
Returns None if no match found."""
try:
return run_sql("SELECT name FROM collection WHERE name=%s", (c,))[0][0]
except:
return None
def get_coll_ancestors(coll):
"Returns a list of ancestors for collection 'coll'."
coll_ancestors = []
coll_ancestor = coll
while 1:
res = run_sql("""SELECT c.name FROM collection AS c
LEFT JOIN collection_collection AS cc ON c.id=cc.id_dad
LEFT JOIN collection AS ccc ON ccc.id=cc.id_son
WHERE ccc.name=%s ORDER BY cc.id_dad ASC LIMIT 1""",
(coll_ancestor,))
if res:
coll_name = res[0][0]
coll_ancestors.append(coll_name)
coll_ancestor = coll_name
else:
break
# ancestors found, return reversed list:
coll_ancestors.reverse()
return coll_ancestors
def get_coll_sons(coll, type='r', public_only=1):
"""Return a list of sons (first-level descendants) of type 'type' for collection 'coll'.
If public_only, then return only non-restricted son collections.
"""
coll_sons = []
query = "SELECT c.name FROM collection AS c "\
"LEFT JOIN collection_collection AS cc ON c.id=cc.id_son "\
"LEFT JOIN collection AS ccc ON ccc.id=cc.id_dad "\
"WHERE cc.type=%s AND ccc.name=%s"
query += " ORDER BY cc.score DESC"
res = run_sql(query, (type, coll))
for name in res:
if not public_only or not collection_restricted_p(name[0]):
coll_sons.append(name[0])
return coll_sons
class CollectionAllChildrenDataCacher(DataCacher):
"""Cache for all children of a collection (regular & virtual, public & private)"""
def __init__(self):
def cache_filler():
def get_all_children(coll, type='r', public_only=1):
"""Return a list of all children of type 'type' for collection 'coll'.
If public_only, then return only non-restricted child collections.
If type='*', then return both regular and virtual collections.
"""
children = []
if type == '*':
sons = get_coll_sons(coll, 'r', public_only) + get_coll_sons(coll, 'v', public_only)
else:
sons = get_coll_sons(coll, type, public_only)
for child in sons:
children.append(child)
children.extend(get_all_children(child, type, public_only))
return children
ret = {}
collections = collection_reclist_cache.cache.keys()
for collection in collections:
ret[collection] = get_all_children(collection, '*', public_only=0)
return ret
def timestamp_verifier():
return max(get_table_update_time('collection'), get_table_update_time('collection_collection'))
DataCacher.__init__(self, cache_filler, timestamp_verifier)
try:
if not collection_allchildren_cache.is_ok_p:
raise Exception
except Exception:
collection_allchildren_cache = CollectionAllChildrenDataCacher()
def get_collection_allchildren(coll, recreate_cache_if_needed=True):
"""Returns the list of all children of a collection."""
if recreate_cache_if_needed:
collection_allchildren_cache.recreate_cache_if_needed()
if coll not in collection_allchildren_cache.cache:
return [] # collection does not exist; return empty list
return collection_allchildren_cache.cache[coll]
def get_coll_real_descendants(coll, type='_', get_hosted_colls=True):
"""Return a list of all descendants of collection 'coll' that are defined by a 'dbquery'.
IOW, we need to decompose compound collections like "A & B" into "A" and "B" provided
that "A & B" has no associated database query defined.
"""
coll_sons = []
res = run_sql("""SELECT c.name,c.dbquery FROM collection AS c
LEFT JOIN collection_collection AS cc ON c.id=cc.id_son
LEFT JOIN collection AS ccc ON ccc.id=cc.id_dad
WHERE ccc.name=%s AND cc.type LIKE %s ORDER BY cc.score DESC""",
(coll, type,))
for name, dbquery in res:
if dbquery: # this is 'real' collection, so return it:
if get_hosted_colls:
coll_sons.append(name)
else:
if not dbquery.startswith("hostedcollection:"):
coll_sons.append(name)
else: # this is 'composed' collection, so recurse:
coll_sons.extend(get_coll_real_descendants(name))
return coll_sons
def browse_pattern_phrases(req, colls, p, f, rg, ln=CFG_SITE_LANG):
"""Returns either biliographic phrases or words indexes."""
## is p enclosed in quotes? (coming from exact search)
if p.startswith('"') and p.endswith('"'):
p = p[1:-1]
p_orig = p
## okay, "real browse" follows:
## FIXME: the maths in the get_nearest_terms_in_bibxxx is just a test
if not f and string.find(p, ":") > 0: # does 'p' contain ':'?
f, p = string.split(p, ":", 1)
## do we search in words indexes?
# FIXME uncomment this
#if not f:
# return browse_in_bibwords(req, p, f)
coll_hitset = intbitset()
for coll_name in colls:
coll_hitset |= get_collection_reclist(coll_name)
index_id = get_index_id_from_field(f)
if index_id != 0:
browsed_phrases_in_colls = get_nearest_terms_in_idxphrase_with_collection(p, index_id, rg/2, rg/2, coll_hitset)
else:
browsed_phrases = get_nearest_terms_in_bibxxx(p, f, (rg+1)/2+1, (rg-1)/2+1)
while not browsed_phrases:
# try again and again with shorter and shorter pattern:
try:
p = p[:-1]
browsed_phrases = get_nearest_terms_in_bibxxx(p, f, (rg+1)/2+1, (rg-1)/2+1)
except:
# probably there are no hits at all:
#req.write(_("No values found."))
return []
## try to check hits in these particular collection selection:
browsed_phrases_in_colls = []
if 0:
for phrase in browsed_phrases:
phrase_hitset = intbitset()
phrase_hitsets = search_pattern("", phrase, f, 'e')
for coll in colls:
phrase_hitset.union_update(phrase_hitsets[coll])
if len(phrase_hitset) > 0:
# okay, this phrase has some hits in colls, so add it:
browsed_phrases_in_colls.append([phrase, len(phrase_hitset)])
## were there hits in collections?
if browsed_phrases_in_colls == []:
if browsed_phrases != []:
#write_warning(req, """<p>No match close to <em>%s</em> found in given collections.
#Please try different term.<p>Displaying matches in any collection...""" % p_orig)
## try to get nbhits for these phrases in any collection:
for phrase in browsed_phrases:
nbhits = get_nbhits_in_bibxxx(phrase, f, coll_hitset)
if nbhits > 0:
browsed_phrases_in_colls.append([phrase, nbhits])
return browsed_phrases_in_colls
def browse_pattern(req, colls, p, f, rg, ln=CFG_SITE_LANG):
"""Displays either biliographic phrases or words indexes."""
# load the right message language
_ = gettext_set_language(ln)
browsed_phrases_in_colls = browse_pattern_phrases(req, colls, p, f, rg, ln)
if len(browsed_phrases_in_colls) == 0:
req.write(_("No values found."))
return
## display results now:
out = websearch_templates.tmpl_browse_pattern(
f=f,
fn=get_field_i18nname(get_field_name(f) or f, ln, False),
ln=ln,
browsed_phrases_in_colls=browsed_phrases_in_colls,
colls=colls,
rg=rg,
)
req.write(out)
return
def browse_in_bibwords(req, p, f, ln=CFG_SITE_LANG):
"""Browse inside words indexes."""
if not p:
return
_ = gettext_set_language(ln)
urlargd = {}
urlargd.update(req.argd)
urlargd['action'] = 'search'
nearest_box = create_nearest_terms_box(urlargd, p, f, 'w', ln=ln, intro_text_p=0)
req.write(websearch_templates.tmpl_search_in_bibwords(
p = p,
f = f,
ln = ln,
nearest_box = nearest_box
))
return
def search_pattern(req=None, p=None, f=None, m=None, ap=0, of="id", verbose=0, ln=CFG_SITE_LANG, display_nearest_terms_box=True, wl=0):
"""Search for complex pattern 'p' within field 'f' according to
matching type 'm'. Return hitset of recIDs.
The function uses multi-stage searching algorithm in case of no
exact match found. See the Search Internals document for
detailed description.
The 'ap' argument governs whether an alternative patterns are to
be used in case there is no direct hit for (p,f,m). For
example, whether to replace non-alphanumeric characters by
spaces if it would give some hits. See the Search Internals
document for detailed description. (ap=0 forbits the
alternative pattern usage, ap=1 permits it.)
'ap' is also internally used for allowing hidden tag search
(for requests coming from webcoll, for example). In this
case ap=-9
The 'of' argument governs whether to print or not some
information to the user in case of no match found. (Usually it
prints the information in case of HTML formats, otherwise it's
silent).
The 'verbose' argument controls the level of debugging information
to be printed (0=least, 9=most).
All the parameters are assumed to have been previously washed.
This function is suitable as a mid-level API.
"""
_ = gettext_set_language(ln)
hitset_empty = intbitset()
# sanity check:
if not p:
hitset_full = intbitset(trailing_bits=1)
hitset_full.discard(0)
# no pattern, so return all universe
return hitset_full
# search stage 1: break up arguments into basic search units:
if verbose and of.startswith("h"):
t1 = os.times()[4]
basic_search_units = create_basic_search_units(req, p, f, m, of)
if verbose and of.startswith("h"):
t2 = os.times()[4]
write_warning("Search stage 1: basic search units are: %s" % cgi.escape(repr(basic_search_units)), req=req)
write_warning("Search stage 1: execution took %.2f seconds." % (t2 - t1), req=req)
# search stage 2: do search for each search unit and verify hit presence:
if verbose and of.startswith("h"):
t1 = os.times()[4]
basic_search_units_hitsets = []
#prepare hiddenfield-related..
- myhiddens = cgf['CFG_BIBFORMAT_HIDDEN_TAGS']
+ myhiddens = cfg['CFG_BIBFORMAT_HIDDEN_TAGS']
can_see_hidden = False
if req:
user_info = collect_user_info(req)
can_see_hidden = user_info.get('precached_canseehiddenmarctags', False)
if not req and ap == -9: # special request, coming from webcoll
can_see_hidden = True
if can_see_hidden:
myhiddens = []
if CFG_INSPIRE_SITE and of.startswith('h'):
# fulltext/caption search warnings for INSPIRE:
fields_to_be_searched = [f for o, p, f, m in basic_search_units]
if 'fulltext' in fields_to_be_searched:
write_warning( _("Warning: full-text search is only available for a subset of papers mostly from %(x_range_from_year)s-%(x_range_to_year)s.") % \
{'x_range_from_year': '2006',
'x_range_to_year': '2012'}, req=req)
elif 'caption' in fields_to_be_searched:
write_warning(_("Warning: figure caption search is only available for a subset of papers mostly from %(x_range_from_year)s-%(x_range_to_year)s.") % \
{'x_range_from_year': '2008',
'x_range_to_year': '2012'}, req=req)
for idx_unit in xrange(len(basic_search_units)):
bsu_o, bsu_p, bsu_f, bsu_m = basic_search_units[idx_unit]
if bsu_f and len(bsu_f) < 2:
if of.startswith("h"):
write_warning(_("There is no index %s. Searching for %s in all fields." % (bsu_f, bsu_p)), req=req)
bsu_f = ''
bsu_m = 'w'
if of.startswith("h") and verbose:
write_warning(_('Instead searching %s.' % str([bsu_o, bsu_p, bsu_f, bsu_m])), req=req)
try:
basic_search_unit_hitset = search_unit(bsu_p, bsu_f, bsu_m, wl)
except InvenioWebSearchWildcardLimitError, excp:
basic_search_unit_hitset = excp.res
if of.startswith("h"):
write_warning(_("Search term too generic, displaying only partial results..."), req=req)
# FIXME: print warning if we use native full-text indexing
if bsu_f == 'fulltext' and bsu_m != 'w' and of.startswith('h') and not CFG_SOLR_URL:
write_warning(_("No phrase index available for fulltext yet, looking for word combination..."), req=req)
#check that the user is allowed to search with this tag
#if he/she tries it
if bsu_f and len(bsu_f) > 1 and bsu_f[0].isdigit() and bsu_f[1].isdigit():
for htag in myhiddens:
ltag = len(htag)
samelenfield = bsu_f[0:ltag]
if samelenfield == htag: #user searches by a hidden tag
#we won't show you anything..
basic_search_unit_hitset = intbitset()
if verbose >= 9 and of.startswith("h"):
write_warning("Pattern %s hitlist omitted since \
it queries in a hidden tag %s" %
(cgi.escape(repr(bsu_p)), repr(myhiddens)), req=req)
display_nearest_terms_box = False #..and stop spying, too.
if verbose >= 9 and of.startswith("h"):
write_warning("Search stage 1: pattern %s gave hitlist %s" % (cgi.escape(bsu_p), basic_search_unit_hitset), req=req)
if len(basic_search_unit_hitset) > 0 or \
ap<1 or \
bsu_o=="|" or \
((idx_unit+1)<len(basic_search_units) and basic_search_units[idx_unit+1][0]=="|"):
# stage 2-1: this basic search unit is retained, since
# either the hitset is non-empty, or the approximate
# pattern treatment is switched off, or the search unit
# was joined by an OR operator to preceding/following
# units so we do not require that it exists
basic_search_units_hitsets.append(basic_search_unit_hitset)
else:
# stage 2-2: no hits found for this search unit, try to replace non-alphanumeric chars inside pattern:
if re.search(r'[^a-zA-Z0-9\s\:]', bsu_p) and bsu_f != 'refersto' and bsu_f != 'citedby':
if bsu_p.startswith('"') and bsu_p.endswith('"'): # is it ACC query?
bsu_pn = re.sub(r'[^a-zA-Z0-9\s\:]+', "*", bsu_p)
else: # it is WRD query
bsu_pn = re.sub(r'[^a-zA-Z0-9\s\:]+', " ", bsu_p)
if verbose and of.startswith('h') and req:
write_warning("Trying (%s,%s,%s)" % (cgi.escape(bsu_pn), cgi.escape(bsu_f), cgi.escape(bsu_m)), req=req)
basic_search_unit_hitset = search_pattern(req=None, p=bsu_pn, f=bsu_f, m=bsu_m, of="id", ln=ln, wl=wl)
if len(basic_search_unit_hitset) > 0:
# we retain the new unit instead
if of.startswith('h'):
write_warning(_("No exact match found for %(x_query1)s, using %(x_query2)s instead...") % \
{'x_query1': "<em>" + cgi.escape(bsu_p) + "</em>",
'x_query2': "<em>" + cgi.escape(bsu_pn) + "</em>"}, req=req)
basic_search_units[idx_unit][1] = bsu_pn
basic_search_units_hitsets.append(basic_search_unit_hitset)
else:
# stage 2-3: no hits found either, propose nearest indexed terms:
if of.startswith('h') and display_nearest_terms_box:
if req:
if bsu_f == "recid":
write_warning(_("Requested record does not seem to exist."), req=req)
else:
write_warning(create_nearest_terms_box(req.argd, bsu_p, bsu_f, bsu_m, ln=ln), req=req)
return hitset_empty
else:
# stage 2-3: no hits found either, propose nearest indexed terms:
if of.startswith('h') and display_nearest_terms_box:
if req:
if bsu_f == "recid":
write_warning(_("Requested record does not seem to exist."), req=req)
else:
write_warning(create_nearest_terms_box(req.argd, bsu_p, bsu_f, bsu_m, ln=ln), req=req)
return hitset_empty
if verbose and of.startswith("h"):
t2 = os.times()[4]
for idx_unit in range(0, len(basic_search_units)):
write_warning("Search stage 2: basic search unit %s gave %d hits." %
(basic_search_units[idx_unit][1:], len(basic_search_units_hitsets[idx_unit])), req=req)
write_warning("Search stage 2: execution took %.2f seconds." % (t2 - t1), req=req)
# search stage 3: apply boolean query for each search unit:
if verbose and of.startswith("h"):
t1 = os.times()[4]
# let the initial set be the complete universe:
hitset_in_any_collection = intbitset(trailing_bits=1)
hitset_in_any_collection.discard(0)
for idx_unit in xrange(len(basic_search_units)):
this_unit_operation = basic_search_units[idx_unit][0]
this_unit_hitset = basic_search_units_hitsets[idx_unit]
if this_unit_operation == '+':
hitset_in_any_collection.intersection_update(this_unit_hitset)
elif this_unit_operation == '-':
hitset_in_any_collection.difference_update(this_unit_hitset)
elif this_unit_operation == '|':
hitset_in_any_collection.union_update(this_unit_hitset)
else:
if of.startswith("h"):
write_warning("Invalid set operation %s." % cgi.escape(this_unit_operation), "Error", req=req)
if len(hitset_in_any_collection) == 0:
# no hits found, propose alternative boolean query:
if of.startswith('h') and display_nearest_terms_box:
nearestterms = []
for idx_unit in range(0, len(basic_search_units)):
bsu_o, bsu_p, bsu_f, bsu_m = basic_search_units[idx_unit]
if bsu_p.startswith("%") and bsu_p.endswith("%"):
bsu_p = "'" + bsu_p[1:-1] + "'"
bsu_nbhits = len(basic_search_units_hitsets[idx_unit])
# create a similar query, but with the basic search unit only
argd = {}
argd.update(req.argd)
argd['p'] = bsu_p
argd['f'] = bsu_f
nearestterms.append((bsu_p, bsu_nbhits, argd))
text = websearch_templates.tmpl_search_no_boolean_hits(
ln=ln, nearestterms=nearestterms)
write_warning(text, req=req)
if verbose and of.startswith("h"):
t2 = os.times()[4]
write_warning("Search stage 3: boolean query gave %d hits." % len(hitset_in_any_collection), req=req)
write_warning("Search stage 3: execution took %.2f seconds." % (t2 - t1), req=req)
return hitset_in_any_collection
def search_pattern_parenthesised(req=None, p=None, f=None, m=None, ap=0, of="id", verbose=0, ln=CFG_SITE_LANG, display_nearest_terms_box=True, wl=0):
"""Search for complex pattern 'p' containing parenthesis within field 'f' according to
matching type 'm'. Return hitset of recIDs.
For more details on the parameters see 'search_pattern'
"""
_ = gettext_set_language(ln)
spires_syntax_converter = SpiresToInvenioSyntaxConverter()
spires_syntax_query = False
# if the pattern uses SPIRES search syntax, convert it to Invenio syntax
if spires_syntax_converter.is_applicable(p):
spires_syntax_query = True
p = spires_syntax_converter.convert_query(p)
# sanity check: do not call parenthesised parser for search terms
# like U(1) but still call it for searches like ('U(1)' | 'U(2)'):
if not re_pattern_parens.search(re_pattern_parens_quotes.sub('_', p)):
return search_pattern(req, p, f, m, ap, of, verbose, ln, display_nearest_terms_box=display_nearest_terms_box, wl=wl)
# Try searching with parentheses
try:
parser = SearchQueryParenthesisedParser()
# get a hitset with all recids
result_hitset = intbitset(trailing_bits=1)
# parse the query. The result is list of [op1, expr1, op2, expr2, ..., opN, exprN]
parsing_result = parser.parse_query(p)
if verbose and of.startswith("h"):
write_warning("Search stage 1: search_pattern_parenthesised() searched %s." % repr(p), req=req)
write_warning("Search stage 1: search_pattern_parenthesised() returned %s." % repr(parsing_result), req=req)
# go through every pattern
# calculate hitset for it
# combine pattern's hitset with the result using the corresponding operator
for index in xrange(0, len(parsing_result)-1, 2 ):
current_operator = parsing_result[index]
current_pattern = parsing_result[index+1]
if CFG_INSPIRE_SITE and spires_syntax_query:
# setting ap=0 to turn off approximate matching for 0 results.
# Doesn't work well in combinations.
# FIXME: The right fix involves collecting statuses for each
# hitset, then showing a nearest terms box exactly once,
# outside this loop.
ap = 0
display_nearest_terms_box = False
# obtain a hitset for the current pattern
current_hitset = search_pattern(req, current_pattern, f, m, ap, of, verbose, ln, display_nearest_terms_box=display_nearest_terms_box, wl=wl)
# combine the current hitset with resulting hitset using the current operator
if current_operator == '+':
result_hitset = result_hitset & current_hitset
elif current_operator == '-':
result_hitset = result_hitset - current_hitset
elif current_operator == '|':
result_hitset = result_hitset | current_hitset
else:
assert False, "Unknown operator in search_pattern_parenthesised()"
return result_hitset
# If searching with parenteses fails, perform search ignoring parentheses
except SyntaxError:
write_warning(_("Search syntax misunderstood. Ignoring all parentheses in the query. If this doesn't help, please check your search and try again."), req=req)
# remove the parentheses in the query. Current implementation removes all the parentheses,
# but it could be improved to romove only these that are not inside quotes
p = p.replace('(', ' ')
p = p.replace(')', ' ')
return search_pattern(req, p, f, m, ap, of, verbose, ln, display_nearest_terms_box=display_nearest_terms_box, wl=wl)
def search_unit(p, f=None, m=None, wl=0, ignore_synonyms=None):
"""Search for basic search unit defined by pattern 'p' and field
'f' and matching type 'm'. Return hitset of recIDs.
All the parameters are assumed to have been previously washed.
'p' is assumed to be already a ``basic search unit'' so that it
is searched as such and is not broken up in any way. Only
wildcard and span queries are being detected inside 'p'.
If CFG_WEBSEARCH_SYNONYM_KBRS is set and we are searching in
one of the indexes that has defined runtime synonym knowledge
base, then look up there and automatically enrich search
results with results for synonyms.
In case the wildcard limit (wl) is greater than 0 and this limit
is reached an InvenioWebSearchWildcardLimitError will be raised.
In case you want to call this function with no limit for the
wildcard queries, wl should be 0.
Parameter 'ignore_synonyms' is a list of terms for which we
should not try to further find a synonym.
This function is suitable as a low-level API.
"""
## create empty output results set:
hitset = intbitset()
if not p: # sanity checking
return hitset
tokenizer = get_field_tokenizer_type(f)
hitset_cjk = intbitset()
if tokenizer == "BibIndexCJKTokenizer":
if is_there_any_CJK_character_in_text(p):
cjk_tok = BibIndexCJKTokenizer()
chars = cjk_tok.tokenize_for_words(p)
for char in chars:
hitset_cjk |= search_unit_in_bibwords(char, f, m, wl)
## eventually look up runtime synonyms:
hitset_synonyms = intbitset()
if CFG_WEBSEARCH_SYNONYM_KBRS.has_key(f):
if ignore_synonyms is None:
ignore_synonyms = []
ignore_synonyms.append(p)
for p_synonym in get_synonym_terms(p,
CFG_WEBSEARCH_SYNONYM_KBRS[f][0],
CFG_WEBSEARCH_SYNONYM_KBRS[f][1]):
if p_synonym != p and \
not p_synonym in ignore_synonyms:
hitset_synonyms |= search_unit(p_synonym, f, m, wl,
ignore_synonyms)
## look up hits:
if f == 'fulltext' and get_idx_indexer('fulltext') == 'SOLR' and CFG_SOLR_URL:
# redirect to Solr
try:
return search_unit_in_solr(p, f, m)
except:
# There were troubles with getting full-text search
# results from Solr. Let us alert the admin of these
# problems and let us simply return empty results to the
# end user.
register_exception()
return hitset
elif f == 'fulltext' and get_idx_indexer('fulltext') == 'XAPIAN' and CFG_XAPIAN_ENABLED:
# redirect to Xapian
try:
return search_unit_in_xapian(p, f, m)
except:
# There were troubles with getting full-text search
# results from Xapian. Let us alert the admin of these
# problems and let us simply return empty results to the
# end user.
register_exception()
return hitset
if f == 'datecreated':
hitset = search_unit_in_bibrec(p, p, 'c')
elif f == 'datemodified':
hitset = search_unit_in_bibrec(p, p, 'm')
elif f == 'refersto':
# we are doing search by the citation count
hitset = search_unit_refersto(p)
elif f == 'rawref':
from invenio.refextract_api import search_from_reference
field, pattern = search_from_reference(p)
return search_unit(pattern, field)
elif f == 'citedby':
# we are doing search by the citation count
hitset = search_unit_citedby(p)
elif f == 'collection':
# we are doing search by the collection name or MARC field
hitset = search_unit_collection(p, m, wl=wl)
elif f == 'tag':
module_found = False
try:
from invenio.modules.tags.search_units import search_unit_in_tags
module_found = True
except:
# WebTag module is disabled, so ignore 'tag' selector
pass
if module_found:
return search_unit_in_tags(p)
elif m == 'a' or m == 'r':
# we are doing either phrase search or regexp search
if f == 'fulltext':
# FIXME: workaround for not having phrase index yet
return search_pattern(None, p, f, 'w')
index_id = get_index_id_from_field(f)
if index_id != 0:
if m == 'a' and index_id in get_idxpair_field_ids():
#for exact match on the admin configured fields we are searching in the pair tables
hitset = search_unit_in_idxpairs(p, f, m, wl)
else:
hitset = search_unit_in_idxphrases(p, f, m, wl)
else:
hitset = search_unit_in_bibxxx(p, f, m, wl)
# if not hitset and m == 'a' and (p[0] != '%' and p[-1] != '%'):
# #if we have no results by doing exact matching, do partial matching
# #for removing the distinction between simple and double quotes
# hitset = search_unit_in_bibxxx('%' + p + '%', f, m, wl)
elif p.startswith("cited:"):
# we are doing search by the citation count
hitset = search_unit_by_times_cited(p[6:])
else:
# we are doing bibwords search by default
hitset = search_unit_in_bibwords(p, f, m, wl=wl)
## merge synonym results and return total:
hitset |= hitset_synonyms
hitset |= hitset_cjk
return hitset
def get_idxpair_field_ids():
"""Returns the list of ids for the fields that idxPAIRS should be used on"""
index_dict = dict(run_sql("SELECT name, id FROM idxINDEX"))
return [index_dict[field] for field in index_dict if field in cfg['CFG_WEBSEARCH_IDXPAIRS_FIELDS']]
def search_unit_in_bibwords(word, f, m=None, decompress=zlib.decompress, wl=0):
"""Searches for 'word' inside bibwordsX table for field 'f' and returns hitset of recIDs."""
set = intbitset() # will hold output result set
set_used = 0 # not-yet-used flag, to be able to circumvent set operations
limit_reached = 0 # flag for knowing if the query limit has been reached
# if no field is specified, search in the global index.
f = f or 'anyfield'
index_id = get_index_id_from_field(f)
if index_id:
bibwordsX = "idxWORD%02dF" % index_id
stemming_language = get_index_stemming_language(index_id)
else:
return intbitset() # word index f does not exist
# wash 'word' argument and run query:
if f.endswith('count') and word.endswith('+'):
# field count query of the form N+ so transform N+ to N->99999:
word = word[:-1] + '->99999'
word = string.replace(word, '*', '%') # we now use '*' as the truncation character
words = string.split(word, "->", 1) # check for span query
if len(words) == 2:
word0 = re_word.sub('', words[0])
word1 = re_word.sub('', words[1])
if stemming_language:
word0 = lower_index_term(word0)
word1 = lower_index_term(word1)
word0 = stem(word0, stemming_language)
word1 = stem(word1, stemming_language)
word0_washed = wash_index_term(word0)
word1_washed = wash_index_term(word1)
if f.endswith('count'):
# field count query; convert to integers in order
# to have numerical behaviour for 'BETWEEN n1 AND n2' query
try:
word0_washed = int(word0_washed)
word1_washed = int(word1_washed)
except ValueError:
pass
try:
res = run_sql_with_limit("SELECT term,hitlist FROM %s WHERE term BETWEEN %%s AND %%s" % bibwordsX,
(word0_washed, word1_washed), wildcard_limit = wl)
except InvenioDbQueryWildcardLimitError, excp:
res = excp.res
limit_reached = 1 # set the limit reached flag to true
else:
if f == 'journal':
pass # FIXME: quick hack for the journal index
else:
word = re_word.sub('', word)
if stemming_language:
word = lower_index_term(word)
word = stem(word, stemming_language)
if string.find(word, '%') >= 0: # do we have wildcard in the word?
if f == 'journal':
# FIXME: quick hack for the journal index
# FIXME: we can run a sanity check here for all indexes
res = ()
else:
try:
res = run_sql_with_limit("SELECT term,hitlist FROM %s WHERE term LIKE %%s" % bibwordsX,
(wash_index_term(word),), wildcard_limit = wl)
except InvenioDbQueryWildcardLimitError, excp:
res = excp.res
limit_reached = 1 # set the limit reached flag to true
else:
res = run_sql("SELECT term,hitlist FROM %s WHERE term=%%s" % bibwordsX,
(wash_index_term(word),))
# fill the result set:
for word, hitlist in res:
hitset_bibwrd = intbitset(hitlist)
# add the results:
if set_used:
set.union_update(hitset_bibwrd)
else:
set = hitset_bibwrd
set_used = 1
#check to see if the query limit was reached
if limit_reached:
#raise an exception, so we can print a nice message to the user
raise InvenioWebSearchWildcardLimitError(set)
# okay, return result set:
return set
def search_unit_in_idxpairs(p, f, type, wl=0):
"""Searches for pair 'p' inside idxPAIR table for field 'f' and
returns hitset of recIDs found."""
limit_reached = 0 # flag for knowing if the query limit has been reached
do_exact_search = True # flag to know when it makes sense to try to do exact matching
result_set = intbitset()
#determine the idxPAIR table to read from
index_id = get_index_id_from_field(f)
if not index_id:
return intbitset()
stemming_language = get_index_stemming_language(index_id)
pairs_tokenizer = BibIndexDefaultTokenizer(stemming_language)
idxpair_table_washed = wash_table_column_name("idxPAIR%02dF" % index_id)
if p.startswith("%") and p.endswith("%"):
p = p[1:-1]
original_pattern = p
p = string.replace(p, '*', '%') # we now use '*' as the truncation character
queries_releated_vars = [] # contains tuples of (query_addons, query_params, use_query_limit)
#is it a span query?
ps = string.split(p, "->", 1)
if len(ps) == 2 and not (ps[0].endswith(' ') or ps[1].startswith(' ')):
#so we are dealing with a span query
pairs_left = pairs_tokenizer.tokenize_for_pairs(ps[0])
pairs_right = pairs_tokenizer.tokenize_for_pairs(ps[1])
if not pairs_left or not pairs_right:
# we are not actually dealing with pairs but with words
return search_unit_in_bibwords(original_pattern, f, type, wl)
elif len(pairs_left) != len(pairs_right):
# it is kind of hard to know what the user actually wanted
# we have to do: foo bar baz -> qux xyz, so let's swith to phrase
return search_unit_in_idxphrases(original_pattern, f, type, wl)
elif len(pairs_left) > 1 and \
len(pairs_right) > 1 and \
pairs_left[:-1] != pairs_right[:-1]:
# again we have something like: foo bar baz -> abc xyz qux
# so we'd better switch to phrase
return search_unit_in_idxphrases(original_pattern, f, type, wl)
else:
# finally, we can treat the search using idxPairs
# at this step we have either: foo bar -> abc xyz
# or foo bar abc -> foo bar xyz
queries_releated_vars = [("BETWEEN %s AND %s", (pairs_left[-1], pairs_right[-1]), True)]
for pair in pairs_left[:-1]:# which should be equal with pairs_right[:-1]
queries_releated_vars.append(("= %s", (pair, ), False))
do_exact_search = False # no exact search for span queries
elif string.find(p, '%') > -1:
#tokenizing p will remove the '%', so we have to make sure it stays
replacement = 'xxxxxxxxxx' #hopefuly this will not clash with anything in the future
p = string.replace(p, '%', replacement)
pairs = pairs_tokenizer.tokenize_for_pairs(p)
if not pairs:
# we are not actually dealing with pairs but with words
return search_unit_in_bibwords(original_pattern, f, type, wl)
queries_releated_vars = []
for pair in pairs:
if string.find(pair, replacement) > -1:
pair = string.replace(pair, replacement, '%') #we replace back the % sign
queries_releated_vars.append(("LIKE %s", (pair, ), True))
else:
queries_releated_vars.append(("= %s", (pair, ), False))
do_exact_search = False
else:
#normal query
pairs = pairs_tokenizer.tokenize_for_pairs(p)
if not pairs:
# we are not actually dealing with pairs but with words
return search_unit_in_bibwords(original_pattern, f, type, wl)
queries_releated_vars = []
for pair in pairs:
queries_releated_vars.append(("= %s", (pair, ), False))
first_results = 1 # flag to know if it's the first set of results or not
for query_var in queries_releated_vars:
query_addons = query_var[0]
query_params = query_var[1]
use_query_limit = query_var[2]
if use_query_limit:
try:
res = run_sql_with_limit("SELECT term, hitlist FROM %s WHERE term %s" \
% (idxpair_table_washed, query_addons), query_params, wildcard_limit=wl) #kwalitee:disable=sql
except InvenioDbQueryWildcardLimitError, excp:
res = excp.res
limit_reached = 1 # set the limit reached flag to true
else:
res = run_sql("SELECT term, hitlist FROM %s WHERE term %s" \
% (idxpair_table_washed, query_addons), query_params) #kwalitee:disable=sql
if not res:
return intbitset()
for pair, hitlist in res:
hitset_idxpairs = intbitset(hitlist)
if first_results:
result_set = hitset_idxpairs
first_results = 0
else:
result_set.intersection_update(hitset_idxpairs)
#check to see if the query limit was reached
if limit_reached:
#raise an exception, so we can print a nice message to the user
raise InvenioWebSearchWildcardLimitError(result_set)
# check if we need to eliminate the false positives
if cfg['CFG_WEBSEARCH_IDXPAIRS_EXACT_SEARCH'] and do_exact_search:
# we need to eliminate the false positives
idxphrase_table_washed = wash_table_column_name("idxPHRASE%02dR" % index_id)
not_exact_search = intbitset()
for recid in result_set:
res = run_sql("SELECT termlist FROM %s WHERE id_bibrec %s" %(idxphrase_table_washed, '=%s'), (recid, )) #kwalitee:disable=sql
if res:
termlist = deserialize_via_marshal(res[0][0])
if not [term for term in termlist if term.lower().find(p.lower()) > -1]:
not_exact_search.add(recid)
else:
not_exact_search.add(recid)
# remove the recs that are false positives from the final result
result_set.difference_update(not_exact_search)
return result_set
def search_unit_in_idxphrases(p, f, type, wl=0):
"""Searches for phrase 'p' inside idxPHRASE*F table for field 'f' and returns hitset of recIDs found.
The search type is defined by 'type' (e.g. equals to 'r' for a regexp search)."""
# call word search method in some cases:
if f.endswith('count'):
return search_unit_in_bibwords(p, f, wl=wl)
set = intbitset() # will hold output result set
set_used = 0 # not-yet-used flag, to be able to circumvent set operations
limit_reached = 0 # flag for knowing if the query limit has been reached
use_query_limit = False # flag for knowing if to limit the query results or not
# deduce in which idxPHRASE table we will search:
idxphraseX = "idxPHRASE%02dF" % get_index_id_from_field("anyfield")
if f:
index_id = get_index_id_from_field(f)
if index_id:
idxphraseX = "idxPHRASE%02dF" % index_id
else:
return intbitset() # phrase index f does not exist
# detect query type (exact phrase, partial phrase, regexp):
if type == 'r':
query_addons = "REGEXP %s"
query_params = (p,)
use_query_limit = True
else:
p = string.replace(p, '*', '%') # we now use '*' as the truncation character
ps = string.split(p, "->", 1) # check for span query:
if len(ps) == 2 and not (ps[0].endswith(' ') or ps[1].startswith(' ')):
query_addons = "BETWEEN %s AND %s"
query_params = (ps[0], ps[1])
use_query_limit = True
else:
if string.find(p, '%') > -1:
query_addons = "LIKE %s"
query_params = (p,)
use_query_limit = True
else:
query_addons = "= %s"
query_params = (p,)
# special washing for fuzzy author index:
if f in ('author', 'firstauthor', 'exactauthor', 'exactfirstauthor', 'authorityauthor'):
query_params_washed = ()
for query_param in query_params:
query_params_washed += (wash_author_name(query_param),)
query_params = query_params_washed
# perform search:
if use_query_limit:
try:
res = run_sql_with_limit("SELECT term,hitlist FROM %s WHERE term %s" % (idxphraseX, query_addons),
query_params, wildcard_limit=wl)
except InvenioDbQueryWildcardLimitError, excp:
res = excp.res
limit_reached = 1 # set the limit reached flag to true
else:
res = run_sql("SELECT term,hitlist FROM %s WHERE term %s" % (idxphraseX, query_addons), query_params)
# fill the result set:
for word, hitlist in res:
hitset_bibphrase = intbitset(hitlist)
# add the results:
if set_used:
set.union_update(hitset_bibphrase)
else:
set = hitset_bibphrase
set_used = 1
#check to see if the query limit was reached
if limit_reached:
#raise an exception, so we can print a nice message to the user
raise InvenioWebSearchWildcardLimitError(set)
# okay, return result set:
return set
def search_unit_in_bibxxx(p, f, type, wl=0):
"""Searches for pattern 'p' inside bibxxx tables for field 'f' and returns hitset of recIDs found.
The search type is defined by 'type' (e.g. equals to 'r' for a regexp search)."""
# call word search method in some cases:
if f == 'journal' or f.endswith('count'):
return search_unit_in_bibwords(p, f, wl=wl)
p_orig = p # saving for eventual future 'no match' reporting
limit_reached = 0 # flag for knowing if the query limit has been reached
use_query_limit = False # flag for knowing if to limit the query results or not
query_addons = "" # will hold additional SQL code for the query
query_params = () # will hold parameters for the query (their number may vary depending on TYPE argument)
# wash arguments:
f = string.replace(f, '*', '%') # replace truncation char '*' in field definition
if type == 'r':
query_addons = "REGEXP %s"
query_params = (p,)
use_query_limit = True
else:
p = string.replace(p, '*', '%') # we now use '*' as the truncation character
ps = string.split(p, "->", 1) # check for span query:
if len(ps) == 2 and not (ps[0].endswith(' ') or ps[1].startswith(' ')):
query_addons = "BETWEEN %s AND %s"
query_params = (ps[0], ps[1])
use_query_limit = True
else:
if string.find(p, '%') > -1:
query_addons = "LIKE %s"
query_params = (p,)
use_query_limit = True
else:
query_addons = "= %s"
query_params = (p,)
# construct 'tl' which defines the tag list (MARC tags) to search in:
tl = []
if len(f) >= 2 and str(f[0]).isdigit() and str(f[1]).isdigit():
tl.append(f) # 'f' seems to be okay as it starts by two digits
else:
# deduce desired MARC tags on the basis of chosen 'f'
tl = get_field_tags(f)
if not tl:
# f index does not exist, nevermind
pass
# okay, start search:
l = [] # will hold list of recID that matched
for t in tl:
# deduce into which bibxxx table we will search:
digit1, digit2 = int(t[0]), int(t[1])
bx = "bib%d%dx" % (digit1, digit2)
bibx = "bibrec_bib%d%dx" % (digit1, digit2)
# construct and run query:
if t == "001":
if query_addons.find('BETWEEN') > -1 or query_addons.find('=') > -1:
# verify that the params are integers (to avoid returning record 123 when searching for 123foo)
try:
query_params = tuple(int(param) for param in query_params)
except ValueError:
return intbitset()
if use_query_limit:
try:
res = run_sql_with_limit("SELECT id FROM bibrec WHERE id %s" % query_addons,
query_params, wildcard_limit=wl)
except InvenioDbQueryWildcardLimitError, excp:
res = excp.res
limit_reached = 1 # set the limit reached flag to true
else:
res = run_sql("SELECT id FROM bibrec WHERE id %s" % query_addons,
query_params)
else:
query = "SELECT bibx.id_bibrec FROM %s AS bx LEFT JOIN %s AS bibx ON bx.id=bibx.id_bibxxx WHERE bx.value %s" % \
(bx, bibx, query_addons)
if len(t) != 6 or t[-1:]=='%':
# wildcard query, or only the beginning of field 't'
# is defined, so add wildcard character:
query += " AND bx.tag LIKE %s"
query_params_and_tag = query_params + (t + '%',)
else:
# exact query for 't':
query += " AND bx.tag=%s"
query_params_and_tag = query_params + (t,)
if use_query_limit:
try:
res = run_sql_with_limit(query, query_params_and_tag, wildcard_limit=wl)
except InvenioDbQueryWildcardLimitError, excp:
res = excp.res
limit_reached = 1 # set the limit reached flag to true
else:
res = run_sql(query, query_params_and_tag)
# fill the result set:
for id_bibrec in res:
if id_bibrec[0]:
l.append(id_bibrec[0])
# check no of hits found:
nb_hits = len(l)
# okay, return result set:
set = intbitset(l)
#check to see if the query limit was reached
if limit_reached:
#raise an exception, so we can print a nice message to the user
raise InvenioWebSearchWildcardLimitError(set)
return set
def search_unit_in_solr(p, f=None, m=None):
"""
Query a Solr index and return an intbitset corresponding
to the result. Parameters (p,f,m) are usual search unit ones.
"""
if m and (m == 'a' or m == 'r'): # phrase/regexp query
if p.startswith('%') and p.endswith('%'):
p = p[1:-1] # fix for partial phrase
p = '"' + p + '"'
return solr_get_bitset(f, p)
def search_unit_in_xapian(p, f=None, m=None):
"""
Query a Xapian index and return an intbitset corresponding
to the result. Parameters (p,f,m) are usual search unit ones.
"""
if m and (m == 'a' or m == 'r'): # phrase/regexp query
if p.startswith('%') and p.endswith('%'):
p = p[1:-1] # fix for partial phrase
p = '"' + p + '"'
return xapian_get_bitset(f, p)
def search_unit_in_bibrec(datetext1, datetext2, type='c'):
"""
Return hitset of recIDs found that were either created or modified
(according to 'type' arg being 'c' or 'm') from datetext1 until datetext2, inclusive.
Does not pay attention to pattern, collection, anything. Useful
to intersect later on with the 'real' query.
"""
set = intbitset()
if type and type.startswith("m"):
type = "modification_date"
else:
type = "creation_date" # by default we are searching for creation dates
parts = datetext1.split('->')
if len(parts) > 1 and datetext1 == datetext2:
datetext1 = parts[0]
datetext2 = parts[1]
if datetext1 == datetext2:
res = run_sql("SELECT id FROM bibrec WHERE %s LIKE %%s" % (type,),
(datetext1 + '%',))
else:
res = run_sql("SELECT id FROM bibrec WHERE %s>=%%s AND %s<=%%s" % (type, type),
(datetext1, datetext2))
for row in res:
set += row[0]
return set
def search_unit_by_times_cited(p):
"""
Return histset of recIDs found that are cited P times.
Usually P looks like '10->23'.
"""
numstr = '"'+p+'"'
#this is sort of stupid but since we may need to
#get the records that do _not_ have cites, we have to
#know the ids of all records, too
#but this is needed only if bsu_p is 0 or 0 or 0->0
allrecs = []
if p == 0 or p == "0" or \
p.startswith("0->") or p.endswith("->0"):
allrecs = intbitset(run_sql("SELECT id FROM bibrec"))
return get_records_with_num_cites(numstr, allrecs)
def search_unit_refersto(query):
"""
Search for records satisfying the query (e.g. author:ellis) and
return list of records referred to by these records.
"""
if query:
ahitset = search_pattern(p=query)
if ahitset:
return get_refersto_hitset(ahitset)
else:
return intbitset([])
else:
return intbitset([])
def search_unit_citedby(query):
"""
Search for records satisfying the query (e.g. author:ellis) and
return list of records cited by these records.
"""
if query:
ahitset = search_pattern(p=query)
if ahitset:
return get_citedby_hitset(ahitset)
else:
return intbitset([])
else:
return intbitset([])
def search_unit_collection(query, m, wl=None):
"""
Search for records satisfying the query (e.g. collection:"BOOK" or
collection:"Books") and return list of records in the collection.
"""
if len(query):
ahitset = get_collection_reclist(query)
if not ahitset:
return search_unit_in_bibwords(query, 'collection', m, wl=wl)
return ahitset
else:
return intbitset([])
def get_records_that_can_be_displayed(user_info,
hitset_in_any_collection,
current_coll=CFG_SITE_NAME,
colls=None):
"""
Return records that can be displayed.
"""
records_that_can_be_displayed = intbitset()
if colls is None:
colls = [current_coll]
# let's get the restricted collections the user has rights to view
permitted_restricted_collections = user_info.get('precached_permitted_restricted_collections', [])
policy = CFG_WEBSEARCH_VIEWRESTRCOLL_POLICY.strip().upper()
current_coll_children = get_collection_allchildren(current_coll) # real & virtual
# add all restricted collections, that the user has access to, and are under the current collection
# do not use set here, in order to maintain a specific order:
# children of 'cc' (real, virtual, restricted), rest of 'c' that are not cc's children
colls_to_be_displayed = [coll for coll in current_coll_children if coll in colls or coll in permitted_restricted_collections]
colls_to_be_displayed.extend([coll for coll in colls if coll not in colls_to_be_displayed])
if policy == 'ANY':# the user needs to have access to at least one collection that restricts the records
#we need this to be able to remove records that are both in a public and restricted collection
permitted_recids = intbitset()
notpermitted_recids = intbitset()
for collection in restricted_collection_cache.cache:
if collection in permitted_restricted_collections:
permitted_recids |= get_collection_reclist(collection)
else:
notpermitted_recids |= get_collection_reclist(collection)
records_that_can_be_displayed = hitset_in_any_collection - (notpermitted_recids - permitted_recids)
else:# the user needs to have access to all collections that restrict a records
notpermitted_recids = intbitset()
for collection in restricted_collection_cache.cache:
if collection not in permitted_restricted_collections:
notpermitted_recids |= get_collection_reclist(collection)
records_that_can_be_displayed = hitset_in_any_collection - notpermitted_recids
if records_that_can_be_displayed.is_infinite():
# We should not return infinite results for user.
records_that_can_be_displayed = intbitset()
for coll in colls_to_be_displayed:
records_that_can_be_displayed |= get_collection_reclist(coll)
return records_that_can_be_displayed
def intersect_results_with_collrecs(req, hitset_in_any_collection, colls, ap=0, of="hb", verbose=0, ln=CFG_SITE_LANG, display_nearest_terms_box=True):
"""Return dict of hitsets given by intersection of hitset with the collection universes."""
_ = gettext_set_language(ln)
# search stage 4: intersect with the collection universe
if verbose and of.startswith("h"):
t1 = os.times()[4]
results = {} # all final results
results_nbhits = 0
# calculate the list of recids (restricted or not) that the user has rights to access and we should display (only those)
if not req or isinstance(req, cStringIO.OutputType): # called from CLI
user_info = {}
for coll in colls:
results[coll] = hitset_in_any_collection & get_collection_reclist(coll)
results_nbhits += len(results[coll])
records_that_can_be_displayed = hitset_in_any_collection
permitted_restricted_collections = []
else:
user_info = collect_user_info(req)
# let's get the restricted collections the user has rights to view
if user_info['guest'] == '1':
permitted_restricted_collections = []
## For guest users that are actually authorized to some restricted
## collection (by virtue of the IP address in a FireRole rule)
## we explicitly build the list of permitted_restricted_collections
for coll in colls:
if collection_restricted_p(coll) and (acc_authorize_action(user_info, 'viewrestrcoll', collection=coll)[0] == 0):
permitted_restricted_collections.append(coll)
else:
permitted_restricted_collections = user_info.get('precached_permitted_restricted_collections', [])
# let's build the list of the both public and restricted
# child collections of the collection from which the user
# started his/her search. This list of children colls will be
# used in the warning proposing a search in that collections
try:
current_coll = req.argd['cc'] # current_coll: coll from which user started his/her search
except:
from flask import request
current_coll = request.args.get('cc', CFG_SITE_NAME) # current_coll: coll from which user started his/her search
current_coll_children = get_collection_allchildren(current_coll) # real & virtual
# add all restricted collections, that the user has access to, and are under the current collection
# do not use set here, in order to maintain a specific order:
# children of 'cc' (real, virtual, restricted), rest of 'c' that are not cc's children
colls_to_be_displayed = [coll for coll in current_coll_children if coll in colls or coll in permitted_restricted_collections]
colls_to_be_displayed.extend([coll for coll in colls if coll not in colls_to_be_displayed])
records_that_can_be_displayed = get_records_that_can_be_displayed(
user_info,
hitset_in_any_collection,
current_coll, colls)
for coll in colls_to_be_displayed:
results[coll] = results.get(coll, intbitset()).union_update(records_that_can_be_displayed & get_collection_reclist(coll))
results_nbhits += len(results[coll])
if results_nbhits == 0:
# no hits found, try to search in Home and restricted and/or hidden collections:
results = {}
results_in_Home = records_that_can_be_displayed & get_collection_reclist(CFG_SITE_NAME)
results_in_restricted_collections = intbitset()
results_in_hidden_collections = intbitset()
for coll in permitted_restricted_collections:
if not get_coll_ancestors(coll): # hidden collection
results_in_hidden_collections.union_update(records_that_can_be_displayed & get_collection_reclist(coll))
else:
results_in_restricted_collections.union_update(records_that_can_be_displayed & get_collection_reclist(coll))
# in this way, we do not count twice, records that are both in Home collection and in a restricted collection
total_results = len(results_in_Home.union(results_in_restricted_collections))
if total_results > 0:
# some hits found in Home and/or restricted collections, so propose this search:
if of.startswith("h") and display_nearest_terms_box:
url = websearch_templates.build_search_url(req.argd, cc=CFG_SITE_NAME, c=[])
len_colls_to_display = len(colls_to_be_displayed)
# trim the list of collections to first two, since it might get very large
write_warning(_("No match found in collection %(x_collection)s. Other collections gave %(x_url_open)s%(x_nb_hits)d hits%(x_url_close)s.") %\
{'x_collection': '<em>' + \
string.join([get_coll_i18nname(coll, ln, False) for coll in colls_to_be_displayed[:2]], ', ') + \
(len_colls_to_display > 2 and ' et al' or '') + '</em>',
'x_url_open': '<a class="nearestterms" href="%s">' % (url),
'x_nb_hits': total_results,
'x_url_close': '</a>'}, req=req)
# display the hole list of collections in a comment
if len_colls_to_display > 2:
write_warning("<!--No match found in collection <em>%(x_collection)s</em>.-->" %\
{'x_collection': string.join([get_coll_i18nname(coll, ln, False) for coll in colls_to_be_displayed], ', ')},
req=req)
else:
# no hits found, either user is looking for a document and he/she has not rights
# or user is looking for a hidden document:
if of.startswith("h") and display_nearest_terms_box:
if len(results_in_hidden_collections) > 0:
write_warning(_("No public collection matched your query. "
"If you were looking for a hidden document, please type "
"the correct URL for this record."), req=req)
else:
write_warning(_("No public collection matched your query. "
"If you were looking for a non-public document, please choose "
"the desired restricted collection first."), req=req)
if verbose and of.startswith("h"):
t2 = os.times()[4]
write_warning("Search stage 4: intersecting with collection universe gave %d hits." % results_nbhits, req=req)
write_warning("Search stage 4: execution took %.2f seconds." % (t2 - t1), req=req)
return results
def intersect_results_with_hitset(req, results, hitset, ap=0, aptext="", of="hb"):
"""Return intersection of search 'results' (a dict of hitsets
with collection as key) with the 'hitset', i.e. apply
'hitset' intersection to each collection within search
'results'.
If the final set is to be empty, and 'ap'
(approximate pattern) is true, and then print the `warningtext'
and return the original 'results' set unchanged. If 'ap' is
false, then return empty results set.
"""
if ap:
results_ap = copy.deepcopy(results)
else:
results_ap = {} # will return empty dict in case of no hits found
nb_total = 0
final_results = {}
for coll in results.keys():
final_results[coll] = results[coll].intersection(hitset)
nb_total += len(final_results[coll])
if nb_total == 0:
if of.startswith("h"):
write_warning(aptext, req=req)
final_results = results_ap
return final_results
def create_similarly_named_authors_link_box(author_name, ln=CFG_SITE_LANG):
"""Return a box similar to ``Not satisfied...'' one by proposing
author searches for similar names. Namely, take AUTHOR_NAME
and the first initial of the firstame (after comma) and look
into author index whether authors with e.g. middle names exist.
Useful mainly for CERN Library that sometimes contains name
forms like Ellis-N, Ellis-Nick, Ellis-Nicolas all denoting the
same person. The box isn't proposed if no similarly named
authors are found to exist.
"""
# return nothing if not configured:
if CFG_WEBSEARCH_CREATE_SIMILARLY_NAMED_AUTHORS_LINK_BOX == 0:
return ""
# return empty box if there is no initial:
if re.match(r'[^ ,]+, [^ ]', author_name) is None:
return ""
# firstly find name comma initial:
author_name_to_search = re.sub(r'^([^ ,]+, +[^ ,]).*$', '\\1', author_name)
# secondly search for similar name forms:
similar_author_names = {}
for name in author_name_to_search, strip_accents(author_name_to_search):
for tag in get_field_tags("author"):
# deduce into which bibxxx table we will search:
digit1, digit2 = int(tag[0]), int(tag[1])
bx = "bib%d%dx" % (digit1, digit2)
bibx = "bibrec_bib%d%dx" % (digit1, digit2)
if len(tag) != 6 or tag[-1:]=='%':
# only the beginning of field 't' is defined, so add wildcard character:
res = run_sql("""SELECT bx.value FROM %s AS bx
WHERE bx.value LIKE %%s AND bx.tag LIKE %%s""" % bx,
(name + "%", tag + "%"))
else:
res = run_sql("""SELECT bx.value FROM %s AS bx
WHERE bx.value LIKE %%s AND bx.tag=%%s""" % bx,
(name + "%", tag))
for row in res:
similar_author_names[row[0]] = 1
# remove the original name and sort the list:
try:
del similar_author_names[author_name]
except KeyError:
pass
# thirdly print the box:
out = ""
if similar_author_names:
out_authors = similar_author_names.keys()
out_authors.sort()
tmp_authors = []
for out_author in out_authors:
nbhits = get_nbhits_in_bibxxx(out_author, "author")
if nbhits:
tmp_authors.append((out_author, nbhits))
out += websearch_templates.tmpl_similar_author_names(
authors=tmp_authors, ln=ln)
return out
def create_nearest_terms_box(urlargd, p, f, t='w', n=5, ln=CFG_SITE_LANG, intro_text_p=True):
"""Return text box containing list of 'n' nearest terms above/below 'p'
for the field 'f' for matching type 't' (words/phrases) in
language 'ln'.
Propose new searches according to `urlargs' with the new words.
If `intro_text_p' is true, then display the introductory message,
otherwise print only the nearest terms in the box content.
"""
# load the right message language
_ = gettext_set_language(ln)
if not CFG_WEBSEARCH_DISPLAY_NEAREST_TERMS:
return _("Your search did not match any records. Please try again.")
nearest_terms = []
if not p: # sanity check
p = "."
if p.startswith('%') and p.endswith('%'):
p = p[1:-1] # fix for partial phrase
index_id = get_index_id_from_field(f)
if f == 'fulltext':
if CFG_SOLR_URL:
return _("No match found, please enter different search terms.")
else:
# FIXME: workaround for not having native phrase index yet
t = 'w'
# special indexes:
if f == 'refersto':
return _("There are no records referring to %s.") % cgi.escape(p)
if f == 'citedby':
return _("There are no records cited by %s.") % cgi.escape(p)
# look for nearest terms:
if t == 'w':
nearest_terms = get_nearest_terms_in_bibwords(p, f, n, n)
if not nearest_terms:
return _("No word index is available for %s.") % \
('<em>' + cgi.escape(get_field_i18nname(get_field_name(f) or f, ln, False)) + '</em>')
else:
nearest_terms = []
if index_id:
nearest_terms = get_nearest_terms_in_idxphrase(p, index_id, n, n)
if f == 'datecreated' or f == 'datemodified':
nearest_terms = get_nearest_terms_in_bibrec(p, f, n, n)
if not nearest_terms:
nearest_terms = get_nearest_terms_in_bibxxx(p, f, n, n)
if not nearest_terms:
return _("No phrase index is available for %s.") % \
('<em>' + cgi.escape(get_field_i18nname(get_field_name(f) or f, ln, False)) + '</em>')
terminfo = []
for term in nearest_terms:
if t == 'w':
hits = get_nbhits_in_bibwords(term, f)
else:
if index_id:
hits = get_nbhits_in_idxphrases(term, f)
elif f == 'datecreated' or f == 'datemodified':
hits = get_nbhits_in_bibrec(term, f)
else:
hits = get_nbhits_in_bibxxx(term, f)
argd = {}
argd.update(urlargd)
# check which fields contained the requested parameter, and replace it.
for (px, fx) in ('p', 'f'), ('p1', 'f1'), ('p2', 'f2'), ('p3', 'f3'):
if px in argd:
argd_px = argd[px]
if t == 'w':
# p was stripped of accents, to do the same:
argd_px = strip_accents(argd_px)
#argd[px] = string.replace(argd_px, p, term, 1)
#we need something similar, but case insensitive
pattern_index = string.find(argd_px.lower(), p.lower())
if pattern_index > -1:
argd[px] = argd_px[:pattern_index] + term + argd_px[pattern_index+len(p):]
break
#this is doing exactly the same as:
#argd[px] = re.sub('(?i)' + re.escape(p), term, argd_px, 1)
#but is ~4x faster (2us vs. 8.25us)
terminfo.append((term, hits, argd))
intro = ""
if intro_text_p: # add full leading introductory text
if f:
intro = _("Search term %(x_term)s inside index %(x_index)s did not match any record. Nearest terms in any collection are:") % \
{'x_term': "<em>" + cgi.escape(p.startswith("%") and p.endswith("%") and p[1:-1] or p) + "</em>",
'x_index': "<em>" + cgi.escape(get_field_i18nname(get_field_name(f) or f, ln, False)) + "</em>"}
else:
intro = _("Search term %s did not match any record. Nearest terms in any collection are:") % \
("<em>" + cgi.escape(p.startswith("%") and p.endswith("%") and p[1:-1] or p) + "</em>")
return websearch_templates.tmpl_nearest_term_box(p=p, ln=ln, f=f, terminfo=terminfo,
intro=intro)
def get_nearest_terms_in_bibwords(p, f, n_below, n_above):
"""Return list of +n -n nearest terms to word `p' in index for field `f'."""
nearest_words = [] # will hold the (sorted) list of nearest words to return
# deduce into which bibwordsX table we will search:
bibwordsX = "idxWORD%02dF" % get_index_id_from_field("anyfield")
if f:
index_id = get_index_id_from_field(f)
if index_id:
bibwordsX = "idxWORD%02dF" % index_id
else:
return nearest_words
# firstly try to get `n' closest words above `p':
res = run_sql("SELECT term FROM %s WHERE term<%%s ORDER BY term DESC LIMIT %%s" % bibwordsX,
(p, n_above))
for row in res:
nearest_words.append(row[0])
nearest_words.reverse()
# secondly insert given word `p':
nearest_words.append(p)
# finally try to get `n' closest words below `p':
res = run_sql("SELECT term FROM %s WHERE term>%%s ORDER BY term ASC LIMIT %%s" % bibwordsX,
(p, n_below))
for row in res:
nearest_words.append(row[0])
return nearest_words
def get_nearest_terms_in_idxphrase(p, index_id, n_below, n_above):
"""Browse (-n_above, +n_below) closest bibliographic phrases
for the given pattern p in the given field idxPHRASE table,
regardless of collection.
Return list of [phrase1, phrase2, ... , phrase_n]."""
if CFG_INSPIRE_SITE and index_id in (3, 15): # FIXME: workaround due to new fuzzy index
return [p,]
idxphraseX = "idxPHRASE%02dF" % index_id
res_above = run_sql("SELECT term FROM %s WHERE term<%%s ORDER BY term DESC LIMIT %%s" % idxphraseX, (p, n_above))
res_above = map(lambda x: x[0], res_above)
res_above.reverse()
res_below = run_sql("SELECT term FROM %s WHERE term>=%%s ORDER BY term ASC LIMIT %%s" % idxphraseX, (p, n_below))
res_below = map(lambda x: x[0], res_below)
return res_above + res_below
def get_nearest_terms_in_idxphrase_with_collection(p, index_id, n_below, n_above, collection):
"""Browse (-n_above, +n_below) closest bibliographic phrases
for the given pattern p in the given field idxPHRASE table,
considering the collection (intbitset).
Return list of [(phrase1, hitset), (phrase2, hitset), ... , (phrase_n, hitset)]."""
idxphraseX = "idxPHRASE%02dF" % index_id
res_above = run_sql("SELECT term,hitlist FROM %s WHERE term<%%s ORDER BY term DESC LIMIT %%s" % idxphraseX, (p, n_above * 3))
res_above = [(term, intbitset(hitlist) & collection) for term, hitlist in res_above]
res_above = [(term, len(hitlist)) for term, hitlist in res_above if hitlist]
res_below = run_sql("SELECT term,hitlist FROM %s WHERE term>=%%s ORDER BY term ASC LIMIT %%s" % idxphraseX, (p, n_below * 3))
res_below = [(term, intbitset(hitlist) & collection) for term, hitlist in res_below]
res_below = [(term, len(hitlist)) for term, hitlist in res_below if hitlist]
res_above.reverse()
return res_above[-n_above:] + res_below[:n_below]
def get_nearest_terms_in_bibxxx(p, f, n_below, n_above):
"""Browse (-n_above, +n_below) closest bibliographic phrases
for the given pattern p in the given field f, regardless
of collection.
Return list of [phrase1, phrase2, ... , phrase_n]."""
## determine browse field:
if not f and string.find(p, ":") > 0: # does 'p' contain ':'?
f, p = string.split(p, ":", 1)
# FIXME: quick hack for the journal index
if f == 'journal':
return get_nearest_terms_in_bibwords(p, f, n_below, n_above)
## We are going to take max(n_below, n_above) as the number of
## values to ferch from bibXXx. This is needed to work around
## MySQL UTF-8 sorting troubles in 4.0.x. Proper solution is to
## use MySQL 4.1.x or our own idxPHRASE in the future.
index_id = get_index_id_from_field(f)
if index_id:
return get_nearest_terms_in_idxphrase(p, index_id, n_below, n_above)
n_fetch = 2*max(n_below, n_above)
## construct 'tl' which defines the tag list (MARC tags) to search in:
tl = []
if str(f[0]).isdigit() and str(f[1]).isdigit():
tl.append(f) # 'f' seems to be okay as it starts by two digits
else:
# deduce desired MARC tags on the basis of chosen 'f'
tl = get_field_tags(f)
## start browsing to fetch list of hits:
browsed_phrases = {} # will hold {phrase1: 1, phrase2: 1, ..., phraseN: 1} dict of browsed phrases (to make them unique)
# always add self to the results set:
browsed_phrases[p.startswith("%") and p.endswith("%") and p[1:-1] or p] = 1
for t in tl:
# deduce into which bibxxx table we will search:
digit1, digit2 = int(t[0]), int(t[1])
bx = "bib%d%dx" % (digit1, digit2)
bibx = "bibrec_bib%d%dx" % (digit1, digit2)
# firstly try to get `n' closest phrases above `p':
if len(t) != 6 or t[-1:]=='%': # only the beginning of field 't' is defined, so add wildcard character:
res = run_sql("""SELECT bx.value FROM %s AS bx
WHERE bx.value<%%s AND bx.tag LIKE %%s
ORDER BY bx.value DESC LIMIT %%s""" % bx,
(p, t + "%", n_fetch))
else:
res = run_sql("""SELECT bx.value FROM %s AS bx
WHERE bx.value<%%s AND bx.tag=%%s
ORDER BY bx.value DESC LIMIT %%s""" % bx,
(p, t, n_fetch))
for row in res:
browsed_phrases[row[0]] = 1
# secondly try to get `n' closest phrases equal to or below `p':
if len(t) != 6 or t[-1:]=='%': # only the beginning of field 't' is defined, so add wildcard character:
res = run_sql("""SELECT bx.value FROM %s AS bx
WHERE bx.value>=%%s AND bx.tag LIKE %%s
ORDER BY bx.value ASC LIMIT %%s""" % bx,
(p, t + "%", n_fetch))
else:
res = run_sql("""SELECT bx.value FROM %s AS bx
WHERE bx.value>=%%s AND bx.tag=%%s
ORDER BY bx.value ASC LIMIT %%s""" % bx,
(p, t, n_fetch))
for row in res:
browsed_phrases[row[0]] = 1
# select first n words only: (this is needed as we were searching
# in many different tables and so aren't sure we have more than n
# words right; this of course won't be needed when we shall have
# one ACC table only for given field):
phrases_out = browsed_phrases.keys()
phrases_out.sort(lambda x, y: cmp(string.lower(strip_accents(x)),
string.lower(strip_accents(y))))
# find position of self:
try:
idx_p = phrases_out.index(p)
except:
idx_p = len(phrases_out)/2
# return n_above and n_below:
return phrases_out[max(0, idx_p-n_above):idx_p+n_below]
def get_nearest_terms_in_bibrec(p, f, n_below, n_above):
"""Return list of nearest terms and counts from bibrec table.
p is usually a date, and f either datecreated or datemodified.
Note: below/above count is very approximative, not really respected.
"""
col = 'creation_date'
if f == 'datemodified':
col = 'modification_date'
res_above = run_sql("""SELECT DATE_FORMAT(%s,'%%%%Y-%%%%m-%%%%d %%%%H:%%%%i:%%%%s')
FROM bibrec WHERE %s < %%s
ORDER BY %s DESC LIMIT %%s""" % (col, col, col),
(p, n_above))
res_below = run_sql("""SELECT DATE_FORMAT(%s,'%%%%Y-%%%%m-%%%%d %%%%H:%%%%i:%%%%s')
FROM bibrec WHERE %s > %%s
ORDER BY %s ASC LIMIT %%s""" % (col, col, col),
(p, n_below))
out = set([])
for row in res_above:
out.add(row[0])
for row in res_below:
out.add(row[0])
out_list = list(out)
out_list.sort()
return list(out_list)
def get_nbhits_in_bibrec(term, f):
"""Return number of hits in bibrec table. term is usually a date,
and f is either 'datecreated' or 'datemodified'."""
col = 'creation_date'
if f == 'datemodified':
col = 'modification_date'
res = run_sql("SELECT COUNT(*) FROM bibrec WHERE %s LIKE %%s" % (col,),
(term + '%',))
return res[0][0]
def get_nbhits_in_bibwords(word, f):
"""Return number of hits for word 'word' inside words index for field 'f'."""
out = 0
# deduce into which bibwordsX table we will search:
bibwordsX = "idxWORD%02dF" % get_index_id_from_field("anyfield")
if f:
index_id = get_index_id_from_field(f)
if index_id:
bibwordsX = "idxWORD%02dF" % index_id
else:
return 0
if word:
res = run_sql("SELECT hitlist FROM %s WHERE term=%%s" % bibwordsX,
(word,))
for hitlist in res:
out += len(intbitset(hitlist[0]))
return out
def get_nbhits_in_idxphrases(word, f):
"""Return number of hits for word 'word' inside phrase index for field 'f'."""
out = 0
# deduce into which bibwordsX table we will search:
idxphraseX = "idxPHRASE%02dF" % get_index_id_from_field("anyfield")
if f:
index_id = get_index_id_from_field(f)
if index_id:
idxphraseX = "idxPHRASE%02dF" % index_id
else:
return 0
if word:
res = run_sql("SELECT hitlist FROM %s WHERE term=%%s" % idxphraseX,
(word,))
for hitlist in res:
out += len(intbitset(hitlist[0]))
return out
def get_nbhits_in_bibxxx(p, f, in_hitset=None):
"""Return number of hits for word 'word' inside words index for field 'f'."""
## determine browse field:
if not f and string.find(p, ":") > 0: # does 'p' contain ':'?
f, p = string.split(p, ":", 1)
# FIXME: quick hack for the journal index
if f == 'journal':
return get_nbhits_in_bibwords(p, f)
## construct 'tl' which defines the tag list (MARC tags) to search in:
tl = []
if str(f[0]).isdigit() and str(f[1]).isdigit():
tl.append(f) # 'f' seems to be okay as it starts by two digits
else:
# deduce desired MARC tags on the basis of chosen 'f'
tl = get_field_tags(f)
# start searching:
recIDs = {} # will hold dict of {recID1: 1, recID2: 1, ..., } (unique recIDs, therefore)
for t in tl:
# deduce into which bibxxx table we will search:
digit1, digit2 = int(t[0]), int(t[1])
bx = "bib%d%dx" % (digit1, digit2)
bibx = "bibrec_bib%d%dx" % (digit1, digit2)
if len(t) != 6 or t[-1:]=='%': # only the beginning of field 't' is defined, so add wildcard character:
res = run_sql("""SELECT bibx.id_bibrec FROM %s AS bibx, %s AS bx
WHERE bx.value=%%s AND bx.tag LIKE %%s
AND bibx.id_bibxxx=bx.id""" % (bibx, bx),
(p, t + "%"))
else:
res = run_sql("""SELECT bibx.id_bibrec FROM %s AS bibx, %s AS bx
WHERE bx.value=%%s AND bx.tag=%%s
AND bibx.id_bibxxx=bx.id""" % (bibx, bx),
(p, t))
for row in res:
recIDs[row[0]] = 1
if in_hitset is None:
nbhits = len(recIDs)
else:
nbhits = len(intbitset(recIDs.keys()).intersection(in_hitset))
return nbhits
def get_mysql_recid_from_aleph_sysno(sysno):
"""Returns DB's recID for ALEPH sysno passed in the argument (e.g. "002379334CER").
Returns None in case of failure."""
out = None
res = run_sql("""SELECT bb.id_bibrec FROM bibrec_bib97x AS bb, bib97x AS b
WHERE b.value=%s AND b.tag='970__a' AND bb.id_bibxxx=b.id""",
(sysno,))
if res:
out = res[0][0]
return out
def guess_primary_collection_of_a_record(recID):
"""Return primary collection name a record recid belongs to, by
testing 980 identifier.
May lead to bad guesses when a collection is defined dynamically
via dbquery.
In that case, return 'CFG_SITE_NAME'."""
out = CFG_SITE_NAME
dbcollids = get_fieldvalues(recID, "980__a")
for dbcollid in dbcollids:
variants = ("collection:" + dbcollid,
'collection:"' + dbcollid + '"',
"980__a:" + dbcollid,
'980__a:"' + dbcollid + '"',
'980:' + dbcollid ,
'980:"' + dbcollid + '"')
res = run_sql("SELECT name FROM collection WHERE dbquery IN (%s,%s,%s,%s,%s,%s)", variants)
if res:
out = res[0][0]
break
if CFG_CERN_SITE:
recID = int(recID)
# dirty hack for ATLAS collections at CERN:
if out in ('ATLAS Communications', 'ATLAS Internal Notes'):
for alternative_collection in ('ATLAS Communications Physics',
'ATLAS Communications General',
'ATLAS Internal Notes Physics',
'ATLAS Internal Notes General',):
if recID in get_collection_reclist(alternative_collection):
return alternative_collection
# dirty hack for FP
FP_collections = {'DO': ['Current Price Enquiries', 'Archived Price Enquiries'],
'IT': ['Current Invitation for Tenders', 'Archived Invitation for Tenders'],
'MS': ['Current Market Surveys', 'Archived Market Surveys']}
fp_coll_ids = [coll for coll in dbcollids if coll in FP_collections]
for coll in fp_coll_ids:
for coll_name in FP_collections[coll]:
if recID in get_collection_reclist(coll_name):
return coll_name
return out
_re_collection_url = re.compile('/collection/(.+)')
def guess_collection_of_a_record(recID, referer=None, recreate_cache_if_needed=True):
"""Return collection name a record recid belongs to, by first testing
the referer URL if provided and otherwise returning the
primary collection."""
if referer:
dummy, hostname, path, dummy, query, dummy = urlparse.urlparse(referer)
#requests can come from different invenio installations, with different collections
if CFG_SITE_URL.find(hostname) < 0:
return guess_primary_collection_of_a_record(recID)
g = _re_collection_url.match(path)
if g:
name = urllib.unquote_plus(g.group(1))
#check if this collection actually exist (also normalize the name if case-insensitive)
name = get_coll_normalised_name(name)
if name and recID in get_collection_reclist(name):
return name
elif path.startswith('/search'):
if recreate_cache_if_needed:
collection_reclist_cache.recreate_cache_if_needed()
query = cgi.parse_qs(query)
for name in query.get('cc', []) + query.get('c', []):
name = get_coll_normalised_name(name)
if name and recID in get_collection_reclist(name, recreate_cache_if_needed=False):
return name
return guess_primary_collection_of_a_record(recID)
def is_record_in_any_collection(recID, recreate_cache_if_needed=True):
"""Return True if the record belongs to at least one collection. This is a
good, although not perfect, indicator to guess if webcoll has already run
after this record has been entered into the system.
"""
if recreate_cache_if_needed:
collection_reclist_cache.recreate_cache_if_needed()
for name in collection_reclist_cache.cache.keys():
if recID in get_collection_reclist(name, recreate_cache_if_needed=False):
return True
return False
def get_all_collections_of_a_record(recID, recreate_cache_if_needed=True):
"""Return all the collection names a record belongs to.
Note this function is O(n_collections)."""
ret = []
if recreate_cache_if_needed:
collection_reclist_cache.recreate_cache_if_needed()
for name in collection_reclist_cache.cache.keys():
if recID in get_collection_reclist(name, recreate_cache_if_needed=False):
ret.append(name)
return ret
def get_tag_name(tag_value, prolog="", epilog=""):
"""Return tag name from the known tag value, by looking up the 'tag' table.
Return empty string in case of failure.
Example: input='100__%', output=first author'."""
out = ""
res = run_sql("SELECT name FROM tag WHERE value=%s", (tag_value,))
if res:
out = prolog + res[0][0] + epilog
return out
def get_fieldcodes():
"""Returns a list of field codes that may have been passed as 'search options' in URL.
Example: output=['subject','division']."""
out = []
res = run_sql("SELECT DISTINCT(code) FROM field")
for row in res:
out.append(row[0])
return out
def get_field_name(code):
"""Return the corresponding field_name given the field code.
e.g. reportnumber -> report number."""
res = run_sql("SELECT name FROM field WHERE code=%s", (code, ))
if res:
return res[0][0]
else:
return ""
def get_field_tags(field):
"""Returns a list of MARC tags for the field code 'field'.
Returns empty list in case of error.
Example: field='author', output=['100__%','700__%']."""
out = []
query = """SELECT t.value FROM tag AS t, field_tag AS ft, field AS f
WHERE f.code=%s AND ft.id_field=f.id AND t.id=ft.id_tag
ORDER BY ft.score DESC"""
res = run_sql(query, (field, ))
for val in res:
out.append(val[0])
return out
def get_merged_recid(recID):
""" Return the record ID of the record with
which the given record has been merged.
@param recID: deleted record recID
@type recID: int
@return: merged record recID
@rtype: int or None
"""
merged_recid = None
for val in get_fieldvalues(recID, "970__d"):
try:
merged_recid = int(val)
break
except ValueError:
pass
return merged_recid
def record_exists(recID):
"""Return 1 if record RECID exists.
Return 0 if it doesn't exist.
Return -1 if it exists but is marked as deleted.
"""
out = 0
res = run_sql("SELECT id FROM bibrec WHERE id=%s", (recID,), 1)
if res:
try: # if recid is '123foo', mysql will return id=123, and we don't want that
recID = int(recID)
except ValueError:
return 0
# record exists; now check whether it isn't marked as deleted:
dbcollids = get_fieldvalues(recID, "980__%")
if ("DELETED" in dbcollids) or (CFG_CERN_SITE and "DUMMY" in dbcollids):
out = -1 # exists, but marked as deleted
else:
out = 1 # exists fine
return out
def record_empty(recID):
"""
Is this record empty, e.g. has only 001, waiting for integration?
@param recID: the record identifier.
@type recID: int
@return: 1 if the record is empty, 0 otherwise.
@rtype: int
"""
record = get_record(recID)
if record is None or len(record) < 2:
return 1
else:
return 0
def record_public_p(recID, recreate_cache_if_needed=True):
"""Return 1 if the record is public, i.e. if it can be found in the Home collection.
Return 0 otherwise.
"""
return recID in get_collection_reclist(CFG_SITE_NAME, recreate_cache_if_needed=recreate_cache_if_needed)
def get_creation_date(recID, fmt="%Y-%m-%d"):
"Returns the creation date of the record 'recID'."
out = ""
res = run_sql("SELECT DATE_FORMAT(creation_date,%s) FROM bibrec WHERE id=%s", (fmt, recID), 1)
if res:
out = res[0][0]
return out
def get_modification_date(recID, fmt="%Y-%m-%d"):
"Returns the date of last modification for the record 'recID'."
out = ""
res = run_sql("SELECT DATE_FORMAT(modification_date,%s) FROM bibrec WHERE id=%s", (fmt, recID), 1)
if res:
out = res[0][0]
return out
def print_search_info(p, f, sf, so, sp, rm, of, ot, collection=CFG_SITE_NAME, nb_found=-1, jrec=1, rg=CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS,
aas=0, ln=CFG_SITE_LANG, p1="", p2="", p3="", f1="", f2="", f3="", m1="", m2="", m3="", op1="", op2="",
sc=1, pl_in_url="",
d1y=0, d1m=0, d1d=0, d2y=0, d2m=0, d2d=0, dt="",
cpu_time=-1, middle_only=0, em=""):
"""Prints stripe with the information on 'collection' and 'nb_found' results and CPU time.
Also, prints navigation links (beg/next/prev/end) inside the results set.
If middle_only is set to 1, it will only print the middle box information (beg/netx/prev/end/etc) links.
This is suitable for displaying navigation links at the bottom of the search results page."""
if em != '' and EM_REPOSITORY["search_info"] not in em:
return ""
# sanity check:
if jrec < 1:
jrec = 1
if jrec > nb_found:
jrec = max(nb_found-rg+1, 1)
return websearch_templates.tmpl_print_search_info(
ln = ln,
collection = collection,
aas = aas,
collection_name = get_coll_i18nname(collection, ln, False),
collection_id = get_colID(collection),
middle_only = middle_only,
rg = rg,
nb_found = nb_found,
sf = sf,
so = so,
rm = rm,
of = of,
ot = ot,
p = p,
f = f,
p1 = p1,
p2 = p2,
p3 = p3,
f1 = f1,
f2 = f2,
f3 = f3,
m1 = m1,
m2 = m2,
m3 = m3,
op1 = op1,
op2 = op2,
pl_in_url = pl_in_url,
d1y = d1y,
d1m = d1m,
d1d = d1d,
d2y = d2y,
d2m = d2m,
d2d = d2d,
dt = dt,
jrec = jrec,
sc = sc,
sp = sp,
all_fieldcodes = get_fieldcodes(),
cpu_time = cpu_time,
)
def print_hosted_search_info(p, f, sf, so, sp, rm, of, ot, collection=CFG_SITE_NAME, nb_found=-1, jrec=1, rg=CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS,
aas=0, ln=CFG_SITE_LANG, p1="", p2="", p3="", f1="", f2="", f3="", m1="", m2="", m3="", op1="", op2="",
sc=1, pl_in_url="",
d1y=0, d1m=0, d1d=0, d2y=0, d2m=0, d2d=0, dt="",
cpu_time=-1, middle_only=0, em=""):
"""Prints stripe with the information on 'collection' and 'nb_found' results and CPU time.
Also, prints navigation links (beg/next/prev/end) inside the results set.
If middle_only is set to 1, it will only print the middle box information (beg/netx/prev/end/etc) links.
This is suitable for displaying navigation links at the bottom of the search results page."""
if em != '' and EM_REPOSITORY["search_info"] not in em:
return ""
# sanity check:
if jrec < 1:
jrec = 1
if jrec > nb_found:
jrec = max(nb_found-rg+1, 1)
return websearch_templates.tmpl_print_hosted_search_info(
ln = ln,
collection = collection,
aas = aas,
collection_name = get_coll_i18nname(collection, ln, False),
collection_id = get_colID(collection),
middle_only = middle_only,
rg = rg,
nb_found = nb_found,
sf = sf,
so = so,
rm = rm,
of = of,
ot = ot,
p = p,
f = f,
p1 = p1,
p2 = p2,
p3 = p3,
f1 = f1,
f2 = f2,
f3 = f3,
m1 = m1,
m2 = m2,
m3 = m3,
op1 = op1,
op2 = op2,
pl_in_url = pl_in_url,
d1y = d1y,
d1m = d1m,
d1d = d1d,
d2y = d2y,
d2m = d2m,
d2d = d2d,
dt = dt,
jrec = jrec,
sc = sc,
sp = sp,
all_fieldcodes = get_fieldcodes(),
cpu_time = cpu_time,
)
def print_results_overview(colls, results_final_nb_total, results_final_nb, cpu_time, ln=CFG_SITE_LANG, ec=[], hosted_colls_potential_results_p=False, em=""):
"""Prints results overview box with links to particular collections below."""
if em != "" and EM_REPOSITORY["overview"] not in em:
return ""
new_colls = []
for coll in colls:
new_colls.append({
'id': get_colID(coll),
'code': coll,
'name': get_coll_i18nname(coll, ln, False),
})
return websearch_templates.tmpl_print_results_overview(
ln = ln,
results_final_nb_total = results_final_nb_total,
results_final_nb = results_final_nb,
cpu_time = cpu_time,
colls = new_colls,
ec = ec,
hosted_colls_potential_results_p = hosted_colls_potential_results_p,
)
def print_hosted_results(url_and_engine, ln=CFG_SITE_LANG, of=None, req=None, no_records_found=False, search_timed_out=False, limit=CFG_EXTERNAL_COLLECTION_MAXRESULTS, em = ""):
"""Prints the full results of a hosted collection"""
if of.startswith("h"):
if no_records_found:
return "<br />No results found."
if search_timed_out:
return "<br />The search engine did not respond in time."
return websearch_templates.tmpl_print_hosted_results(
url_and_engine=url_and_engine,
ln=ln,
of=of,
req=req,
limit=limit,
display_body = em == "" or EM_REPOSITORY["body"] in em,
display_add_to_basket = em == "" or EM_REPOSITORY["basket"] in em)
class BibSortDataCacher(DataCacher):
"""
Cache holding all structures created by bibsort
( _data, data_dict).
"""
def __init__(self, method_name):
self.method_name = method_name
self.method_id = 0
try:
res = run_sql("""SELECT id from bsrMETHOD where name = %s""", (self.method_name,))
except:
self.method_id = 0
if res and res[0]:
self.method_id = res[0][0]
else:
self.method_id = 0
def cache_filler():
method_id = self.method_id
alldicts = {}
if self.method_id == 0:
return {}
try:
res_data = run_sql("""SELECT data_dict_ordered from bsrMETHODDATA \
where id_bsrMETHOD = %s""", (method_id,))
res_buckets = run_sql("""SELECT bucket_no, bucket_data from bsrMETHODDATABUCKET\
where id_bsrMETHOD = %s""", (method_id,))
except Exception:
# database problems, return empty cache
return {}
try:
data_dict_ordered = deserialize_via_marshal(res_data[0][0])
except:
data_dict_ordered = {}
alldicts['data_dict_ordered'] = data_dict_ordered # recid: weight
if not res_buckets:
alldicts['bucket_data'] = {}
return alldicts
for row in res_buckets:
bucket_no = row[0]
try:
bucket_data = intbitset(row[1])
except:
bucket_data = intbitset([])
alldicts.setdefault('bucket_data', {})[bucket_no] = bucket_data
return alldicts
def timestamp_verifier():
method_id = self.method_id
res = run_sql("""SELECT last_updated from bsrMETHODDATA where id_bsrMETHOD = %s""", (method_id,))
try:
update_time_methoddata = str(res[0][0])
except IndexError:
update_time_methoddata = '1970-01-01 00:00:00'
res = run_sql("""SELECT max(last_updated) from bsrMETHODDATABUCKET where id_bsrMETHOD = %s""", (method_id,))
try:
update_time_buckets = str(res[0][0])
except IndexError:
update_time_buckets = '1970-01-01 00:00:00'
return max(update_time_methoddata, update_time_buckets)
DataCacher.__init__(self, cache_filler, timestamp_verifier)
def get_sorting_methods():
if not CFG_BIBSORT_BUCKETS: # we do not want to use buckets
return {}
try: # make sure the method has some data
res = run_sql("""SELECT m.name, m.definition FROM bsrMETHOD m, bsrMETHODDATA md WHERE m.id = md.id_bsrMETHOD""")
except:
return {}
return dict(res)
sorting_methods = get_sorting_methods()
cache_sorted_data = {}
for sorting_method in sorting_methods:
try:
cache_sorted_data[sorting_method].is_ok_p
except Exception:
cache_sorted_data[sorting_method] = BibSortDataCacher(sorting_method)
def get_tags_from_sort_fields(sort_fields):
"""Given a list of sort_fields, return the tags associated with it and
also the name of the field that has no tags associated, to be able to
display a message to the user."""
tags = []
if not sort_fields:
return [], ''
for sort_field in sort_fields:
if sort_field and str(sort_field[0:2]).isdigit():
# sort_field starts by two digits, so this is probably a MARC tag already
tags.append(sort_field)
else:
# let us check the 'field' table
field_tags = get_field_tags(sort_field)
if field_tags:
tags.extend(field_tags)
else:
return [], sort_field
return tags, ''
def rank_records(req, rank_method_code, rank_limit_relevance, hitset_global, pattern=None, verbose=0, sort_order='d', of='hb', ln=CFG_SITE_LANG, rg=None, jrec=None, field=''):
"""Initial entry point for ranking records, acts like a dispatcher.
(i) rank_method_code is in bsrMETHOD, bibsort buckets can be used;
(ii)rank_method_code is not in bsrMETHOD, use bibrank;
"""
if CFG_BIBSORT_BUCKETS and sorting_methods:
for sort_method in sorting_methods:
definition = sorting_methods[sort_method]
if definition.startswith('RNK') and \
definition.replace('RNK:','').strip().lower() == string.lower(rank_method_code):
(solution_recs, solution_scores) = sort_records_bibsort(req, hitset_global, sort_method, '', sort_order, verbose, of, ln, rg, jrec, 'r')
#return (solution_recs, solution_scores, '', '', '')
comment = ''
if verbose > 0:
comment = 'find_citations retlist %s' % [[solution_recs[i], solution_scores[i]] for i in range(len(solution_recs))]
return (solution_recs, solution_scores, '(', ')', comment)
return rank_records_bibrank(rank_method_code, rank_limit_relevance, hitset_global, pattern, verbose, field, rg, jrec)
def sort_records(req, recIDs, sort_field='', sort_order='d', sort_pattern='', verbose=0, of='hb', ln=CFG_SITE_LANG, rg=None, jrec=None):
"""Initial entry point for sorting records, acts like a dispatcher.
(i) sort_field is in the bsrMETHOD, and thus, the BibSort has sorted the data for this field, so we can use the cache;
(ii)sort_field is not in bsrMETHOD, and thus, the cache does not contain any information regarding this sorting method"""
_ = gettext_set_language(ln)
#we should return sorted records up to irec_max(exclusive)
dummy, irec_max = get_interval_for_records_to_sort(len(recIDs), jrec, rg)
#calculate the min index on the reverted list
index_min = max(len(recIDs) - irec_max, 0) #just to be sure that the min index is not negative
#bibsort does not handle sort_pattern for now, use bibxxx
if sort_pattern:
return sort_records_bibxxx(req, recIDs, None, sort_field, sort_order, sort_pattern, verbose, of, ln, rg, jrec)
use_sorting_buckets = True
if not CFG_BIBSORT_BUCKETS or not sorting_methods: #ignore the use of buckets, use old fashion sorting
use_sorting_buckets = False
if not sort_field:
if use_sorting_buckets:
return sort_records_bibsort(req, recIDs, 'latest first', sort_field, sort_order, verbose, of, ln, rg, jrec)
else:
return recIDs[index_min:]
sort_fields = string.split(sort_field, ",")
if len(sort_fields) == 1:
# we have only one sorting_field, check if it is treated by BibSort
for sort_method in sorting_methods:
definition = sorting_methods[sort_method]
if use_sorting_buckets and \
((definition.startswith('FIELD') and \
definition.replace('FIELD:','').strip().lower() == string.lower(sort_fields[0])) or \
sort_method == sort_fields[0]):
#use BibSort
return sort_records_bibsort(req, recIDs, sort_method, sort_field, sort_order, verbose, of, ln, rg, jrec)
#deduce sorting MARC tag out of the 'sort_field' argument:
tags, error_field = get_tags_from_sort_fields(sort_fields)
if error_field:
if use_sorting_buckets:
return sort_records_bibsort(req, recIDs, 'latest first', sort_field, sort_order, verbose, of, ln, rg, jrec)
else:
if of.startswith('h'):
write_warning(_("Sorry, %s does not seem to be a valid sort option. The records will not be sorted.") % cgi.escape(error_field), "Error", req=req)
return recIDs[index_min:]
if tags:
for sort_method in sorting_methods:
definition = sorting_methods[sort_method]
if definition.startswith('MARC') \
and definition.replace('MARC:','').strip().split(',') == tags \
and use_sorting_buckets:
#this list of tags have a designated method in BibSort, so use it
return sort_records_bibsort(req, recIDs, sort_method, sort_field, sort_order, verbose, of, ln, rg, jrec)
#we do not have this sort_field in BibSort tables -> do the old fashion sorting
return sort_records_bibxxx(req, recIDs, tags, sort_field, sort_order, sort_pattern, verbose, of, ln, rg, jrec)
return recIDs[index_min:]
def sort_records_bibsort(req, recIDs, sort_method, sort_field='', sort_order='d', verbose=0, of='hb', ln=CFG_SITE_LANG, rg=None, jrec=None, sort_or_rank = 's'):
"""This function orders the recIDs list, based on a sorting method(sort_field) using the BibSortDataCacher for speed"""
_ = gettext_set_language(ln)
#sanity check
if sort_method not in sorting_methods:
if sort_or_rank == 'r':
return rank_records_bibrank(sort_method, 0, recIDs, None, verbose)
else:
return sort_records_bibxxx(req, recIDs, None, sort_field, sort_order, '', verbose, of, ln, rg, jrec)
if verbose >= 3 and of.startswith('h'):
write_warning("Sorting (using BibSort cache) by method %s (definition %s)." \
% (cgi.escape(repr(sort_method)), cgi.escape(repr(sorting_methods[sort_method]))), req=req)
#we should return sorted records up to irec_max(exclusive)
dummy, irec_max = get_interval_for_records_to_sort(len(recIDs), jrec, rg)
solution = intbitset([])
input_recids = intbitset(recIDs)
cache_sorted_data[sort_method].recreate_cache_if_needed()
sort_cache = cache_sorted_data[sort_method].cache
bucket_numbers = sort_cache['bucket_data'].keys()
#check if all buckets have been constructed
if len(bucket_numbers) != CFG_BIBSORT_BUCKETS:
if verbose > 3 and of.startswith('h'):
write_warning("Not all buckets have been constructed.. switching to old fashion sorting.", req=req)
if sort_or_rank == 'r':
return rank_records_bibrank(sort_method, 0, recIDs, None, verbose)
else:
return sort_records_bibxxx(req, recIDs, None, sort_field, sort_order, '', verbose, of, ln, rg, jrec)
if sort_order == 'd':
bucket_numbers.reverse()
for bucket_no in bucket_numbers:
solution.union_update(input_recids & sort_cache['bucket_data'][bucket_no])
if len(solution) >= irec_max:
break
dict_solution = {}
missing_records = []
for recid in solution:
try:
dict_solution[recid] = sort_cache['data_dict_ordered'][recid]
except KeyError:
#recid is in buckets, but not in the bsrMETHODDATA,
#maybe because the value has been deleted, but the change has not yet been propagated to the buckets
missing_records.append(recid)
#check if there are recids that are not in any bucket -> to be added at the end/top, ordered by insertion date
if len(solution) < irec_max:
#some records have not been yet inserted in the bibsort structures
#or, some records have no value for the sort_method
missing_records = sorted(missing_records + list(input_recids.difference(solution)))
#the records need to be sorted in reverse order for the print record function
#the return statement should be equivalent with the following statements
#(these are clearer, but less efficient, since they revert the same list twice)
#sorted_solution = (missing_records + sorted(dict_solution, key=dict_solution.__getitem__, reverse=sort_order=='d'))[:irec_max]
#sorted_solution.reverse()
#return sorted_solution
if sort_method.strip().lower().startswith('latest') and sort_order == 'd':
# if we want to sort the records on their insertion date, add the mission records at the top
solution = sorted(dict_solution, key=dict_solution.__getitem__, reverse=sort_order=='a') + missing_records
else:
solution = missing_records + sorted(dict_solution, key=dict_solution.__getitem__, reverse=sort_order=='a')
#calculate the min index on the reverted list
index_min = max(len(solution) - irec_max, 0) #just to be sure that the min index is not negative
#return all the records up to irec_max, but on the reverted list
if sort_or_rank == 'r':
# we need the recids, with values
return (solution[index_min:], [dict_solution.get(record, 0) for record in solution[index_min:]])
else:
return solution[index_min:]
def sort_records_bibxxx(req, recIDs, tags, sort_field='', sort_order='d', sort_pattern='', verbose=0, of='hb', ln=CFG_SITE_LANG, rg=None, jrec=None):
"""OLD FASHION SORTING WITH NO CACHE, for sort fields that are not run in BibSort
Sort records in 'recIDs' list according sort field 'sort_field' in order 'sort_order'.
If more than one instance of 'sort_field' is found for a given record, try to choose that that is given by
'sort pattern', for example "sort by report number that starts by CERN-PS".
Note that 'sort_field' can be field code like 'author' or MARC tag like '100__a' directly."""
_ = gettext_set_language(ln)
#we should return sorted records up to irec_max(exclusive)
dummy, irec_max = get_interval_for_records_to_sort(len(recIDs), jrec, rg)
#calculate the min index on the reverted list
index_min = max(len(recIDs) - irec_max, 0) #just to be sure that the min index is not negative
## check arguments:
if not sort_field:
return recIDs[index_min:]
if len(recIDs) > CFG_WEBSEARCH_NB_RECORDS_TO_SORT:
if of.startswith('h'):
write_warning(_("Sorry, sorting is allowed on sets of up to %d records only. Using default sort order.") % CFG_WEBSEARCH_NB_RECORDS_TO_SORT, "Warning", req=req)
return recIDs[index_min:]
recIDs_dict = {}
recIDs_out = []
if not tags:
# tags have not been camputed yet
sort_fields = string.split(sort_field, ",")
tags, error_field = get_tags_from_sort_fields(sort_fields)
if error_field:
if of.startswith('h'):
write_warning(_("Sorry, %s does not seem to be a valid sort option. The records will not be sorted.") % cgi.escape(error_field), "Error", req=req)
return recIDs[index_min:]
if verbose >= 3 and of.startswith('h'):
write_warning("Sorting by tags %s." % cgi.escape(repr(tags)), req=req)
if sort_pattern:
write_warning("Sorting preferentially by %s." % cgi.escape(sort_pattern), req=req)
## check if we have sorting tag defined:
if tags:
# fetch the necessary field values:
for recID in recIDs:
val = "" # will hold value for recID according to which sort
vals = [] # will hold all values found in sorting tag for recID
for tag in tags:
if CFG_CERN_SITE and tag == '773__c':
# CERN hack: journal sorting
# 773__c contains page numbers, e.g. 3-13, and we want to sort by 3, and numerically:
vals.extend(["%050s" % x.split("-", 1)[0] for x in get_fieldvalues(recID, tag)])
else:
vals.extend(get_fieldvalues(recID, tag))
if sort_pattern:
# try to pick that tag value that corresponds to sort pattern
bingo = 0
for v in vals:
if v.lower().startswith(sort_pattern.lower()): # bingo!
bingo = 1
val = v
break
if not bingo: # sort_pattern not present, so add other vals after spaces
val = sort_pattern + " " + string.join(vals)
else:
# no sort pattern defined, so join them all together
val = string.join(vals)
val = strip_accents(val.lower()) # sort values regardless of accents and case
if recIDs_dict.has_key(val):
recIDs_dict[val].append(recID)
else:
recIDs_dict[val] = [recID]
# sort them:
recIDs_dict_keys = recIDs_dict.keys()
recIDs_dict_keys.sort()
# now that keys are sorted, create output array:
for k in recIDs_dict_keys:
for s in recIDs_dict[k]:
recIDs_out.append(s)
# ascending or descending?
if sort_order == 'a':
recIDs_out.reverse()
# okay, we are done
# return only up to the maximum that we need to sort
if len(recIDs_out) != len(recIDs):
dummy, irec_max = get_interval_for_records_to_sort(len(recIDs_out), jrec, rg)
index_min = max(len(recIDs_out) - irec_max, 0) #just to be sure that the min index is not negative
return recIDs_out[index_min:]
else:
# good, no sort needed
return recIDs[index_min:]
def get_interval_for_records_to_sort(nb_found, jrec=None, rg=None):
"""calculates in which interval should the sorted records be
a value of 'rg=-9999' means to print all records: to be used with care."""
if not jrec:
jrec = 1
if not rg:
#return all
return jrec-1, nb_found
if rg == -9999: # print all records
rg = nb_found
else:
rg = abs(rg)
if jrec < 1: # sanity checks
jrec = 1
if jrec > nb_found:
jrec = max(nb_found-rg+1, 1)
# will sort records from irec_min to irec_max excluded
irec_min = jrec - 1
irec_max = irec_min + rg
if irec_min < 0:
irec_min = 0
if irec_max > nb_found:
irec_max = nb_found
return irec_min, irec_max
def print_records(req, recIDs, jrec=1, rg=CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS, format='hb', ot='', ln=CFG_SITE_LANG,
relevances=[], relevances_prologue="(", relevances_epilogue="%%)",
decompress=zlib.decompress, search_pattern='', print_records_prologue_p=True,
print_records_epilogue_p=True, verbose=0, tab='', sf='', so='d', sp='',
rm='', em=''):
"""
Prints list of records 'recIDs' formatted according to 'format' in
groups of 'rg' starting from 'jrec'.
Assumes that the input list 'recIDs' is sorted in reverse order,
so it counts records from tail to head.
A value of 'rg=-9999' means to print all records: to be used with care.
Print also list of RELEVANCES for each record (if defined), in
between RELEVANCE_PROLOGUE and RELEVANCE_EPILOGUE.
Print prologue and/or epilogue specific to 'format' if
'print_records_prologue_p' and/or print_records_epilogue_p' are
True.
'sf' is sort field and 'rm' is ranking method that are passed here
only for proper linking purposes: e.g. when a certain ranking
method or a certain sort field was selected, keep it selected in
any dynamic search links that may be printed.
"""
if em != "" and EM_REPOSITORY["body"] not in em:
return
# load the right message language
_ = gettext_set_language(ln)
# sanity checking:
if req is None:
return
# get user_info (for formatting based on user)
if isinstance(req, cStringIO.OutputType):
user_info = {}
else:
user_info = collect_user_info(req)
if len(recIDs):
nb_found = len(recIDs)
if rg == -9999: # print all records
rg = nb_found
else:
rg = abs(rg)
if jrec < 1: # sanity checks
jrec = 1
if jrec > nb_found:
jrec = max(nb_found-rg+1, 1)
# will print records from irec_max to irec_min excluded:
irec_max = nb_found - jrec
irec_min = nb_found - jrec - rg
if irec_min < 0:
irec_min = -1
if irec_max >= nb_found:
irec_max = nb_found - 1
#req.write("%s:%d-%d" % (recIDs, irec_min, irec_max))
if format.startswith('x'):
# print header if needed
if print_records_prologue_p:
print_records_prologue(req, format)
# print records
recIDs_to_print = [recIDs[x] for x in range(irec_max, irec_min, -1)]
if ot:
# asked to print some filtered fields only, so call print_record() on the fly:
for irec in range(irec_max, irec_min, -1):
x = print_record(recIDs[irec], format, ot, ln, search_pattern=search_pattern,
user_info=user_info, verbose=verbose, sf=sf, so=so, sp=sp, rm=rm)
req.write(x)
if x:
req.write('\n')
else:
format_records(recIDs_to_print,
format,
ln=ln,
search_pattern=search_pattern,
record_separator="\n",
user_info=user_info,
req=req)
# print footer if needed
if print_records_epilogue_p:
print_records_epilogue(req, format)
elif format.startswith('t') or str(format[0:3]).isdigit():
# we are doing plain text output:
for irec in range(irec_max, irec_min, -1):
x = print_record(recIDs[irec], format, ot, ln, search_pattern=search_pattern,
user_info=user_info, verbose=verbose, sf=sf, so=so, sp=sp, rm=rm)
req.write(x)
if x:
req.write('\n')
elif format == 'excel':
recIDs_to_print = [recIDs[x] for x in range(irec_max, irec_min, -1)]
create_excel(recIDs=recIDs_to_print, req=req, ln=ln, ot=ot, user_info=user_info)
else:
# we are doing HTML output:
if format == 'hp' or format.startswith("hb_") or format.startswith("hd_"):
# portfolio and on-the-fly formats:
for irec in range(irec_max, irec_min, -1):
req.write(print_record(recIDs[irec], format, ot, ln, search_pattern=search_pattern,
user_info=user_info, verbose=verbose, sf=sf, so=so, sp=sp, rm=rm))
elif format.startswith("hb"):
# HTML brief format:
display_add_to_basket = True
if user_info:
if user_info['email'] == 'guest':
if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS > 4:
display_add_to_basket = False
else:
if not user_info['precached_usebaskets']:
display_add_to_basket = False
if em != "" and EM_REPOSITORY["basket"] not in em:
display_add_to_basket = False
req.write(websearch_templates.tmpl_record_format_htmlbrief_header(
ln = ln))
for irec in range(irec_max, irec_min, -1):
row_number = jrec+irec_max-irec
recid = recIDs[irec]
if relevances and relevances[irec]:
relevance = relevances[irec]
else:
relevance = ''
record = print_record(recIDs[irec], format, ot, ln, search_pattern=search_pattern,
user_info=user_info, verbose=verbose, sf=sf, so=so, sp=sp, rm=rm)
req.write(websearch_templates.tmpl_record_format_htmlbrief_body(
ln = ln,
recid = recid,
row_number = row_number,
relevance = relevance,
record = record,
relevances_prologue = relevances_prologue,
relevances_epilogue = relevances_epilogue,
display_add_to_basket = display_add_to_basket
))
req.write(websearch_templates.tmpl_record_format_htmlbrief_footer(
ln = ln,
display_add_to_basket = display_add_to_basket))
elif format.startswith("hd"):
# HTML detailed format:
for irec in range(irec_max, irec_min, -1):
if record_exists(recIDs[irec]) == -1:
write_warning(_("The record has been deleted."), req=req)
merged_recid = get_merged_recid(recIDs[irec])
if merged_recid:
write_warning(_("The record %d replaces it." % merged_recid), req=req)
continue
unordered_tabs = get_detailed_page_tabs(get_colID(guess_primary_collection_of_a_record(recIDs[irec])),
recIDs[irec], ln=ln)
ordered_tabs_id = [(tab_id, values['order']) for (tab_id, values) in unordered_tabs.iteritems()]
ordered_tabs_id.sort(lambda x, y: cmp(x[1], y[1]))
link_ln = ''
if ln != CFG_SITE_LANG:
link_ln = '?ln=%s' % ln
recid = recIDs[irec]
recid_to_display = recid # Record ID used to build the URL.
if CFG_WEBSEARCH_USE_ALEPH_SYSNOS:
try:
recid_to_display = get_fieldvalues(recid,
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG)[0]
except IndexError:
# No external sysno is available, keep using
# internal recid.
pass
tabs = [(unordered_tabs[tab_id]['label'], \
'%s/%s/%s/%s%s' % (CFG_SITE_URL, CFG_SITE_RECORD, recid_to_display, tab_id, link_ln), \
tab_id == tab,
unordered_tabs[tab_id]['enabled']) \
for (tab_id, order) in ordered_tabs_id
if unordered_tabs[tab_id]['visible'] == True]
tabs_counts = get_detailed_page_tabs_counts(recid)
citedbynum = tabs_counts['Citations']
references = tabs_counts['References']
discussions = tabs_counts['Discussions']
# load content
if tab == 'usage':
req.write(webstyle_templates.detailed_record_container_top(recIDs[irec],
tabs,
ln,
citationnum=citedbynum,
referencenum=references,
discussionnum=discussions))
r = calculate_reading_similarity_list(recIDs[irec], "downloads")
downloadsimilarity = None
downloadhistory = None
#if r:
# downloadsimilarity = r
if CFG_BIBRANK_SHOW_DOWNLOAD_GRAPHS:
downloadhistory = create_download_history_graph_and_box(recIDs[irec], ln)
r = calculate_reading_similarity_list(recIDs[irec], "pageviews")
viewsimilarity = None
if r: viewsimilarity = r
content = websearch_templates.tmpl_detailed_record_statistics(recIDs[irec],
ln,
downloadsimilarity=downloadsimilarity,
downloadhistory=downloadhistory,
viewsimilarity=viewsimilarity)
req.write(content)
req.write(webstyle_templates.detailed_record_container_bottom(recIDs[irec],
tabs,
ln))
elif tab == 'citations':
recid = recIDs[irec]
req.write(webstyle_templates.detailed_record_container_top(recid,
tabs,
ln,
citationnum=citedbynum,
referencenum=references,
discussionnum=discussions))
req.write(websearch_templates.tmpl_detailed_record_citations_prologue(recid, ln))
# Citing
citinglist = calculate_cited_by_list(recid)
req.write(websearch_templates.tmpl_detailed_record_citations_citing_list(recid,
ln,
citinglist,
sf=sf,
so=so,
sp=sp,
rm=rm))
# Self-cited
selfcited = get_self_cited_by(recid)
req.write(websearch_templates.tmpl_detailed_record_citations_self_cited(recid,
ln, selfcited=selfcited, citinglist=citinglist))
# Co-cited
s = calculate_co_cited_with_list(recid)
cociting = None
if s:
cociting = s
req.write(websearch_templates.tmpl_detailed_record_citations_co_citing(recid,
ln,
cociting=cociting))
# Citation history, if needed
citationhistory = None
if citinglist:
citationhistory = create_citation_history_graph_and_box(recid, ln)
#debug
if verbose > 3:
write_warning("Citation graph debug: " + \
str(len(citationhistory)), req=req)
req.write(websearch_templates.tmpl_detailed_record_citations_citation_history(recid, ln, citationhistory))
req.write(websearch_templates.tmpl_detailed_record_citations_epilogue(recid, ln))
req.write(webstyle_templates.detailed_record_container_bottom(recid,
tabs,
ln))
elif tab == 'references':
req.write(webstyle_templates.detailed_record_container_top(recIDs[irec],
tabs,
ln,
citationnum=citedbynum,
referencenum=references,
discussionnum=discussions))
req.write(format_record(recIDs[irec], 'HDREF', ln=ln, user_info=user_info, verbose=verbose))
req.write(webstyle_templates.detailed_record_container_bottom(recIDs[irec],
tabs,
ln))
elif tab == 'keywords':
from invenio.bibclassify_webinterface import \
record_get_keywords, write_keywords_body, \
generate_keywords
from invenio.webinterface_handler import wash_urlargd
form = req.form
argd = wash_urlargd(form, {
'generate': (str, 'no'),
'sort': (str, 'occurrences'),
'type': (str, 'tagcloud'),
'numbering': (str, 'off'),
})
recid = recIDs[irec]
req.write(webstyle_templates.detailed_record_container_top(recid,
tabs,
ln))
content = websearch_templates.tmpl_record_plots(recID=recid,
ln=ln)
req.write(content)
req.write(webstyle_templates.detailed_record_container_bottom(recid,
tabs,
ln))
req.write(webstyle_templates.detailed_record_container_top(recid,
tabs, ln, citationnum=citedbynum, referencenum=references))
if argd['generate'] == 'yes':
# The user asked to generate the keywords.
keywords = generate_keywords(req, recid, argd)
else:
# Get the keywords contained in the MARC.
keywords = record_get_keywords(recid, argd)
if argd['sort'] == 'related' and not keywords:
req.write('You may want to run BibIndex.')
# Output the keywords or the generate button.
write_keywords_body(keywords, req, recid, argd)
req.write(webstyle_templates.detailed_record_container_bottom(recid,
tabs, ln))
elif tab == 'plots':
req.write(webstyle_templates.detailed_record_container_top(recIDs[irec],
tabs,
ln))
content = websearch_templates.tmpl_record_plots(recID=recIDs[irec],
ln=ln)
req.write(content)
req.write(webstyle_templates.detailed_record_container_bottom(recIDs[irec],
tabs,
ln))
else:
# Metadata tab
req.write(webstyle_templates.detailed_record_container_top(recIDs[irec],
tabs,
ln,
show_short_rec_p=False,
citationnum=citedbynum, referencenum=references,
discussionnum=discussions))
creationdate = None
modificationdate = None
if record_exists(recIDs[irec]) == 1:
creationdate = get_creation_date(recIDs[irec])
modificationdate = get_modification_date(recIDs[irec])
content = print_record(recIDs[irec], format, ot, ln,
search_pattern=search_pattern,
user_info=user_info, verbose=verbose,
sf=sf, so=so, sp=sp, rm=rm)
content = websearch_templates.tmpl_detailed_record_metadata(
recID = recIDs[irec],
ln = ln,
format = format,
creationdate = creationdate,
modificationdate = modificationdate,
content = content)
# display of the next-hit/previous-hit/back-to-search links
# on the detailed record pages
content += websearch_templates.tmpl_display_back_to_search(req,
recIDs[irec],
ln)
req.write(content)
req.write(webstyle_templates.detailed_record_container_bottom(recIDs[irec],
tabs,
ln,
creationdate=creationdate,
modificationdate=modificationdate,
show_short_rec_p=False))
if len(tabs) > 0:
# Add the mini box at bottom of the page
if CFG_WEBCOMMENT_ALLOW_REVIEWS:
from invenio.modules.comments.api import get_mini_reviews
reviews = get_mini_reviews(recid = recIDs[irec], ln=ln)
else:
reviews = ''
actions = format_record(recIDs[irec], 'HDACT', ln=ln, user_info=user_info, verbose=verbose)
files = format_record(recIDs[irec], 'HDFILE', ln=ln, user_info=user_info, verbose=verbose)
req.write(webstyle_templates.detailed_record_mini_panel(recIDs[irec],
ln,
format,
files=files,
reviews=reviews,
actions=actions))
else:
# Other formats
for irec in range(irec_max, irec_min, -1):
req.write(print_record(recIDs[irec], format, ot, ln,
search_pattern=search_pattern,
user_info=user_info, verbose=verbose,
sf=sf, so=so, sp=sp, rm=rm))
else:
write_warning(_("Use different search terms."), req=req)
def print_records_prologue(req, format, cc=None):
"""
Print the appropriate prologue for list of records in the given
format.
"""
prologue = "" # no prologue needed for HTML or Text formats
if format.startswith('xm'):
prologue = websearch_templates.tmpl_xml_marc_prologue()
elif format.startswith('xn'):
prologue = websearch_templates.tmpl_xml_nlm_prologue()
elif format.startswith('xw'):
prologue = websearch_templates.tmpl_xml_refworks_prologue()
elif format.startswith('xr'):
prologue = websearch_templates.tmpl_xml_rss_prologue(cc=cc)
elif format.startswith('xe8x'):
prologue = websearch_templates.tmpl_xml_endnote_8x_prologue()
elif format.startswith('xe'):
prologue = websearch_templates.tmpl_xml_endnote_prologue()
elif format.startswith('xo'):
prologue = websearch_templates.tmpl_xml_mods_prologue()
elif format.startswith('xp'):
prologue = websearch_templates.tmpl_xml_podcast_prologue(cc=cc)
elif format.startswith('x'):
prologue = websearch_templates.tmpl_xml_default_prologue()
req.write(prologue)
def print_records_epilogue(req, format):
"""
Print the appropriate epilogue for list of records in the given
format.
"""
epilogue = "" # no epilogue needed for HTML or Text formats
if format.startswith('xm'):
epilogue = websearch_templates.tmpl_xml_marc_epilogue()
elif format.startswith('xn'):
epilogue = websearch_templates.tmpl_xml_nlm_epilogue()
elif format.startswith('xw'):
epilogue = websearch_templates.tmpl_xml_refworks_epilogue()
elif format.startswith('xr'):
epilogue = websearch_templates.tmpl_xml_rss_epilogue()
elif format.startswith('xe8x'):
epilogue = websearch_templates.tmpl_xml_endnote_8x_epilogue()
elif format.startswith('xe'):
epilogue = websearch_templates.tmpl_xml_endnote_epilogue()
elif format.startswith('xo'):
epilogue = websearch_templates.tmpl_xml_mods_epilogue()
elif format.startswith('xp'):
epilogue = websearch_templates.tmpl_xml_podcast_epilogue()
elif format.startswith('x'):
epilogue = websearch_templates.tmpl_xml_default_epilogue()
req.write(epilogue)
def get_record(recid):
"""Directly the record object corresponding to the recid."""
if CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE:
value = run_sql("SELECT value FROM bibfmt WHERE id_bibrec=%s AND FORMAT='recstruct'", (recid, ))
if value:
try:
return deserialize_via_marshal(value[0][0])
except:
### In case of corruption, let's rebuild it!
pass
return create_record(print_record(recid, 'xm'))[0]
def print_record(recID, format='hb', ot='', ln=CFG_SITE_LANG, decompress=zlib.decompress,
search_pattern=None, user_info=None, verbose=0, sf='', so='d',
sp='', rm='', brief_links=True):
"""
Prints record 'recID' formatted according to 'format'.
'sf' is sort field and 'rm' is ranking method that are passed here
only for proper linking purposes: e.g. when a certain ranking
method or a certain sort field was selected, keep it selected in
any dynamic search links that may be printed.
"""
if format == 'recstruct':
return get_record(recID)
_ = gettext_set_language(ln)
display_claim_this_paper = False
try:
display_claim_this_paper = user_info["precached_viewclaimlink"]
except (KeyError, TypeError):
display_claim_this_paper = False
#check from user information if the user has the right to see hidden fields/tags in the
#records as well
can_see_hidden = False
if user_info:
can_see_hidden = user_info.get('precached_canseehiddenmarctags', False)
out = ""
# sanity check:
record_exist_p = record_exists(recID)
if record_exist_p == 0: # doesn't exist
return out
# New Python BibFormat procedure for formatting
# Old procedure follows further below
# We must still check some special formats, but these
# should disappear when BibFormat improves.
if not (CFG_BIBFORMAT_USE_OLD_BIBFORMAT \
or format.lower().startswith('t') \
or format.lower().startswith('hm') \
or str(format[0:3]).isdigit() \
or ot):
# Unspecified format is hd
if format == '':
format = 'hd'
if record_exist_p == -1 and get_output_format_content_type(format) == 'text/html':
# HTML output displays a default value for deleted records.
# Other format have to deal with it.
out += _("The record has been deleted.")
# was record deleted-but-merged ?
merged_recid = get_merged_recid(recID)
if merged_recid:
out += ' ' + _("The record %d replaces it." % merged_recid)
else:
out += call_bibformat(recID, format, ln, search_pattern=search_pattern,
user_info=user_info, verbose=verbose)
# at the end of HTML brief mode, print the "Detailed record" functionality:
if brief_links and format.lower().startswith('hb') and \
format.lower() != 'hb_p':
out += websearch_templates.tmpl_print_record_brief_links(ln=ln,
recID=recID,
sf=sf,
so=so,
sp=sp,
rm=rm,
display_claim_link=display_claim_this_paper)
return out
# Old PHP BibFormat procedure for formatting
# print record opening tags, if needed:
if format == "marcxml" or format == "oai_dc":
out += " <record>\n"
out += " <header>\n"
for oai_id in get_fieldvalues(recID, CFG_OAI_ID_FIELD):
out += " <identifier>%s</identifier>\n" % oai_id
out += " <datestamp>%s</datestamp>\n" % get_modification_date(recID)
out += " </header>\n"
out += " <metadata>\n"
if format.startswith("xm") or format == "marcxml":
# look for detailed format existence:
query = "SELECT value FROM bibfmt WHERE id_bibrec=%s AND format=%s"
res = run_sql(query, (recID, format), 1)
if res and record_exist_p == 1 and not ot:
# record 'recID' is formatted in 'format', and we are not
# asking for field-filtered output; so print it:
out += "%s" % decompress(res[0][0])
elif ot:
# field-filtered output was asked for; print only some fields
if not can_see_hidden:
ot = list(set(ot) - set(cfg['CFG_BIBFORMAT_HIDDEN_TAGS']))
out += record_xml_output(get_record(recID), ot)
else:
# record 'recID' is not formatted in 'format' or we ask
# for field-filtered output -- they are not in "bibfmt"
# table; so fetch all the data from "bibXXx" tables:
if format == "marcxml":
out += """ <record xmlns="http://www.loc.gov/MARC21/slim">\n"""
out += " <controlfield tag=\"001\">%d</controlfield>\n" % int(recID)
elif format.startswith("xm"):
out += """ <record>\n"""
out += " <controlfield tag=\"001\">%d</controlfield>\n" % int(recID)
if record_exist_p == -1:
# deleted record, so display only OAI ID and 980:
oai_ids = get_fieldvalues(recID, CFG_OAI_ID_FIELD)
if oai_ids:
out += "<datafield tag=\"%s\" ind1=\"%s\" ind2=\"%s\"><subfield code=\"%s\">%s</subfield></datafield>\n" % \
(CFG_OAI_ID_FIELD[0:3], CFG_OAI_ID_FIELD[3:4], CFG_OAI_ID_FIELD[4:5], CFG_OAI_ID_FIELD[5:6], oai_ids[0])
out += "<datafield tag=\"980\" ind1=\"\" ind2=\"\"><subfield code=\"c\">DELETED</subfield></datafield>\n"
else:
# controlfields
query = "SELECT b.tag,b.value,bb.field_number FROM bib00x AS b, bibrec_bib00x AS bb "\
"WHERE bb.id_bibrec=%s AND b.id=bb.id_bibxxx AND b.tag LIKE '00%%' "\
"ORDER BY bb.field_number, b.tag ASC"
res = run_sql(query, (recID, ))
for row in res:
field, value = row[0], row[1]
value = encode_for_xml(value)
out += """ <controlfield tag="%s">%s</controlfield>\n""" % \
(encode_for_xml(field[0:3]), value)
# datafields
i = 1 # Do not process bib00x and bibrec_bib00x, as
# they are controlfields. So start at bib01x and
# bibrec_bib00x (and set i = 0 at the end of
# first loop)
for digit1 in range(0, 10):
for digit2 in range(i, 10):
bx = "bib%d%dx" % (digit1, digit2)
bibx = "bibrec_bib%d%dx" % (digit1, digit2)
query = "SELECT b.tag,b.value,bb.field_number FROM %s AS b, %s AS bb "\
"WHERE bb.id_bibrec=%%s AND b.id=bb.id_bibxxx AND b.tag LIKE %%s"\
"ORDER BY bb.field_number, b.tag ASC" % (bx, bibx)
res = run_sql(query, (recID, str(digit1)+str(digit2)+'%'))
field_number_old = -999
field_old = ""
for row in res:
field, value, field_number = row[0], row[1], row[2]
ind1, ind2 = field[3], field[4]
if ind1 == "_" or ind1 == "":
ind1 = " "
if ind2 == "_" or ind2 == "":
ind2 = " "
# print field tag, unless hidden
printme = True
if not can_see_hidden:
for htag in cfg['CFG_BIBFORMAT_HIDDEN_TAGS']:
ltag = len(htag)
samelenfield = field[0:ltag]
if samelenfield == htag:
printme = False
if printme:
if field_number != field_number_old or field[:-1] != field_old[:-1]:
if field_number_old != -999:
out += """ </datafield>\n"""
out += """ <datafield tag="%s" ind1="%s" ind2="%s">\n""" % \
(encode_for_xml(field[0:3]), encode_for_xml(ind1), encode_for_xml(ind2))
field_number_old = field_number
field_old = field
# print subfield value
value = encode_for_xml(value)
out += """ <subfield code="%s">%s</subfield>\n""" % \
(encode_for_xml(field[-1:]), value)
# all fields/subfields printed in this run, so close the tag:
if field_number_old != -999:
out += """ </datafield>\n"""
i = 0 # Next loop should start looking at bib%0 and bibrec_bib00x
# we are at the end of printing the record:
out += " </record>\n"
elif format == "xd" or format == "oai_dc":
# XML Dublin Core format, possibly OAI -- select only some bibXXx fields:
out += """ <dc xmlns="http://purl.org/dc/elements/1.1/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://purl.org/dc/elements/1.1/
http://www.openarchives.org/OAI/1.1/dc.xsd">\n"""
if record_exist_p == -1:
out += ""
else:
for f in get_fieldvalues(recID, "041__a"):
out += " <language>%s</language>\n" % f
for f in get_fieldvalues(recID, "100__a"):
out += " <creator>%s</creator>\n" % encode_for_xml(f)
for f in get_fieldvalues(recID, "700__a"):
out += " <creator>%s</creator>\n" % encode_for_xml(f)
for f in get_fieldvalues(recID, "245__a"):
out += " <title>%s</title>\n" % encode_for_xml(f)
for f in get_fieldvalues(recID, "65017a"):
out += " <subject>%s</subject>\n" % encode_for_xml(f)
for f in get_fieldvalues(recID, "8564_u"):
if f.split('.') == 'png':
continue
out += " <identifier>%s</identifier>\n" % encode_for_xml(f)
for f in get_fieldvalues(recID, "520__a"):
out += " <description>%s</description>\n" % encode_for_xml(f)
out += " <date>%s</date>\n" % get_creation_date(recID)
out += " </dc>\n"
elif len(format) == 6 and str(format[0:3]).isdigit():
# user has asked to print some fields only
if format == "001":
out += "<!--%s-begin-->%s<!--%s-end-->\n" % (format, recID, format)
else:
vals = get_fieldvalues(recID, format)
for val in vals:
out += "<!--%s-begin-->%s<!--%s-end-->\n" % (format, val, format)
elif format.startswith('t'):
## user directly asked for some tags to be displayed only
if record_exist_p == -1:
out += get_fieldvalues_alephseq_like(recID, ["001", CFG_OAI_ID_FIELD, "980"], can_see_hidden)
else:
out += get_fieldvalues_alephseq_like(recID, ot, can_see_hidden)
elif format == "hm":
if record_exist_p == -1:
out += "\n<pre>" + cgi.escape(get_fieldvalues_alephseq_like(recID, ["001", CFG_OAI_ID_FIELD, "980"], can_see_hidden)) + "</pre>"
else:
out += "\n<pre>" + cgi.escape(get_fieldvalues_alephseq_like(recID, ot, can_see_hidden)) + "</pre>"
elif format.startswith("h") and ot:
## user directly asked for some tags to be displayed only
if record_exist_p == -1:
out += "\n<pre>" + get_fieldvalues_alephseq_like(recID, ["001", CFG_OAI_ID_FIELD, "980"], can_see_hidden) + "</pre>"
else:
out += "\n<pre>" + get_fieldvalues_alephseq_like(recID, ot, can_see_hidden) + "</pre>"
elif format == "hd":
# HTML detailed format
if record_exist_p == -1:
out += _("The record has been deleted.")
else:
# look for detailed format existence:
query = "SELECT value FROM bibfmt WHERE id_bibrec=%s AND format=%s"
res = run_sql(query, (recID, format), 1)
if res:
# record 'recID' is formatted in 'format', so print it
out += "%s" % decompress(res[0][0])
else:
# record 'recID' is not formatted in 'format', so try to call BibFormat on the fly or use default format:
out_record_in_format = call_bibformat(recID, format, ln, search_pattern=search_pattern,
user_info=user_info, verbose=verbose)
if out_record_in_format:
out += out_record_in_format
else:
out += websearch_templates.tmpl_print_record_detailed(
ln = ln,
recID = recID,
)
elif format.startswith("hb_") or format.startswith("hd_"):
# underscore means that HTML brief/detailed formats should be called on-the-fly; suitable for testing formats
if record_exist_p == -1:
out += _("The record has been deleted.")
else:
out += call_bibformat(recID, format, ln, search_pattern=search_pattern,
user_info=user_info, verbose=verbose)
elif format.startswith("hx"):
# BibTeX format, called on the fly:
if record_exist_p == -1:
out += _("The record has been deleted.")
else:
out += call_bibformat(recID, format, ln, search_pattern=search_pattern,
user_info=user_info, verbose=verbose)
elif format.startswith("hs"):
# for citation/download similarity navigation links:
if record_exist_p == -1:
out += _("The record has been deleted.")
else:
out += '<a href="%s">' % websearch_templates.build_search_url(recid=recID, ln=ln)
# firstly, title:
titles = get_fieldvalues(recID, "245__a")
if titles:
for title in titles:
out += "<strong>%s</strong>" % title
else:
# usual title not found, try conference title:
titles = get_fieldvalues(recID, "111__a")
if titles:
for title in titles:
out += "<strong>%s</strong>" % title
else:
# just print record ID:
out += "<strong>%s %d</strong>" % (get_field_i18nname("record ID", ln, False), recID)
out += "</a>"
# secondly, authors:
authors = get_fieldvalues(recID, "100__a") + get_fieldvalues(recID, "700__a")
if authors:
out += " - %s" % authors[0]
if len(authors) > 1:
out += " <em>et al</em>"
# thirdly publication info:
publinfos = get_fieldvalues(recID, "773__s")
if not publinfos:
publinfos = get_fieldvalues(recID, "909C4s")
if not publinfos:
publinfos = get_fieldvalues(recID, "037__a")
if not publinfos:
publinfos = get_fieldvalues(recID, "088__a")
if publinfos:
out += " - %s" % publinfos[0]
else:
# fourthly publication year (if not publication info):
years = get_fieldvalues(recID, "773__y")
if not years:
years = get_fieldvalues(recID, "909C4y")
if not years:
years = get_fieldvalues(recID, "260__c")
if years:
out += " (%s)" % years[0]
else:
# HTML brief format by default
if record_exist_p == -1:
out += _("The record has been deleted.")
else:
query = "SELECT value FROM bibfmt WHERE id_bibrec=%s AND format=%s"
res = run_sql(query, (recID, format))
if res:
# record 'recID' is formatted in 'format', so print it
out += "%s" % decompress(res[0][0])
else:
# record 'recID' is not formatted in 'format', so try to call BibFormat on the fly: or use default format:
if CFG_WEBSEARCH_CALL_BIBFORMAT:
out_record_in_format = call_bibformat(recID, format, ln, search_pattern=search_pattern,
user_info=user_info, verbose=verbose)
if out_record_in_format:
out += out_record_in_format
else:
out += websearch_templates.tmpl_print_record_brief(
ln = ln,
recID = recID,
)
else:
out += websearch_templates.tmpl_print_record_brief(
ln = ln,
recID = recID,
)
# at the end of HTML brief mode, print the "Detailed record" functionality:
if format == 'hp' or format.startswith("hb_") or format.startswith("hd_"):
pass # do nothing for portfolio and on-the-fly formats
else:
out += websearch_templates.tmpl_print_record_brief_links(ln=ln,
recID=recID,
sf=sf,
so=so,
sp=sp,
rm=rm,
display_claim_link=display_claim_this_paper)
# print record closing tags, if needed:
if format == "marcxml" or format == "oai_dc":
out += " </metadata>\n"
out += " </record>\n"
return out
def call_bibformat(recID, format="HD", ln=CFG_SITE_LANG, search_pattern=None, user_info=None, verbose=0):
"""
Calls BibFormat and returns formatted record.
BibFormat will decide by itself if old or new BibFormat must be used.
"""
from invenio.modules.formatter.utils import get_pdf_snippets
keywords = []
if search_pattern is not None:
for unit in create_basic_search_units(None, str(search_pattern), None):
bsu_o, bsu_p, bsu_f, bsu_m = unit[0], unit[1], unit[2], unit[3]
if (bsu_o != '-' and bsu_f in [None, 'fulltext']):
if bsu_m == 'a' and bsu_p.startswith('%') and bsu_p.endswith('%'):
# remove leading and training `%' representing partial phrase search
keywords.append(bsu_p[1:-1])
else:
keywords.append(bsu_p)
out = format_record(recID,
of=format,
ln=ln,
search_pattern=keywords,
user_info=user_info,
verbose=verbose)
if CFG_WEBSEARCH_FULLTEXT_SNIPPETS and user_info and \
'fulltext' in user_info['uri'].lower():
# check snippets only if URL contains fulltext
# FIXME: make it work for CLI too, via new function arg
if keywords:
snippets = ''
try:
snippets = get_pdf_snippets(recID, keywords, user_info)
except:
register_exception()
if snippets:
out += snippets
return out
def log_query(hostname, query_args, uid=-1):
"""
Log query into the query and user_query tables.
Return id_query or None in case of problems.
"""
id_query = None
if uid >= 0:
# log the query only if uid is reasonable
res = run_sql("SELECT id FROM query WHERE urlargs=%s", (query_args,), 1)
try:
id_query = res[0][0]
except:
id_query = run_sql("INSERT INTO query (type, urlargs) VALUES ('r', %s)", (query_args,))
if id_query:
run_sql("INSERT INTO user_query (id_user, id_query, hostname, date) VALUES (%s, %s, %s, %s)",
(uid, id_query, hostname,
time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())))
return id_query
def log_query_info(action, p, f, colls, nb_records_found_total=-1):
"""Write some info to the log file for later analysis."""
try:
log = open(CFG_LOGDIR + "/search.log", "a")
log.write(time.strftime("%Y%m%d%H%M%S#", time.localtime()))
log.write(action+"#")
log.write(p+"#")
log.write(f+"#")
for coll in colls[:-1]:
log.write("%s," % coll)
log.write("%s#" % colls[-1])
log.write("%d" % nb_records_found_total)
log.write("\n")
log.close()
except:
pass
return
def clean_dictionary(dictionary, list_of_items):
"""Returns a copy of the dictionary with all the items
in the list_of_items as empty strings"""
out_dictionary = dictionary.copy()
out_dictionary.update((item, '') for item in list_of_items)
return out_dictionary
### CALLABLES
def perform_request_search(req=None, cc=CFG_SITE_NAME, c=None, p="", f="", rg=CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS, sf="", so="d", sp="", rm="", of="id", ot="", aas=0,
p1="", f1="", m1="", op1="", p2="", f2="", m2="", op2="", p3="", f3="", m3="", sc=0, jrec=0,
recid=-1, recidb=-1, sysno="", id=-1, idb=-1, sysnb="", action="", d1="",
d1y=0, d1m=0, d1d=0, d2="", d2y=0, d2m=0, d2d=0, dt="", verbose=0, ap=0, ln=CFG_SITE_LANG, ec=None, tab="",
wl=0, em=""):
"""Perform search or browse request, without checking for
authentication. Return list of recIDs found, if of=id.
Otherwise create web page.
The arguments are as follows:
req - mod_python Request class instance.
cc - current collection (e.g. "ATLAS"). The collection the
user started to search/browse from.
c - collection list (e.g. ["Theses", "Books"]). The
collections user may have selected/deselected when
starting to search from 'cc'.
p - pattern to search for (e.g. "ellis and muon or kaon").
f - field to search within (e.g. "author").
rg - records in groups of (e.g. "10"). Defines how many hits
per collection in the search results page are
displayed. (Note that `rg' is ignored in case of `of=id'.)
sf - sort field (e.g. "title").
so - sort order ("a"=ascending, "d"=descending).
sp - sort pattern (e.g. "CERN-") -- in case there are more
values in a sort field, this argument tells which one
to prefer
rm - ranking method (e.g. "jif"). Defines whether results
should be ranked by some known ranking method.
of - output format (e.g. "hb"). Usually starting "h" means
HTML output (and "hb" for HTML brief, "hd" for HTML
detailed), "x" means XML output, "t" means plain text
output, "id" means no output at all but to return list
of recIDs found, "intbitset" means to return an intbitset
representation of the recIDs found (no sorting or ranking
will be performed). (Suitable for high-level API.)
ot - output only these MARC tags (e.g. "100,700,909C0b").
Useful if only some fields are to be shown in the
output, e.g. for library to control some fields.
em - output only part of the page.
aas - advanced search ("0" means no, "1" means yes). Whether
search was called from within the advanced search
interface.
p1 - first pattern to search for in the advanced search
interface. Much like 'p'.
f1 - first field to search within in the advanced search
interface. Much like 'f'.
m1 - first matching type in the advanced search interface.
("a" all of the words, "o" any of the words, "e" exact
phrase, "p" partial phrase, "r" regular expression).
op1 - first operator, to join the first and the second unit
in the advanced search interface. ("a" add, "o" or,
"n" not).
p2 - second pattern to search for in the advanced search
interface. Much like 'p'.
f2 - second field to search within in the advanced search
interface. Much like 'f'.
m2 - second matching type in the advanced search interface.
("a" all of the words, "o" any of the words, "e" exact
phrase, "p" partial phrase, "r" regular expression).
op2 - second operator, to join the second and the third unit
in the advanced search interface. ("a" add, "o" or,
"n" not).
p3 - third pattern to search for in the advanced search
interface. Much like 'p'.
f3 - third field to search within in the advanced search
interface. Much like 'f'.
m3 - third matching type in the advanced search interface.
("a" all of the words, "o" any of the words, "e" exact
phrase, "p" partial phrase, "r" regular expression).
sc - split by collection ("0" no, "1" yes). Governs whether
we want to present the results in a single huge list,
or splitted by collection.
jrec - jump to record (e.g. "234"). Used for navigation
inside the search results. (Note that `jrec' is ignored
in case of `of=id'.)
recid - display record ID (e.g. "20000"). Do not
search/browse but go straight away to the Detailed
record page for the given recID.
recidb - display record ID bis (e.g. "20010"). If greater than
'recid', then display records from recid to recidb.
Useful for example for dumping records from the
database for reformatting.
sysno - display old system SYS number (e.g. ""). If you
migrate to Invenio from another system, and store your
old SYS call numbers, you can use them instead of recid
if you wish so.
id - the same as recid, in case recid is not set. For
backwards compatibility.
idb - the same as recid, in case recidb is not set. For
backwards compatibility.
sysnb - the same as sysno, in case sysno is not set. For
backwards compatibility.
action - action to do. "SEARCH" for searching, "Browse" for
browsing. Default is to search.
d1 - first datetime in full YYYY-mm-dd HH:MM:DD format
(e.g. "1998-08-23 12:34:56"). Useful for search limits
on creation/modification date (see 'dt' argument
below). Note that 'd1' takes precedence over d1y, d1m,
d1d if these are defined.
d1y - first date's year (e.g. "1998"). Useful for search
limits on creation/modification date.
d1m - first date's month (e.g. "08"). Useful for search
limits on creation/modification date.
d1d - first date's day (e.g. "23"). Useful for search
limits on creation/modification date.
d2 - second datetime in full YYYY-mm-dd HH:MM:DD format
(e.g. "1998-09-02 12:34:56"). Useful for search limits
on creation/modification date (see 'dt' argument
below). Note that 'd2' takes precedence over d2y, d2m,
d2d if these are defined.
d2y - second date's year (e.g. "1998"). Useful for search
limits on creation/modification date.
d2m - second date's month (e.g. "09"). Useful for search
limits on creation/modification date.
d2d - second date's day (e.g. "02"). Useful for search
limits on creation/modification date.
dt - first and second date's type (e.g. "c"). Specifies
whether to search in creation dates ("c") or in
modification dates ("m"). When dt is not set and d1*
and d2* are set, the default is "c".
verbose - verbose level (0=min, 9=max). Useful to print some
internal information on the searching process in case
something goes wrong.
ap - alternative patterns (0=no, 1=yes). In case no exact
match is found, the search engine can try alternative
patterns e.g. to replace non-alphanumeric characters by
a boolean query. ap defines if this is wanted.
ln - language of the search interface (e.g. "en"). Useful
for internationalization.
ec - list of external search engines to search as well
(e.g. "SPIRES HEP").
wl - wildcard limit (ex: 100) the wildcard queries will be
limited at 100 results
"""
kwargs = prs_wash_arguments(req=req, cc=cc, c=c, p=p, f=f, rg=rg, sf=sf, so=so, sp=sp, rm=rm, of=of, ot=ot, aas=aas,
p1=p1, f1=f1, m1=m1, op1=op1, p2=p2, f2=f2, m2=m2, op2=op2, p3=p3, f3=f3, m3=m3, sc=sc, jrec=jrec,
recid=recid, recidb=recidb, sysno=sysno, id=id, idb=idb, sysnb=sysnb, action=action, d1=d1,
d1y=d1y, d1m=d1m, d1d=d1d, d2=d2, d2y=d2y, d2m=d2m, d2d=d2d, dt=dt, verbose=verbose, ap=ap, ln=ln, ec=ec,
tab=tab, wl=wl, em=em)
return prs_perform_search(kwargs=kwargs, **kwargs)
def prs_perform_search(kwargs=None, **dummy):
"""Internal call which does the search, it is calling standard Invenio;
Unless you know what you are doing, don't use this call as an API
"""
# separately because we can call it independently
out = prs_wash_arguments_colls(kwargs=kwargs, **kwargs)
if not out:
return out
return prs_search(kwargs=kwargs, **kwargs)
def prs_wash_arguments_colls(kwargs=None, of=None, req=None, cc=None, c=None, sc=None, verbose=None,
aas=None, ln=None, em="", **dummy):
"""
Check and wash collection list argument before we start searching.
If there are troubles, e.g. a collection is not defined, print
warning to the browser.
@return: True if collection list is OK, and various False values
(empty string, empty list) if there was an error.
"""
# raise an exception when trying to print out html from the cli
if of.startswith("h"):
assert req
# for every search engine request asking for an HTML output, we
# first regenerate cache of collection and field I18N names if
# needed; so that later we won't bother checking timestamps for
# I18N names at all:
if of.startswith("h"):
collection_i18nname_cache.recreate_cache_if_needed()
field_i18nname_cache.recreate_cache_if_needed()
try:
(cc, colls_to_display, colls_to_search, hosted_colls, wash_colls_debug) = wash_colls(cc, c, sc, verbose) # which colls to search and to display?
kwargs['colls_to_display'] = colls_to_display
kwargs['colls_to_search'] = colls_to_search
kwargs['hosted_colls'] = hosted_colls
kwargs['wash_colls_debug'] = wash_colls_debug
except InvenioWebSearchUnknownCollectionError, exc:
colname = exc.colname
if of.startswith("h"):
page_start(req, of, cc, aas, ln, getUid(req),
websearch_templates.tmpl_collection_not_found_page_title(colname, ln))
req.write(websearch_templates.tmpl_collection_not_found_page_body(colname, ln))
page_end(req, of, ln, em)
return ''
elif of == "id":
return []
elif of == "intbitset":
return intbitset()
elif of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
page_end(req, of, ln, em)
return ''
else:
page_end(req, of, ln, em)
return ''
return True
def prs_wash_arguments(req=None, cc=CFG_SITE_NAME, c=None, p="", f="", rg=CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS,
sf="", so="d", sp="", rm="", of="id", ot="", aas=0,
p1="", f1="", m1="", op1="", p2="", f2="", m2="", op2="", p3="", f3="", m3="",
sc=0, jrec=0, recid=-1, recidb=-1, sysno="", id=-1, idb=-1, sysnb="", action="", d1="",
d1y=0, d1m=0, d1d=0, d2="", d2y=0, d2m=0, d2d=0, dt="", verbose=0, ap=0, ln=CFG_SITE_LANG,
ec=None, tab="", uid=None, wl=0, em="", **dummy):
"""
Sets the (default) values and checks others for the PRS call
"""
# wash output format:
of = wash_output_format(of)
# wash all arguments requiring special care
p = wash_pattern(p)
f = wash_field(f)
p1 = wash_pattern(p1)
f1 = wash_field(f1)
p2 = wash_pattern(p2)
f2 = wash_field(f2)
p3 = wash_pattern(p3)
f3 = wash_field(f3)
(d1y, d1m, d1d, d2y, d2m, d2d) = map(int, (d1y, d1m, d1d, d2y, d2m, d2d))
datetext1, datetext2 = wash_dates(d1, d1y, d1m, d1d, d2, d2y, d2m, d2d)
# wash ranking method:
if not is_method_valid(None, rm):
rm = ""
# backwards compatibility: id, idb, sysnb -> recid, recidb, sysno (if applicable)
if sysnb != "" and sysno == "":
sysno = sysnb
if id > 0 and recid == -1:
recid = id
if idb > 0 and recidb == -1:
recidb = idb
# TODO deduce passed search limiting criterias (if applicable)
pl, pl_in_url = "", "" # no limits by default
if action != "browse" and req and not isinstance(req, cStringIO.OutputType) \
and req.args and not isinstance(req.args, dict): # we do not want to add options while browsing or while calling via command-line
fieldargs = cgi.parse_qs(req.args)
for fieldcode in get_fieldcodes():
if fieldargs.has_key(fieldcode):
for val in fieldargs[fieldcode]:
pl += "+%s:\"%s\" " % (fieldcode, val)
pl_in_url += "&amp;%s=%s" % (urllib.quote(fieldcode), urllib.quote(val))
# deduce recid from sysno argument (if applicable):
if sysno: # ALEPH SYS number was passed, so deduce DB recID for the record:
recid = get_mysql_recid_from_aleph_sysno(sysno)
if recid is None:
recid = 0 # use recid 0 to indicate that this sysno does not exist
# deduce collection we are in (if applicable):
if recid > 0:
referer = None
if req:
referer = req.headers_in.get('Referer')
cc = guess_collection_of_a_record(recid, referer)
# deduce user id (if applicable):
if uid is None:
try:
uid = getUid(req)
except:
uid = 0
_ = gettext_set_language(ln)
kwargs = {'req':req,'cc':cc, 'c':c, 'p':p, 'f':f, 'rg':rg, 'sf':sf, 'so':so, 'sp':sp, 'rm':rm, 'of':of, 'ot':ot, 'aas':aas,
'p1':p1, 'f1':f1, 'm1':m1, 'op1':op1, 'p2':p2, 'f2':f2, 'm2':m2, 'op2':op2, 'p3':p3, 'f3':f3, 'm3':m3, 'sc':sc, 'jrec':jrec,
'recid':recid, 'recidb':recidb, 'sysno':sysno, 'id':id, 'idb':idb, 'sysnb':sysnb, 'action':action, 'd1':d1,
'd1y':d1y, 'd1m':d1m, 'd1d':d1d, 'd2':d2, 'd2y':d2y, 'd2m':d2m, 'd2d':d2d, 'dt':dt, 'verbose':verbose, 'ap':ap, 'ln':ln, 'ec':ec,
'tab':tab, 'wl':wl, 'em': em,
'datetext1': datetext1, 'datetext2': datetext2, 'uid': uid, 'cc':cc, 'pl': pl, 'pl_in_url': pl_in_url, '_': _,
'selected_external_collections_infos':None,
}
kwargs.update(**dummy)
return kwargs
def prs_search(kwargs=None, recid=0, req=None, cc=None, p=None, p1=None, p2=None, p3=None,
f=None, ec=None, verbose=None, ln=None, selected_external_collections_infos=None,
action=None,rm=None, of=None, em=None,
**dummy):
"""
This function write various bits into the req object as the search
proceeds (so that pieces of a page are rendered even before the
search ended)
"""
## 0 - start output
if recid >= 0: # recid can be 0 if deduced from sysno and if such sysno does not exist
output = prs_detailed_record(kwargs=kwargs, **kwargs)
if output is not None:
return output
elif action == "browse":
## 2 - browse needed
of = 'hb'
output = prs_browse(kwargs=kwargs, **kwargs)
if output is not None:
return output
elif rm and p.startswith("recid:"):
## 3-ter - similarity search (or old-style citation search) needed
output = prs_search_similar_records(kwargs=kwargs, **kwargs)
if output is not None:
return output
elif p.startswith("cocitedwith:"): #WAS EXPERIMENTAL
## 3-terter - cited by search needed
output = prs_search_cocitedwith(kwargs=kwargs, **kwargs)
if output is not None:
return output
else:
## 3 - common search needed
output = prs_search_common(kwargs=kwargs, **kwargs)
if output is not None:
return output
# External searches
if of.startswith("h"):
if not of in ['hcs', 'hcs2']:
perform_external_collection_search_with_em(req, cc, [p, p1, p2, p3], f, ec, verbose,
ln, selected_external_collections_infos, em=em)
return page_end(req, of, ln, em)
def prs_detailed_record(kwargs=None, req=None, of=None, cc=None, aas=None, ln=None, uid=None, recid=None, recidb=None,
p=None, verbose=None, tab=None, sf=None, so=None, sp=None, rm=None, ot=None, _=None, em=None,
**dummy):
"""Formats and prints one record"""
## 1 - detailed record display
title, description, keywords = \
websearch_templates.tmpl_record_page_header_content(req, recid, ln)
if req is not None and req.method != 'HEAD':
page_start(req, of, cc, aas, ln, uid, title, description, keywords, recid, tab, em)
# Default format is hb but we are in detailed -> change 'of'
if of == "hb":
of = "hd"
if record_exists(recid):
if recidb <= recid: # sanity check
recidb = recid + 1
if of in ["id", "intbitset"]:
result = [recidx for recidx in range(recid, recidb) if record_exists(recidx)]
if of == "intbitset":
return intbitset(result)
else:
return result
else:
print_records(req, range(recid, recidb), -1, -9999, of, ot, ln, search_pattern=p, verbose=verbose,
tab=tab, sf=sf, so=so, sp=sp, rm=rm, em=em)
if req and of.startswith("h"): # register detailed record page view event
client_ip_address = str(req.remote_ip)
register_page_view_event(recid, uid, client_ip_address)
else: # record does not exist
if of == "id":
return []
elif of == "intbitset":
return intbitset()
elif of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
elif of.startswith("h"):
if req.method == 'HEAD':
raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND
else:
write_warning(_("Requested record does not seem to exist."), req=req)
def prs_browse(kwargs=None, req=None, of=None, cc=None, aas=None, ln=None, uid=None, _=None, p=None,
p1=None, p2=None, p3=None, colls_to_display=None, f=None, rg=None, sf=None,
so=None, sp=None, rm=None, ot=None, f1=None, m1=None, op1=None,
f2=None, m2=None, op2=None, f3=None, m3=None, sc=None, pl=None,
d1y=None, d1m=None, d1d=None, d2y=None, d2m=None, d2d=None,
dt=None, jrec=None, ec=None, action=None,
colls_to_search=None, verbose=None, em=None, **dummy):
page_start(req, of, cc, aas, ln, uid, _("Browse"), p=create_page_title_search_pattern_info(p, p1, p2, p3), em=em)
req.write(create_search_box(cc, colls_to_display, p, f, rg, sf, so, sp, rm, of, ot, aas, ln, p1, f1, m1, op1,
p2, f2, m2, op2, p3, f3, m3, sc, pl, d1y, d1m, d1d, d2y, d2m, d2d, dt, jrec, ec, action,
em
))
write_warning(create_exact_author_browse_help_link(p, p1, p2, p3, f, f1, f2, f3,
rm, cc, ln, jrec, rg, aas, action),
req=req)
try:
if aas == 1 or (p1 or p2 or p3):
browse_pattern(req, colls_to_search, p1, f1, rg, ln)
browse_pattern(req, colls_to_search, p2, f2, rg, ln)
browse_pattern(req, colls_to_search, p3, f3, rg, ln)
else:
browse_pattern(req, colls_to_search, p, f, rg, ln)
except:
register_exception(req=req, alert_admin=True)
if of.startswith("h"):
req.write(create_error_box(req, verbose=verbose, ln=ln))
elif of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
return page_end(req, of, ln, em)
def prs_search_similar_records(kwargs=None, req=None, of=None, cc=None, pl_in_url=None, ln=None, uid=None, _=None, p=None,
p1=None, p2=None, p3=None, colls_to_display=None, f=None, rg=None, sf=None,
so=None, sp=None, rm=None, ot=None, aas=None, f1=None, m1=None, op1=None,
f2=None, m2=None, op2=None, f3=None, m3=None, sc=None, pl=None,
d1y=None, d1m=None, d1d=None, d2y=None, d2m=None, d2d=None,
dt=None, jrec=None, ec=None, action=None, em=None,
verbose=None, **dummy):
if req and req.method != 'HEAD':
page_start(req, of, cc, aas, ln, uid, _("Search Results"), p=create_page_title_search_pattern_info(p, p1, p2, p3),
em=em)
if of.startswith("h"):
req.write(create_search_box(cc, colls_to_display, p, f, rg, sf, so, sp, rm, of, ot, aas, ln, p1, f1, m1, op1,
p2, f2, m2, op2, p3, f3, m3, sc, pl, d1y, d1m, d1d, d2y, d2m, d2d, dt, jrec, ec, action,
em
))
if record_exists(p[6:]) != 1:
# record does not exist
if of.startswith("h"):
if req.method == 'HEAD':
raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND
else:
write_warning(_("Requested record does not seem to exist."), req=req)
if of == "id":
return []
if of == "intbitset":
return intbitset()
elif of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
else:
# record well exists, so find similar ones to it
t1 = os.times()[4]
results_similar_recIDs, results_similar_relevances, results_similar_relevances_prologue, results_similar_relevances_epilogue, results_similar_comments = \
rank_records_bibrank(rm, 0, get_collection_reclist(cc), string.split(p), verbose, f, rg, jrec)
if results_similar_recIDs:
t2 = os.times()[4]
cpu_time = t2 - t1
if of.startswith("h"):
req.write(print_search_info(p, f, sf, so, sp, rm, of, ot, cc, len(results_similar_recIDs),
jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2,
sc, pl_in_url,
d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time, em=em))
write_warning(results_similar_comments, req=req)
print_records(req, results_similar_recIDs, jrec, rg, of, ot, ln,
results_similar_relevances, results_similar_relevances_prologue,
results_similar_relevances_epilogue,
search_pattern=p, verbose=verbose, sf=sf, so=so, sp=sp, rm=rm, em=em)
elif of == "id":
return results_similar_recIDs
elif of == "intbitset":
return intbitset(results_similar_recIDs)
elif of.startswith("x"):
print_records(req, results_similar_recIDs, jrec, rg, of, ot, ln,
results_similar_relevances, results_similar_relevances_prologue,
results_similar_relevances_epilogue, search_pattern=p, verbose=verbose,
sf=sf, so=so, sp=sp, rm=rm, em=em)
else:
# rank_records failed and returned some error message to display:
if of.startswith("h"):
write_warning(results_similar_relevances_prologue, req=req)
write_warning(results_similar_relevances_epilogue, req=req)
write_warning(results_similar_comments, req=req)
if of == "id":
return []
elif of == "intbitset":
return intbitset()
elif of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
def prs_search_cocitedwith(kwargs=None, req=None, of=None, cc=None, pl_in_url=None, ln=None, uid=None, _=None, p=None,
p1=None, p2=None, p3=None, colls_to_display=None, f=None, rg=None, sf=None,
so=None, sp=None, rm=None, ot=None, aas=None, f1=None, m1=None, op1=None,
f2=None, m2=None, op2=None, f3=None, m3=None, sc=None, pl=None,
d1y=None, d1m=None, d1d=None, d2y=None, d2m=None, d2d=None,
dt=None, jrec=None, ec=None, action=None,
verbose=None, em=None, **dummy):
page_start(req, of, cc, aas, ln, uid, _("Search Results"), p=create_page_title_search_pattern_info(p, p1, p2, p3),
em=em)
if of.startswith("h"):
req.write(create_search_box(cc, colls_to_display, p, f, rg, sf, so, sp, rm, of, ot, aas, ln, p1, f1, m1, op1,
p2, f2, m2, op2, p3, f3, m3, sc, pl, d1y, d1m, d1d, d2y, d2m, d2d, dt, jrec, ec, action,
em
))
recID = p[12:]
if record_exists(recID) != 1:
# record does not exist
if of.startswith("h"):
write_warning(_("Requested record does not seem to exist."), req=req)
if of == "id":
return []
elif of == "intbitset":
return intbitset()
elif of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
else:
# record well exists, so find co-cited ones:
t1 = os.times()[4]
results_cocited_recIDs = map(lambda x: x[0], calculate_co_cited_with_list(int(recID)))
if results_cocited_recIDs:
t2 = os.times()[4]
cpu_time = t2 - t1
if of.startswith("h"):
req.write(print_search_info(p, f, sf, so, sp, rm, of, ot, CFG_SITE_NAME, len(results_cocited_recIDs),
jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2,
sc, pl_in_url,
d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time, em=em))
print_records(req, results_cocited_recIDs, jrec, rg, of, ot, ln, search_pattern=p, verbose=verbose,
sf=sf, so=so, sp=sp, rm=rm, em=em)
elif of == "id":
return results_cocited_recIDs
elif of == "intbitset":
return intbitset(results_cocited_recIDs)
elif of.startswith("x"):
print_records(req, results_cocited_recIDs, jrec, rg, of, ot, ln, search_pattern=p, verbose=verbose,
sf=sf, so=so, sp=sp, rm=rm, em=em)
else:
# cited rank_records failed and returned some error message to display:
if of.startswith("h"):
write_warning("nothing found", req=req)
if of == "id":
return []
elif of == "intbitset":
return intbitset()
elif of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
def prs_search_hosted_collections(kwargs=None, req=None, of=None, ln=None, _=None, p=None,
p1=None, p2=None, p3=None, hosted_colls=None, f=None,
colls_to_search=None, hosted_colls_actual_or_potential_results_p=None,
verbose=None, **dummy):
hosted_colls_results = hosted_colls_timeouts = hosted_colls_true_results = None
# search into the hosted collections only if the output format is html or xml
if hosted_colls and (of.startswith("h") or of.startswith("x")) and not p.startswith("recid:"):
# hosted_colls_results : the hosted collections' searches that did not timeout
# hosted_colls_timeouts : the hosted collections' searches that timed out and will be searched later on again
(hosted_colls_results, hosted_colls_timeouts) = calculate_hosted_collections_results(req, [p, p1, p2, p3], f, hosted_colls, verbose, ln, CFG_HOSTED_COLLECTION_TIMEOUT_ANTE_SEARCH)
# successful searches
if hosted_colls_results:
hosted_colls_true_results = []
for result in hosted_colls_results:
# if the number of results is None or 0 (or False) then just do nothing
if result[1] == None or result[1] == False:
# these are the searches the returned no or zero results
if verbose:
write_warning("Hosted collections (perform_search_request): %s returned no results" % result[0][1].name, req=req)
else:
# these are the searches that actually returned results on time
hosted_colls_true_results.append(result)
if verbose:
write_warning("Hosted collections (perform_search_request): %s returned %s results in %s seconds" % (result[0][1].name, result[1], result[2]), req=req)
else:
if verbose:
write_warning("Hosted collections (perform_search_request): there were no hosted collections results to be printed at this time", req=req)
if hosted_colls_timeouts:
if verbose:
for timeout in hosted_colls_timeouts:
write_warning("Hosted collections (perform_search_request): %s timed out and will be searched again later" % timeout[0][1].name, req=req)
# we need to know for later use if there were any hosted collections to be searched even if they weren't in the end
elif hosted_colls and ((not (of.startswith("h") or of.startswith("x"))) or p.startswith("recid:")):
(hosted_colls_results, hosted_colls_timeouts) = (None, None)
else:
if verbose:
write_warning("Hosted collections (perform_search_request): there were no hosted collections to be searched", req=req)
## let's define some useful boolean variables:
# True means there are actual or potential hosted collections results to be printed
kwargs['hosted_colls_actual_or_potential_results_p'] = not (not hosted_colls or not ((hosted_colls_results and hosted_colls_true_results) or hosted_colls_timeouts))
# True means there are hosted collections timeouts to take care of later
# (useful for more accurate printing of results later)
kwargs['hosted_colls_potential_results_p'] = not (not hosted_colls or not hosted_colls_timeouts)
# True means we only have hosted collections to deal with
kwargs['only_hosted_colls_actual_or_potential_results_p'] = not colls_to_search and hosted_colls_actual_or_potential_results_p
kwargs['hosted_colls_results'] = hosted_colls_results
kwargs['hosted_colls_timeouts'] = hosted_colls_timeouts
kwargs['hosted_colls_true_results'] = hosted_colls_true_results
def prs_advanced_search(results_in_any_collection, kwargs=None, req=None, of=None,
cc=None, ln=None, _=None, p=None, p1=None, p2=None, p3=None,
f=None, f1=None, m1=None, op1=None, f2=None, m2=None,
op2=None, f3=None, m3=None, ap=None, ec=None,
selected_external_collections_infos=None, verbose=None,
wl=None, em=None, **dummy):
len_results_p1 = 0
len_results_p2 = 0
len_results_p3 = 0
try:
results_in_any_collection.union_update(search_pattern_parenthesised(req, p1, f1, m1, ap=ap, of=of, verbose=verbose, ln=ln, wl=wl))
len_results_p1 = len(results_in_any_collection)
if len_results_p1 == 0:
if of.startswith("h"):
perform_external_collection_search_with_em(req, cc, [p, p1, p2, p3], f, ec,
verbose, ln, selected_external_collections_infos, em=em)
elif of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
return page_end(req, of, ln, em)
if p2:
results_tmp = search_pattern_parenthesised(req, p2, f2, m2, ap=ap, of=of, verbose=verbose, ln=ln, wl=wl)
len_results_p2 = len(results_tmp)
if op1 == "a": # add
results_in_any_collection.intersection_update(results_tmp)
elif op1 == "o": # or
results_in_any_collection.union_update(results_tmp)
elif op1 == "n": # not
results_in_any_collection.difference_update(results_tmp)
else:
if of.startswith("h"):
write_warning("Invalid set operation %s." % cgi.escape(op1), "Error", req=req)
if len(results_in_any_collection) == 0:
if of.startswith("h"):
if len_results_p2:
#each individual query returned results, but the boolean operation did not
nearestterms = []
nearest_search_args = req.argd.copy()
if p1:
nearestterms.append((p1, len_results_p1, clean_dictionary(nearest_search_args, ['p2', 'f2', 'm2', 'p3', 'f3', 'm3'])))
nearestterms.append((p2, len_results_p2, clean_dictionary(nearest_search_args, ['p1', 'f1', 'm1', 'p3', 'f3', 'm3'])))
write_warning(websearch_templates.tmpl_search_no_boolean_hits(ln=ln, nearestterms=nearestterms), req=req)
perform_external_collection_search_with_em(req, cc, [p, p1, p2, p3], f, ec, verbose,
ln, selected_external_collections_infos, em=em)
elif of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
if p3:
results_tmp = search_pattern_parenthesised(req, p3, f3, m3, ap=ap, of=of, verbose=verbose, ln=ln, wl=wl)
len_results_p3 = len(results_tmp)
if op2 == "a": # add
results_in_any_collection.intersection_update(results_tmp)
elif op2 == "o": # or
results_in_any_collection.union_update(results_tmp)
elif op2 == "n": # not
results_in_any_collection.difference_update(results_tmp)
else:
if of.startswith("h"):
write_warning("Invalid set operation %s." % cgi.escape(op2), "Error", req=req)
if len(results_in_any_collection) == 0 and len_results_p3 and of.startswith("h"):
#each individual query returned results but the boolean operation did not
nearestterms = []
nearest_search_args = req.argd.copy()
if p1:
nearestterms.append((p1, len_results_p1, clean_dictionary(nearest_search_args, ['p2', 'f2', 'm2', 'p3', 'f3', 'm3'])))
if p2:
nearestterms.append((p2, len_results_p2, clean_dictionary(nearest_search_args, ['p1', 'f1', 'm1', 'p3', 'f3', 'm3'])))
nearestterms.append((p3, len_results_p3, clean_dictionary(nearest_search_args, ['p1', 'f1', 'm1', 'p2', 'f2', 'm2'])))
write_warning(websearch_templates.tmpl_search_no_boolean_hits(ln=ln, nearestterms=nearestterms), req=req)
except:
register_exception(req=req, alert_admin=True)
if of.startswith("h"):
req.write(create_error_box(req, verbose=verbose, ln=ln))
perform_external_collection_search_with_em(req, cc, [p, p1, p2, p3], f, ec, verbose,
ln, selected_external_collections_infos, em=em)
elif of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
return page_end(req, of, ln, em)
def prs_simple_search(results_in_any_collection, kwargs=None, req=None, of=None, cc=None, ln=None, p=None, f=None,
p1=None, p2=None, p3=None, ec=None, verbose=None, selected_external_collections_infos=None,
only_hosted_colls_actual_or_potential_results_p=None, query_representation_in_cache=None,
ap=None, hosted_colls_actual_or_potential_results_p=None, wl=None, em=None,
**dummy):
try:
results_in_cache = intbitset().fastload(
search_results_cache.get(query_representation_in_cache))
except:
results_in_cache = None
if results_in_cache is not None:
# query is not in the cache already, so reuse it:
results_in_any_collection.union_update(results_in_cache)
if verbose and of.startswith("h"):
write_warning("Search stage 0: query found in cache, reusing cached results.", req=req)
else:
try:
# added the display_nearest_terms_box parameter to avoid printing out the "Nearest terms in any collection"
# recommendations when there are results only in the hosted collections. Also added the if clause to avoid
# searching in case we know we only have actual or potential hosted collections results
if not only_hosted_colls_actual_or_potential_results_p:
results_in_any_collection.union_update(search_pattern_parenthesised(req, p, f, ap=ap, of=of, verbose=verbose, ln=ln,
display_nearest_terms_box=not hosted_colls_actual_or_potential_results_p,
wl=wl))
except:
register_exception(req=req, alert_admin=True)
if of.startswith("h"):
req.write(create_error_box(req, verbose=verbose, ln=ln))
perform_external_collection_search_with_em(req, cc, [p, p1, p2, p3], f, ec, verbose,
ln, selected_external_collections_infos, em=em)
return page_end(req, of, ln, em)
def prs_intersect_results_with_collrecs(results_final, results_in_any_collection, kwargs=None, colls_to_search=None,
req=None, ap=None, of=None, ln=None,
cc=None, p=None, p1=None, p2=None, p3=None, f=None,
ec=None, verbose=None, selected_external_collections_infos=None, em=None,
**dummy):
display_nearest_terms_box=not kwargs['hosted_colls_actual_or_potential_results_p']
try:
# added the display_nearest_terms_box parameter to avoid printing out the "Nearest terms in any collection"
# recommendations when there results only in the hosted collections. Also added the if clause to avoid
# searching in case we know since the last stage that we have no results in any collection
if len(results_in_any_collection) != 0:
results_final.update(intersect_results_with_collrecs(req, results_in_any_collection, colls_to_search, ap, of,
verbose, ln, display_nearest_terms_box=display_nearest_terms_box))
except:
register_exception(req=req, alert_admin=True)
if of.startswith("h"):
req.write(create_error_box(req, verbose=verbose, ln=ln))
perform_external_collection_search_with_em(req, cc, [p, p1, p2, p3], f, ec, verbose,
ln, selected_external_collections_infos, em=em)
return page_end(req, of, ln, em)
def prs_store_results_in_cache(query_representation_in_cache, results_in_any_collection, req=None, verbose=None, of=None, **dummy):
if CFG_WEBSEARCH_SEARCH_CACHE_SIZE > 0:
search_results_cache.set(query_representation_in_cache,
results_in_any_collection.fastdump(),
timeout=CFG_WEBSEARCH_SEARCH_CACHE_TIMEOUT)
search_results_cache.set(query_representation_in_cache + '::cc',
dummy.get('cc', CFG_SITE_NAME),
timeout=CFG_WEBSEARCH_SEARCH_CACHE_TIMEOUT)
if req:
from flask import request
req = request
search_results_cache.set(query_representation_in_cache + '::p',
req.values.get('p', ''),
timeout=CFG_WEBSEARCH_SEARCH_CACHE_TIMEOUT)
if verbose and of.startswith("h"):
write_warning(req, "Search stage 3: storing query results in cache.", req=req)
def prs_apply_search_limits(results_final, kwargs=None, req=None, of=None, cc=None, ln=None, _=None,
p=None, p1=None, p2=None, p3=None, f=None, pl=None, ap=None, dt=None,
ec=None, selected_external_collections_infos=None,
hosted_colls_actual_or_potential_results_p=None,
datetext1=None, datetext2=None, verbose=None, wl=None, em=None,
**dummy):
if datetext1 != "" and results_final != {}:
if verbose and of.startswith("h"):
write_warning("Search stage 5: applying time etc limits, from %s until %s..." % (datetext1, datetext2), req=req)
try:
results_final = intersect_results_with_hitset(req,
results_final,
search_unit_in_bibrec(datetext1, datetext2, dt),
ap,
aptext= _("No match within your time limits, "
"discarding this condition..."),
of=of)
except:
register_exception(req=req, alert_admin=True)
if of.startswith("h"):
req.write(create_error_box(req, verbose=verbose, ln=ln))
perform_external_collection_search_with_em(req, cc, [p, p1, p2, p3], f, ec, verbose,
ln, selected_external_collections_infos, em=em)
return page_end(req, of, ln, em)
if results_final == {} and not hosted_colls_actual_or_potential_results_p:
if of.startswith("h"):
perform_external_collection_search_with_em(req, cc, [p, p1, p2, p3], f, ec, verbose,
ln, selected_external_collections_infos, em=em)
#if of.startswith("x"):
# # Print empty, but valid XML
# print_records_prologue(req, of)
# print_records_epilogue(req, of)
return page_end(req, of, ln, em)
if pl and results_final != {}:
pl = wash_pattern(pl)
if verbose and of.startswith("h"):
write_warning("Search stage 5: applying search pattern limit %s..." % cgi.escape(pl), req=req)
try:
results_final = intersect_results_with_hitset(req,
results_final,
search_pattern_parenthesised(req, pl, ap=0, ln=ln, wl=wl),
ap,
aptext=_("No match within your search limits, "
"discarding this condition..."),
of=of)
except:
register_exception(req=req, alert_admin=True)
if of.startswith("h"):
req.write(create_error_box(req, verbose=verbose, ln=ln))
perform_external_collection_search_with_em(req, cc, [p, p1, p2, p3], f, ec, verbose,
ln, selected_external_collections_infos, em=em)
return page_end(req, of, ln, em)
if results_final == {} and not hosted_colls_actual_or_potential_results_p:
if of.startswith("h"):
perform_external_collection_search_with_em(req, cc, [p, p1, p2, p3], f, ec, verbose,
ln, selected_external_collections_infos, em=em)
if of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
return page_end(req, of, ln, em)
def prs_split_into_collections(kwargs=None, results_final=None, colls_to_search=None, hosted_colls_results=None,
cpu_time=0, results_final_nb_total=None, hosted_colls_actual_or_potential_results_p=None,
hosted_colls_true_results=None, hosted_colls_timeouts=None, **dummy):
results_final_nb_total = 0
results_final_nb = {} # will hold number of records found in each collection
# (in simple dict to display overview more easily)
for coll in results_final.keys():
results_final_nb[coll] = len(results_final[coll])
#results_final_nb_total += results_final_nb[coll]
# Now let us calculate results_final_nb_total more precisely,
# in order to get the total number of "distinct" hits across
# searched collections; this is useful because a record might
# have been attributed to more than one primary collection; so
# we have to avoid counting it multiple times. The price to
# pay for this accuracy of results_final_nb_total is somewhat
# increased CPU time.
if results_final.keys() == 1:
# only one collection; no need to union them
results_final_for_all_selected_colls = results_final.values()[0]
results_final_nb_total = results_final_nb.values()[0]
else:
# okay, some work ahead to union hits across collections:
results_final_for_all_selected_colls = intbitset()
for coll in results_final.keys():
results_final_for_all_selected_colls.union_update(results_final[coll])
results_final_nb_total = len(results_final_for_all_selected_colls)
#if hosted_colls and (of.startswith("h") or of.startswith("x")):
if hosted_colls_actual_or_potential_results_p:
if hosted_colls_results:
for result in hosted_colls_true_results:
colls_to_search.append(result[0][1].name)
results_final_nb[result[0][1].name] = result[1]
results_final_nb_total += result[1]
cpu_time += result[2]
if hosted_colls_timeouts:
for timeout in hosted_colls_timeouts:
colls_to_search.append(timeout[1].name)
# use -963 as a special number to identify the collections that timed out
results_final_nb[timeout[1].name] = -963
kwargs['results_final_nb'] = results_final_nb
kwargs['results_final_nb_total'] = results_final_nb_total
kwargs['results_final_for_all_selected_colls'] = results_final_for_all_selected_colls
kwargs['cpu_time'] = cpu_time #rca TODO: check where the cpu_time is used, this line was missing
return (results_final_nb, results_final_nb_total, results_final_for_all_selected_colls)
def prs_summarize_records(kwargs=None, req=None, p=None, f=None, aas=None,
p1=None, p2=None, p3=None, f1=None, f2=None, f3=None, op1=None, op2=None,
ln=None, results_final_for_all_selected_colls=None, of='hcs', **dummy):
# feed the current search to be summarized:
from invenio.legacy.search_engine.summarizer import summarize_records
search_p = p
search_f = f
if not p and (aas == 1 or p1 or p2 or p3):
op_d = {'n': ' and not ', 'a': ' and ', 'o': ' or ', '': ''}
triples = ziplist([f1, f2, f3], [p1, p2, p3], [op1, op2, ''])
triples_len = len(triples)
for i in range(triples_len):
fi, pi, oi = triples[i] # e.g.:
if i < triples_len-1 and not triples[i+1][1]: # if p2 empty
triples[i+1][0] = '' # f2 must be too
oi = '' # and o1
if ' ' in pi:
pi = '"'+pi+'"'
if fi:
fi = fi + ':'
search_p += fi + pi + op_d[oi]
search_f = ''
summarize_records(results_final_for_all_selected_colls, of, ln, search_p, search_f, req)
def prs_print_records(kwargs=None, results_final=None, req=None, of=None, cc=None, pl_in_url=None,
ln=None, _=None, p=None, p1=None, p2=None, p3=None, f=None, rg=None, sf=None,
so=None, sp=None, rm=None, ot=None, aas=None, f1=None, m1=None, op1=None,
f2=None, m2=None, op2=None, f3=None, m3=None, sc=None, d1y=None, d1m=None,
d1d=None, d2y=None, d2m=None, d2d=None, dt=None, jrec=None, colls_to_search=None,
hosted_colls_actual_or_potential_results_p=None, hosted_colls_results=None,
hosted_colls_true_results=None, hosted_colls_timeouts=None, results_final_nb=None,
cpu_time=None, verbose=None, em=None, **dummy):
if len(colls_to_search)>1:
cpu_time = -1 # we do not want to have search time printed on each collection
print_records_prologue(req, of, cc=cc)
results_final_colls = []
wlqh_results_overlimit = 0
for coll in colls_to_search:
if results_final.has_key(coll) and len(results_final[coll]):
if of.startswith("h"):
req.write(print_search_info(p, f, sf, so, sp, rm, of, ot, coll, results_final_nb[coll],
jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2,
sc, pl_in_url,
d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time, em=em))
results_final_recIDs = list(results_final[coll])
results_final_relevances = []
results_final_relevances_prologue = ""
results_final_relevances_epilogue = ""
if rm: # do we have to rank?
results_final_recIDs_ranked, results_final_relevances, results_final_relevances_prologue, results_final_relevances_epilogue, results_final_comments = \
rank_records(req, rm, 0, results_final[coll],
string.split(p) + string.split(p1) +
string.split(p2) + string.split(p3), verbose, so, of, ln, rg, jrec, kwargs['f'])
if of.startswith("h"):
write_warning(results_final_comments, req=req)
if results_final_recIDs_ranked:
results_final_recIDs = results_final_recIDs_ranked
else:
# rank_records failed and returned some error message to display:
write_warning(results_final_relevances_prologue, req=req)
write_warning(results_final_relevances_epilogue, req=req)
elif sf or (CFG_BIBSORT_BUCKETS and sorting_methods): # do we have to sort?
results_final_recIDs = sort_records(req, results_final_recIDs, sf, so, sp, verbose, of, ln, rg, jrec)
if len(results_final_recIDs) < CFG_WEBSEARCH_PREV_NEXT_HIT_LIMIT:
results_final_colls.append(results_final_recIDs)
else:
wlqh_results_overlimit = 1
print_records(req, results_final_recIDs, jrec, rg, of, ot, ln,
results_final_relevances,
results_final_relevances_prologue,
results_final_relevances_epilogue,
search_pattern=p,
print_records_prologue_p=False,
print_records_epilogue_p=False,
verbose=verbose,
sf=sf,
so=so,
sp=sp,
rm=rm,
em=em)
if of.startswith("h"):
req.write(print_search_info(p, f, sf, so, sp, rm, of, ot, coll, results_final_nb[coll],
jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2,
sc, pl_in_url,
d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time, 1, em=em))
if req and not isinstance(req, cStringIO.OutputType):
# store the last search results page
session_param_set(req, 'websearch-last-query', req.unparsed_uri)
if wlqh_results_overlimit:
results_final_colls = None
# store list of results if user wants to display hits
# in a single list, or store list of collections of records
# if user displays hits split by collections:
session_param_set(req, 'websearch-last-query-hits', results_final_colls)
#if hosted_colls and (of.startswith("h") or of.startswith("x")):
if hosted_colls_actual_or_potential_results_p:
if hosted_colls_results:
# TODO: add a verbose message here
for result in hosted_colls_true_results:
if of.startswith("h"):
req.write(print_hosted_search_info(p, f, sf, so, sp, rm, of, ot, result[0][1].name, results_final_nb[result[0][1].name],
jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2,
sc, pl_in_url,
d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time, em=em))
req.write(print_hosted_results(url_and_engine=result[0], ln=ln, of=of, req=req, limit=rg, em=em))
if of.startswith("h"):
req.write(print_hosted_search_info(p, f, sf, so, sp, rm, of, ot, result[0][1].name, results_final_nb[result[0][1].name],
jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2,
sc, pl_in_url,
d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time, 1))
if hosted_colls_timeouts:
# TODO: add a verbose message here
# TODO: check if verbose messages still work when dealing with (re)calculations of timeouts
(hosted_colls_timeouts_results, hosted_colls_timeouts_timeouts) = do_calculate_hosted_collections_results(req, ln, None, verbose, None, hosted_colls_timeouts, CFG_HOSTED_COLLECTION_TIMEOUT_POST_SEARCH)
if hosted_colls_timeouts_results:
for result in hosted_colls_timeouts_results:
if result[1] == None or result[1] == False:
## these are the searches the returned no or zero results
## also print a nearest terms box, in case this is the only
## collection being searched and it returns no results?
if of.startswith("h"):
req.write(print_hosted_search_info(p, f, sf, so, sp, rm, of, ot, result[0][1].name, -963,
jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2,
sc, pl_in_url,
d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time))
req.write(print_hosted_results(url_and_engine=result[0], ln=ln, of=of, req=req, no_records_found=True, limit=rg, em=em))
req.write(print_hosted_search_info(p, f, sf, so, sp, rm, of, ot, result[0][1].name, -963,
jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2,
sc, pl_in_url,
d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time, 1))
else:
# these are the searches that actually returned results on time
if of.startswith("h"):
req.write(print_hosted_search_info(p, f, sf, so, sp, rm, of, ot, result[0][1].name, result[1],
jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2,
sc, pl_in_url,
d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time))
req.write(print_hosted_results(url_and_engine=result[0], ln=ln, of=of, req=req, limit=rg, em=em))
if of.startswith("h"):
req.write(print_hosted_search_info(p, f, sf, so, sp, rm, of, ot, result[0][1].name, result[1],
jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2,
sc, pl_in_url,
d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time, 1))
if hosted_colls_timeouts_timeouts:
for timeout in hosted_colls_timeouts_timeouts:
if of.startswith("h"):
req.write(print_hosted_search_info(p, f, sf, so, sp, rm, of, ot, timeout[1].name, -963,
jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2,
sc, pl_in_url,
d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time))
req.write(print_hosted_results(url_and_engine=timeout[0], ln=ln, of=of, req=req, search_timed_out=True, limit=rg, em=em))
req.write(print_hosted_search_info(p, f, sf, so, sp, rm, of, ot, timeout[1].name, -963,
jrec, rg, aas, ln, p1, p2, p3, f1, f2, f3, m1, m2, m3, op1, op2,
sc, pl_in_url,
d1y, d1m, d1d, d2y, d2m, d2d, dt, cpu_time, 1))
print_records_epilogue(req, of)
if f == "author" and of.startswith("h"):
req.write(create_similarly_named_authors_link_box(p, ln))
def prs_log_query(kwargs=None, req=None, uid=None, of=None, ln=None, p=None, f=None,
colls_to_search=None, results_final_nb_total=None, em=None, **dummy):
# FIXME move query logging to signal receiver
# log query:
try:
from flask.ext.login import current_user
if req:
from flask import request
req = request
id_query = log_query(req.host,
'&'.join(map(lambda (k,v): k+'='+v, request.values.iteritems(multi=True))),
uid)
#id_query = log_query(req.remote_host, req.args, uid)
#of = request.values.get('of', 'hb')
if of.startswith("h") and id_query and (em == '' or EM_REPOSITORY["alert"] in em):
if not of in ['hcs', 'hcs2']:
# display alert/RSS teaser for non-summary formats:
display_email_alert_part = True
if current_user:
if current_user['email'] == 'guest':
if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS > 4:
display_email_alert_part = False
else:
if not current_user['precached_usealerts']:
display_email_alert_part = False
from flask import flash
flash(websearch_templates.tmpl_alert_rss_teaser_box_for_query(id_query, \
ln=ln, display_email_alert_part=display_email_alert_part), 'search-results-after')
except:
# do not log query if req is None (used by CLI interface)
pass
log_query_info("ss", p, f, colls_to_search, results_final_nb_total)
def prs_search_common(kwargs=None, req=None, of=None, cc=None, ln=None, uid=None, _=None, p=None,
p1=None, p2=None, p3=None, colls_to_display=None, f=None, rg=None, sf=None,
so=None, sp=None, rm=None, ot=None, aas=None, f1=None, m1=None, op1=None,
f2=None, m2=None, op2=None, f3=None, m3=None, sc=None, pl=None,
d1y=None, d1m=None, d1d=None, d2y=None, d2m=None, d2d=None,
dt=None, jrec=None, ec=None, action=None, colls_to_search=None, wash_colls_debug=None,
verbose=None, wl=None, em=None, **dummy):
query_representation_in_cache = get_search_results_cache_key(**kwargs)
page_start(req, of, cc, aas, ln, uid, p=create_page_title_search_pattern_info(p, p1, p2, p3), em=em)
if of.startswith("h") and verbose and wash_colls_debug:
write_warning("wash_colls debugging info : %s" % wash_colls_debug, req=req)
prs_search_hosted_collections(kwargs=kwargs, **kwargs)
if of.startswith("h"):
req.write(create_search_box(cc, colls_to_display, p, f, rg, sf, so, sp, rm, of, ot, aas, ln, p1, f1, m1, op1,
p2, f2, m2, op2, p3, f3, m3, sc, pl, d1y, d1m, d1d, d2y, d2m, d2d, dt, jrec, ec, action,
em
))
t1 = os.times()[4]
results_in_any_collection = intbitset()
if aas == 1 or (p1 or p2 or p3):
## 3A - advanced search
output = prs_advanced_search(results_in_any_collection, kwargs=kwargs, **kwargs)
if output is not None:
return output
else:
## 3B - simple search
output = prs_simple_search(results_in_any_collection, kwargs=kwargs, **kwargs)
if output is not None:
return output
if len(results_in_any_collection) == 0 and not kwargs['hosted_colls_actual_or_potential_results_p']:
if of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
return None
# store this search query results into search results cache if needed:
prs_store_results_in_cache(query_representation_in_cache, results_in_any_collection, **kwargs)
# search stage 4 and 5: intersection with collection universe and sorting/limiting
try:
output = prs_intersect_with_colls_and_apply_search_limits(results_in_any_collection, kwargs=kwargs, **kwargs)
if output is not None:
return output
except Exception: # no results to display
return None
t2 = os.times()[4]
cpu_time = t2 - t1
kwargs['cpu_time'] = cpu_time
## search stage 6: display results:
return prs_display_results(kwargs=kwargs, **kwargs)
def prs_intersect_with_colls_and_apply_search_limits(results_in_any_collection,
kwargs=None, req=None, of=None, ln=None, _=None,
p=None, p1=None, p2=None, p3=None, f=None, cc=None, ec=None,
verbose=None, em=None, **dummy):
# search stage 4: intersection with collection universe:
results_final = {}
output = prs_intersect_results_with_collrecs(results_final, results_in_any_collection, kwargs, **kwargs)
if output is not None:
return output
# another external search if we still don't have something
if results_final == {} and not kwargs['hosted_colls_actual_or_potential_results_p']:
if of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
kwargs['results_final'] = results_final
raise Exception
# search stage 5: apply search option limits and restrictions:
output = prs_apply_search_limits(results_final, kwargs=kwargs, **kwargs)
kwargs['results_final'] = results_final
if output is not None:
return output
def prs_display_results(kwargs=None, results_final=None, req=None, of=None, sf=None,
so=None, sp=None, verbose=None, p=None, p1=None, p2=None, p3=None,
cc=None, ln=None, _=None, ec=None, colls_to_search=None, rm=None, cpu_time=None,
f=None, em=None, **dummy
):
## search stage 6: display results:
# split result set into collections
(results_final_nb, results_final_nb_total, results_final_for_all_selected_colls) = prs_split_into_collections(kwargs=kwargs, **kwargs)
# we continue past this point only if there is a hosted collection that has timed out and might offer potential results
if results_final_nb_total == 0 and not kwargs['hosted_colls_potential_results_p']:
if of.startswith("h"):
write_warning("No match found, please enter different search terms.", req=req)
elif of.startswith("x"):
# Print empty, but valid XML
print_records_prologue(req, of)
print_records_epilogue(req, of)
else:
prs_log_query(kwargs=kwargs, **kwargs)
# yes, some hits found: good!
# collection list may have changed due to not-exact-match-found policy so check it out:
for coll in results_final.keys():
if coll not in colls_to_search:
colls_to_search.append(coll)
# print results overview:
if of == "intbitset":
#return the result as an intbitset
return results_final_for_all_selected_colls
elif of == "id":
# we have been asked to return list of recIDs
recIDs = list(results_final_for_all_selected_colls)
if rm: # do we have to rank?
results_final_for_all_colls_rank_records_output = rank_records(req, rm, 0, results_final_for_all_selected_colls,
string.split(p) + string.split(p1) +
string.split(p2) + string.split(p3), verbose, so, of, ln, kwargs['rg'], kwargs['jrec'], kwargs['f'])
if results_final_for_all_colls_rank_records_output[0]:
recIDs = results_final_for_all_colls_rank_records_output[0]
elif sf or (CFG_BIBSORT_BUCKETS and sorting_methods): # do we have to sort?
recIDs = sort_records(req, recIDs, sf, so, sp, verbose, of, ln)
return recIDs
elif of.startswith("h"):
if of not in ['hcs', 'hcs2']:
# added the hosted_colls_potential_results_p parameter to help print out the overview more accurately
req.write(print_results_overview(colls_to_search, results_final_nb_total, results_final_nb, cpu_time,
ln, ec, hosted_colls_potential_results_p=kwargs['hosted_colls_potential_results_p'], em=em))
kwargs['selected_external_collections_infos'] = print_external_results_overview(req, cc, [p, p1, p2, p3],
f, ec, verbose, ln, print_overview=em == "" or EM_REPOSITORY["overview"] in em)
# print number of hits found for XML outputs:
if of.startswith("x") or of == 'mobb':
req.write("<!-- Search-Engine-Total-Number-Of-Results: %s -->\n" % kwargs['results_final_nb_total'])
# print records:
if of in ['hcs', 'hcs2']:
prs_summarize_records(kwargs=kwargs, **kwargs)
else:
prs_print_records(kwargs=kwargs, **kwargs)
# this is a copy of the prs_display_results with output parts removed, needed for external modules
def prs_rank_results(kwargs=None, results_final=None, req=None, colls_to_search=None,
sf=None, so=None, sp=None, of=None, rm=None, p=None, p1=None, p2=None, p3=None,
verbose=None, **dummy
):
## search stage 6: display results:
# split result set into collections
(results_final_nb, results_final_nb_total, results_final_for_all_selected_colls) = prs_split_into_collections(kwargs=kwargs, **kwargs)
# yes, some hits found: good!
# collection list may have changed due to not-exact-match-found policy so check it out:
for coll in results_final.keys():
if coll not in colls_to_search:
colls_to_search.append(coll)
# we have been asked to return list of recIDs
recIDs = list(results_final_for_all_selected_colls)
if rm: # do we have to rank?
results_final_for_all_colls_rank_records_output = rank_records(req, rm, 0, results_final_for_all_selected_colls,
string.split(p) + string.split(p1) +
string.split(p2) + string.split(p3), verbose, so, of, field=kwargs['f'])
if results_final_for_all_colls_rank_records_output[0]:
recIDs = results_final_for_all_colls_rank_records_output[0]
elif sf or (CFG_BIBSORT_BUCKETS and sorting_methods): # do we have to sort?
recIDs = sort_records(req, recIDs, sf, so, sp, verbose, of)
return recIDs
def perform_request_cache(req, action="show"):
"""Manipulates the search engine cache."""
req.content_type = "text/html"
req.send_http_header()
req.write("<html>")
out = ""
out += "<h1>Search Cache</h1>"
req.write(out)
# show collection reclist cache:
out = "<h3>Collection reclist cache</h3>"
out += "- collection table last updated: %s" % get_table_update_time('collection')
out += "<br />- reclist cache timestamp: %s" % collection_reclist_cache.timestamp
out += "<br />- reclist cache contents:"
out += "<blockquote>"
for coll in collection_reclist_cache.cache.keys():
if collection_reclist_cache.cache[coll]:
out += "%s (%d)<br />" % (coll, len(collection_reclist_cache.cache[coll]))
out += "</blockquote>"
req.write(out)
# show field i18nname cache:
out = "<h3>Field I18N names cache</h3>"
out += "- fieldname table last updated: %s" % get_table_update_time('fieldname')
out += "<br />- i18nname cache timestamp: %s" % field_i18nname_cache.timestamp
out += "<br />- i18nname cache contents:"
out += "<blockquote>"
for field in field_i18nname_cache.cache.keys():
for ln in field_i18nname_cache.cache[field].keys():
out += "%s, %s = %s<br />" % (field, ln, field_i18nname_cache.cache[field][ln])
out += "</blockquote>"
req.write(out)
# show collection i18nname cache:
out = "<h3>Collection I18N names cache</h3>"
out += "- collectionname table last updated: %s" % get_table_update_time('collectionname')
out += "<br />- i18nname cache timestamp: %s" % collection_i18nname_cache.timestamp
out += "<br />- i18nname cache contents:"
out += "<blockquote>"
for coll in collection_i18nname_cache.cache.keys():
for ln in collection_i18nname_cache.cache[coll].keys():
out += "%s, %s = %s<br />" % (coll, ln, collection_i18nname_cache.cache[coll][ln])
out += "</blockquote>"
req.write(out)
req.write("</html>")
return "\n"
def perform_request_log(req, date=""):
"""Display search log information for given date."""
req.content_type = "text/html"
req.send_http_header()
req.write("<html>")
req.write("<h1>Search Log</h1>")
if date: # case A: display stats for a day
yyyymmdd = string.atoi(date)
req.write("<p><big><strong>Date: %d</strong></big><p>" % yyyymmdd)
req.write("""<table border="1">""")
req.write("<tr><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td><td><strong>%s</strong></td></tr>" % ("No.", "Time", "Pattern", "Field", "Collection", "Number of Hits"))
# read file:
p = os.popen("grep ^%d %s/search.log" % (yyyymmdd, CFG_LOGDIR), 'r')
lines = p.readlines()
p.close()
# process lines:
i = 0
for line in lines:
try:
datetime, dummy_aas, p, f, c, nbhits = string.split(line,"#")
i += 1
req.write("<tr><td align=\"right\">#%d</td><td>%s:%s:%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td></tr>" \
% (i, datetime[8:10], datetime[10:12], datetime[12:], p, f, c, nbhits))
except:
pass # ignore eventual wrong log lines
req.write("</table>")
else: # case B: display summary stats per day
yyyymm01 = int(time.strftime("%Y%m01", time.localtime()))
yyyymmdd = int(time.strftime("%Y%m%d", time.localtime()))
req.write("""<table border="1">""")
req.write("<tr><td><strong>%s</strong></td><td><strong>%s</strong></tr>" % ("Day", "Number of Queries"))
for day in range(yyyymm01, yyyymmdd + 1):
p = os.popen("grep -c ^%d %s/search.log" % (day, CFG_LOGDIR), 'r')
for line in p.readlines():
req.write("""<tr><td>%s</td><td align="right"><a href="%s/search/log?date=%d">%s</a></td></tr>""" % \
(day, CFG_SITE_URL, day, line))
p.close()
req.write("</table>")
req.write("</html>")
return "\n"
def get_all_field_values(tag):
"""
Return all existing values stored for a given tag.
@param tag: the full tag, e.g. 909C0b
@type tag: string
@return: the list of values
@rtype: list of strings
"""
table = 'bib%02dx' % int(tag[:2])
return [row[0] for row in run_sql("SELECT DISTINCT(value) FROM %s WHERE tag=%%s" % table, (tag, ))]
def get_most_popular_field_values(recids, tags, exclude_values=None, count_repetitive_values=True, split_by=0):
"""
Analyze RECIDS and look for TAGS and return most popular values
and the frequency with which they occur sorted according to
descending frequency.
If a value is found in EXCLUDE_VALUES, then do not count it.
If COUNT_REPETITIVE_VALUES is True, then we count every occurrence
of value in the tags. If False, then we count the value only once
regardless of the number of times it may appear in a record.
(But, if the same value occurs in another record, we count it, of
course.)
@return: list of tuples containing tag and its frequency
Example:
>>> get_most_popular_field_values(range(11,20), '980__a')
[('PREPRINT', 10), ('THESIS', 7), ...]
>>> get_most_popular_field_values(range(11,20), ('100__a', '700__a'))
[('Ellis, J', 10), ('Ellis, N', 7), ...]
>>> get_most_popular_field_values(range(11,20), ('100__a', '700__a'), ('Ellis, J'))
[('Ellis, N', 7), ...]
"""
def _get_most_popular_field_values_helper_sorter(val1, val2):
"""Compare VAL1 and VAL2 according to, firstly, frequency, then
secondly, alphabetically."""
compared_via_frequencies = cmp(valuefreqdict[val2],
valuefreqdict[val1])
if compared_via_frequencies == 0:
return cmp(val1.lower(), val2.lower())
else:
return compared_via_frequencies
valuefreqdict = {}
## sanity check:
if not exclude_values:
exclude_values = []
if isinstance(tags, str):
tags = (tags,)
## find values to count:
vals_to_count = []
displaytmp = {}
if count_repetitive_values:
# counting technique A: can look up many records at once: (very fast)
for tag in tags:
vals_to_count.extend(get_fieldvalues(recids, tag, sort=False,
split_by=split_by))
else:
# counting technique B: must count record-by-record: (slow)
for recid in recids:
vals_in_rec = []
for tag in tags:
for val in get_fieldvalues(recid, tag, False):
vals_in_rec.append(val)
# do not count repetitive values within this record
# (even across various tags, so need to unify again):
dtmp = {}
for val in vals_in_rec:
dtmp[val.lower()] = 1
displaytmp[val.lower()] = val
vals_in_rec = dtmp.keys()
vals_to_count.extend(vals_in_rec)
## are we to exclude some of found values?
for val in vals_to_count:
if val not in exclude_values:
if val in valuefreqdict:
valuefreqdict[val] += 1
else:
valuefreqdict[val] = 1
## sort by descending frequency of values:
if not CFG_NUMPY_IMPORTABLE:
## original version
out = []
vals = valuefreqdict.keys()
vals.sort(_get_most_popular_field_values_helper_sorter)
for val in vals:
tmpdisplv = ''
if val in displaytmp:
tmpdisplv = displaytmp[val]
else:
tmpdisplv = val
out.append((tmpdisplv, valuefreqdict[val]))
return out
else:
f = [] # frequencies
n = [] # original names
ln = [] # lowercased names
## build lists within one iteration
for (val, freq) in valuefreqdict.iteritems():
f.append(-1 * freq)
if val in displaytmp:
n.append(displaytmp[val])
else:
n.append(val)
ln.append(val.lower())
## sort by frequency (desc) and then by lowercased name.
return [(n[i], -1 * f[i]) for i in numpy.lexsort([ln, f])]
def profile(p="", f="", c=CFG_SITE_NAME):
"""Profile search time."""
import profile
import pstats
profile.run("perform_request_search(p='%s',f='%s', c='%s')" % (p, f, c), "perform_request_search_profile")
p = pstats.Stats("perform_request_search_profile")
p.strip_dirs().sort_stats("cumulative").print_stats()
return 0
def perform_external_collection_search_with_em(req, current_collection, pattern_list, field,
external_collection, verbosity_level=0, lang=CFG_SITE_LANG,
selected_external_collections_infos=None, em=""):
perform_external_collection_search(req, current_collection, pattern_list, field, external_collection,
verbosity_level, lang, selected_external_collections_infos,
print_overview=em == "" or EM_REPOSITORY["overview"] in em,
print_search_info=em == "" or EM_REPOSITORY["search_info"] in em,
print_see_also_box=em == "" or EM_REPOSITORY["see_also_box"] in em,
print_body=em == "" or EM_REPOSITORY["body"] in em)
@cache.memoize(timeout=5)
def get_fulltext_terms_from_search_pattern(search_pattern):
keywords = []
if search_pattern is not None:
for unit in create_basic_search_units(None, search_pattern.encode('utf-8'), None):
bsu_o, bsu_p, bsu_f, bsu_m = unit[0], unit[1], unit[2], unit[3]
if (bsu_o != '-' and bsu_f in [None, 'fulltext']):
if bsu_m == 'a' and bsu_p.startswith('%') and bsu_p.endswith('%'):
# remove leading and training `%' representing partial phrase search
keywords.append(bsu_p[1:-1])
else:
keywords.append(bsu_p)
return keywords
diff --git a/invenio/legacy/webalert/webinterface.py b/invenio/legacy/webalert/webinterface.py
index 9785734f7..1d408c1cc 100644
--- a/invenio/legacy/webalert/webinterface.py
+++ b/invenio/legacy/webalert/webinterface.py
@@ -1,556 +1,556 @@
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""PERSONAL FEATURES - YOUR ALERTS"""
__revision__ = "$Id$"
__lastupdated__ = """$Date$"""
from invenio.config import CFG_SITE_SECURE_URL, CFG_SITE_NAME, \
CFG_ACCESS_CONTROL_LEVEL_SITE, CFG_SITE_NAME_INTL
from invenio.legacy.webpage import page
from invenio import webalert
from invenio.legacy.webuser import getUid, page_not_authorized, isGuestUser
from invenio.ext.legacy.handler import wash_urlargd, WebInterfaceDirectory
from invenio.utils.url import redirect_to_url, make_canonical_urlargd
-from invenio.webstat import register_customevent
+from invenio.legacy.webstat.api import register_customevent
from invenio.ext.logging import register_exception
from invenio.legacy.webuser import collect_user_info
from invenio.base.i18n import gettext_set_language
import invenio.legacy.template
webalert_templates = invenio.legacy.template.load('webalert')
class WebInterfaceYourAlertsPages(WebInterfaceDirectory):
"""Defines the set of /youralerts pages."""
_exports = ['', 'display', 'input', 'modify', 'list', 'add',
'update', 'remove']
def index(self, req, dummy):
"""Index page."""
redirect_to_url(req, '%s/youralerts/list' % CFG_SITE_SECURE_URL)
def display(self, req, form):
"""Display search history page. A misnomer."""
argd = wash_urlargd(form, {'p': (str, "n")
})
uid = getUid(req)
# load the right language
_ = gettext_set_language(argd['ln'])
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/youralerts/display" % \
(CFG_SITE_SECURE_URL,),
navmenuid="youralerts")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/youralerts/display%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_usealerts']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use alerts."))
if argd['p'] == 'y':
_title = _("Popular Searches")
else:
_title = _("Your Searches")
# register event in webstat
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("alerts", ["display", "", user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title=_title,
body=webalert.perform_display(argd['p'], uid, ln=argd['ln']),
navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % {
'sitesecureurl' : CFG_SITE_SECURE_URL,
'ln': argd['ln'],
'account' : _("Your Account"),
},
description=_("%s Personalize, Display searches") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
uid=uid,
language=argd['ln'],
req=req,
lastupdated=__lastupdated__,
navmenuid='youralerts',
secure_page_p=1)
def input(self, req, form):
argd = wash_urlargd(form, {'idq': (int, None),
'name': (str, ""),
'freq': (str, "week"),
'notif': (str, "y"),
'idb': (int, 0),
'error_msg': (str, ""),
})
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/youralerts/input" % \
(CFG_SITE_SECURE_URL,),
navmenuid="youralerts")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/youralerts/input%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
# load the right language
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_usealerts']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use alerts."))
try:
html = webalert.perform_input_alert("add", argd['idq'], argd['name'], argd['freq'],
argd['notif'], argd['idb'], uid, ln=argd['ln'])
except webalert.AlertError, msg:
return page(title=_("Error"),
body=webalert_templates.tmpl_errorMsg(ln=argd['ln'], error_msg=msg),
navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % {
'sitesecureurl' : CFG_SITE_SECURE_URL,
'ln': argd['ln'],
'account' : _("Your Account"),
},
description=_("%s Personalize, Set a new alert") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
uid=uid,
language=argd['ln'],
req=req,
lastupdated=__lastupdated__,
navmenuid='youralerts')
if argd['error_msg'] != "":
html = webalert_templates.tmpl_errorMsg(
ln = argd['ln'],
error_msg = argd['error_msg'],
rest = html,
)
# register event in webstat
alert_str = "%s (%d)" % (argd['name'], argd['idq'])
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("alerts", ["input", alert_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title=_("Set a new alert"),
body=html,
navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % {
'sitesecureurl' : CFG_SITE_SECURE_URL,
'ln': argd['ln'],
'account' : _("Your Account"),
},
description=_("%s Personalize, Set a new alert") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
uid=uid,
language=argd['ln'],
req=req,
lastupdated=__lastupdated__,
navmenuid='youralerts')
def modify(self, req, form):
argd = wash_urlargd(form, {'idq': (int, None),
'old_idb': (int, None),
'name': (str, ""),
'freq': (str, "week"),
'notif': (str, "y"),
'idb': (int, 0),
'error_msg': (str, ""),
})
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/youralerts/modify" % \
(CFG_SITE_SECURE_URL,),
navmenuid="youralerts")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/youralerts/modify%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
# load the right language
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_usealerts']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use alerts."))
try:
html = webalert.perform_input_alert("update", argd['idq'], argd['name'], argd['freq'],
argd['notif'], argd['idb'], uid, argd['old_idb'], ln=argd['ln'])
except webalert.AlertError, msg:
return page(title=_("Error"),
body=webalert_templates.tmpl_errorMsg(ln=argd['ln'], error_msg=msg),
navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % {
'sitesecureurl' : CFG_SITE_SECURE_URL,
'ln': argd['ln'],
'account' : _("Your Account"),
},
description=_("%s Personalize, Set a new alert") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
uid=uid,
language=argd['ln'],
req=req,
lastupdated=__lastupdated__,
navmenuid='youralerts')
if argd['error_msg'] != "":
html = webalert_templates.tmpl_errorMsg(
ln = argd['ln'],
error_msg = argd['error_msg'],
rest = html,
)
# register event in webstat
alert_str = "%s (%d)" % (argd['name'], argd['idq'])
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("alerts", ["modify", alert_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title=_("Modify alert settings"),
body=html,
navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % {
'sitesecureurl' : CFG_SITE_SECURE_URL,
'ln': argd['ln'],
'account' : _("Your Account"),
},
description=_("%s Personalize, Modify alert settings") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
uid=uid,
language=argd['ln'],
req=req,
lastupdated=__lastupdated__,
navmenuid='youralerts')
def list(self, req, form):
argd = wash_urlargd(form, {})
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/youralerts/list" % \
(CFG_SITE_SECURE_URL,),
navmenuid="youralerts")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/youralerts/list%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
# load the right language
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_usealerts']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use alerts."))
# register event in webstat
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("alerts", ["list", "", user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title=_("Your Alerts"),
body=webalert.perform_list_alerts(uid, ln = argd['ln']),
navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % {
'sitesecureurl' : CFG_SITE_SECURE_URL,
'ln': argd['ln'],
'account' : _("Your Account"),
},
description=_("%s Personalize, Display alerts") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
uid=uid,
language=argd['ln'],
req=req,
lastupdated=__lastupdated__,
navmenuid='youralerts')
def add(self, req, form):
argd = wash_urlargd(form, {'idq': (int, None),
'name': (str, None),
'freq': (str, None),
'notif': (str, None),
'idb': (int, None),
})
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/youralerts/add" % \
(CFG_SITE_SECURE_URL,),
navmenuid="youralerts")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/youralerts/add%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
# load the right language
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_usealerts']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use alerts."))
try:
html = webalert.perform_add_alert(argd['name'], argd['freq'], argd['notif'],
argd['idb'], argd['idq'], uid, ln=argd['ln'])
except webalert.AlertError, msg:
return page(title=_("Error"),
body=webalert_templates.tmpl_errorMsg(ln=argd['ln'], error_msg=msg),
navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % {
'sitesecureurl' : CFG_SITE_SECURE_URL,
'ln': argd['ln'],
'account' : _("Your Account"),
},
description=_("%s Personalize, Set a new alert") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
uid=uid,
language=argd['ln'],
req=req,
lastupdated=__lastupdated__,
navmenuid='youralerts')
# register event in webstat
alert_str = "%s (%d)" % (argd['name'], argd['idq'])
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("alerts", ["add", alert_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title=_("Display alerts"),
body=html,
navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % {
'sitesecureurl' : CFG_SITE_SECURE_URL,
'ln': argd['ln'],
'account' : _("Your Account"),
},
description=_("%s Personalize, Display alerts") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
uid=uid,
language=argd['ln'],
req=req,
lastupdated=__lastupdated__,
navmenuid='youralerts')
def update(self, req, form):
argd = wash_urlargd(form, {'name': (str, None),
'freq': (str, None),
'notif': (str, None),
'idb': (int, None),
'idq': (int, None),
'old_idb': (int, None),
})
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/youralerts/update" % \
(CFG_SITE_SECURE_URL,),
navmenuid="youralerts")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/youralerts/update%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
# load the right language
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_usealerts']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use alerts."))
try:
html = webalert.perform_update_alert(argd['name'], argd['freq'], argd['notif'],
argd['idb'], argd['idq'], argd['old_idb'], uid, ln=argd['ln'])
except webalert.AlertError, msg:
return page(title=_("Error"),
body=webalert_templates.tmpl_errorMsg(ln=argd['ln'], error_msg=msg),
navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % {
'sitesecureurl' : CFG_SITE_SECURE_URL,
'ln': argd['ln'],
'account' : _("Your Account"),
},
description=_("%s Personalize, Set a new alert") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
uid=uid,
language=argd['ln'],
req=req,
lastupdated=__lastupdated__,
navmenuid='youralerts')
# register event in webstat
alert_str = "%s (%d)" % (argd['name'], argd['idq'])
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("alerts", ["update", alert_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title=_("Display alerts"),
body=html,
navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % {
'sitesecureurl' : CFG_SITE_SECURE_URL,
'ln': argd['ln'],
'account' : _("Your Account"),
},
description=_("%s Personalize, Display alerts") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
uid=uid,
language=argd['ln'],
req=req,
lastupdated=__lastupdated__,
navmenuid='youralerts')
def remove(self, req, form):
argd = wash_urlargd(form, {'name': (str, None),
'idq': (int, None),
'idb': (int, None),
})
uid = getUid(req)
if CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/youralerts/remove" % \
(CFG_SITE_SECURE_URL,),
navmenuid="youralerts")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/youralerts/remove%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
# load the right language
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_usealerts']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use alerts."))
try:
html = webalert.perform_remove_alert(argd['name'], argd['idq'],
argd['idb'], uid, ln=argd['ln'])
except webalert.AlertError, msg:
return page(title=_("Error"),
body=webalert_templates.tmpl_errorMsg(ln=argd['ln'], error_msg=msg),
navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % {
'sitesecureurl' : CFG_SITE_SECURE_URL,
'ln': argd['ln'],
'account' : _("Your Account"),
},
description=_("%s Personalize, Set a new alert") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
uid=uid,
language=argd['ln'],
req=req,
lastupdated=__lastupdated__,
navmenuid='youralerts')
# register event in webstat
alert_str = "%s (%d)" % (argd['name'], argd['idq'])
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("alerts", ["remove", alert_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
# display success
return page(title=_("Display alerts"),
body=html,
navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % {
'sitesecureurl' : CFG_SITE_SECURE_URL,
'ln': argd['ln'],
'account' : _("Your Account"),
},
description=_("%s Personalize, Display alerts") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
uid=uid,
language=argd['ln'],
req=req,
lastupdated=__lastupdated__,
navmenuid='youralerts')
diff --git a/invenio/legacy/webauthorprofile/cli.py b/invenio/legacy/webauthorprofile/cli.py
index 8eb652dd2..c003ba204 100644
--- a/invenio/legacy/webauthorprofile/cli.py
+++ b/invenio/legacy/webauthorprofile/cli.py
@@ -1,40 +1,40 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
webauthorprofile_cli
This module provides a command-line interface for WebAuthorProfile.
"""
-from invenio import bibauthorid_config as bconfig
+from invenio.legacy.bibauthorid import config as bconfig
def main():
"""Main function """
try:
from invenio import webauthorprofile_daemon as daemon
except ImportError:
bconfig.LOGGER.error("Hmm...No Daemon process running.")
return
daemon.webauthorprofile_daemon()
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/webauthorprofile/daemon.py b/invenio/legacy/webauthorprofile/daemon.py
index a46006f48..b83b90818 100644
--- a/invenio/legacy/webauthorprofile/daemon.py
+++ b/invenio/legacy/webauthorprofile/daemon.py
@@ -1,132 +1,132 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
WebAuthorProfile daemon
"""
import sys
from invenio import bibtask
-from invenio.bibauthorid_dbinterface import get_existing_personids
+from invenio.legacy.bibauthorid.dbinterface import get_existing_personids
from invenio.webauthorprofile_dbapi import get_expired_person_ids
from invenio.webauthorprofile_corefunctions import _compute_cache_for_person
def webauthorprofile_daemon():
"""Constructs the webauthorprofile bibtask."""
bibtask.task_init(authorization_action='runbibclassify',
authorization_msg="WebAuthorProfile Task Submission",
description="""
Purpose:
Precompute WebAuthorProfile caches.
Examples:
$webauthorprofile -u admin --all
""",
help_specific_usage="""
webauthorprofile [OPTIONS]
OPTIONS
Options for update personid
(default) Computes all caches for all persons with at least one expired cache
--all Computes all caches for all persons
--mp Enables multiprocessing computation
""",
version="Invenio WebAuthorProfile v 1.0",
specific_params=("i:",
[
"all",
"mp"
]),
task_submit_elaborate_specific_parameter_fnc=_task_submit_elaborate_specific_parameter,
task_submit_check_options_fnc=_task_submit_check_options,
task_run_fnc=_task_run_core)
def _task_submit_elaborate_specific_parameter(key, value, opts, args):
"""
Given the string key it checks it's meaning, eventually using the
value. Usually, it fills some key in the options dict.
It must return True if it has elaborated the key, False, if it doesn't
know that key.
"""
if key in ("--all",):
bibtask.task_set_option("all_pids", True)
elif key in ("--mp",):
bibtask.task_set_option("mp", True)
else:
return False
return True
def _task_run_core():
"""
Runs the requested task in the bibsched environment.
"""
all_pids = bibtask.task_get_option('all_pids', False)
mp = bibtask.task_get_option('mp', False)
if all_pids:
pids = list(get_existing_personids(with_papers_only=True))
else:
pids = get_expired_person_ids()
if mp:
compute_cache_mp(pids)
else:
compute_cache(pids)
return 1
def _task_submit_check_options():
"""
Required by bibtask. Checks the options.
"""
return True
def compute_cache(pids):
bibtask.write_message("WebAuthorProfile: %s persons to go" % len(pids),
stream=sys.stdout, verbose=0)
for i, p in enumerate(pids):
bibtask.write_message("WebAuthorProfile: doing %s out of %s" % (pids.index(p) + 1, len(pids)))
bibtask.task_update_progress("WebAuthorProfile: doing %s out of %s" % (pids.index(p) + 1, len(pids)))
_compute_cache_for_person(p)
bibtask.task_sleep_now_if_required(can_stop_too=True)
def compute_cache_mp(pids):
from multiprocessing import Pool
p = Pool()
bibtask.write_message("WebAuthorProfileMP: %s persons to go" % len(pids),
stream=sys.stdout, verbose=0)
sl = 100
ss = [pids[i: i + sl] for i in range(0, len(pids), sl)]
for i, bunch in enumerate(ss):
bibtask.write_message("WebAuthorProfileMP: doing bunch %s out of %s" % (str(i + 1), len(ss)))
bibtask.task_update_progress("WebAuthorProfileMP: doing bunch %s out of %s" % (str(i + 1), len(ss)))
p.map(_compute_cache_for_person, bunch)
bibtask.task_sleep_now_if_required(can_stop_too=True)
diff --git a/invenio/legacy/webauthorprofile/webinterface.py b/invenio/legacy/webauthorprofile/webinterface.py
index 3b6391011..123ab8c73 100644
--- a/invenio/legacy/webauthorprofile/webinterface.py
+++ b/invenio/legacy/webauthorprofile/webinterface.py
@@ -1,503 +1,503 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebAuthorProfile Web Interface Logic and URL handler."""
# pylint: disable=W0105
# pylint: disable=C0301
# pylint: disable=W0613
import sys
from operator import itemgetter
from invenio.bibauthorid_webauthorprofileinterface import is_valid_canonical_id, \
get_person_id_from_paper, get_person_id_from_canonical_id, \
search_person_ids_by_name, get_papers_by_person_id, get_person_redirect_link
from invenio.webauthorprofile_corefunctions import get_pubs, get_person_names_dicts, \
get_institute_pub_dict, get_coauthors, get_summarize_records, \
get_total_downloads, get_cited_by_list, get_kwtuples, get_venuetuples, \
get_veryfy_my_pubs_list_link, get_hepnames_data, get_self_pubs, \
get_collabtuples, get_person_oldest_date, expire_all_cache_for_person
-#from invenio.bibauthorid_config import EXTERNAL_CLAIMED_RECORDS_KEY
+#from invenio.legacy.bibauthorid.config import EXTERNAL_CLAIMED_RECORDS_KEY
from invenio.config import CFG_SITE_LANG
from invenio.config import CFG_SITE_URL
from invenio.config import CFG_WEBAUTHORPROFILE_USE_BIBAUTHORID
from invenio.legacy.webpage import pageheaderonly
from invenio.ext.legacy.handler import wash_urlargd, WebInterfaceDirectory
from invenio.utils.url import redirect_to_url
from invenio.utils.json import json_unicode_to_utf8
import datetime
import invenio.legacy.template
websearch_templates = invenio.legacy.template.load('websearch')
webauthorprofile_templates = invenio.legacy.template.load('webauthorprofile')
bibauthorid_template = invenio.legacy.template.load('bibauthorid')
from invenio.legacy.search_engine import page_end
JSON_OK = False
if sys.hexversion < 0x2060000:
try:
import simplejson as json
JSON_OK = True
except ImportError:
# Okay, no Ajax app will be possible, but continue anyway,
# since this package is only recommended, not mandatory.
JSON_OK = False
else:
try:
import json
JSON_OK = True
except ImportError:
JSON_OK = False
#tag constants
AUTHOR_TAG = "100__a"
AUTHOR_INST_TAG = "100__u"
COAUTHOR_TAG = "700__a"
COAUTHOR_INST_TAG = "700__u"
VENUE_TAG = "909C4p"
KEYWORD_TAG = "695__a"
FKEYWORD_TAG = "6531_a"
CFG_INSPIRE_UNWANTED_KEYWORDS_START = ['talk',
'conference',
'conference proceedings',
'numerical calculations',
'experimental results',
'review',
'bibliography',
'upper limit',
'lower limit',
'tables',
'search for',
'on-shell',
'off-shell',
'formula',
'lectures',
'book',
'thesis']
CFG_INSPIRE_UNWANTED_KEYWORDS_MIDDLE = ['GeV',
'((']
RECOMPUTE_ALLOWED_DELAY = datetime.timedelta(minutes=30)
class WebAuthorPages(WebInterfaceDirectory):
"""
Handle webauthorpages. /author/
"""
_exports = ['']
def _lookup(self, component, path):
"""
This handler parses dynamic URLs:
- /person/1332 shows the page of person 1332
- /person/100:5522,1431 shows the page of the person
identified by the table:bibref,bibrec pair
"""
if not component in self._exports:
return WebAuthorPages(component), path
def __init__(self, person_id=None):
"""
Constructor of the web interface.
@param person_id: The identifier of a user. Can be one of:
- a bibref: e.g. "100:1442,155"
- a person id: e.g. "14"
- a canonical id: e.g. "Ellis_J_1"
@type person_id: string
@return: will return an empty object if the identifier is of wrong type
@rtype: None (if something is not right)
"""
self.person_id = None
self.cid = None
self.original_search_parameter = person_id
if not CFG_WEBAUTHORPROFILE_USE_BIBAUTHORID:
return
if (not person_id) or (not isinstance(person_id, str)):
return
try:
self.person_id = int(person_id)
self.cid = get_person_redirect_link(self.person_id)
return
except (TypeError, ValueError):
pass
try:
self.person_id = int(get_person_id_from_canonical_id(person_id))
if self.person_id < 0:
if is_valid_canonical_id(person_id):
self.cid = None
return
else:
raise ValueError
self.cid = get_person_redirect_link(self.person_id)
return
except (ValueError, TypeError):
pass
fail_bibrecref = False
if person_id.count(":") and person_id.count(","):
bibref = person_id
table, ref, bibrec = None, None, None
if not bibref.count(":"):
fail_bibrecref = True
if not bibref.count(","):
fail_bibrecref = True
try:
table = bibref.split(":")[0]
ref = bibref.split(":")[1].split(",")[0]
bibrec = bibref.split(":")[1].split(",")[1]
except IndexError:
fail_bibrecref = True
try:
table = int(table)
ref = int(ref)
bibrec = int(bibrec)
except (ValueError, TypeError):
fail_bibrecref = True
try:
pid = int(get_person_id_from_paper(person_id))
except (ValueError, TypeError):
fail_bibrecref = True
if not fail_bibrecref:
self.person_id = pid
self.cid = self.cid = get_person_redirect_link(self.person_id)
return
self.person_id = -1
#self.person_id can be:
# -1 if not valid personid
def index(self, req, form):
'''
Serve the main person page.
Will use the object's person id to get a person's information.
@param req: Apache Request Object
@type req: Apache Request Object
@param form: Parameters sent via GET or POST request
@type form: dict
@return: a full page formatted in HTML
@return: string
'''
argd = wash_urlargd(form,
{'ln': (str, CFG_SITE_LANG),
'verbose': (int, 0),
'recid': (int, -1),
'recompute': (int, 0)
})
ln = argd['ln']
expire_cache = False
if 'recompute' in argd and argd['recompute']:
expire_cache = True
if CFG_WEBAUTHORPROFILE_USE_BIBAUTHORID:
try:
self.person_id = int(self.person_id)
except (TypeError, ValueError):
#In any case, if the parameter is invalid, go to a person search page
self.person_id = -1
return redirect_to_url(req, "%s/person/search?q=%s" %
(CFG_SITE_URL, self.original_search_parameter))
if self.person_id < 0:
return redirect_to_url(req, "%s/person/search?q=%s" %
(CFG_SITE_URL, self.original_search_parameter))
else:
self.person_id = self.original_search_parameter
if form.has_key('jsondata'):
req.content_type = "application/json"
self.create_authorpage_websearch(req, form, self.person_id, ln)
return
else:
req.content_type = "text/html"
req.send_http_header()
metaheaderadd = '<script type="text/javascript" src="%s/js/webauthorprofile.js"> </script>' % (CFG_SITE_URL)
metaheaderadd += """
<style>
.hidden {
display: none;
}
</style>
"""
title_message = "Author Publication Profile Page"
req.write(pageheaderonly(req=req, title=title_message,
metaheaderadd=metaheaderadd, language=ln))
req.write(websearch_templates.tmpl_search_pagestart(ln=ln))
self.create_authorpage_websearch(req, form, self.person_id, ln, expire_cache)
return page_end(req, 'hb', ln)
def __call__(self, req, form):
'''
Serve the main person page.
Will use the object's person id to get a person's information.
@param req: Apache Request Object
@type req: Apache Request Object
@param form: Parameters sent via GET or POST request
@type form: dict
@return: a full page formatted in HTML
@return: string
'''
argd = wash_urlargd(form,
{'ln': (str, CFG_SITE_LANG),
'verbose': (int, 0),
'recid': (int, -1)
})
recid = argd['recid']
if not CFG_WEBAUTHORPROFILE_USE_BIBAUTHORID:
return self.index(req, form)
if self.cid:
return redirect_to_url(req, '%s/author/%s/' % (CFG_SITE_URL, self.cid))
elif self.person_id and self.person_id >= 0:
return redirect_to_url(req, '%s/author/%s/' % (CFG_SITE_URL, self.person_id))
elif self.person_id and recid > -1:
#we got something different from person_id, canonical name or bibrefrec pair.
#try to figure out a personid
argd = wash_urlargd(form,
{'ln': (str, CFG_SITE_LANG),
'verbose': (int, 0),
'recid': (int, -1)
})
recid = argd['recid']
if not recid:
return redirect_to_url(req, "%s/person/search?q=%s" %
(CFG_SITE_URL, self.original_search_parameter))
# req.write("Not enough search parameters %s"%
# str(self.original_search_parameter))
nquery = self.original_search_parameter
sorted_results = search_person_ids_by_name(nquery)
test_results = None
authors = []
for results in sorted_results:
pid = results[0]
authorpapers = get_papers_by_person_id(pid, -1)
authorpapers = sorted(authorpapers, key=itemgetter(0),
reverse=True)
if (recid and
not (str(recid) in [row[0] for row in authorpapers])):
continue
authors.append([results[0], results[1],
authorpapers[0:4]])
test_results = authors
if len(test_results) == 1:
self.person_id = test_results[0][0]
self.cid = get_person_redirect_link(self.person_id)
if self.cid and self.person_id > -1:
redirect_to_url(req, '%s/author/%s/' % (CFG_SITE_URL, self.cid))
elif self.person_id > -1:
redirect_to_url(req, '%s/author/%s/' % (CFG_SITE_URL, self.person_id))
else:
return redirect_to_url(req, "%s/person/search?q=%s" %
(CFG_SITE_URL, self.original_search_parameter))
#req.write("Could not determine personID from bibrec. What to do here? %s"%
#str(self.original_search_parameter))
else:
return redirect_to_url(req, "%s/person/search?q=%s" %
(CFG_SITE_URL, self.original_search_parameter))
#req.write("Could not determine personID from bibrec. What to do here 2? %s"%
# (str(self.original_search_parameter),str(recid)))
else:
return redirect_to_url(req, "%s/person/search?q=%s" %
(CFG_SITE_URL, self.original_search_parameter))
# req.write("Search param %s does not represent a valid person, please correct your query"%
#(str(self.original_search_parameter),))
def create_authorpage_websearch(self, req, form, person_id, ln='en', expire_cache=False):
recompute_allowed = True
oldest_cache_date = get_person_oldest_date(person_id)
if oldest_cache_date:
delay = datetime.datetime.now() - oldest_cache_date
if delay > RECOMPUTE_ALLOWED_DELAY:
if expire_cache:
recompute_allowed = False
expire_all_cache_for_person(person_id)
else:
recompute_allowed = False
if CFG_WEBAUTHORPROFILE_USE_BIBAUTHORID:
if person_id < 0:
return ("Critical Error. PersonID should never be less than 0!")
pubs, pubsStatus = get_pubs(person_id)
if not pubs:
pubs = []
selfpubs, selfpubsStatus = get_self_pubs(person_id)
if not selfpubs:
selfpubs = []
namesdict, namesdictStatus = get_person_names_dicts(person_id)
if not namesdict:
namesdict = {}
try:
authorname = namesdict['longest']
db_names_dict = namesdict['db_names_dict']
except (IndexError, KeyError):
authorname = 'None'
db_names_dict = {}
#author_aff_pubs, author_aff_pubsStatus = (None, None)
author_aff_pubs, author_aff_pubsStatus = get_institute_pub_dict(person_id)
if not author_aff_pubs:
author_aff_pubs = {}
coauthors, coauthorsStatus = get_coauthors(person_id)
if not coauthors:
coauthors = {}
summarize_records, summarize_recordsStatus = get_summarize_records(person_id, 'hcs', ln)
if not summarize_records:
summarize_records = 'None'
totaldownloads, totaldownloadsStatus = get_total_downloads(person_id)
if not totaldownloads:
totaldownloads = 0
citedbylist, citedbylistStatus = get_cited_by_list(person_id)
if not citedbylist:
citedbylist = 'None'
kwtuples, kwtuplesStatus = get_kwtuples(person_id)
if kwtuples:
pass
#kwtuples = kwtuples[0:MAX_KEYWORD_LIST]
else:
kwtuples = []
collab, collabStatus = get_collabtuples(person_id)
vtuples, venuetuplesStatus = get_venuetuples(person_id)
if vtuples:
pass
#vtuples = venuetuples[0:MAX_VENUE_LIST]
else:
vtuples = str(vtuples)
person_link, person_linkStatus = get_veryfy_my_pubs_list_link(person_id)
if not person_link or not person_linkStatus:
bibauthorid_data = {"is_baid": True, "pid":person_id, "cid": None}
person_link = str(person_id)
else:
bibauthorid_data = {"is_baid": True, "pid":person_id, "cid": person_link}
hepdict, hepdictStatus = get_hepnames_data(person_id)
oldest_cache_date = get_person_oldest_date(person_id)
#req.write("\nPAGE CONTENT START\n")
#req.write(str(time.time()))
#eval = [not_empty(x) or y for x, y in
beval = [y for _, y in
[(authorname, namesdictStatus) ,
(totaldownloads, totaldownloadsStatus),
(author_aff_pubs, author_aff_pubsStatus),
(citedbylist, citedbylistStatus),
(kwtuples, kwtuplesStatus),
(coauthors, coauthorsStatus),
(vtuples, venuetuplesStatus),
(db_names_dict, namesdictStatus),
(person_link, person_linkStatus),
(summarize_records, summarize_recordsStatus),
(pubs, pubsStatus),
(hepdict, hepdictStatus),
(selfpubs, selfpubsStatus),
(collab, collabStatus)]]
#not_complete = False in eval
#req.write(str(eval))
if form.has_key('jsondata'):
json_response = {'boxes_info': {}}
json_data = json.loads(str(form['jsondata']))
json_data = json_unicode_to_utf8(json_data)
# loop to check which boxes need content
json_response['boxes_info'].update({'name_variants': {'status':beval[0], 'html_content': webauthorprofile_templates.tmpl_author_name_variants_box(req, db_names_dict, bibauthorid_data, ln, add_box=False, loading=not beval[0])}})
json_response['boxes_info'].update({'combined_papers': {'status':(beval[3] and beval[12]), 'html_content': webauthorprofile_templates.tmpl_papers_with_self_papers_box(req, pubs, selfpubs, bibauthorid_data, totaldownloads, ln, add_box=False, loading=not beval[3])}})
#json_response['boxes_info'].update({'papers': {'status':beval[3], 'html_content': webauthorprofile_templates.tmpl_papers_box(req, pubs, bibauthorid_data, totaldownloads, ln, add_box=False, loading=not beval[3])}})
json_response['boxes_info'].update({'selfpapers': {'status':beval[12], 'html_content': webauthorprofile_templates.tmpl_self_papers_box(req, selfpubs, bibauthorid_data, totaldownloads, ln, add_box=False, loading=not beval[12])}})
json_response['boxes_info'].update({'keywords': {'status':beval[4], 'html_content': webauthorprofile_templates.tmpl_keyword_box(kwtuples, bibauthorid_data, ln, add_box=False, loading=not beval[4])}})
json_response['boxes_info'].update({'affiliations': {'status':beval[2], 'html_content': webauthorprofile_templates.tmpl_affiliations_box(author_aff_pubs, ln, add_box=False, loading=not beval[2])}})
json_response['boxes_info'].update({'coauthors': {'status':beval[5], 'html_content': webauthorprofile_templates.tmpl_coauthor_box(bibauthorid_data, coauthors, ln, add_box=False, loading=not beval[5])}})
json_response['boxes_info'].update({'numpaperstitle': {'status':beval[10], 'html_content': webauthorprofile_templates.tmpl_numpaperstitle(bibauthorid_data, pubs)}})
json_response['boxes_info'].update({'authornametitle': {'status':beval[7], 'html_content': webauthorprofile_templates.tmpl_authornametitle(db_names_dict)}})
json_response['boxes_info'].update({'citations': {'status':beval[9], 'html_content': summarize_records}})
json_response['boxes_info'].update({'hepdata': {'status':beval[11], 'html_content':webauthorprofile_templates.tmpl_hepnames(hepdict, ln, add_box=False, loading=not beval[11])}})
json_response['boxes_info'].update({'collaborations': {'status':beval[13], 'html_content': webauthorprofile_templates.tmpl_collab_box(collab, bibauthorid_data, ln, add_box=False, loading=not beval[13])}})
req.content_type = 'application/json'
req.write(json.dumps(json_response))
else:
gboxstatus = self.person_id
if False not in beval:
gboxstatus = 'noAjax'
req.write('<script type="text/javascript">var gBOX_STATUS = "%s" </script>' % (gboxstatus))
req.write(webauthorprofile_templates.tmpl_author_page(req,
pubs, \
selfpubs, \
authorname, \
totaldownloads, \
author_aff_pubs, \
citedbylist, kwtuples, \
coauthors, vtuples, \
db_names_dict, person_link, \
bibauthorid_data, \
summarize_records, \
hepdict, \
collab, \
ln, \
beval, \
oldest_cache_date,
recompute_allowed))
diff --git a/invenio/legacy/webbasket/webinterface.py b/invenio/legacy/webbasket/webinterface.py
index f9d5a6f62..6b9449229 100644
--- a/invenio/legacy/webbasket/webinterface.py
+++ b/invenio/legacy/webbasket/webinterface.py
@@ -1,1647 +1,1647 @@
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebBasket Web Interface."""
__revision__ = "$Id$"
__lastupdated__ = """$Date$"""
from invenio.utils import apache
import os
import cgi
import urllib
from invenio.config import CFG_SITE_SECURE_URL, \
CFG_ACCESS_CONTROL_LEVEL_SITE, \
CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS, \
CFG_SITE_SECURE_URL, CFG_PREFIX, CFG_SITE_LANG
from invenio.base.i18n import gettext_set_language
from invenio.legacy.webpage import page
from invenio.legacy.webuser import getUid, page_not_authorized, isGuestUser
from invenio.webbasket import \
check_user_can_comment, \
check_sufficient_rights, \
perform_request_display, \
perform_request_search, \
create_guest_warning_box, \
create_basket_navtrail, \
perform_request_write_note, \
perform_request_save_note, \
perform_request_delete_note, \
perform_request_add_group, \
perform_request_edit, \
perform_request_edit_topic, \
perform_request_list_public_baskets, \
perform_request_unsubscribe, \
perform_request_subscribe, \
perform_request_display_public, \
perform_request_write_public_note, \
perform_request_save_public_note, \
delete_record, \
move_record, \
perform_request_add, \
perform_request_create_basket, \
perform_request_delete, \
wash_topic, \
wash_group, \
perform_request_export_xml, \
page_start, \
page_end
from invenio.webbasket_config import CFG_WEBBASKET_CATEGORIES, \
CFG_WEBBASKET_ACTIONS, \
CFG_WEBBASKET_SHARE_LEVELS
from invenio.webbasket_dblayer import get_basket_name, \
get_max_user_rights_on_basket
from invenio.utils.url import get_referer, redirect_to_url, make_canonical_urlargd
from invenio.ext.legacy.handler import wash_urlargd, WebInterfaceDirectory
-from invenio.webstat import register_customevent
+from invenio.legacy.webstat.api import register_customevent
from invenio.ext.logging import register_exception
from invenio.legacy.webuser import collect_user_info
from invenio.modules.comments.api import check_user_can_attach_file_to_comments
from invenio.modules.access.engine import acc_authorize_action
from invenio.utils.html import is_html_text_editor_installed
from invenio.ckeditor_invenio_connector import process_CKEditor_upload, send_response
-from invenio.bibdocfile import stream_file
+from invenio.legacy.bibdocfile.api import stream_file
class WebInterfaceBasketCommentsFiles(WebInterfaceDirectory):
"""Handle upload and access to files for comments in WebBasket.
The upload is currently only available through the CKEditor.
"""
def _lookup(self, component, path):
""" This handler is invoked for the dynamic URLs (for getting
and putting attachments) Eg:
/yourbaskets/attachments/get/31/652/5/file/myfile.pdf
/yourbaskets/attachments/get/31/552/5/image/myfigure.png
bskid/recid/uid/
/yourbaskets/attachments/put/31/550/
bskid/recid
"""
if component == 'get' and len(path) > 4:
bskid = path[0] # Basket id
recid = path[1] # Record id
uid = path[2] # uid of the submitter
file_type = path[3] # file, image, flash or media (as
# defined by CKEditor)
if file_type in ['file', 'image', 'flash', 'media']:
file_name = '/'.join(path[4:]) # the filename
def answer_get(req, form):
"""Accessing files attached to comments."""
form['file'] = file_name
form['type'] = file_type
form['uid'] = uid
form['recid'] = recid
form['bskid'] = bskid
return self._get(req, form)
return answer_get, []
elif component == 'put' and len(path) > 1:
bskid = path[0] # Basket id
recid = path[1] # Record id
def answer_put(req, form):
"""Attaching file to a comment."""
form['recid'] = recid
form['bskid'] = bskid
return self._put(req, form)
return answer_put, []
# All other cases: file not found
return None, []
def _get(self, req, form):
"""
Returns a file attached to a comment.
A file is attached to a comment of a record of a basket, by a
user (who is the author of the comment), and is of a certain
type (file, image, etc). Therefore these 5 values are part of
the URL. Eg:
CFG_SITE_SECURE_URL/yourbaskets/attachments/get/31/91/5/file/myfile.pdf
bskid/recid/uid
"""
argd = wash_urlargd(form, {'file': (str, None),
'type': (str, None),
'uid': (int, 0),
'bskid': (int, 0),
'recid': (int, 0)})
_ = gettext_set_language(argd['ln'])
# Can user view this basket & record & comment, i.e. can user
# access its attachments?
#uid = getUid(req)
user_info = collect_user_info(req)
rights = get_max_user_rights_on_basket(argd['uid'], argd['bskid'])
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
if user_info['email'] == 'guest':
# Ask to login
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'ln' : argd['ln'], 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target)
elif not(check_sufficient_rights(rights, CFG_WEBBASKET_SHARE_LEVELS['READITM'])):
return page_not_authorized(req, "../", \
text = _("You are not authorized to view this attachment"))
if not argd['file'] is None:
# Prepare path to file on disk. Normalize the path so that
# ../ and other dangerous components are removed.
path = os.path.abspath(CFG_PREFIX + '/var/data/baskets/comments/' + \
str(argd['bskid']) + '/' + str(argd['recid']) + '/' + \
str(argd['uid']) + '/' + argd['type'] + '/' + \
argd['file'])
# Check that we are really accessing attachements
# directory, for the declared basket and record.
if path.startswith(CFG_PREFIX + '/var/data/baskets/comments/' + \
str(argd['bskid']) + '/' + str(argd['recid'])) and \
os.path.exists(path):
return stream_file(req, path)
# Send error 404 in all other cases
return apache.HTTP_NOT_FOUND
def _put(self, req, form):
"""
Process requests received from CKEditor to upload files, etc.
URL eg:
CFG_SITE_SECURE_URL/yourbaskets/attachments/put/31/91/
bskid/recid/
"""
if not is_html_text_editor_installed():
return
argd = wash_urlargd(form, {'bskid': (int, 0),
'recid': (int, 0)})
uid = getUid(req)
# URL where the file can be fetched after upload
user_files_path = '%(CFG_SITE_SECURE_URL)s/yourbaskets/attachments/get/%(bskid)s/%(recid)i/%(uid)s' % \
{'uid': uid,
'recid': argd['recid'],
'bskid': argd['bskid'],
'CFG_SITE_SECURE_URL': CFG_SITE_SECURE_URL}
# Path to directory where uploaded files are saved
user_files_absolute_path = '%(CFG_PREFIX)s/var/data/baskets/comments/%(bskid)s/%(recid)s/%(uid)s' % \
{'uid': uid,
'recid': argd['recid'],
'bskid': argd['bskid'],
'CFG_PREFIX': CFG_PREFIX}
# Check that user can
# 1. is logged in
# 2. comment records of this basket (to simplify, we use
# WebComment function to check this, even if it is not
# entirely adequate)
# 3. attach files
user_info = collect_user_info(req)
(auth_code, dummy) = check_user_can_attach_file_to_comments(user_info, argd['recid'])
fileurl = ''
callback_function = ''
if user_info['email'] == 'guest':
# 1. User is guest: must login prior to upload
data ='Please login before uploading file.'
if not user_info['precached_usebaskets']:
msg = 'Sorry, you are not allowed to use WebBasket'
elif not check_user_can_comment(uid, argd['bskid']):
# 2. User cannot edit comment of this basket
msg = 'Sorry, you are not allowed to submit files'
elif auth_code:
# 3. User cannot submit
msg = 'Sorry, you are not allowed to submit files.'
else:
# Process the upload and get the response
(msg, uploaded_file_path, filename, fileurl, callback_function) = \
process_CKEditor_upload(form, uid, user_files_path, user_files_absolute_path,
recid=argd['recid'])
send_response(req, msg, fileurl, callback_function)
class WebInterfaceYourBasketsPages(WebInterfaceDirectory):
"""Defines the set of /yourbaskets pages."""
_exports = ['',
'display_item',
'display',
'search',
'write_note',
'save_note',
'delete_note',
'add',
'delete',
'modify',
'edit',
'edit_topic',
'create_basket',
'display_public',
'list_public_baskets',
'subscribe',
'unsubscribe',
'write_public_note',
'save_public_note',
'attachments']
attachments = WebInterfaceBasketCommentsFiles()
def index(self, req, dummy):
"""Index page."""
redirect_to_url(req, '%s/yourbaskets/display?%s' % (CFG_SITE_SECURE_URL, req.args))
def display_item(self, req, dummy):
"""Legacy URL redirection."""
redirect_to_url(req, '%s/yourbaskets/display?%s' % (CFG_SITE_SECURE_URL, req.args))
def display(self, req, form):
"""Display basket interface."""
#import rpdb2; rpdb2.start_embedded_debugger('password', fAllowRemote=True)
argd = wash_urlargd(form, {'category':
(str, CFG_WEBBASKET_CATEGORIES['PRIVATE']),
'topic': (str, ""),
'group': (int, 0),
'bskid': (int, 0),
'recid': (int, 0),
'bsk_to_sort': (int, 0),
'sort_by_title': (str, ""),
'sort_by_date': (str, ""),
'of': (str, "hb"),
'ln': (str, CFG_SITE_LANG)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/display",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/display%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
(body, dummy, navtrail) = perform_request_display(uid=uid,
selected_category=argd['category'],
selected_topic=argd['topic'],
selected_group_id=argd['group'],
selected_bskid=argd['bskid'],
selected_recid=argd['recid'],
of=argd['of'],
ln=argd['ln'])
if isGuestUser(uid):
body = create_guest_warning_box(argd['ln']) + body
# register event in webstat
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["display", "", user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
rssurl = CFG_SITE_SECURE_URL + "/rss"
if argd['of'] != 'hb':
page_start(req, of=argd['of'])
if argd['of'].startswith('x'):
req.write(body)
page_end(req, of=argd['of'])
return
elif argd['bskid']:
rssurl = "%s/yourbaskets/display?category=%s&amp;topic=%s&amp;group=%i&amp;bskid=%i&amp;of=xr" % \
(CFG_SITE_SECURE_URL,
argd['category'],
urllib.quote(argd['topic']),
argd['group'],
argd['bskid'])
return page(title = _("Display baskets"),
body = body,
navtrail = navtrail,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
navtrail_append_title_p = 0,
secure_page_p=1,
rssurl=rssurl)
def search(self, req, form):
"""Search baskets interface."""
argd = wash_urlargd(form, {'category': (str, ""),
'topic': (str, ""),
'group': (int, 0),
'p': (str, ""),
'b': (str, ""),
'n': (int, 0),
'of': (str, "hb"),
'verbose': (int, 0),
'ln': (str, CFG_SITE_LANG)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/search",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/search%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
(body, navtrail) = perform_request_search(uid=uid,
selected_category=argd['category'],
selected_topic=argd['topic'],
selected_group_id=argd['group'],
p=argd['p'],
b=argd['b'],
n=argd['n'],
# format=argd['of'],
ln=argd['ln'])
# register event in webstat
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["search", "", user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title = _("Search baskets"),
body = body,
navtrail = navtrail,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
navtrail_append_title_p = 0,
secure_page_p=1)
def write_note(self, req, form):
"""Write a comment (just interface for writing)"""
argd = wash_urlargd(form, {'category': (str, CFG_WEBBASKET_CATEGORIES['PRIVATE']),
'topic': (str, ""),
'group': (int, 0),
'bskid': (int, 0),
'recid': (int, 0),
'cmtid': (int, 0),
'of' : (str, ''),
'ln': (str, CFG_SITE_LANG)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/write_note",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/write_note%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
(body, navtrail) = perform_request_write_note(uid=uid,
category=argd['category'],
topic=argd['topic'],
group_id=argd['group'],
bskid=argd['bskid'],
recid=argd['recid'],
cmtid=argd['cmtid'],
ln=argd['ln'])
# register event in webstat
basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid'])
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["write_note", basket_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title = _("Add a note"),
body = body,
navtrail = navtrail,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
secure_page_p=1)
def save_note(self, req, form):
"""Save comment on record in basket"""
argd = wash_urlargd(form, {'category': (str, CFG_WEBBASKET_CATEGORIES['PRIVATE']),
'topic': (str, ""),
'group': (int, 0),
'bskid': (int, 0),
'recid': (int, 0),
'note_title': (str, ""),
'note_body': (str, ""),
'date_creation': (str, ""),
'editor_type': (str, ""),
'of': (str, ''),
'ln': (str, CFG_SITE_LANG),
'reply_to': (int, 0)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/save_note",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/save_note%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
(body, navtrail) = perform_request_save_note(uid=uid,
category=argd['category'],
topic=argd['topic'],
group_id=argd['group'],
bskid=argd['bskid'],
recid=argd['recid'],
note_title=argd['note_title'],
note_body=argd['note_body'],
date_creation=argd['date_creation'],
editor_type=argd['editor_type'],
ln=argd['ln'],
reply_to=argd['reply_to'])
# TODO: do not stat event if save was not succussful
# register event in webstat
basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid'])
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["save_note", basket_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title = _("Display item and notes"),
body = body,
navtrail = navtrail,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
navtrail_append_title_p = 0,
secure_page_p=1)
def delete_note(self, req, form):
"""Delete a comment
@param bskid: id of basket (int)
@param recid: id of record (int)
@param cmtid: id of comment (int)
@param category: category (see webbasket_config) (str)
@param topic: nb of topic currently displayed (int)
@param group: id of group baskets currently displayed (int)
@param ln: language"""
argd = wash_urlargd(form, {'category': (str, CFG_WEBBASKET_CATEGORIES['PRIVATE']),
'topic': (str, ""),
'group': (int, 0),
'bskid': (int, 0),
'recid': (int, 0),
'cmtid': (int, 0),
'of' : (str, ''),
'ln': (str, CFG_SITE_LANG)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/delete_note",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/delete_note%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/display%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
(body, navtrail) = perform_request_delete_note(uid=uid,
category=argd['category'],
topic=argd['topic'],
group_id=argd['group'],
bskid=argd['bskid'],
recid=argd['recid'],
cmtid=argd['cmtid'],
ln=argd['ln'])
# TODO: do not stat event if delete was not succussful
# register event in webstat
basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid'])
user_info = collect_user_info(req)
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["delete_note", basket_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title = _("Display item and notes"),
body = body,
navtrail = navtrail,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
navtrail_append_title_p = 0,
secure_page_p=1)
def add(self, req, form):
"""Add records to baskets.
@param recid: list of records to add
@param colid: in case of external collections, the id of the collection the records belong to
@param bskids: list of baskets to add records to. if not provided,
will return a page where user can select baskets
@param referer: URL of the referring page
@param new_basket_name: add record to new basket
@param new_topic_name: new basket goes into new topic
@param create_in_topic: # of topic to put basket into
@param ln: language"""
# TODO: apply a maximum limit of items (100) that can be added to a basket
# at once. Also see the build_search_url function of websearch_..._searcher.py
# for the "rg" GET variable.
argd = wash_urlargd(form, {'recid': (list, []),
'category': (str, ""),
'bskid': (int, 0),
'colid': (int, 0),
'es_title': (str, ""),
'es_desc': (str, ""),
'es_url': (str, ""),
'note_body': (str, ""),
'date_creation': (str, ""),
'editor_type': (str, ""),
'b': (str, ""),
'copy': (int, 0),
'move_from_basket': (int, 0),
'wait': (int, 0),
'referer': (str, ""),
'of': (str, ''),
'ln': (str, CFG_SITE_LANG)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/add",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/add%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
if not argd['referer']:
argd['referer'] = get_referer(req)
(body, navtrail) = perform_request_add(uid=uid,
recids=argd['recid'],
colid=argd['colid'],
bskid=argd['bskid'],
es_title=argd['es_title'],
es_desc=argd['es_desc'],
es_url=argd['es_url'],
note_body=argd['note_body'],
date_creation=argd['date_creation'],
editor_type=argd['editor_type'],
category=argd['category'],
b=argd['b'],
copy=argd['copy'],
move_from_basket=argd['move_from_basket'],
wait=argd['wait'],
referer=argd['referer'],
ln=argd['ln'])
if isGuestUser(uid):
body = create_guest_warning_box(argd['ln']) + body
# register event in webstat
bskid = argd['bskid']
basket_str = "%s (%s)" % (get_basket_name(bskid), bskid)
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["add", basket_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title = _('Add to basket'),
body = body,
navtrail = navtrail,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
navtrail_append_title_p = 0,
secure_page_p=1)
def delete(self, req, form):
"""Delete basket interface"""
argd = wash_urlargd(form, {'bskid' : (int, -1),
'confirmed' : (int, 0),
'category' : (str, CFG_WEBBASKET_CATEGORIES['PRIVATE']),
'topic' : (str, ""),
'group' : (int, 0),
'of' : (str, ''),
'ln' : (str, CFG_SITE_LANG)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/delete",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/delete%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
body=perform_request_delete(uid=uid,
bskid=argd['bskid'],
confirmed=argd['confirmed'],
category=argd['category'],
selected_topic=argd['topic'],
selected_group_id=argd['group'],
ln=argd['ln'])
if argd['confirmed']:
if argd['category'] == CFG_WEBBASKET_CATEGORIES['PRIVATE']:
argd['topic'] = wash_topic(uid, argd['topic'])[0]
elif argd['category'] == CFG_WEBBASKET_CATEGORIES['GROUP']:
argd['group'] = wash_group(uid, argd['group'])[0]
url = """%s/yourbaskets/display?category=%s&topic=%s&group=%i&ln=%s""" % \
(CFG_SITE_SECURE_URL,
argd['category'],
urllib.quote(argd['topic']),
argd['group'],
argd['ln'])
redirect_to_url(req, url)
else:
navtrail = '<a class="navtrail" href="%s/youraccount/display?ln=%s">'\
'%s</a>'
navtrail %= (CFG_SITE_SECURE_URL, argd['ln'], _("Your Account"))
navtrail_end = create_basket_navtrail(uid=uid,
category=argd['category'],
topic=argd['topic'],
group=argd['group'],
bskid=argd['bskid'],
ln=argd['ln'])
if isGuestUser(uid):
body = create_guest_warning_box(argd['ln']) + body
# register event in webstat
basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid'])
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["delete", basket_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title = _("Delete a basket"),
body = body,
navtrail = navtrail + navtrail_end,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
secure_page_p=1)
def modify(self, req, form):
"""Modify basket content interface (reorder, suppress record, etc.)"""
argd = wash_urlargd(form, {'action': (str, ""),
'bskid': (int, -1),
'recid': (int, 0),
'category': (str, CFG_WEBBASKET_CATEGORIES['PRIVATE']),
'topic': (str, ""),
'group': (int, 0),
'of' : (str, ''),
'ln': (str, CFG_SITE_LANG)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/modify",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/modify%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
url = CFG_SITE_SECURE_URL
url += '/yourbaskets/display?category=%s&topic=%s&group=%i&bskid=%i&ln=%s' % \
(argd['category'], urllib.quote(argd['topic']), argd['group'], argd['bskid'], argd['ln'])
if argd['action'] == CFG_WEBBASKET_ACTIONS['DELETE']:
delete_record(uid, argd['bskid'], argd['recid'])
redirect_to_url(req, url)
elif argd['action'] == CFG_WEBBASKET_ACTIONS['UP']:
move_record(uid, argd['bskid'], argd['recid'], argd['action'])
redirect_to_url(req, url)
elif argd['action'] == CFG_WEBBASKET_ACTIONS['DOWN']:
move_record(uid, argd['bskid'], argd['recid'], argd['action'])
redirect_to_url(req, url)
elif argd['action'] == CFG_WEBBASKET_ACTIONS['COPY'] or \
argd['action'] == CFG_WEBBASKET_ACTIONS['MOVE']:
if(argd['action'] == CFG_WEBBASKET_ACTIONS['MOVE']):
title = _("Move record to basket")
from_bsk = argd['bskid']
else:
title = _("Copy record to basket")
from_bsk = 0
referer = get_referer(req)
(body, navtrail) = perform_request_add(uid=uid,
recids=argd['recid'],
copy=True,
move_from_basket=from_bsk,
referer=referer,
ln=argd['ln'])
if isGuestUser(uid):
body = create_guest_warning_box(argd['ln']) + body
else:
title = ''
body = ''
# warnings = [('WRN_WEBBASKET_UNDEFINED_ACTION',)]
navtrail = '<a class="navtrail" href="%s/youraccount/display?ln=%s">'\
'%s</a>'
navtrail %= (CFG_SITE_SECURE_URL, argd['ln'], _("Your Account"))
navtrail_end = create_basket_navtrail(uid=uid,
category=argd['category'],
topic=argd['topic'],
group=argd['group'],
bskid=argd['bskid'],
ln=argd['ln'])
# register event in webstat
basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid'])
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["modify", basket_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title = title,
body = body,
navtrail = navtrail + navtrail_end,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
secure_page_p=1)
def edit(self, req, form):
"""Edit basket interface"""
argd = wash_urlargd(form, {'bskid': (int, 0),
'groups': (list, []),
'topic': (str, ""),
'add_group': (str, ""),
'group_cancel': (str, ""),
'submit': (str, ""),
'cancel': (str, ""),
'delete': (str, ""),
'new_name': (str, ""),
'new_topic': (str, ""),
'new_topic_name': (str, ""),
'new_group': (str, ""),
'external': (str, ""),
'of' : (str, ''),
'ln': (str, CFG_SITE_LANG)})
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/edit",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/edit%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
if argd['cancel']:
url = CFG_SITE_SECURE_URL + '/yourbaskets/display?category=%s&topic=%s&ln=%s'
url %= (CFG_WEBBASKET_CATEGORIES['PRIVATE'],
urllib.quote(argd['topic']),
argd['ln'])
redirect_to_url(req, url)
elif argd['delete']:
url = CFG_SITE_SECURE_URL
url += '/yourbaskets/delete?bskid=%i&category=%s&topic=%s&ln=%s' % \
(argd['bskid'],
CFG_WEBBASKET_CATEGORIES['PRIVATE'],
urllib.quote(argd['topic']),
argd['ln'])
redirect_to_url(req, url)
elif argd['add_group'] and not(argd['new_group']):
body = perform_request_add_group(uid=uid,
bskid=argd['bskid'],
topic=argd['topic'],
ln=argd['ln'])
# warnings = []
elif (argd['add_group'] and argd['new_group']) or argd['group_cancel']:
if argd['add_group']:
perform_request_add_group(uid=uid,
bskid=argd['bskid'],
topic=argd['topic'],
group_id=argd['new_group'],
ln=argd['ln'])
body = perform_request_edit(uid=uid,
bskid=argd['bskid'],
topic=argd['topic'],
ln=argd['ln'])
elif argd['submit']:
body = perform_request_edit(uid=uid,
bskid=argd['bskid'],
topic=argd['topic'],
new_name=argd['new_name'],
new_topic=argd['new_topic'],
new_topic_name=argd['new_topic_name'],
groups=argd['groups'],
external=argd['external'],
ln=argd['ln'])
if argd['new_topic'] != "-1":
argd['topic'] = argd['new_topic']
url = CFG_SITE_SECURE_URL + '/yourbaskets/display?category=%s&topic=%s&ln=%s' % \
(CFG_WEBBASKET_CATEGORIES['PRIVATE'],
urllib.quote(argd['topic']),
argd['ln'])
redirect_to_url(req, url)
else:
body = perform_request_edit(uid=uid,
bskid=argd['bskid'],
topic=argd['topic'],
ln=argd['ln'])
navtrail = '<a class="navtrail" href="%s/youraccount/display?ln=%s">'\
'%s</a>'
navtrail %= (CFG_SITE_SECURE_URL, argd['ln'], _("Your Account"))
navtrail_end = create_basket_navtrail(
uid=uid,
category=CFG_WEBBASKET_CATEGORIES['PRIVATE'],
topic=argd['topic'],
group=0,
bskid=argd['bskid'],
ln=argd['ln'])
if isGuestUser(uid):
body = create_guest_warning_box(argd['ln']) + body
# register event in webstat
basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid'])
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["edit", basket_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title = _("Edit basket"),
body = body,
navtrail = navtrail + navtrail_end,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
secure_page_p=1)
def edit_topic(self, req, form):
"""Edit topic interface"""
argd = wash_urlargd(form, {'topic': (str, ""),
'submit': (str, ""),
'cancel': (str, ""),
'delete': (str, ""),
'new_name': (str, ""),
'of' : (str, ''),
'ln': (str, CFG_SITE_LANG)})
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/edit",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/edit_topic%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
if argd['cancel']:
url = CFG_SITE_SECURE_URL + '/yourbaskets/display?category=%s&ln=%s'
url %= (CFG_WEBBASKET_CATEGORIES['PRIVATE'], argd['ln'])
redirect_to_url(req, url)
elif argd['delete']:
url = CFG_SITE_SECURE_URL
url += '/yourbaskets/delete?bskid=%i&category=%s&topic=%s&ln=%s' % \
(argd['bskid'],
CFG_WEBBASKET_CATEGORIES['PRIVATE'],
urllib.quote(argd['topic']),
argd['ln'])
redirect_to_url(req, url)
elif argd['submit']:
body = perform_request_edit_topic(uid=uid,
topic=argd['topic'],
new_name=argd['new_name'],
ln=argd['ln'])
url = CFG_SITE_SECURE_URL + '/yourbaskets/display?category=%s&ln=%s' % \
(CFG_WEBBASKET_CATEGORIES['PRIVATE'], argd['ln'])
redirect_to_url(req, url)
else:
body = perform_request_edit_topic(uid=uid,
topic=argd['topic'],
ln=argd['ln'])
navtrail = '<a class="navtrail" href="%s/youraccount/display?ln=%s">'\
'%s</a>'
navtrail %= (CFG_SITE_SECURE_URL, argd['ln'], _("Your Account"))
navtrail_end = ""
#navtrail_end = create_basket_navtrail(
# uid=uid,
# category=CFG_WEBBASKET_CATEGORIES['PRIVATE'],
# topic=argd['topic'],
# group=0,
# ln=argd['ln'])
if isGuestUser(uid):
body = create_guest_warning_box(argd['ln']) + body
# register event in webstat
#basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid'])
#if user_info['email']:
# user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
#else:
# user_str = ""
#try:
# register_customevent("baskets", ["edit", basket_str, user_str])
#except:
# register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title = _("Edit topic"),
body = body,
navtrail = navtrail + navtrail_end,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
secure_page_p=1)
def create_basket(self, req, form):
"""Create basket interface"""
argd = wash_urlargd(form, {'new_basket_name': (str, ""),
'new_topic_name' : (str, ""),
'create_in_topic': (str, "-1"),
'topic' : (str, ""),
'recid' : (list, []),
'colid' : (int, -1),
'es_title' : (str, ''),
'es_desc' : (str, ''),
'es_url' : (str, ''),
'copy' : (int, 0),
'move_from_basket':(int, 0),
'referer' : (str, ''),
'of' : (str, ''),
'ln' : (str, CFG_SITE_LANG)})
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/create_basket",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/create_basket%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
_ = gettext_set_language(argd['ln'])
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
if argd['new_basket_name'] and \
(argd['new_topic_name'] or argd['create_in_topic'] != "-1"):
(bskid, topic) = perform_request_create_basket(
req,
uid=uid,
new_basket_name=argd['new_basket_name'],
new_topic_name=argd['new_topic_name'],
create_in_topic=argd['create_in_topic'],
recids=argd['recid'],
colid=argd['colid'],
es_title=argd['es_title'],
es_desc=argd['es_desc'],
es_url=argd['es_url'],
copy=argd['copy'],
move_from_basket=argd['move_from_basket'],
referer=argd['referer'],
ln=argd['ln'])
# register event in webstat
basket_str = "%s ()" % argd['new_basket_name']
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["create_basket", basket_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
if ( argd['recid'] and argd['colid'] >= 0 ):
url = CFG_SITE_SECURE_URL + '/yourbaskets/add?category=%s&copy=%i&referer=%s&bskid=%i&colid=%i&move_from_basket=%i&recid=%s&wait=1&ln=%s'
url %= (CFG_WEBBASKET_CATEGORIES['PRIVATE'],
argd['copy'],
urllib.quote(argd['referer']),
bskid,
argd['colid'],
argd['move_from_basket'],
'&recid='.join(str(recid) for recid in argd['recid']),
argd['ln'])
elif ( argd['es_title'] and argd['es_desc'] and argd['es_url'] and argd['colid'] == -1 ):
# Adding NEW external record - this does not need 'move_from_basket' data
url = CFG_SITE_SECURE_URL + '/yourbaskets/add?category=%s&bskid=%i&colid=%i&es_title=%s&es_desc=%s&es_url=%s&wait=1&ln=%s'
url %= (CFG_WEBBASKET_CATEGORIES['PRIVATE'],
bskid,
argd['colid'],
urllib.quote(argd['es_title']),
urllib.quote(argd['es_desc']),
urllib.quote(argd['es_url']),
argd['ln'])
else:
url = CFG_SITE_SECURE_URL + '/yourbaskets/display?category=%s&topic=%s&ln=%s'
url %= (CFG_WEBBASKET_CATEGORIES['PRIVATE'],
urllib.quote(topic),
argd['ln'])
redirect_to_url(req, url)
else:
body = perform_request_create_basket(req,
uid=uid,
new_basket_name=argd['new_basket_name'],
new_topic_name=argd['new_topic_name'],
create_in_topic=argd['create_in_topic'],
topic=argd['topic'],
recids=argd['recid'],
colid=argd['colid'],
es_title=argd['es_title'],
es_desc=argd['es_desc'],
es_url=argd['es_url'],
copy=argd['copy'],
move_from_basket=argd['move_from_basket'],
referer=argd['referer'],
ln=argd['ln'])
navtrail = '<a class="navtrail" href="%s/youraccount/'\
'display?ln=%s">%s</a>'
navtrail %= (CFG_SITE_SECURE_URL, argd['ln'], _("Your Account"))
if isGuestUser(uid):
body = create_guest_warning_box(argd['ln']) + body
return page(title = _("Create basket"),
body = body,
navtrail = navtrail,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
secure_page_p=1)
def display_public(self, req, form):
"""Display a public basket"""
argd = wash_urlargd(form, {'bskid': (int, 0),
'recid': (int, 0),
'of': (str, "hb"),
'ln': (str, CFG_SITE_LANG)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/display",
navmenuid = 'yourbaskets')
user_info = collect_user_info(req)
if not argd['bskid']:
(body, navtrail) = perform_request_list_public_baskets(uid)
title = _('List of public baskets')
# register event in webstat
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["list_public_baskets", "", user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
else:
(body, dummy, navtrail) = perform_request_display_public(uid=uid,
selected_bskid=argd['bskid'],
selected_recid=argd['recid'],
of=argd['of'],
ln=argd['ln'])
title = _('Public basket')
# register event in webstat
basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid'])
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["display_public", basket_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
rssurl = CFG_SITE_SECURE_URL + "/rss"
if argd['of'] != 'hb':
page_start(req, of=argd['of'])
if argd['of'].startswith('x'):
req.write(body)
page_end(req, of=argd['of'])
return
elif argd['bskid']:
rssurl = "%s/yourbaskets/display_public?&amp;bskid=%i&amp;of=xr" % \
(CFG_SITE_SECURE_URL,
argd['bskid'])
return page(title = title,
body = body,
navtrail = navtrail,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
navtrail_append_title_p = 0,
secure_page_p=1,
rssurl=rssurl)
def list_public_baskets(self, req, form):
"""List of public baskets interface."""
argd = wash_urlargd(form, {'limit': (int, 1),
'sort': (str, 'name'),
'asc': (int, 1),
'of': (str, ''),
'ln': (str, CFG_SITE_LANG)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE == 2:
return page_not_authorized(req, "../yourbaskets/list_public_baskets",
navmenuid = 'yourbaskets')
user_info = collect_user_info(req)
nb_views_show = acc_authorize_action(user_info, 'runwebstatadmin')
nb_views_show_p = not(nb_views_show[0])
(body, navtrail) = perform_request_list_public_baskets(uid,
argd['limit'],
argd['sort'],
argd['asc'],
nb_views_show_p,
argd['ln'])
return page(title = _("List of public baskets"),
body = body,
navtrail = navtrail,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
navtrail_append_title_p = 0,
secure_page_p=1)
def subscribe(self, req, form):
"""Subscribe to a basket pseudo-interface."""
argd = wash_urlargd(form, {'bskid': (int, 0),
'of': (str, 'hb'),
'ln': (str, CFG_SITE_LANG)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE == 2:
return page_not_authorized(req, "../yourbaskets/subscribe",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/subscribe%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
if not argd['bskid']:
(body, navtrail) = perform_request_list_public_baskets(uid)
title = _('List of public baskets')
else:
# TODO: Take care of XML output as shown below
#req.content_type = "text/xml"
#req.send_http_header()
#return perform_request_display_public(bskid=argd['bskid'], of=argd['of'], ln=argd['ln'])
subscribe_warnings_html = perform_request_subscribe(uid, argd['bskid'], argd['ln'])
(body, dummy, navtrail) = perform_request_display_public(uid=uid,
selected_bskid=argd['bskid'],
selected_recid=0,
of=argd['of'],
ln=argd['ln'])
#warnings.extend(subscribe_warnings)
body = subscribe_warnings_html + body
title = _('Public basket')
return page(title = title,
body = body,
navtrail = navtrail,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
navtrail_append_title_p = 0,
secure_page_p=1)
def unsubscribe(self, req, form):
"""Unsubscribe from basket pseudo-interface."""
argd = wash_urlargd(form, {'bskid': (int, 0),
'of': (str, 'hb'),
'ln': (str, CFG_SITE_LANG)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE == 2:
return page_not_authorized(req, "../yourbaskets/unsubscribe",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/unsubscribe%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
if not argd['bskid']:
(body, navtrail) = perform_request_list_public_baskets(uid)
title = _('List of public baskets')
else:
# TODO: Take care of XML output as shown below
#req.content_type = "text/xml"
#req.send_http_header()
#return perform_request_display_public(bskid=argd['bskid'], of=argd['of'], ln=argd['ln'])
unsubscribe_warnings_html = perform_request_unsubscribe(uid, argd['bskid'], argd['ln'])
(body, dummy, navtrail) = perform_request_display_public(uid=uid,
selected_bskid=argd['bskid'],
selected_recid=0,
of=argd['of'],
ln=argd['ln'])
# warnings.extend(unsubscribe_warnings)
body = unsubscribe_warnings_html + body
title = _('Public basket')
return page(title = title,
body = body,
navtrail = navtrail,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
navtrail_append_title_p = 0,
secure_page_p=1)
def write_public_note(self, req, form):
"""Write a comment (just interface for writing)"""
argd = wash_urlargd(form, {'bskid': (int, 0),
'recid': (int, 0),
'cmtid': (int, 0),
'of' : (str, ''),
'ln' : (str, CFG_SITE_LANG)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/write_public_note",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/write_public_note%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
(body, navtrail) = perform_request_write_public_note(uid=uid,
bskid=argd['bskid'],
recid=argd['recid'],
cmtid=argd['cmtid'],
ln=argd['ln'])
# register event in webstat
basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid'])
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["write_public_note", basket_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title = _("Add a note"),
body = body,
navtrail = navtrail,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
secure_page_p=1)
def save_public_note(self, req, form):
"""Save comment on record in basket"""
argd = wash_urlargd(form, {'bskid': (int, 0),
'recid': (int, 0),
'note_title': (str, ""),
'note_body': (str, ""),
'editor_type': (str, ""),
'of': (str, ''),
'ln': (str, CFG_SITE_LANG),
'reply_to': (str, 0)})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../yourbaskets/save_public_note",
navmenuid = 'yourbaskets')
if isGuestUser(uid):
if not CFG_WEBSESSION_DIFFERENTIATE_BETWEEN_GUESTS:
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourbaskets/save_public_note%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_usebaskets']:
return page_not_authorized(req, "../", \
text = _("You are not authorized to use baskets."))
(body, navtrail) = perform_request_save_public_note(uid=uid,
bskid=argd['bskid'],
recid=argd['recid'],
note_title=argd['note_title'],
note_body=argd['note_body'],
editor_type=argd['editor_type'],
ln=argd['ln'],
reply_to=argd['reply_to'])
# TODO: do not stat event if save was not succussful
# register event in webstat
basket_str = "%s (%d)" % (get_basket_name(argd['bskid']), argd['bskid'])
if user_info['email']:
user_str = "%s (%d)" % (user_info['email'], user_info['uid'])
else:
user_str = ""
try:
register_customevent("baskets", ["save_public_note", basket_str, user_str])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return page(title = _("Display item and notes"),
body = body,
navtrail = navtrail,
uid = uid,
lastupdated = __lastupdated__,
language = argd['ln'],
req = req,
navmenuid = 'yourbaskets',
of = argd['of'],
navtrail_append_title_p = 0,
secure_page_p=1)
diff --git a/invenio/legacy/webcomment/webinterface.py b/invenio/legacy/webcomment/webinterface.py
index 1e16cc1d6..a5a96e2b1 100644
--- a/invenio/legacy/webcomment/webinterface.py
+++ b/invenio/legacy/webcomment/webinterface.py
@@ -1,930 +1,930 @@
# -*- coding: utf-8 -*-
## Comments and reviews for records.
## This file is part of Invenio.
## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
""" Comments and reviews for records: web interface """
__lastupdated__ = """$Date$"""
__revision__ = """$Id$"""
import cgi
from invenio.modules.comments.api import check_recID_is_in_range, \
perform_request_display_comments_or_remarks, \
perform_request_add_comment_or_remark, \
perform_request_vote, \
perform_request_report, \
subscribe_user_to_discussion, \
unsubscribe_user_from_discussion, \
get_user_subscription_to_discussion, \
check_user_can_attach_file_to_comments, \
check_user_can_view_comments, \
check_user_can_send_comments, \
check_user_can_view_comment, \
query_get_comment, \
toggle_comment_visibility, \
check_comment_belongs_to_record, \
is_comment_deleted, \
perform_display_your_comments
from invenio.config import \
CFG_TMPDIR, \
CFG_SITE_LANG, \
CFG_SITE_URL, \
CFG_SITE_SECURE_URL, \
CFG_PREFIX, \
CFG_SITE_NAME, \
CFG_SITE_NAME_INTL, \
CFG_WEBCOMMENT_ALLOW_COMMENTS,\
CFG_WEBCOMMENT_ALLOW_REVIEWS, \
CFG_WEBCOMMENT_USE_MATHJAX_IN_COMMENTS, \
CFG_SITE_RECORD, \
CFG_WEBCOMMENT_MAX_ATTACHMENT_SIZE, \
CFG_WEBCOMMENT_MAX_ATTACHED_FILES, \
CFG_ACCESS_CONTROL_LEVEL_SITE
from invenio.legacy.webuser import getUid, page_not_authorized, isGuestUser, collect_user_info
from invenio.legacy.webpage import page, pageheaderonly, pagefooteronly
from invenio.legacy.search_engine import create_navtrail_links, \
guess_primary_collection_of_a_record, \
get_colID
from invenio.utils.url import redirect_to_url, \
make_canonical_urlargd
from invenio.utils.html import get_mathjax_header
from invenio.ext.logging import register_exception
from invenio.base.i18n import gettext_set_language
from invenio.ext.legacy.handler import wash_urlargd, WebInterfaceDirectory
-from invenio.websearchadminlib import get_detailed_page_tabs, get_detailed_page_tabs_counts
+from invenio.legacy.websearch.adminlib import get_detailed_page_tabs, get_detailed_page_tabs_counts
from invenio.modules.access.local_config import VIEWRESTRCOLL
from invenio.modules.access.mailcookie import \
mail_cookie_create_authorize_action, \
mail_cookie_create_common, \
mail_cookie_check_common, \
InvenioWebAccessMailCookieDeletedError, \
InvenioWebAccessMailCookieError
from invenio.modules.comments.config import \
InvenioWebCommentError, \
InvenioWebCommentWarning
import invenio.legacy.template
webstyle_templates = invenio.legacy.template.load('webstyle')
websearch_templates = invenio.legacy.template.load('websearch')
import os
from invenio.utils import apache
-from invenio.bibdocfile import \
+from invenio.legacy.bibdocfile.api import \
stream_file, \
decompose_file, \
propose_next_docname
class WebInterfaceCommentsPages(WebInterfaceDirectory):
"""Defines the set of /comments pages."""
_exports = ['', 'display', 'add', 'vote', 'report', 'index', 'attachments',
'subscribe', 'unsubscribe', 'toggle']
def __init__(self, recid=-1, reviews=0):
self.recid = recid
self.discussion = reviews # 0:comments, 1:reviews
self.attachments = WebInterfaceCommentsFiles(recid, reviews)
def index(self, req, form):
"""
Redirects to display function
"""
return self.display(req, form)
def display(self, req, form):
"""
Display comments (reviews if enabled) associated with record having id recid where recid>0.
This function can also be used to display remarks associated with basket having id recid where recid<-99.
@param ln: language
@param recid: record id, integer
@param do: display order hh = highest helpful score, review only
lh = lowest helpful score, review only
hs = highest star score, review only
ls = lowest star score, review only
od = oldest date
nd = newest date
@param ds: display since all= no filtering by date
nd = n days ago
nw = n weeks ago
nm = n months ago
ny = n years ago
where n is a single digit integer between 0 and 9
@param nb: number of results per page
@param p: results page
@param voted: boolean, active if user voted for a review, see vote function
@param reported: int, active if user reported a certain comment/review, see report function
@param reviews: boolean, enabled for reviews, disabled for comments
@param subscribed: int, 1 if user just subscribed to discussion, -1 if unsubscribed
@return the full html page.
"""
argd = wash_urlargd(form, {'do': (str, "od"),
'ds': (str, "all"),
'nb': (int, 100),
'p': (int, 1),
'voted': (int, -1),
'reported': (int, -1),
'subscribed': (int, 0),
'cmtgrp': (list, ["latest"]) # 'latest' is now a reserved group/round name
})
_ = gettext_set_language(argd['ln'])
uid = getUid(req)
user_info = collect_user_info(req)
(auth_code, auth_msg) = check_user_can_view_comments(user_info, self.recid)
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
return page_not_authorized(req, "../", \
text = auth_msg)
can_send_comments = False
(auth_code, auth_msg) = check_user_can_send_comments(user_info, self.recid)
if not auth_code:
can_send_comments = True
can_attach_files = False
(auth_code, auth_msg) = check_user_can_attach_file_to_comments(user_info, self.recid)
if not auth_code and (user_info['email'] != 'guest'):
can_attach_files = True
subscription = get_user_subscription_to_discussion(self.recid, uid)
if subscription == 1:
user_is_subscribed_to_discussion = True
user_can_unsubscribe_from_discussion = True
elif subscription == 2:
user_is_subscribed_to_discussion = True
user_can_unsubscribe_from_discussion = False
else:
user_is_subscribed_to_discussion = False
user_can_unsubscribe_from_discussion = False
unordered_tabs = get_detailed_page_tabs(get_colID(guess_primary_collection_of_a_record(self.recid)),
self.recid,
ln=argd['ln'])
ordered_tabs_id = [(tab_id, values['order']) for (tab_id, values) in unordered_tabs.iteritems()]
ordered_tabs_id.sort(lambda x, y: cmp(x[1], y[1]))
link_ln = ''
if argd['ln'] != CFG_SITE_LANG:
link_ln = '?ln=%s' % argd['ln']
tabs = [(unordered_tabs[tab_id]['label'], \
'%s/record/%s/%s%s' % (CFG_SITE_URL, self.recid, tab_id, link_ln), \
tab_id in ['comments', 'reviews'],
unordered_tabs[tab_id]['enabled']) \
for (tab_id, order) in ordered_tabs_id
if unordered_tabs[tab_id]['visible'] == True]
tabs_counts = get_detailed_page_tabs_counts(self.recid)
citedbynum = tabs_counts['Citations']
references = tabs_counts['References']
discussions = tabs_counts['Discussions']
top = webstyle_templates.detailed_record_container_top(self.recid,
tabs,
argd['ln'],
citationnum=citedbynum,
referencenum=references,
discussionnum=discussions)
bottom = webstyle_templates.detailed_record_container_bottom(self.recid,
tabs,
argd['ln'])
#display_comment_rounds = [cmtgrp for cmtgrp in argd['cmtgrp'] if cmtgrp.isdigit() or cmtgrp == "all" or cmtgrp == "-1"]
display_comment_rounds = argd['cmtgrp']
check_warnings = []
(ok, problem) = check_recID_is_in_range(self.recid, check_warnings, argd['ln'])
if ok:
body = perform_request_display_comments_or_remarks(req=req, recID=self.recid,
display_order=argd['do'],
display_since=argd['ds'],
nb_per_page=argd['nb'],
page=argd['p'],
ln=argd['ln'],
voted=argd['voted'],
reported=argd['reported'],
subscribed=argd['subscribed'],
reviews=self.discussion,
uid=uid,
can_send_comments=can_send_comments,
can_attach_files=can_attach_files,
user_is_subscribed_to_discussion=user_is_subscribed_to_discussion,
user_can_unsubscribe_from_discussion=user_can_unsubscribe_from_discussion,
display_comment_rounds=display_comment_rounds
)
title, description, keywords = websearch_templates.tmpl_record_page_header_content(req, self.recid, argd['ln'])
navtrail = create_navtrail_links(cc=guess_primary_collection_of_a_record(self.recid), ln=argd['ln'])
if navtrail:
navtrail += ' &gt; '
navtrail += '<a class="navtrail" href="%s/%s/%s?ln=%s">'% (CFG_SITE_URL, CFG_SITE_RECORD, self.recid, argd['ln'])
navtrail += cgi.escape(title)
navtrail += '</a>'
navtrail += ' &gt; <a class="navtrail">%s</a>' % (self.discussion==1 and _("Reviews") or _("Comments"))
mathjaxheader = ''
if CFG_WEBCOMMENT_USE_MATHJAX_IN_COMMENTS:
mathjaxheader = get_mathjax_header(req.is_https())
jqueryheader = '''
<script src="%(CFG_SITE_URL)s/js/jquery.MultiFile.pack.js" type="text/javascript" language="javascript"></script>
''' % {'CFG_SITE_URL': CFG_SITE_URL}
return pageheaderonly(title=title,
navtrail=navtrail,
uid=uid,
verbose=1,
metaheaderadd = mathjaxheader + jqueryheader,
req=req,
language=argd['ln'],
navmenuid='search',
navtrail_append_title_p=0) + \
websearch_templates.tmpl_search_pagestart(argd['ln']) + \
top + body + bottom + \
websearch_templates.tmpl_search_pageend(argd['ln']) + \
pagefooteronly(lastupdated=__lastupdated__, language=argd['ln'], req=req)
else:
return page(title=_("Record Not Found"),
body=problem,
uid=uid,
verbose=1,
req=req,
language=argd['ln'],
navmenuid='search')
# Return the same page wether we ask for /CFG_SITE_RECORD/123 or /CFG_SITE_RECORD/123/
__call__ = index
def add(self, req, form):
"""
Add a comment (review) to record with id recid where recid>0
Also works for adding a remark to basket with id recid where recid<-99
@param ln: languange
@param recid: record id
@param action: 'DISPLAY' to display add form
'SUBMIT' to submit comment once form is filled
'REPLY' to reply to an already existing comment
@param msg: the body of the comment/review or remark
@param score: star score of the review
@param note: title of the review
@param comid: comment id, needed for replying
@param editor_type: the type of editor used for submitting the
comment: 'textarea', 'ckeditor'.
@param subscribe: if set, subscribe user to receive email
notifications when new comment are added to
this discussion
@return the full html page.
"""
argd = wash_urlargd(form, {'action': (str, "DISPLAY"),
'msg': (str, ""),
'note': (str, ''),
'score': (int, 0),
'comid': (int, 0),
'editor_type': (str, ""),
'subscribe': (str, ""),
'cookie': (str, "")
})
_ = gettext_set_language(argd['ln'])
actions = ['DISPLAY', 'REPLY', 'SUBMIT']
uid = getUid(req)
# Is site ready to accept comments?
if uid == -1 or (not CFG_WEBCOMMENT_ALLOW_COMMENTS and not CFG_WEBCOMMENT_ALLOW_REVIEWS):
return page_not_authorized(req, "../comments/add",
navmenuid='search')
# Is user allowed to post comment?
user_info = collect_user_info(req)
(auth_code_1, auth_msg_1) = check_user_can_view_comments(user_info, self.recid)
(auth_code_2, auth_msg_2) = check_user_can_send_comments(user_info, self.recid)
if isGuestUser(uid):
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
# Save user's value in cookie, so that these "POST"
# parameters are not lost during login process
msg_cookie = mail_cookie_create_common('comment_msg',
{'msg': argd['msg'],
'note': argd['note'],
'score': argd['score'],
'editor_type': argd['editor_type'],
'subscribe': argd['subscribe']},
onetime=True)
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri'] + '&cookie=' + msg_cookie}, {})
return redirect_to_url(req, target, norobot=True)
elif (auth_code_1 or auth_code_2):
return page_not_authorized(req, "../", \
text = auth_msg_1 + auth_msg_2)
if argd['comid']:
# If replying to a comment, are we on a record that
# matches the original comment user is replying to?
if not check_comment_belongs_to_record(argd['comid'], self.recid):
return page_not_authorized(req, "../", \
text = _("Specified comment does not belong to this record"))
# Is user trying to reply to a restricted comment? Make
# sure user has access to it. We will then inherit its
# restriction for the new comment
(auth_code, auth_msg) = check_user_can_view_comment(user_info, argd['comid'])
if auth_code:
return page_not_authorized(req, "../", \
text = _("You do not have access to the specified comment"))
# Is user trying to reply to a deleted comment? If so, we
# let submitted comment go (to not lose possibly submitted
# content, if comment is submitted while original is
# deleted), but we "reset" comid to make sure that for
# action 'REPLY' the original comment is not included in
# the reply
if is_comment_deleted(argd['comid']):
argd['comid'] = 0
user_info = collect_user_info(req)
can_attach_files = False
(auth_code, auth_msg) = check_user_can_attach_file_to_comments(user_info, self.recid)
if not auth_code and (user_info['email'] != 'guest'):
can_attach_files = True
warning_msgs = [] # list of warning tuples (warning_text, warning_color)
added_files = {}
if can_attach_files:
# User is allowed to attach files. Process the files
file_too_big = False
formfields = form.get('commentattachment[]', [])
if not hasattr(formfields, "__getitem__"): # A single file was uploaded
formfields = [formfields]
for formfield in formfields[:CFG_WEBCOMMENT_MAX_ATTACHED_FILES]:
if hasattr(formfield, "filename") and formfield.filename:
filename = formfield.filename
dir_to_open = os.path.join(CFG_TMPDIR, 'webcomment', str(uid))
try:
assert(dir_to_open.startswith(CFG_TMPDIR))
except AssertionError:
register_exception(req=req,
prefix='User #%s tried to upload file to forbidden location: %s' \
% (uid, dir_to_open))
if not os.path.exists(dir_to_open):
try:
os.makedirs(dir_to_open)
except:
register_exception(req=req, alert_admin=True)
## Before saving the file to disc, wash the filename (in particular
## washing away UNIX and Windows (e.g. DFS) paths):
filename = os.path.basename(filename.split('\\')[-1])
filename = filename.strip()
if filename != "":
# Check that file does not already exist
n = 1
while os.path.exists(os.path.join(dir_to_open, filename)):
basedir, name, extension = decompose_file(filename)
new_name = propose_next_docname(name)
filename = new_name + extension
fp = open(os.path.join(dir_to_open, filename), "w")
# FIXME: temporary, waiting for wsgi handler to be
# fixed. Once done, read chunk by chunk
## while formfield.file:
## fp.write(formfield.file.read(10240))
fp.write(formfield.file.read())
fp.close()
# Isn't this file too big?
file_size = os.path.getsize(os.path.join(dir_to_open, filename))
if CFG_WEBCOMMENT_MAX_ATTACHMENT_SIZE > 0 and \
file_size > CFG_WEBCOMMENT_MAX_ATTACHMENT_SIZE:
os.remove(os.path.join(dir_to_open, filename))
# One file is too big: record that,
# dismiss all uploaded files and re-ask to
# upload again
file_too_big = True
try:
raise InvenioWebCommentWarning(_('The size of file \\"%s\\" (%s) is larger than maximum allowed file size (%s). Select files again.') % (cgi.escape(filename), str(file_size/1024) + 'KB', str(CFG_WEBCOMMENT_MAX_ATTACHMENT_SIZE/1024) + 'KB'))
except InvenioWebCommentWarning, exc:
register_exception(stream='warning')
warning_msgs.append((exc.message, ''))
#warning_msgs.append(('WRN_WEBCOMMENT_MAX_FILE_SIZE_REACHED', cgi.escape(filename), str(file_size/1024) + 'KB', str(CFG_WEBCOMMENT_MAX_ATTACHMENT_SIZE/1024) + 'KB'))
else:
added_files[filename] = os.path.join(dir_to_open, filename)
if file_too_big:
# One file was too big. Removed all uploaded filed
for filepath in added_files.items():
try:
os.remove(filepath)
except:
# File was already removed or does not exist?
pass
client_ip_address = req.remote_ip
check_warnings = []
(ok, problem) = check_recID_is_in_range(self.recid, check_warnings, argd['ln'])
if ok:
title, description, keywords = websearch_templates.tmpl_record_page_header_content(req,
self.recid,
argd['ln'])
navtrail = create_navtrail_links(cc=guess_primary_collection_of_a_record(self.recid))
if navtrail:
navtrail += ' &gt; '
navtrail += '<a class="navtrail" href="%s/%s/%s?ln=%s">'% (CFG_SITE_URL, CFG_SITE_RECORD, self.recid, argd['ln'])
navtrail += cgi.escape(title)
navtrail += '</a>'
navtrail += '&gt; <a class="navtrail" href="%s/%s/%s/%s/?ln=%s">%s</a>' % (CFG_SITE_URL,
CFG_SITE_RECORD,
self.recid,
self.discussion==1 and 'reviews' or 'comments',
argd['ln'],
self.discussion==1 and _('Reviews') or _('Comments'))
if argd['action'] not in actions:
argd['action'] = 'DISPLAY'
if not argd['msg']:
# User had to login in-between, so retrieve msg
# from cookie
try:
(kind, cookie_argd) = mail_cookie_check_common(argd['cookie'],
delete=True)
argd.update(cookie_argd)
except InvenioWebAccessMailCookieDeletedError, e:
return redirect_to_url(req, CFG_SITE_SECURE_URL + '/'+ CFG_SITE_RECORD +'/' + \
str(self.recid) + (self.discussion==1 and \
'/reviews' or '/comments'))
except InvenioWebAccessMailCookieError, e:
# Invalid or empty cookie: continue
pass
subscribe = False
if argd['subscribe'] and \
get_user_subscription_to_discussion(self.recid, uid) == 0:
# User is not already subscribed, and asked to subscribe
subscribe = True
body = perform_request_add_comment_or_remark(recID=self.recid,
ln=argd['ln'],
uid=uid,
action=argd['action'],
msg=argd['msg'],
note=argd['note'],
score=argd['score'],
reviews=self.discussion,
comID=argd['comid'],
client_ip_address=client_ip_address,
editor_type=argd['editor_type'],
can_attach_files=can_attach_files,
subscribe=subscribe,
req=req,
attached_files=added_files,
warnings=warning_msgs)
if self.discussion:
title = _("Add Review")
else:
title = _("Add Comment")
jqueryheader = '''
<script src="%(CFG_SITE_URL)s/js/jquery.MultiFile.pack.js" type="text/javascript" language="javascript"></script>
''' % {'CFG_SITE_URL': CFG_SITE_URL}
return page(title=title,
body=body,
navtrail=navtrail,
uid=uid,
language=CFG_SITE_LANG,
verbose=1,
req=req,
navmenuid='search',
metaheaderadd=jqueryheader)
# id not in range
else:
return page(title=_("Record Not Found"),
body=problem,
uid=uid,
verbose=1,
req=req,
navmenuid='search')
def vote(self, req, form):
"""
Vote positively or negatively for a comment/review.
@param comid: comment/review id
@param com_value: +1 to vote positively
-1 to vote negatively
@param recid: the id of the record the comment/review is associated with
@param ln: language
@param do: display order hh = highest helpful score, review only
lh = lowest helpful score, review only
hs = highest star score, review only
ls = lowest star score, review only
od = oldest date
nd = newest date
@param ds: display since all= no filtering by date
nd = n days ago
nw = n weeks ago
nm = n months ago
ny = n years ago
where n is a single digit integer between 0 and 9
@param nb: number of results per page
@param p: results page
@param referer: http address of the calling function to redirect to (refresh)
@param reviews: boolean, enabled for reviews, disabled for comments
"""
argd = wash_urlargd(form, {'comid': (int, -1),
'com_value': (int, 0),
'recid': (int, -1),
'do': (str, "od"),
'ds': (str, "all"),
'nb': (int, 100),
'p': (int, 1),
'referer': (str, None)
})
_ = gettext_set_language(argd['ln'])
client_ip_address = req.remote_ip
uid = getUid(req)
user_info = collect_user_info(req)
(auth_code, auth_msg) = check_user_can_view_comments(user_info, self.recid)
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
return page_not_authorized(req, "../", \
text = auth_msg)
# Check that comment belongs to this recid
if not check_comment_belongs_to_record(argd['comid'], self.recid):
return page_not_authorized(req, "../", \
text = _("Specified comment does not belong to this record"))
# Check that user can access the record
(auth_code, auth_msg) = check_user_can_view_comment(user_info, argd['comid'])
if auth_code:
return page_not_authorized(req, "../", \
text = _("You do not have access to the specified comment"))
# Check that comment is not currently deleted
if is_comment_deleted(argd['comid']):
return page_not_authorized(req, "../", \
text = _("You cannot vote for a deleted comment"),
ln=argd['ln'])
success = perform_request_vote(argd['comid'], client_ip_address, argd['com_value'], uid)
if argd['referer']:
argd['referer'] += "?ln=%s&do=%s&ds=%s&nb=%s&p=%s&voted=%s&" % (
argd['ln'], argd['do'], argd['ds'], argd['nb'], argd['p'], success)
redirect_to_url(req, argd['referer'])
else:
#Note: sent to comments display
referer = "%s/%s/%s/%s?&ln=%s&voted=1"
referer %= (CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, self.discussion == 1 and 'reviews' or 'comments', argd['ln'])
redirect_to_url(req, referer)
def report(self, req, form):
"""
Report a comment/review for inappropriate content
@param comid: comment/review id
@param recid: the id of the record the comment/review is associated with
@param ln: language
@param do: display order hh = highest helpful score, review only
lh = lowest helpful score, review only
hs = highest star score, review only
ls = lowest star score, review only
od = oldest date
nd = newest date
@param ds: display since all= no filtering by date
nd = n days ago
nw = n weeks ago
nm = n months ago
ny = n years ago
where n is a single digit integer between 0 and 9
@param nb: number of results per page
@param p: results page
@param referer: http address of the calling function to redirect to (refresh)
@param reviews: boolean, enabled for reviews, disabled for comments
"""
argd = wash_urlargd(form, {'comid': (int, -1),
'recid': (int, -1),
'do': (str, "od"),
'ds': (str, "all"),
'nb': (int, 100),
'p': (int, 1),
'referer': (str, None)
})
_ = gettext_set_language(argd['ln'])
client_ip_address = req.remote_ip
uid = getUid(req)
user_info = collect_user_info(req)
(auth_code, auth_msg) = check_user_can_view_comments(user_info, self.recid)
if isGuestUser(uid):
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
return page_not_authorized(req, "../", \
text = auth_msg)
# Check that comment belongs to this recid
if not check_comment_belongs_to_record(argd['comid'], self.recid):
return page_not_authorized(req, "../", \
text = _("Specified comment does not belong to this record"))
# Check that user can access the record
(auth_code, auth_msg) = check_user_can_view_comment(user_info, argd['comid'])
if auth_code:
return page_not_authorized(req, "../", \
text = _("You do not have access to the specified comment"))
# Check that comment is not currently deleted
if is_comment_deleted(argd['comid']):
return page_not_authorized(req, "../", \
text = _("You cannot report a deleted comment"),
ln=argd['ln'])
success = perform_request_report(argd['comid'], client_ip_address, uid)
if argd['referer']:
argd['referer'] += "?ln=%s&do=%s&ds=%s&nb=%s&p=%s&reported=%s&" % (argd['ln'], argd['do'], argd['ds'], argd['nb'], argd['p'], str(success))
redirect_to_url(req, argd['referer'])
else:
#Note: sent to comments display
referer = "%s/%s/%s/%s/display?ln=%s&voted=1"
referer %= (CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, self.discussion==1 and 'reviews' or 'comments', argd['ln'])
redirect_to_url(req, referer)
def subscribe(self, req, form):
"""
Subscribe current user to receive email notification when new
comments are added to current discussion.
"""
argd = wash_urlargd(form, {'referer': (str, None)})
uid = getUid(req)
user_info = collect_user_info(req)
(auth_code, auth_msg) = check_user_can_view_comments(user_info, self.recid)
if isGuestUser(uid):
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
return page_not_authorized(req, "../", \
text = auth_msg)
success = subscribe_user_to_discussion(self.recid, uid)
display_url = "%s/%s/%s/comments/display?subscribed=%s&ln=%s" % \
(CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, str(success), argd['ln'])
redirect_to_url(req, display_url)
def unsubscribe(self, req, form):
"""
Unsubscribe current user from current discussion.
"""
argd = wash_urlargd(form, {'referer': (str, None)})
user_info = collect_user_info(req)
uid = getUid(req)
if isGuestUser(uid):
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target, norobot=True)
success = unsubscribe_user_from_discussion(self.recid, uid)
display_url = "%s/%s/%s/comments/display?subscribed=%s&ln=%s" % \
(CFG_SITE_SECURE_URL, CFG_SITE_RECORD, self.recid, str(-success), argd['ln'])
redirect_to_url(req, display_url)
def toggle(self, req, form):
"""
Store the visibility of a comment for current user
"""
argd = wash_urlargd(form, {'comid': (int, -1),
'referer': (str, None),
'collapse': (int, 1)})
uid = getUid(req)
if isGuestUser(uid):
# We do not store information for guests
return ''
toggle_comment_visibility(uid, argd['comid'], argd['collapse'], self.recid)
if argd['referer']:
return redirect_to_url(req, CFG_SITE_SECURE_URL + \
(not argd['referer'].startswith('/') and '/' or '') + \
argd['referer'] + '#' + str(argd['comid']))
class WebInterfaceCommentsFiles(WebInterfaceDirectory):
"""Handle <strike>upload and </strike> access to files for comments.
<strike>The upload is currently only available through the Ckeditor.</strike>
"""
#_exports = ['put'] # 'get' is handled by _lookup(..)
def __init__(self, recid=-1, reviews=0):
self.recid = recid
self.discussion = reviews # 0:comments, 1:reviews
def _lookup(self, component, path):
""" This handler is invoked for the dynamic URLs (for getting
<strike>and putting attachments</strike>) Eg:
CFG_SITE_URL/CFG_SITE_RECORD/5953/comments/attachments/get/652/myfile.pdf
"""
if component == 'get' and len(path) > 1:
comid = path[0] # comment ID
file_name = '/'.join(path[1:]) # the filename
def answer_get(req, form):
"""Accessing files attached to comments."""
form['file'] = file_name
form['comid'] = comid
return self._get(req, form)
return answer_get, []
# All other cases: file not found
return None, []
def _get(self, req, form):
"""
Returns a file attached to a comment.
Example:
CFG_SITE_URL/CFG_SITE_RECORD/5953/comments/attachments/get/652/myfile.pdf
where 652 is the comment ID
"""
argd = wash_urlargd(form, {'file': (str, None),
'comid': (int, 0)})
_ = gettext_set_language(argd['ln'])
# Can user view this record, i.e. can user access its
# attachments?
uid = getUid(req)
user_info = collect_user_info(req)
# Check that user can view record, and its comments (protected
# with action "viewcomment")
(auth_code, auth_msg) = check_user_can_view_comments(user_info, self.recid)
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
return page_not_authorized(req, "../", \
text = auth_msg)
# Does comment exist?
if not query_get_comment(argd['comid']):
req.status = apache.HTTP_NOT_FOUND
return page(title=_("Page Not Found"),
body=_('The requested comment could not be found'),
req=req)
# Check that user can view this particular comment, protected
# using its own restriction
(auth_code, auth_msg) = check_user_can_view_comment(user_info, argd['comid'])
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : \
CFG_SITE_SECURE_URL + user_info['uri']}, {})
return redirect_to_url(req, target)
elif auth_code:
return page_not_authorized(req, "../", \
text = auth_msg,
ln=argd['ln'])
# Check that comment is not currently deleted
if is_comment_deleted(argd['comid']):
return page_not_authorized(req, "../", \
text = _("You cannot access files of a deleted comment"),
ln=argd['ln'])
if not argd['file'] is None:
# Prepare path to file on disk. Normalize the path so that
# ../ and other dangerous components are removed.
path = os.path.abspath(CFG_PREFIX + '/var/data/comments/' + \
str(self.recid) + '/' + str(argd['comid']) + \
'/' + argd['file'])
# Check that we are really accessing attachements
# directory, for the declared record.
if path.startswith(CFG_PREFIX + '/var/data/comments/' + \
str(self.recid)) and \
os.path.exists(path):
return stream_file(req, path)
# Send error 404 in all other cases
req.status = apache.HTTP_NOT_FOUND
return page(title=_("Page Not Found"),
body=_('The requested file could not be found'),
req=req,
language=argd['ln'])
class WebInterfaceYourCommentsPages(WebInterfaceDirectory):
"""Defines the set of /yourcomments pages."""
_exports = ['', ]
def index(self, req, form):
"""Index page."""
argd = wash_urlargd(form, {'page': (int, 1),
'format': (str, "rc"),
'order_by': (str, "lcf"),
'per_page': (str, "all"),
})
# TODO: support also "reviews", by adding new option to show/hide them if needed
uid = getUid(req)
# load the right language
_ = gettext_set_language(argd['ln'])
# Is site ready to accept comments?
if not CFG_WEBCOMMENT_ALLOW_COMMENTS or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "%s/yourcomments" % \
(CFG_SITE_SECURE_URL,),
text="Comments are currently disabled on this site",
navmenuid="yourcomments")
elif uid == -1 or isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({
'referer' : "%s/yourcomments%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd(argd, {})),
"ln" : argd['ln']}, {})))
user_info = collect_user_info(req)
if not user_info['precached_sendcomments']:
# Maybe we should still authorize if user submitted
# comments in the past?
return page_not_authorized(req, "../", \
text = _("You are not authorized to use comments."))
return page(title=_("Your Comments"),
body=perform_display_your_comments(user_info,
page_number=argd['page'],
selected_order_by_option=argd['order_by'],
selected_display_number_option=argd['per_page'],
selected_display_format_option=argd['format'],
ln=argd['ln']),
navtrail= """<a class="navtrail" href="%(sitesecureurl)s/youraccount/display?ln=%(ln)s">%(account)s</a>""" % {
'sitesecureurl' : CFG_SITE_SECURE_URL,
'ln': argd['ln'],
'account' : _("Your Account"),
},
description=_("%s View your previously submitted comments") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(argd['ln'], CFG_SITE_NAME),
uid=uid,
language=argd['ln'],
req=req,
lastupdated=__lastupdated__,
navmenuid='youralerts',
secure_page_p=1)
# Return the same page wether we ask for /CFG_SITE_RECORD/123 or /CFG_SITE_RECORD/123/
__call__ = index
diff --git a/invenio/legacy/webjournal/adminlib.py b/invenio/legacy/webjournal/adminlib.py
index fcf3b110a..2bb52ba2c 100644
--- a/invenio/legacy/webjournal/adminlib.py
+++ b/invenio/legacy/webjournal/adminlib.py
@@ -1,967 +1,967 @@
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0301
"""Invenio WebJournal Administration Interface."""
__revision__ = "$Id$"
import sys
import cPickle
import re
import os
if sys.hexversion < 0x2040000:
# pylint: disable=W0622
from sets import Set as set
# pylint: enable=W0622
from invenio.ext.logging import register_exception
from invenio.config import \
CFG_SITE_URL, \
CFG_SITE_LANG, \
CFG_SITE_NAME, \
CFG_ETCDIR, \
CFG_CACHEDIR, \
CFG_TMPSHAREDDIR, \
CFG_SITE_SUPPORT_EMAIL, \
CFG_SITE_RECORD
from invenio.base.i18n import gettext_set_language
from invenio.ext.email import send_email
from invenio.modules.access.engine import acc_authorize_action
from invenio.webjournal_config import \
InvenioWebJournalJournalIdNotFoundDBError, \
InvenioWebJournalReleaseUpdateError, \
InvenioWebJournalNoJournalOnServerError
from invenio.webjournal_utils import \
get_journals_ids_and_names, \
guess_journal_name, \
get_current_issue, \
get_issue_number_display, \
get_featured_records, \
add_featured_record, \
remove_featured_record, \
clear_cache_for_issue, \
get_next_journal_issues, \
get_release_datetime, \
get_journal_id, \
compare_issues, \
get_journal_info_path, \
get_journal_css_url, \
get_journal_alert_sender_email, \
get_journal_alert_recipient_email, \
get_journal_draft_keyword_to_remove, \
get_journal_categories, \
get_journal_articles, \
get_grouped_issues, \
get_journal_issue_grouping, \
get_journal_languages, \
get_journal_collection_to_refresh_on_release, \
get_journal_index_to_refresh_on_release, \
issue_is_later_than, \
WEBJOURNAL_OPENER
from invenio.legacy.dbquery import run_sql
from invenio.legacy.bibrecord import \
create_record, \
print_rec
from invenio.modules.formatter import format_record
-from invenio.bibtask import task_low_level_submission, bibtask_allocate_sequenceid
+from invenio.legacy.bibsched.bibtask import task_low_level_submission, bibtask_allocate_sequenceid
from invenio.legacy.search_engine import get_all_collections_of_a_record
import invenio.legacy.template
wjt = invenio.legacy.template.load('webjournal')
def getnavtrail(previous = ''):
"""Get the navtrail"""
navtrail = """<a class="navtrail" href="%s/help/admin">Admin Area</a> """ % (CFG_SITE_URL,)
navtrail = navtrail + previous
return navtrail
def perform_index(ln=CFG_SITE_LANG, journal_name=None, action=None, uid=None):
"""
Index page
Lists the journals, and offers options to edit them, delete them
or add new journal.
Parameters:
journal_name - the journal affected by action, if any
action - one of ['', 'askDelete', _('Delete'), _('Cancel')]
ln - language
uid - user id
"""
_ = gettext_set_language(ln)
msg = None
if action == 'askDelete' and journal_name is not None:
msg = '''<fieldset style="display:inline;margin-left:auto;margin-right:auto;">
<legend>Delete Journal Configuration</legend><span style="color:#f00">Are you sure you want to delete the configuration of %(journal_name)s?
<form action="%(CFG_SITE_URL)s/admin/webjournal/webjournaladmin.py">
<input type="hidden" name="journal_name" value="%(journal_name)s" />
<input class="formbutton" type="submit" name="action" value="%(delete)s" />
<input class="formbutton" type="submit" name="action" value="%(cancel)s" />
</form></span></fieldset>''' % {'CFG_SITE_URL': CFG_SITE_URL,
'journal_name': journal_name,
'delete': _("Delete"),
'cancel': _("Cancel")}
if action == _("Delete") and journal_name is not None:
# User confirmed and clicked on "Delete" button
remove_journal(journal_name)
journals = get_journals_ids_and_names()
# Only keep journal that user can view or edit
journals = [(journal_info, acc_authorize_action(uid,
'cfgwebjournal',
name=journal_info['journal_name'],
with_editor_rights='yes')[0] == 0) \
for journal_info in journals \
if acc_authorize_action(uid,
'cfgwebjournal',
name=journal_info['journal_name'])[0] == 0]
return wjt.tmpl_admin_index(ln=ln,
journals=journals,
msg=msg)
def perform_administrate(ln=CFG_SITE_LANG, journal_name=None,
as_editor=True):
"""
Administration of a journal
Show the current and next issues/publications, and display links
to more specific administrative pages.
Parameters:
journal_name - the journal to be administrated
ln - language
with_editor_rights - True if can edit configuration. Read-only mode otherwise
"""
if journal_name is None:
try:
journal_name = guess_journal_name(ln)
except InvenioWebJournalNoJournalOnServerError, e:
return e.user_box()
if not can_read_xml_config(journal_name):
return '<span style="color:#f00">Configuration could not be read. Please check that %s/webjournal/%s/%s-config.xml exists and can be read by the server.</span><br/>' % (CFG_ETCDIR, journal_name, journal_name)
current_issue = get_current_issue(ln, journal_name)
current_publication = get_issue_number_display(current_issue,
journal_name,
ln)
issue_list = get_grouped_issues(journal_name, current_issue)
next_issue_number = get_next_journal_issues(issue_list[-1], journal_name, 1)
return wjt.tmpl_admin_administrate(journal_name,
current_issue,
current_publication,
issue_list,
next_issue_number[0],
ln,
as_editor=as_editor)
def perform_feature_record(journal_name,
recid,
img_url='',
action='',
ln=CFG_SITE_LANG):
"""
Interface to feature a record
Used to list, add and remove featured records of the journal.
Parameters:
journal_name - the journal for which the article is featured
recid - the record affected by 'action'
img_url - the URL to image displayed with given record
(only when action == 'add')
action - One of ['', 'add', 'askremove', _('Remove'), _('Cancel')]
ln - language
"""
_ = gettext_set_language(ln)
if action == 'add':
result = add_featured_record(journal_name, recid, img_url)
if result == 0:
msg ='''<span style="color:#0f0">Successfully featured
<a href="%(CFG_SITE_URL)s/%(CFG_SITE_RECORD)s/%(recid)s">record %(recid)s</a>.
Go to the <a href="%(CFG_SITE_URL)s/journal/%(name)s">%(name)s journal</a> to
see the result.</span>''' % {'CFG_SITE_URL': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
'name': journal_name,
'recid': recid}
elif result == 1:
msg = '''<span style="color:#f00"><a href="%(CFG_SITE_URL)s/%(CFG_SITE_RECORD)s/%(recid)s">record %(recid)s</a> is already featured. Choose another one or remove it first.</span>''' % \
{'CFG_SITE_URL': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
'recid': recid}
else:
msg = '''<span style="color:#f00">Record could not be featured. Check file permission.</span>'''
featured_records = get_featured_records(journal_name)
return wjt.tmpl_admin_feature_record(ln=ln,
journal_name=journal_name,
featured_records=featured_records,
msg=msg)
elif action == 'askremove':
msg = '''<fieldset style="display:inline;margin-left:auto;margin-right:auto;">
<legend>Remove featured record</legend><span style="color:#f00">Are you sure you want to remove <a href="%(CFG_SITE_URL)s/%(CFG_SITE_RECORD)s/%(recid)s">record %(recid)s</a> from the list of featured record?
<form action="%(CFG_SITE_URL)s/admin/webjournal/webjournaladmin.py/feature_record">
<input type="hidden" name="journal_name" value="%(name)s" />
<input type="hidden" name="recid" value="%(recid)s" />
<input class="formbutton" type="submit" name="action" value="%(remove)s" />
<input class="formbutton" type="submit" name="action" value="%(cancel)s" />
</form></span></fieldset>''' % \
{'CFG_SITE_URL': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
'name': journal_name,
'recid': recid,
'cancel': _("Cancel"),
'remove': _("Remove")}
featured_records = get_featured_records(journal_name)
return wjt.tmpl_admin_feature_record(ln=ln,
journal_name=journal_name,
featured_records=featured_records,
msg=msg)
elif action == _("Remove"):
result = remove_featured_record(journal_name, recid)
msg = '''<span style="color:#f00"><a href="%(CFG_SITE_URL)s/%(CFG_SITE_RECORD)s/%(recid)s">Record %(recid)s</a>
has been removed.</span>''' % \
{'CFG_SITE_URL': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
'recid': recid}
featured_records = get_featured_records(journal_name)
return wjt.tmpl_admin_feature_record(ln=ln,
journal_name=journal_name,
featured_records=featured_records,
msg=msg)
else:
msg = '''Here you can choose which records from the %s should
be featured on the journal webpage.''' % CFG_SITE_NAME
featured_records = get_featured_records(journal_name)
return wjt.tmpl_admin_feature_record(ln=ln,
journal_name=journal_name,
featured_records=featured_records,
msg=msg)
def perform_regenerate_issue(issue,
journal_name,
ln=CFG_SITE_LANG,
confirmed_p=False,
publish_draft_articles_p=False):
"""
Clears the cache for the given issue.
Parameters:
journal_name - the journal for which the cache should
be deleted
issue - the issue for which the cache should be
deleted
ln - language
confirmed_p - if True, regenerate. Else ask confirmation
publish_draft_articles_p - should the remaining draft articles in
the issue be made public?
"""
if not confirmed_p:
# Ask user confirmation about the regeneration
current_issue = get_current_issue(ln, journal_name)
issue_released_p = not issue_is_later_than(issue, current_issue)
return wjt.tmpl_admin_regenerate_confirm(ln,
journal_name,
issue,
issue_released_p)
else:
# Regenerate the issue (clear the cache)
success = clear_cache_for_issue(journal_name,
issue)
if publish_draft_articles_p:
current_issue = get_current_issue(ln, journal_name)
if not issue_is_later_than(issue, current_issue):
# This issue is already released: we can safely publish
# the articles. Otherwise we'll refuse to publish the drafts
move_drafts_articles_to_ready(journal_name, issue)
if success:
return wjt.tmpl_admin_regenerate_success(ln,
journal_name,
issue)
else:
return wjt.tmpl_admin_regenerate_error(ln,
journal_name,
issue)
def perform_request_issue_control(journal_name, issues,
action, ln=CFG_SITE_LANG):
"""
Central logic for issue control.
Regenerates the flat files 'current_issue' and 'issue_group' of
the journal that control which issue is currently active for the
journal.
Parameters:
journal_name - the journal affected by 'action'
issues - list of issues affected by 'action' TODO: check
action - One of ['cfg', _('Add'), _('Refresh'),
_('Publish'), _('Update')]
ln - language
"""
_ = gettext_set_language(ln)
out = ''
if action == "cfg" or action == _("Refresh") or action == _("Add"):
# find out if we are in update or release
current_issue = get_current_issue(ln, journal_name)
grouped_issues = get_grouped_issues(journal_name, current_issue)
if current_issue != grouped_issues[-1]:
# The current issue has "pending updates", i.e. is grouped
# with unreleased issues. Propose to update these issues
next_issue = grouped_issues[grouped_issues.index(current_issue) + 1]
out = wjt.tmpl_admin_update_issue(ln,
journal_name,
next_issue,
current_issue)
else:
# Propose a release
next_issues = get_next_journal_issues(current_issue,
journal_name,
n=get_journal_issue_grouping(journal_name))
if action == _("Refresh"):
next_issues += issues
next_issues = list(set(next_issues))# avoid double entries
elif action == _("Add"):
next_issues += issues
next_issues = list(set(next_issues))# avoid double entries
next_issues.sort(compare_issues)
highest_issue_so_far = next_issues[-1]
one_more_issue = get_next_journal_issues(highest_issue_so_far,
journal_name,
1)
next_issues += one_more_issue
next_issues = list(set(next_issues)) # avoid double entries
else:
# get the next issue numbers to publish
next_issues = get_next_journal_issues(current_issue,
journal_name,
n=get_journal_issue_grouping(journal_name))
next_issues.sort(compare_issues)
out = wjt.tmpl_admin_control_issue(ln,
journal_name,
next_issues)
elif action == _("Publish"):
# Publish the given issues (mark them as current issues)
publish_issues = issues
publish_issues = list(set(publish_issues)) # avoid double entries
publish_issues.sort(compare_issues)
if len(publish_issues) == 0:
# User did not select an issue
current_issue = get_current_issue(ln, journal_name)
next_issues = get_next_journal_issues(current_issue,
journal_name,
n=get_journal_issue_grouping(journal_name))
out = '<p style="color:#f00;text-align:center">' + \
_('Please select an issue') + '</p>'
out += wjt.tmpl_admin_control_issue(ln,
journal_name,
next_issues)
return out
try:
release_journal_issue(publish_issues, journal_name, ln)
except InvenioWebJournalJournalIdNotFoundDBError, e:
register_exception(req=None)
return e.user_box()
out = wjt.tmpl_admin_control_issue_success_msg(ln,
publish_issues,
journal_name)
elif action == _("Update"):
try:
try:
update_issue = issues[0]
except:
raise InvenioWebJournalReleaseUpdateError(ln, journal_name)
except InvenioWebJournalReleaseUpdateError, e:
register_exception(req=None)
return e.user_box()
try:
release_journal_update(update_issue, journal_name, ln)
except InvenioWebJournalJournalIdNotFoundDBError, e:
register_exception(req=None)
return e.user_box()
out = wjt.tmpl_admin_updated_issue_msg(ln,
update_issue,
journal_name)
return out
def perform_request_alert(journal_name, issue,
sent, plain_text, subject, recipients,
html_mail, force, ln=CFG_SITE_LANG):
"""
All the logic for alert emails.
Display a form to edit email/recipients and options to send the
email. Sent in HTML/PlainText or only PlainText if wished so.
Also prevent mistake of sending the alert more than one for a
particular issue.
Parameters:
journal_name - the journal for which the alert is sent
issue - the issue for which the alert is sent
sent - Display interface to edit email if "False"
(string). Else send the email.
plain_text - the text of the mail
subject - the subject of the mail
recipients - the recipients of the mail (string with
comma-separated emails)
html_mail - if 'html', also send email as HTML (copying
from the current issue on the web)
force - if different than "False", the email is sent
even if it has already been sent.
ln - language
"""
# FIXME: more flexible options to choose the language of the alert
languages = get_journal_languages(journal_name)
if languages:
alert_ln = languages[0]
else:
alert_ln = CFG_SITE_LANG
if not get_release_datetime(issue, journal_name, ln):
# Trying to send an alert for an unreleased issue
return wjt.tmpl_admin_alert_unreleased_issue(ln,
journal_name)
if sent == "False":
# Retrieve default message, subject and recipients, and
# display email editor
subject = wjt.tmpl_admin_alert_subject(journal_name,
alert_ln,
issue)
plain_text = wjt.tmpl_admin_alert_plain_text(journal_name,
alert_ln,
issue)
plain_text = plain_text.encode('utf-8')
recipients = get_journal_alert_recipient_email(journal_name)
return wjt.tmpl_admin_alert_interface(ln,
journal_name,
subject,
plain_text,
recipients,
alert_ln)
else:
# User asked to send the mail
if was_alert_sent_for_issue(issue,
journal_name,
ln) != False and force == "False":
# Mmh, email already sent before for this issue. Ask
# confirmation
return wjt.tmpl_admin_alert_was_already_sent(ln,
journal_name,
subject,
plain_text,
recipients,
html_mail,
issue)
html_string = None
if html_mail == "html":
# Also send as HTML: retrieve from current issue
html_file = WEBJOURNAL_OPENER.open('%s/journal/%s?ln=%s'
% (CFG_SITE_URL, journal_name, alert_ln))
html_string = html_file.read()
html_file.close()
html_string = put_css_in_file(html_string, journal_name)
html_string = insert_journal_link(html_string, journal_name, issue, ln)
html_string = wash_alert(html_string)
sender_email = get_journal_alert_sender_email(journal_name)
send_email(sender_email, recipients, subject, plain_text,
html_string, header='', footer='', html_header='',
html_footer='', charset='utf-8')
update_DB_for_alert(issue, journal_name, ln)
return wjt.tmpl_admin_alert_success_msg(ln,
journal_name)
def perform_request_configure(journal_name, xml_config, action, ln=CFG_SITE_LANG):
"""
Add a new journal or configure the settings of an existing journal.
Parameters:
journal_name - the journal to configure, or name of the new journal
xml_config - the xml configuration of the journal (string)
action - One of ['edit', 'editDone', 'add', 'addDone']
ln - language
"""
msg = None
if action == 'edit':
# Read existing config
if journal_name is not None:
if not can_read_xml_config(journal_name):
return '<span style="color:#f00">Configuration could not be read. Please check that %s/webjournal/%s/%s-config.xml exists and can be read by the server.</span><br/>' % (CFG_ETCDIR, journal_name, journal_name)
config_path = '%s/webjournal/%s/%s-config.xml' % (CFG_ETCDIR, journal_name, journal_name)
xml_config = file(config_path).read()
else:
# cannot edit unknown journal...
return '<span style="color:#f00">You must specify a journal name</span>'
if action in ['editDone', 'addDone']:
# Save config
if action == 'addDone':
res = add_journal(journal_name, xml_config)
if res == -1:
msg = '<span style="color:#f00">A journal with that name already exists. Please choose another name.</span>'
action = 'add'
elif res == -2:
msg = '<span style="color:#f00">Configuration could not be written (no permission). Please manually copy your config to %s/webjournal/%s/%s-config.xml</span><br/>' % (CFG_ETCDIR, journal_name, journal_name)
action = 'edit'
elif res == -4:
msg = '<span style="color:#f00">Cache file could not be written (no permission). Please manually create directory %s/webjournal/%s/ and make it writable for your Apache user</span><br/>' % (CFG_CACHEDIR, journal_name)
action = 'edit'
elif res > 0:
msg = '<span style="color:#0f0">Journal successfully added.</span>'
action = 'edit'
else:
msg = '<span style="color:#f00">An error occurred. The journal could not be added</span>'
action = 'edit'
if action == 'add':
# Display a sample config.
xml_config = '''<?xml version="1.0" encoding="UTF-8"?>
<webjournal name="AtlantisTimes">
<view>
<niceName>Atlantis Times</niceName>
<niceURL>%(CFG_SITE_URL)s</niceURL>
<css>
<screen>/img/AtlantisTimes.css</screen>
<print>/img/AtlantisTimes.css</print>
</css>
<format_template>
<index>AtlantisTimes_Index.bft</index>
<detailed>AtlantisTimes_Detailed.bft</detailed>
<search>AtlantisTimes_Search.bft</search>
<popup>AtlantisTimes_Popup.bft</popup>
<contact>AtlantisTimes_Contact.bft</contact>
</format_template>
</view>
<model>
<record>
<rule>News, 980__a:ATLANTISTIMESNEWS or 980__a:ATLANTISTIMESNEWSDRAFT</rule>
<rule>Science, 980__a:ATLANTISTIMESSCIENCE or 980__a:ATLANTISTIMESSCIENCEDRAFT</rule>
<rule>Arts, 980__a:ATLANTISTIMESARTS or 980__a:ATLANTISTIMESARTSDRAFT</rule>
</record>
</model>
<controller>
<issue_grouping>2</issue_grouping>
<issues_per_year>52</issues_per_year>
<hide_unreleased_issues>all</hide_unreleased_issues>
<marc_tags>
<issue_number>773__n</issue_number>
<order_number>773__c</order_number>
</marc_tags>
<alert_sender>%(CFG_SITE_SUPPORT_EMAIL)s</alert_sender>
<alert_recipients>recipients@atlantis.atl</alert_recipients>
<languages>en,fr</languages>
<submission>
<doctype>DEMOJRN</doctype>
<report_number_field>DEMOJRN_RN</report_number_field>
</submission>
<first_issue>02/2009</first_issue>
<draft_keyword>DRAFT</draft_keyword>
</controller>
</webjournal>''' % {'CFG_SITE_URL': CFG_SITE_URL,
'CFG_SITE_SUPPORT_EMAIL': CFG_SITE_SUPPORT_EMAIL}
out = wjt.tmpl_admin_configure_journal(ln=ln,
journal_name=journal_name,
xml_config=xml_config,
action=action,
msg=msg)
return out
######################## ADDING/REMOVING JOURNALS ###############################
def add_journal(journal_name, xml_config):
"""
Add a new journal to the DB. Also create the configuration file
Parameters:
journal_name - the name (used in URLs) of the new journal
xml_config - the xml configuration of the journal (string)
Returns:
the id of the journal if successfully added
-1 if could not be added because journal name already exists
-2 if config could not be saved
-3 if could not be added for other reasons
-4 if database cache could not be added
"""
try:
get_journal_id(journal_name)
except InvenioWebJournalJournalIdNotFoundDBError:
# Perfect, journal does not exist
res = run_sql("INSERT INTO jrnJOURNAL (name) VALUES(%s)", (journal_name,))
# Also save xml_config
config_dir = '%s/webjournal/%s/' % (CFG_ETCDIR, journal_name)
try:
if not os.path.exists(config_dir):
os.makedirs(config_dir)
xml_config_file = file(config_dir + journal_name + '-config.xml', 'w')
xml_config_file.write(xml_config)
xml_config_file.close()
except Exception:
res = -2
# And save some info in file in case database is down
journal_info_path = get_journal_info_path(journal_name)
journal_info_dir = os.path.dirname(journal_info_path)
if not os.path.exists(journal_info_dir):
try:
os.makedirs(journal_info_dir)
except Exception:
if res <= 0:
res = -4
journal_info_file = open(journal_info_path, 'w')
cPickle.dump({'journal_id': res,
'journal_name': journal_name,
'current_issue':'01/2000'}, journal_info_file)
return res
return -1
def remove_journal(journal_name):
"""
Remove a journal from the DB. Does not completely remove
everything, in case it was an error from the editor..
Parameters:
journal_name - the journal to remove
Returns:
the id of the journal if successfully removed or
-1 if could not be removed because journal name does not exist or
-2 if could not be removed for other reasons
"""
run_sql("DELETE FROM jrnJOURNAL WHERE name=%s", (journal_name,))
######################## TIME / ISSUE FUNCTIONS ###############################
def release_journal_issue(publish_issues, journal_name, ln=CFG_SITE_LANG):
"""
Releases a new issue.
This sets the current issue in the database to 'publish_issues' for
given 'journal_name'
Parameters:
journal_name - the journal for which we release a new issue
publish_issues - the list of issues that will be considered as
current (there can be several)
ln - language
"""
journal_id = get_journal_id(journal_name, ln)
if len(publish_issues) > 1:
publish_issues.sort(compare_issues)
low_bound = publish_issues[0]
high_bound = publish_issues[-1]
issue_display = '%s-%s/%s' % (low_bound.split("/")[0],
high_bound.split("/")[0],
high_bound.split("/")[1])
# remember convention: if we are going over a new year, take the higher
else:
issue_display = publish_issues[0]
# produce the DB lines
for publish_issue in publish_issues:
move_drafts_articles_to_ready(journal_name, publish_issue)
run_sql("INSERT INTO jrnISSUE (id_jrnJOURNAL, issue_number, issue_display) \
VALUES(%s, %s, %s)", (journal_id,
publish_issue,
issue_display))
# set first issue to published
release_journal_update(publish_issues[0], journal_name, ln)
# update information in file (in case DB is down)
journal_info_path = get_journal_info_path(journal_name)
journal_info_file = open(journal_info_path, 'w')
cPickle.dump({'journal_id': journal_id,
'journal_name': journal_name,
'current_issue': get_current_issue(ln, journal_name)},
journal_info_file)
def delete_journal_issue(issue, journal_name, ln=CFG_SITE_LANG):
"""
Deletes an issue from the DB.
(Not currently used)
"""
journal_id = get_journal_id(journal_name, ln)
run_sql("DELETE FROM jrnISSUE WHERE issue_number=%s \
AND id_jrnJOURNAL=%s",(issue, journal_id))
# update information in file (in case DB is down)
journal_info_path = get_journal_info_path(journal_name)
journal_info_file = open(journal_info_path, 'w')
cPickle.dump({'journal_id': journal_id,
'journal_name': journal_name,
'current_issue': get_current_issue(ln, journal_name)},
journal_info_file)
def was_alert_sent_for_issue(issue, journal_name, ln):
"""
Returns False if alert has not already been sent for given journal and
issue, else returns time of last alert, as time tuple
Parameters:
journal_name - the journal for which we want to check last alert
issue - the issue for which we want to check last alert
ln - language
Returns:
time tuple or False. Eg: (2008, 4, 25, 7, 58, 37, 4, 116, -1)
"""
journal_id = get_journal_id(journal_name, ln)
date_announced = run_sql("SELECT date_announced FROM jrnISSUE \
WHERE issue_number=%s \
AND id_jrnJOURNAL=%s", (issue, journal_id))[0][0]
if date_announced == None:
return False
else:
return date_announced.timetuple()
def update_DB_for_alert(issue, journal_name, ln):
"""
Update the 'last sent alert' timestamp for the given journal and
issue.
Parameters:
journal_name - the journal for which we want to update the time
of last alert
issue - the issue for which we want to update the time
of last alert
ln - language
"""
journal_id = get_journal_id(journal_name, ln)
run_sql("UPDATE jrnISSUE set date_announced=NOW() \
WHERE issue_number=%s \
AND id_jrnJOURNAL=%s", (issue,
journal_id))
def release_journal_update(update_issue, journal_name, ln=CFG_SITE_LANG):
"""
Releases an update to a journal.
"""
move_drafts_articles_to_ready(journal_name, update_issue)
journal_id = get_journal_id(journal_name, ln)
run_sql("UPDATE jrnISSUE set date_released=NOW() \
WHERE issue_number=%s \
AND id_jrnJOURNAL=%s", (update_issue,
journal_id))
def move_drafts_articles_to_ready(journal_name, issue):
"""
Move draft articles to their final "collection".
To do so we rely on the convention that an admin-chosen keyword
must be removed from the metadata
"""
protected_datafields = ['100', '245', '246', '520', '590', '700']
keyword_to_remove = get_journal_draft_keyword_to_remove(journal_name)
collections_to_refresh = {}
indexes_to_refresh = get_journal_index_to_refresh_on_release(journal_name)
bibindex_indexes_params = []
if indexes_to_refresh:
bibindex_indexes_params = ['-w', ','.join(indexes_to_refresh)]
categories = get_journal_categories(journal_name, issue)
task_sequence_id = str(bibtask_allocate_sequenceid())
for category in categories:
articles = get_journal_articles(journal_name, issue, category)
for order, recids in articles.iteritems():
for recid in recids:
record_xml = format_record(recid, of='xm')
if not record_xml:
continue
new_record_xml_path = os.path.join(CFG_TMPSHAREDDIR,
'webjournal_publish_' + \
str(recid) + '.xml')
if os.path.exists(new_record_xml_path):
# Do not modify twice
continue
record_struc = create_record(record_xml)
record = record_struc[0]
new_record = update_draft_record_metadata(record,
protected_datafields,
keyword_to_remove)
new_record_xml = print_rec(new_record)
if new_record_xml.find(keyword_to_remove) >= 0:
new_record_xml = new_record_xml.replace(keyword_to_remove, '')
# Write to file
new_record_xml_file = file(new_record_xml_path, 'w')
new_record_xml_file.write(new_record_xml)
new_record_xml_file.close()
# Submit
task_low_level_submission('bibupload',
'WebJournal',
'-c', new_record_xml_path,
'-I', task_sequence_id)
task_low_level_submission('bibindex',
'WebJournal',
'-i', str(recid),
'-I', task_sequence_id,
*bibindex_indexes_params)
for collection in get_all_collections_of_a_record(recid):
collections_to_refresh[collection] = ''
# Refresh collections
collections_to_refresh.update([(c, '') for c in get_journal_collection_to_refresh_on_release(journal_name)])
for collection in collections_to_refresh.keys():
task_low_level_submission('webcoll',
'WebJournal',
'-f', '-P', '2', '-p', '1', '-c', collection,
'-I', task_sequence_id)
def update_draft_record_metadata(record, protected_datafields, keyword_to_remove):
"""
Returns a new record with fields that should be modified in order
for this draft record to be considered as 'ready': keep only
controlfield 001 and non-protected fields that contains the
'keyword_to_remove'
Parameters:
record - a single recored (as BibRecord structure)
protected_datafields - *list* tags that should not be part of the
returned record
keyword_to_remove - *str* keyword that should be considered
when checking if a field should be part of
the returned record.
"""
new_record = {}
for tag, field in record.iteritems():
if tag in protected_datafields:
continue
elif not keyword_to_remove in str(field) and \
not tag == '001':
continue
else:
# Keep
new_record[tag] = field
return new_record
######################## XML CONFIG ###############################
def can_read_xml_config(journal_name):
"""
Check that configuration xml for given journal name is exists and
can be read.
"""
config_path = '%s/webjournal/%s/%s-config.xml' % \
(CFG_ETCDIR, journal_name, journal_name)
try:
file(config_path).read()
except IOError:
return False
return True
######################## EMAIL HELPER FUNCTIONS ###############################
CFG_WEBJOURNAL_ALERT_WASH_PATTERN = re.compile('<\!--\s*START_NOT_FOR_ALERT\s*-->.*?<\!--\s*END_NOT_FOR_ALERT\s*-->', re.MULTILINE | re.DOTALL)
def wash_alert(html_string):
"""
Remove from alert any content in-between tags <!--START_NOT_FOR_ALERT--> and
<!--END_NOT_FOR_ALERT-->
@param html_string: the HTML newsletter
"""
return CFG_WEBJOURNAL_ALERT_WASH_PATTERN.sub('', html_string)
def insert_journal_link(html_string, journal_name, issue, ln):
"""
Insert a warning regarding HTML formatting inside mail client and
link to journal page just after the body of the page.
@param html_string: the HTML newsletter
@param journal_name: the journal name
@param issue: journal issue for which the alert is sent (in the form number/year)
@param ln: language
"""
def replace_body(match_obj):
"Replace body with itself + header message"
header = wjt.tmpl_admin_alert_header_html(journal_name, ln, issue)
return match_obj.group() + header
return re.sub('<body.*?>', replace_body, html_string, 1)
def put_css_in_file(html_message, journal_name):
"""
Retrieve the CSS of the journal and insert/inline it in the <head>
section of the given html_message. (Used for HTML alert emails)
Parameters:
journal_name - the journal name
html_message - the html message (string) in which the CSS
should be inserted
Returns:
the HTML message with its CSS inlined
"""
css_path = get_journal_css_url(journal_name)
if not css_path:
return
css_file = WEBJOURNAL_OPENER.open(css_path)
css = css_file.read()
css = make_full_paths_in_css(css, journal_name)
html_parted = html_message.split("</head>")
if len(html_parted) > 1:
html = '%s<style type="text/css">%s</style></head>%s' % (html_parted[0],
css,
html_parted[1])
else:
html_parted = html_message.split("<html>")
if len(html_parted) > 1:
html = '%s<html><head><style type="text/css">%s</style></head>%s' % (html_parted[0],
css,
html_parted[1])
else:
return
return html
def make_full_paths_in_css(css, journal_name):
"""
Update the URLs in a CSS from relative to absolute URLs, so that the
URLs are accessible from anywhere (Used for HTML alert emails)
Parameters:
journal_name - the journal name
css - a cascading stylesheet (string)
Returns:
(str) the given css with relative paths converted to absolute paths
"""
url_pattern = re.compile('''url\(["']?\s*(?P<url>\S*)\s*["']?\)''',
re.DOTALL)
url_iter = url_pattern.finditer(css)
rel_to_full_path = {}
for url in url_iter:
url_string = url.group("url")
url_string = url_string.replace('"', "")
url_string = url_string.replace("'", "")
if url_string[:6] != "http://":
rel_to_full_path[url_string] = '"%s/img/webjournal_%s/%s"' % \
(CFG_SITE_URL,
journal_name,
url_string)
for url in rel_to_full_path.keys():
css = css.replace(url, rel_to_full_path[url])
return css
diff --git a/invenio/legacy/webjournal/utils.py b/invenio/legacy/webjournal/utils.py
index 45fbce393..ec4aac5ee 100644
--- a/invenio/legacy/webjournal/utils.py
+++ b/invenio/legacy/webjournal/utils.py
@@ -1,1809 +1,1809 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Various utilities for WebJournal, e.g. config parser, etc.
"""
import time
import datetime
import calendar
import re
import os
import cPickle
import math
import urllib
from MySQLdb import OperationalError
from xml.dom import minidom
from urlparse import urlparse
from invenio.config import \
CFG_ETCDIR, \
CFG_SITE_URL, \
CFG_CACHEDIR, \
CFG_SITE_LANG, \
CFG_ACCESS_CONTROL_LEVEL_SITE, \
CFG_SITE_SUPPORT_EMAIL, \
CFG_DEVEL_SITE, \
CFG_CERN_SITE
from invenio.legacy.dbquery import run_sql
from invenio.modules.formatter.engine import BibFormatObject
from invenio.legacy.search_engine import search_pattern, record_exists
from invenio.base.i18n import gettext_set_language
from invenio.ext.logging import register_exception
from invenio.utils.url import make_invenio_opener
WEBJOURNAL_OPENER = make_invenio_opener('WebJournal')
########################### REGULAR EXPRESSIONS ######################
header_pattern = re.compile('<p\s*(align=justify)??>\s*<strong>(?P<header>.*?)</strong>\s*</p>')
header_pattern2 = re.compile('<p\s*(class="articleHeader").*?>(?P<header>.*?)</p>')
para_pattern = re.compile('<p.*?>(?P<paragraph>.+?)</p>', re.DOTALL)
img_pattern = re.compile('<img.*?src=("|\')?(?P<image>\S+?)("|\'|\s).*?/>', re.DOTALL)
image_pattern = re.compile(r'''
(<a\s*href=["']?(?P<hyperlink>\S*)["']?>)?# get the link location for the image
\s*# after each tag we can have arbitrary whitespaces
<center># the image is always centered
\s*
<img\s*(class=["']imageScale["'])*?\s*src=(?P<image>\S*)\s*border=1\s*(/)?># getting the image itself
\s*
</center>
\s*
(</a>)?
(<br />|<br />|<br/>)*# the caption can be separated by any nr of line breaks
(
<b>
\s*
<i>
\s*
<center>(?P<caption>.*?)</center># getting the caption
\s*
</i>
\s*
</b>
)?''', re.DOTALL | re.VERBOSE | re.IGNORECASE )
#'
############################## FEATURED RECORDS ######################
def get_featured_records(journal_name):
"""
Returns the 'featured' records i.e. records chosen to be displayed
with an image on the main page, in the widgets section, for the
given journal.
parameter:
journal_name - (str) the name of the journal for which we want
to get the featured records
returns:
list of tuples (recid, img_url)
"""
try:
feature_file = open('%s/webjournal/%s/featured_record' % \
(CFG_ETCDIR, journal_name))
except:
return []
records = feature_file.readlines()
return [(record.split('---', 1)[0], record.split('---', 1)[1]) \
for record in records if "---" in record]
def add_featured_record(journal_name, recid, img_url):
"""
Adds the given record to the list of featured records of the given
journal.
parameters:
journal_name - (str) the name of the journal to which the record
should be added.
recid - (int) the record id of the record to be featured.
img_url - (str) a url to an image icon displayed along the
featured record.
returns:
0 if everything went ok
1 if record is already in the list
2 if other problems
"""
# Check that record is not already there
featured_records = get_featured_records(journal_name)
for featured_recid, featured_img in featured_records:
if featured_recid == str(recid):
return 1
try:
fptr = open('%s/webjournal/%s/featured_record'
% (CFG_ETCDIR, journal_name), "a")
fptr.write(str(recid) + '---' + img_url + '\n')
fptr.close()
except:
return 2
return 0
def remove_featured_record(journal_name, recid):
"""
Removes the given record from the list of featured records of the
given journal.
parameters:
journal_name - (str) the name of the journal to which the record
should be added.
recid - (int) the record id of the record to be featured.
"""
featured_records = get_featured_records(journal_name)
try:
fptr = open('%s/webjournal/%s/featured_record'
% (CFG_ETCDIR, journal_name), "w")
for featured_recid, featured_img in featured_records:
if str(featured_recid) != str(recid):
fptr.write(str(featured_recid) + '---' + featured_img + \
'\n')
fptr.close()
except:
return 1
return 0
############################ ARTICLES RELATED ########################
def get_order_dict_from_recid_list(recids, journal_name, issue_number,
newest_first=False,
newest_only=False):
"""
Returns the ordered list of input recids, for given
'issue_number'.
Since there might be several articles at the same position, the
returned structure is a dictionary with keys being order number
indicated in record metadata, and values being list of recids for
this order number (recids for one position are ordered from
highest to lowest recid).
Eg: {'1': [2390, 2386, 2385],
'3': [2388],
'2': [2389],
'4': [2387]}
Parameters:
recids - a list of all recid's that should be brought
into order
journal_name - the name of the journal
issue_number - *str* the issue_number for which we are
deriving the order
newest_first - *bool* if True, new articles should be placed
at beginning of the list. If so, their
position/order will be negative integers
newest_only - *bool* if only new articles should be returned
Returns:
ordered_records: a dictionary with the recids ordered by
keys
"""
ordered_records = {}
ordered_new_records = {}
records_without_defined_order = []
new_records_without_defined_order = []
for record in recids:
temp_rec = BibFormatObject(record)
articles_info = temp_rec.fields('773__')
for article_info in articles_info:
if article_info.get('n', '') == issue_number or \
'0' + article_info.get('n', '') == issue_number:
if article_info.has_key('c') and \
article_info['c'].isdigit():
order_number = int(article_info.get('c', ''))
if (newest_first or newest_only) and \
is_new_article(journal_name, issue_number, record):
if ordered_new_records.has_key(order_number):
ordered_new_records[order_number].append(record)
else:
ordered_new_records[order_number] = [record]
elif not newest_only:
if ordered_records.has_key(order_number):
ordered_records[order_number].append(record)
else:
ordered_records[order_number] = [record]
else:
# No order? No problem! Append it at the end.
if newest_first and is_new_article(journal_name, issue_number, record):
new_records_without_defined_order.append(record)
elif not newest_only:
records_without_defined_order.append(record)
# Append records without order at the end of the list
if records_without_defined_order:
if ordered_records:
ordered_records[max(ordered_records.keys()) + 1] = records_without_defined_order
else:
ordered_records[1] = records_without_defined_order
# Append new records without order at the end of the list of new
# records
if new_records_without_defined_order:
if ordered_new_records:
ordered_new_records[max(ordered_new_records.keys()) + 1] = new_records_without_defined_order
else:
ordered_new_records[1] = new_records_without_defined_order
# Append new records at the beginning of the list of 'old'
# records. To do so, use negative integers
if ordered_new_records:
highest_new_record_order = max(ordered_new_records.keys())
for order, new_records in ordered_new_records.iteritems():
ordered_records[- highest_new_record_order + order - 1] = new_records
for (order, records) in ordered_records.iteritems():
# Reverse so that if there are several articles at same
# positon, newest appear first
records.reverse()
return ordered_records
def get_journal_articles(journal_name, issue, category,
newest_first=False, newest_only=False):
"""
Returns the recids in given category and journal, for given issue
number. The returned recids are grouped according to their 773__c
field.
Example of returned value:
{'1': [2390, 2386, 2385],
'3': [2388],
'2': [2389],
'4': [2387]}
Parameters:
journal_name - *str* the name of the journal (as used in URLs)
issue - *str* the issue. Eg: "08/2007"
category - *str* the name of the category
newest_first - *bool* if True, new articles should be placed
at beginning of the list. If so, their
position/order will be negative integers
newest_only - *bool* if only new articles should be returned
"""
use_cache = True
current_issue = get_current_issue(CFG_SITE_LANG, journal_name)
if issue_is_later_than(issue, current_issue):
# If we are working on unreleased issue, do not use caching
# mechanism
use_cache = False
if use_cache:
cached_articles = _get_cached_journal_articles(journal_name, issue, category)
if cached_articles is not None:
ordered_articles = get_order_dict_from_recid_list(cached_articles,
journal_name,
issue,
newest_first,
newest_only)
return ordered_articles
# Retrieve the list of rules that map Category -> Search Pattern.
# Keep only the rule matching our category
config_strings = get_xml_from_config(["record/rule"], journal_name)
category_to_search_pattern_rules = config_strings["record/rule"]
try:
matching_rule = [rule.split(',', 1) for rule in \
category_to_search_pattern_rules \
if rule.split(',')[0] == category]
except:
return []
recids_issue = search_pattern(p='773__n:%s -980:DELETED' % issue)
recids_rule = search_pattern(p=matching_rule[0][1])
if issue[0] == '0':
# search for 09/ and 9/
recids_issue.union_update(search_pattern(p='773__n:%s -980:DELETED' % issue.lstrip('0')))
recids_rule.intersection_update(recids_issue)
recids = [recid for recid in recids_rule if record_exists(recid) == 1]
if use_cache:
_cache_journal_articles(journal_name, issue, category, recids)
ordered_articles = get_order_dict_from_recid_list(recids,
journal_name,
issue,
newest_first,
newest_only)
return ordered_articles
def _cache_journal_articles(journal_name, issue, category, articles):
"""
Caches given articles IDs.
"""
journal_cache_path = get_journal_article_cache_path(journal_name,
issue)
try:
journal_cache_file = open(journal_cache_path, 'r')
journal_info = cPickle.load(journal_cache_file)
journal_cache_file.close()
except cPickle.PickleError, e:
journal_info = {}
except IOError:
journal_info = {}
except EOFError:
journal_info = {}
except ValueError:
journal_info = {}
if not journal_info.has_key('journal_articles'):
journal_info['journal_articles'] = {}
journal_info['journal_articles'][category] = articles
# Create cache directory if it does not exist
journal_cache_dir = os.path.dirname(journal_cache_path)
if not os.path.exists(journal_cache_dir):
try:
os.makedirs(journal_cache_dir)
except:
return False
journal_cache_file = open(journal_cache_path, 'w')
cPickle.dump(journal_info, journal_cache_file)
journal_cache_file.close()
return True
def _get_cached_journal_articles(journal_name, issue, category):
"""
Retrieve the articles IDs cached for this journal.
Returns None if cache does not exist or more than 5 minutes old
"""
# Check if our cache is more or less up-to-date (not more than 5
# minutes old)
try:
journal_cache_path = get_journal_article_cache_path(journal_name,
issue)
last_update = os.path.getctime(journal_cache_path)
except Exception, e :
return None
now = time.time()
if (last_update + 5*60) < now:
return None
# Get from cache
try:
journal_cache_file = open(journal_cache_path, 'r')
journal_info = cPickle.load(journal_cache_file)
journal_articles = journal_info.get('journal_articles', {}).get(category, None)
journal_cache_file.close()
except cPickle.PickleError, e:
journal_articles = None
except IOError:
journal_articles = None
except EOFError:
journal_articles = None
except ValueError:
journal_articles = None
return journal_articles
def is_new_article(journal_name, issue, recid):
"""
Check if given article should be considered as new or not.
New articles are articles that have never appeared in older issues
than given one.
"""
article_found_in_older_issue = False
temp_rec = BibFormatObject(recid)
publication_blocks = temp_rec.fields('773__')
for publication_block in publication_blocks:
this_issue_number, this_issue_year = issue.split('/')
issue_number, issue_year = publication_block.get('n', '/').split('/', 1)
if int(issue_year) < int(this_issue_year):
# Found an older issue
article_found_in_older_issue = True
break
elif int(issue_year) == int(this_issue_year) and \
int(issue_number) < int(this_issue_number):
# Found an older issue
article_found_in_older_issue = True
break
return not article_found_in_older_issue
############################ CATEGORIES RELATED ######################
def get_journal_categories(journal_name, issue=None):
"""
List the categories for the given journal and issue.
Returns categories in same order as in config file.
Parameters:
journal_name - *str* the name of the journal (as used in URLs)
issue - *str* the issue. Eg:'08/2007'. If None, consider
all categories defined in journal config
"""
categories = []
current_issue = get_current_issue(CFG_SITE_LANG, journal_name)
config_strings = get_xml_from_config(["record/rule"], journal_name)
all_categories = [rule.split(',')[0] for rule in \
config_strings["record/rule"]]
if issue is None:
return all_categories
for category in all_categories:
recids = get_journal_articles(journal_name,
issue,
category)
if len(recids.keys()) > 0:
categories.append(category)
return categories
def get_category_query(journal_name, category):
"""
Returns the category definition for the given category and journal name
Parameters:
journal_name - *str* the name of the journal (as used in URLs)
categoy - *str* a category name, as found in the XML config
"""
config_strings = get_xml_from_config(["record/rule"], journal_name)
category_to_search_pattern_rules = config_strings["record/rule"]
try:
matching_rule = [rule.split(',', 1)[1].strip() for rule in \
category_to_search_pattern_rules \
if rule.split(',')[0] == category]
except:
return None
return matching_rule[0]
######################### JOURNAL CONFIG VARS ######################
cached_parsed_xml_config = {}
def get_xml_from_config(nodes, journal_name):
"""
Returns values from the journal configuration file.
The needed values can be specified by node name, or by a hierarchy
of nodes names using '/' as character to mean 'descendant of'.
Eg. 'record/rule' to get all the values of 'rule' tags inside the
'record' node
Returns a dictionary with a key for each query and a list of
strings (innerXml) results for each key.
Has a special field "config_fetching_error" that returns an error when
something has gone wrong.
"""
# Get and open the config file
results = {}
if cached_parsed_xml_config.has_key(journal_name):
config_file = cached_parsed_xml_config[journal_name]
else:
config_path = '%s/webjournal/%s/%s-config.xml' % \
(CFG_ETCDIR, journal_name, journal_name)
config_file = minidom.Document
try:
config_file = minidom.parse("%s" % config_path)
except:
# todo: raise exception "error: no config file found"
results["config_fetching_error"] = "could not find config file"
return results
else:
cached_parsed_xml_config[journal_name] = config_file
for node_path in nodes:
node = config_file
for node_path_component in node_path.split('/'):
# pylint: disable=E1103
# The node variable can be rewritten in the loop and therefore
# its type can change.
if node != config_file and node.length > 0:
# We have a NodeList object: consider only first child
node = node.item(0)
# pylint: enable=E1103
try:
node = node.getElementsByTagName(node_path_component)
except:
# WARNING, config did not have such value
node = []
break
results[node_path] = []
for result in node:
try:
result_string = result.firstChild.toxml(encoding="utf-8")
except:
# WARNING, config did not have such value
continue
results[node_path].append(result_string)
return results
def get_journal_issue_field(journal_name):
"""
Returns the MARC field in which this journal expects to find
the issue number. Read this from the journal config file
Parameters:
journal_name - *str* the name of the journal (as used in URLs)
"""
config_strings = get_xml_from_config(["issue_number"], journal_name)
try:
issue_field = config_strings["issue_number"][0]
except:
issue_field = '773__n'
return issue_field
def get_journal_css_url(journal_name, type='screen'):
"""
Returns URL to this journal's CSS.
Parameters:
journal_name - *str* the name of the journal (as used in URLs)
type - *str* 'screen' or 'print', depending on the kind
of CSS
"""
config_strings = get_xml_from_config([type], journal_name)
css_path = ''
try:
css_path = config_strings["screen"][0]
except Exception:
register_exception(req=None,
suffix="No css file for journal %s. Is this right?" % \
journal_name)
return CFG_SITE_URL + '/' + css_path
def get_journal_submission_params(journal_name):
"""
Returns the (doctype, identifier element, identifier field) for
the submission of articles in this journal, so that it is possible
to build direct submission links.
Parameter:
journal_name - *str* the name of the journal (as used in URLs)
"""
doctype = ''
identifier_field = ''
identifier_element = ''
config_strings = get_xml_from_config(["submission/doctype"], journal_name)
if config_strings.get('submission/doctype', ''):
doctype = config_strings['submission/doctype'][0]
config_strings = get_xml_from_config(["submission/identifier_element"], journal_name)
if config_strings.get('submission/identifier_element', ''):
identifier_element = config_strings['submission/identifier_element'][0]
config_strings = get_xml_from_config(["submission/identifier_field"], journal_name)
if config_strings.get('submission/identifier_field', ''):
identifier_field = config_strings['submission/identifier_field'][0]
else:
identifier_field = '037__a'
return (doctype, identifier_element, identifier_field)
def get_journal_draft_keyword_to_remove(journal_name):
"""
Returns the keyword that should be removed from the article
metadata in order to move the article from Draft to Ready
"""
config_strings = get_xml_from_config(["draft_keyword"], journal_name)
if config_strings.get('draft_keyword', ''):
return config_strings['draft_keyword'][0]
return ''
def get_journal_alert_sender_email(journal_name):
"""
Returns the email address that should be used as send of the alert
email.
If not specified, use CFG_SITE_SUPPORT_EMAIL
"""
config_strings = get_xml_from_config(["alert_sender"], journal_name)
if config_strings.get('alert_sender', ''):
return config_strings['alert_sender'][0]
return CFG_SITE_SUPPORT_EMAIL
def get_journal_alert_recipient_email(journal_name):
"""
Returns the default email address of the recipients of the email
Return a string of comma-separated emails.
"""
if CFG_DEVEL_SITE:
# To be on the safe side, do not return the default alert recipients.
return ''
config_strings = get_xml_from_config(["alert_recipients"], journal_name)
if config_strings.get('alert_recipients', ''):
return config_strings['alert_recipients'][0]
return ''
def get_journal_collection_to_refresh_on_release(journal_name):
"""
Returns the list of collection to update (WebColl) upon release of
an issue.
"""
from invenio.legacy.search_engine import collection_reclist_cache
config_strings = get_xml_from_config(["update_on_release/collection"], journal_name)
return [coll for coll in config_strings.get('update_on_release/collection', []) if \
collection_reclist_cache.cache.has_key(coll)]
def get_journal_index_to_refresh_on_release(journal_name):
"""
Returns the list of indexed to update (BibIndex) upon release of
an issue.
"""
- from invenio.bibindex_engine_utils import get_index_id_from_index_name
+ from invenio.legacy.bibindex.engine import get_index_id_from_index_name
config_strings = get_xml_from_config(["update_on_release/index"], journal_name)
return [index for index in config_strings.get('update_on_release/index', []) if \
get_index_id_from_index_name(index) != '']
def get_journal_template(template, journal_name, ln=CFG_SITE_LANG):
"""
Returns the journal templates name for the given template type
Raise an exception if template cannot be found.
"""
from invenio.webjournal_config import \
InvenioWebJournalTemplateNotFoundError
config_strings = get_xml_from_config([template], journal_name)
try:
index_page_template = 'webjournal' + os.sep + \
config_strings[template][0]
except:
raise InvenioWebJournalTemplateNotFoundError(ln,
journal_name,
template)
return index_page_template
def get_journal_name_intl(journal_name, ln=CFG_SITE_LANG):
"""
Returns the nice name of the journal, translated if possible
"""
_ = gettext_set_language(ln)
config_strings = get_xml_from_config(["niceName"], journal_name)
if config_strings.get('niceName', ''):
return _(config_strings['niceName'][0])
return ''
def get_journal_languages(journal_name):
"""
Returns the list of languages defined for this journal
"""
config_strings = get_xml_from_config(["languages"], journal_name)
if config_strings.get('languages', ''):
return [ln.strip() for ln in \
config_strings['languages'][0].split(',')]
return []
def get_journal_issue_grouping(journal_name):
"""
Returns the number of issue that are typically released at the
same time.
This is used if every two weeks you release an issue that should
contains issue of next 2 weeks (eg. at week 16, you relase an
issue named '16-17/2009')
This number should help in the admin interface to guess how to
release the next issue (can be overidden by user).
"""
config_strings = get_xml_from_config(["issue_grouping"], journal_name)
if config_strings.get('issue_grouping', ''):
issue_grouping = config_strings['issue_grouping'][0]
if issue_grouping.isdigit() and int(issue_grouping) > 0:
return int(issue_grouping)
return 1
def get_journal_nb_issues_per_year(journal_name):
"""
Returns the default number of issues per year for this journal.
This number should help in the admin interface to guess the next
issue number (can be overidden by user).
"""
config_strings = get_xml_from_config(["issues_per_year"], journal_name)
if config_strings.get('issues_per_year', ''):
issues_per_year = config_strings['issues_per_year'][0]
if issues_per_year.isdigit() and int(issues_per_year) > 0:
return int(issues_per_year)
return 52
def get_journal_preferred_language(journal_name, ln):
"""
Returns the most adequate language to display the journal, given a
language.
"""
languages = get_journal_languages(journal_name)
if ln in languages:
return ln
elif CFG_SITE_LANG in languages:
return CFG_SITE_LANG
elif languages:
return languages
else:
return CFG_SITE_LANG
def get_unreleased_issue_hiding_mode(journal_name):
"""
Returns how unreleased issue should be treated. Can be one of the
following string values:
'future' - only future unreleased issues are hidden. Past
unreleased one can be viewed
'all' - any unreleased issue (past and future) have to be
hidden
- 'none' - no unreleased issue is hidden
"""
config_strings = get_xml_from_config(["hide_unreleased_issues"], journal_name)
if config_strings.get('hide_unreleased_issues', ''):
hide_unreleased_issues = config_strings['hide_unreleased_issues'][0]
if hide_unreleased_issues in ['future', 'all', 'none']:
return hide_unreleased_issues
return 'all'
def get_first_issue_from_config(journal_name):
"""
Returns the first issue as defined from config. This should only
be useful when no issue have been released.
If not specified, returns the issue made of current week number
and year.
"""
config_strings = get_xml_from_config(["first_issue"], journal_name)
if config_strings.has_key('first_issue'):
return config_strings['first_issue'][0]
return time.strftime("%W/%Y", time.localtime())
######################## TIME / ISSUE FUNCTIONS ######################
def get_current_issue(ln, journal_name):
"""
Returns the current issue of a journal as a string.
Current issue is the latest released issue.
"""
journal_id = get_journal_id(journal_name, ln)
try:
current_issue = run_sql("""SELECT issue_number
FROM jrnISSUE
WHERE date_released <= NOW()
AND id_jrnJOURNAL=%s
ORDER BY date_released DESC
LIMIT 1""",
(journal_id,))[0][0]
except:
# start the first journal ever
current_issue = get_first_issue_from_config(journal_name)
run_sql("""INSERT INTO jrnISSUE (id_jrnJOURNAL, issue_number, issue_display)
VALUES(%s, %s, %s)""",
(journal_id,
current_issue,
current_issue))
return current_issue
def get_all_released_issues(journal_name):
"""
Returns the list of released issue, ordered by release date
Note that it only includes the issues that are considered as
released in the DB: it will not for example include articles that
have been imported in the system but not been released
"""
journal_id = get_journal_id(journal_name)
res = run_sql("""SELECT issue_number
FROM jrnISSUE
WHERE id_jrnJOURNAL = %s
AND UNIX_TIMESTAMP(date_released) != 0
ORDER BY date_released DESC""",
(journal_id,))
if res:
return [row[0] for row in res]
else:
return []
def get_next_journal_issues(current_issue_number, journal_name, n=2):
"""
This function suggests the 'n' next issue numbers
"""
number, year = current_issue_number.split('/', 1)
number = int(number)
year = int(year)
number_issues_per_year = get_journal_nb_issues_per_year(journal_name)
next_issues = [make_issue_number(journal_name,
((number - 1 + i) % (number_issues_per_year)) + 1,
year + ((number - 1 + i) / number_issues_per_year)) \
for i in range(1, n + 1)]
return next_issues
def get_grouped_issues(journal_name, issue_number):
"""
Returns all the issues grouped with a given one.
Issues are sorted from the oldest to newest one.
"""
grouped_issues = []
journal_id = get_journal_id(journal_name, CFG_SITE_LANG)
issue_display = get_issue_number_display(issue_number, journal_name)
res = run_sql("""SELECT issue_number
FROM jrnISSUE
WHERE id_jrnJOURNAL=%s AND issue_display=%s""",
(journal_id,
issue_display))
if res:
grouped_issues = [row[0] for row in res]
grouped_issues.sort(compare_issues)
return grouped_issues
def compare_issues(issue1, issue2):
"""
Comparison function for issues.
Returns:
-1 if issue1 is older than issue2
0 if issues are equal
1 if issue1 is newer than issue2
"""
issue1_number, issue1_year = issue1.split('/', 1)
issue2_number, issue2_year = issue2.split('/', 1)
if int(issue1_year) == int(issue2_year):
return cmp(int(issue1_number), int(issue2_number))
else:
return cmp(int(issue1_year), int(issue2_year))
def issue_is_later_than(issue1, issue2):
"""
Returns true if issue1 is later than issue2
"""
issue_number1, issue_year1 = issue1.split('/', 1)
issue_number2, issue_year2 = issue2.split('/', 1)
if int(issue_year1) > int(issue_year2):
return True
elif int(issue_year1) == int(issue_year2):
return int(issue_number1) > int(issue_number2)
else:
return False
def get_issue_number_display(issue_number, journal_name,
ln=CFG_SITE_LANG):
"""
Returns the display string for a given issue number.
"""
journal_id = get_journal_id(journal_name, ln)
issue_display = run_sql("""SELECT issue_display
FROM jrnISSUE
WHERE issue_number=%s
AND id_jrnJOURNAL=%s""",
(issue_number, journal_id))
if issue_display:
return issue_display[0][0]
else:
# Not yet released...
return issue_number
def make_issue_number(journal_name, number, year, for_url_p=False):
"""
Creates a normalized issue number representation with given issue
number (as int or str) and year (as int or str).
Reverse the year and number if for_url_p is True
"""
number_issues_per_year = get_journal_nb_issues_per_year(journal_name)
precision = len(str(number_issues_per_year))
number = int(str(number))
year = int(str(year))
if for_url_p:
return ("%i/%0" + str(precision) + "i") % \
(year, number)
else:
return ("%0" + str(precision) + "i/%i") % \
(number, year)
def get_release_datetime(issue, journal_name, ln=CFG_SITE_LANG):
"""
Gets the date at which an issue was released from the DB.
Returns None if issue has not yet been released.
See issue_to_datetime() to get the *theoretical* release time of an
issue.
"""
journal_id = get_journal_id(journal_name, ln)
try:
release_date = run_sql("""SELECT date_released
FROM jrnISSUE
WHERE issue_number=%s
AND id_jrnJOURNAL=%s""",
(issue, journal_id))[0][0]
except:
return None
if release_date:
return release_date
else:
return None
def get_announcement_datetime(issue, journal_name, ln=CFG_SITE_LANG):
"""
Get the date at which an issue was announced through the alert system.
Return None if not announced
"""
journal_id = get_journal_id(journal_name, ln)
try:
announce_date = run_sql("""SELECT date_announced
FROM jrnISSUE
WHERE issue_number=%s
AND id_jrnJOURNAL=%s""",
(issue, journal_id))[0][0]
except:
return None
if announce_date:
return announce_date
else:
return None
def datetime_to_issue(issue_datetime, journal_name):
"""
Returns the issue corresponding to the given datetime object.
If issue_datetime is too far in the future or in the past, gives
the best possible matching issue, or None, if it does not seem to
exist.
#If issue_datetime is too far in the future, return the latest
#released issue.
#If issue_datetime is too far in the past, return None
Parameters:
issue_datetime - *datetime* date of the issue to be retrieved
journal_name - *str* the name of the journal (as used in URLs)
"""
issue_number = None
journal_id = get_journal_id(journal_name)
# Try to discover how much days an issue is valid
nb_issues_per_year = get_journal_nb_issues_per_year(journal_name)
this_year_number_of_days = 365
if calendar.isleap(issue_datetime.year):
this_year_number_of_days = 366
issue_day_lifetime = math.ceil(float(this_year_number_of_days)/nb_issues_per_year)
res = run_sql("""SELECT issue_number, date_released
FROM jrnISSUE
WHERE date_released < %s
AND id_jrnJOURNAL = %s
ORDER BY date_released DESC LIMIT 1""",
(issue_datetime, journal_id))
if res and res[0][1]:
issue_number = res[0][0]
issue_release_date = res[0][1]
# Check that the result is not too far in the future:
if issue_release_date + datetime.timedelta(issue_day_lifetime) < issue_datetime:
# In principle, the latest issue will no longer be valid
# at that time
return None
else:
# Mmh, are we too far in the past? This can happen in the case
# of articles that have been imported in the system but never
# considered as 'released' in the database. So we should still
# try to approximate/match an issue:
if round(issue_day_lifetime) in [6, 7, 8]:
# Weekly issues. We can use this information to better
# match the issue number
issue_nb = int(issue_datetime.strftime('%W')) # = week number
else:
# Compute the number of days since beginning of year, and
# divide by the lifetime of an issue: we get the
# approximate issue_number
issue_nb = math.ceil((int(issue_datetime.strftime('%j')) / issue_day_lifetime))
issue_number = ("%0" + str(len(str(nb_issues_per_year)))+ "i/%i") % (issue_nb, issue_datetime.year)
# Now check if this issue exists in the system for this
# journal
if not get_journal_categories(journal_name, issue_number):
# This issue did not exist
return None
return issue_number
DAILY = 1
WEEKLY = 2
MONTHLY = 3
def issue_to_datetime(issue_number, journal_name, granularity=None):
"""
Returns the *theoretical* date of release for given issue: useful
if you release on Friday, but the issue date of the journal
should correspond to the next Monday.
This will correspond to the next day/week/month, depending on the
number of issues per year (or the 'granularity' if specified) and
the release time (if close to the end of a period defined by the
granularity, consider next period since release is made a bit in
advance).
See get_release_datetime() for the *real* release time of an issue
THIS FUNCTION SHOULD ONLY BE USED FOR INFORMATIVE DISPLAY PURPOSE,
AS IT GIVES APPROXIMATIVE RESULTS. Do not use it to make decisions.
Parameters:
issue_number - *str* issue number to consider
journal_name - *str* the name of the journal (as used in URLs)
granularity - *int* the granularity to consider
"""
# If we have released, we can use this information. Otherwise we
# have to approximate.
issue_date = get_release_datetime(issue_number, journal_name)
if not issue_date:
# Approximate release date
number, year = issue_number.split('/')
number = int(number)
year = int(year)
nb_issues_per_year = get_journal_nb_issues_per_year(journal_name)
this_year_number_of_days = 365
if calendar.isleap(year):
this_year_number_of_days = 366
issue_day_lifetime = float(this_year_number_of_days)/nb_issues_per_year
# Compute from beginning of the year
issue_date = datetime.datetime(year, 1, 1) + \
datetime.timedelta(days=int(round((number - 1) * issue_day_lifetime)))
# Okay, but if last release is not too far in the past, better
# compute from the release.
current_issue = get_current_issue(CFG_SITE_LANG, journal_name)
current_issue_time = get_release_datetime(current_issue, journal_name)
if current_issue_time.year == issue_date.year:
current_issue_number, current_issue_year = current_issue.split('/')
current_issue_number = int(current_issue_number)
# Compute from last release
issue_date = current_issue_time + \
datetime.timedelta(days=int((number - current_issue_number) * issue_day_lifetime))
# If granularity is not specifed, deduce from config
if granularity is None:
nb_issues_per_year = get_journal_nb_issues_per_year(journal_name)
if nb_issues_per_year > 250:
granularity = DAILY
elif nb_issues_per_year > 40:
granularity = WEEKLY
else:
granularity = MONTHLY
# Now we can adapt the date to match the granularity
if granularity == DAILY:
if issue_date.hour >= 15:
# If released after 3pm, consider it is the issue of the next
# day
issue_date = issue_date + datetime.timedelta(days=1)
elif granularity == WEEKLY:
(year, week_nb, day_nb) = issue_date.isocalendar()
if day_nb > 4:
# If released on Fri, Sat or Sun, consider that it is next
# week's issue.
issue_date = issue_date + datetime.timedelta(weeks=1)
# Get first day of the week
issue_date = issue_date - datetime.timedelta(days=issue_date.weekday())
else:
if issue_date.day > 22:
# If released last week of the month, consider release for
# next month
issue_date = issue_date.replace(month=issue_date.month+1)
date_string = issue_date.strftime("%Y %m 1")
issue_date = datetime.datetime(*(time.strptime(date_string, "%Y %m %d")[0:6]))
return issue_date
def get_number_of_articles_for_issue(issue, journal_name, ln=CFG_SITE_LANG):
"""
Function that returns a dictionary with all categories and number of
articles in each category.
"""
all_articles = {}
categories = get_journal_categories(journal_name, issue)
for category in categories:
all_articles[category] = len(get_journal_articles(journal_name, issue, category))
return all_articles
########################## JOURNAL RELATED ###########################
def get_journal_info_path(journal_name):
"""
Returns the path to the info file of the given journal. The info
file should be used to get information about a journal when database
is not available.
Returns None if path cannot be determined
"""
# We must make sure we don't try to read outside of webjournal
# cache dir
info_path = os.path.abspath("%s/webjournal/%s/info.dat" % \
(CFG_CACHEDIR, journal_name))
if info_path.startswith(CFG_CACHEDIR + '/webjournal/'):
return info_path
else:
return None
def get_journal_article_cache_path(journal_name, issue):
"""
Returns the path to cache file of the articles of a given issue
Returns None if path cannot be determined
"""
# We must make sure we don't try to read outside of webjournal
# cache dir
issue_number, year = issue.replace('/', '_').split('_', 1)
cache_path = os.path.abspath("%s/webjournal/%s/%s/%s/articles_cache.dat" % \
(CFG_CACHEDIR, journal_name,
year, issue_number))
if cache_path.startswith(CFG_CACHEDIR + '/webjournal/'):
return cache_path
else:
return None
def get_journal_id(journal_name, ln=CFG_SITE_LANG):
"""
Get the id for this journal from the DB. If DB is down, try to get
from cache.
"""
journal_id = None
from invenio.webjournal_config import InvenioWebJournalJournalIdNotFoundDBError
if CFG_ACCESS_CONTROL_LEVEL_SITE == 2:
# do not connect to the database as the site is closed for
# maintenance:
journal_info_path = get_journal_info_path(journal_name)
try:
journal_info_file = open(journal_info_path, 'r')
journal_info = cPickle.load(journal_info_file)
journal_id = journal_info.get('journal_id', None)
except cPickle.PickleError, e:
journal_id = None
except IOError:
journal_id = None
except ValueError:
journal_id = None
else:
try:
res = run_sql("SELECT id FROM jrnJOURNAL WHERE name=%s",
(journal_name,))
if len(res) > 0:
journal_id = res[0][0]
except OperationalError, e:
# Cannot connect to database. Try to read from cache
journal_info_path = get_journal_info_path(journal_name)
try:
journal_info_file = open(journal_info_path, 'r')
journal_info = cPickle.load(journal_info_file)
journal_id = journal_info['journal_id']
except cPickle.PickleError, e:
journal_id = None
except IOError:
journal_id = None
except ValueError:
journal_id = None
if journal_id is None:
raise InvenioWebJournalJournalIdNotFoundDBError(ln, journal_name)
return journal_id
def guess_journal_name(ln, journal_name=None):
"""
Tries to take a guess what a user was looking for on the server if
not providing a name for the journal, or if given journal name
does not match case of original journal.
"""
from invenio.webjournal_config import InvenioWebJournalNoJournalOnServerError
from invenio.webjournal_config import InvenioWebJournalNoNameError
journals_id_and_names = get_journals_ids_and_names()
if len(journals_id_and_names) == 0:
raise InvenioWebJournalNoJournalOnServerError(ln)
elif not journal_name and \
journals_id_and_names[0].has_key('journal_name'):
return journals_id_and_names[0]['journal_name']
elif len(journals_id_and_names) > 0:
possible_journal_names = [journal_id_and_name['journal_name'] for journal_id_and_name \
in journals_id_and_names \
if journal_id_and_name.get('journal_name', '').lower() == journal_name.lower()]
if possible_journal_names:
return possible_journal_names[0]
else:
raise InvenioWebJournalNoNameError(ln)
else:
raise InvenioWebJournalNoNameError(ln)
def get_journals_ids_and_names():
"""
Returns the list of existing journals IDs and names. Try to read
from the DB, or from cache if DB is not accessible.
"""
journals = []
if CFG_ACCESS_CONTROL_LEVEL_SITE == 2:
# do not connect to the database as the site is closed for
# maintenance:
files = os.listdir("%s/webjournal" % CFG_CACHEDIR)
info_files = [path + os.sep + 'info.dat' for path in files if \
os.path.isdir(path) and \
os.path.exists(path + os.sep + 'info.dat')]
for info_file in info_files:
try:
journal_info_file = open(info_file, 'r')
journal_info = cPickle.load(journal_info_file)
journal_id = journal_info.get('journal_id', None)
journal_name = journal_info.get('journal_name', None)
current_issue = journal_info.get('current_issue', None)
if journal_id is not None and \
journal_name is not None:
journals.append({'journal_id': journal_id,
'journal_name': journal_name,
'current_issue': current_issue})
except cPickle.PickleError, e:
# Well, can't do anything...
continue
except IOError:
# Well, can't do anything...
continue
except ValueError:
continue
else:
try:
res = run_sql("SELECT id, name FROM jrnJOURNAL ORDER BY id")
for journal_id, journal_name in res:
journals.append({'journal_id': journal_id,
'journal_name': journal_name})
except OperationalError, e:
# Cannot connect to database. Try to read from cache
files = os.listdir("%s/webjournal" % CFG_CACHEDIR)
info_files = [path + os.sep + 'info.dat' for path in files if \
os.path.isdir(path) and \
os.path.exists(path + os.sep + 'info.dat')]
for info_file in info_files:
try:
journal_info_file = open(info_file, 'r')
journal_info = cPickle.load(journal_info_file)
journal_id = journal_info.get('journal_id', None)
journal_name = journal_info.get('journal_name', None)
current_issue = journal_info.get('current_issue', None)
if journal_id is not None and \
journal_name is not None:
journals.append({'journal_id': journal_id,
'journal_name': journal_name,
'current_issue': current_issue})
except cPickle.PickleError, e:
# Well, can't do anything...
continue
except IOError:
# Well, can't do anything...
continue
except ValueError:
continue
return journals
def parse_url_string(uri):
"""
Centralized function to parse any url string given in
webjournal. Useful to retrieve current category, journal,
etc. from within format elements
The webjournal interface handler should already have cleaned the
URI beforehand, so that journal name exist, issue number is
correct, etc. The only remaining problem might be due to the
capitalization of journal name in contact, search and popup pages,
so clean the journal name. Note that language is also as returned
from the URL, which might need to be filtered to match available
languages (WebJournal elements can rely in bfo.lang to retrieve
washed language)
returns:
args: all arguments in dict form
"""
args = {'journal_name' : '',
'issue_year' : '',
'issue_number' : None,
'issue' : None,
'category' : '',
'recid' : -1,
'verbose' : 0,
'ln' : CFG_SITE_LANG,
'archive_year' : None,
'archive_search': ''}
if not uri.startswith('/journal'):
# Mmh, incorrect context. Still, keep language if available
url_params = urlparse(uri)[4]
args['ln'] = dict([part.split('=') for part in url_params.split('&') \
if len(part.split('=')) == 2]).get('ln', CFG_SITE_LANG)
return args
# Take everything after journal and before first question mark
splitted_uri = uri.split('journal', 1)
second_part = splitted_uri[1]
splitted_uri = second_part.split('?')
uri_middle_part = splitted_uri[0]
uri_arguments = ''
if len(splitted_uri) > 1:
uri_arguments = splitted_uri[1]
arg_list = uri_arguments.split("&")
args['ln'] = CFG_SITE_LANG
args['verbose'] = 0
for arg_pair in arg_list:
arg_and_value = arg_pair.split('=')
if len(arg_and_value) == 2:
if arg_and_value[0] == 'ln':
args['ln'] = arg_and_value[1]
elif arg_and_value[0] == 'verbose' and \
arg_and_value[1].isdigit():
args['verbose'] = int(arg_and_value[1])
elif arg_and_value[0] == 'archive_year' and \
arg_and_value[1].isdigit():
args['archive_year'] = int(arg_and_value[1])
elif arg_and_value[0] == 'archive_search':
args['archive_search'] = arg_and_value[1]
elif arg_and_value[0] == 'name':
args['journal_name'] = guess_journal_name(args['ln'],
arg_and_value[1])
arg_list = uri_middle_part.split("/")
if len(arg_list) > 1 and arg_list[1] not in ['search', 'contact', 'popup']:
args['journal_name'] = urllib.unquote(arg_list[1])
elif arg_list[1] not in ['search', 'contact', 'popup']:
args['journal_name'] = guess_journal_name(args['ln'],
args['journal_name'])
cur_issue = get_current_issue(args['ln'], args['journal_name'])
if len(arg_list) > 2:
try:
args['issue_year'] = int(urllib.unquote(arg_list[2]))
except:
args['issue_year'] = int(cur_issue.split('/')[1])
else:
args['issue'] = cur_issue
args['issue_year'] = int(cur_issue.split('/')[1])
args['issue_number'] = int(cur_issue.split('/')[0])
if len(arg_list) > 3:
try:
args['issue_number'] = int(urllib.unquote(arg_list[3]))
except:
args['issue_number'] = int(cur_issue.split('/')[0])
args['issue'] = make_issue_number(args['journal_name'],
args['issue_number'],
args['issue_year'])
if len(arg_list) > 4:
args['category'] = urllib.unquote(arg_list[4])
if len(arg_list) > 5:
try:
args['recid'] = int(urllib.unquote(arg_list[5]))
except:
pass
args['ln'] = get_journal_preferred_language(args['journal_name'],
args['ln'])
# FIXME : wash arguments?
return args
def make_journal_url(current_uri, custom_parameters=None):
"""
Create a URL, using the current URI and overriding values
with the given custom_parameters
Parameters:
current_uri - *str* the current full URI
custom_parameters - *dict* a dictionary of parameters that
should override those of curent_uri
"""
if not custom_parameters:
custom_parameters = {}
default_params = parse_url_string(current_uri)
for key, value in custom_parameters.iteritems():
# Override default params with custom params
default_params[key] = str(value)
uri = CFG_SITE_URL + '/journal/'
if default_params['journal_name']:
uri += urllib.quote(default_params['journal_name']) + '/'
if default_params['issue_year'] and default_params['issue_number']:
uri += make_issue_number(default_params['journal_name'],
default_params['issue_number'],
default_params['issue_year'],
for_url_p=True) + '/'
if default_params['category']:
uri += urllib.quote(default_params['category'])
if default_params['recid'] and \
default_params['recid'] != -1:
uri += '/' + str(default_params['recid'])
printed_question_mark = False
if default_params['ln']:
uri += '?ln=' + default_params['ln']
printed_question_mark = True
if default_params['verbose'] != 0:
if printed_question_mark:
uri += '&amp;verbose=' + str(default_params['verbose'])
else:
uri += '?verbose=' + str(default_params['verbose'])
return uri
############################ HTML CACHING FUNCTIONS ############################
def cache_index_page(html, journal_name, category, issue, ln):
"""
Caches the index page main area of a Bulletin
(right hand menu cannot be cached)
@return: tuple (path to cache file (or None), message)
"""
issue = issue.replace("/", "_")
issue_number, year = issue.split("_", 1)
category = category.replace(" ", "")
cache_path = os.path.abspath('%s/webjournal/%s/%s/%s/index_%s_%s.html' % \
(CFG_CACHEDIR, journal_name,
year, issue_number, category,
ln))
if not cache_path.startswith(CFG_CACHEDIR + '/webjournal'):
# Mmh, not accessing correct path. Stop caching
return (None, 'Trying to cache at wrong location: %s' % cache_path)
cache_path_dir = os.path.dirname(cache_path)
try:
if not os.path.isdir(cache_path_dir):
os.makedirs(cache_path_dir)
cached_file = open(cache_path, "w")
cached_file.write(html)
cached_file.close()
except Exception, e:
register_exception(req=None,
prefix="Could not store index page cache",
alert_admin=True)
return (None, e)
return (cache_path, '')
def get_index_page_from_cache(journal_name, category, issue, ln):
"""
Function to get an index page from the cache.
False if not in cache.
"""
issue = issue.replace("/", "_")
issue_number, year = issue.split("_", 1)
category = category.replace(" ", "")
cache_path = os.path.abspath('%s/webjournal/%s/%s/%s/index_%s_%s.html' % \
(CFG_CACHEDIR, journal_name,
year, issue_number, category, ln))
if not cache_path.startswith(CFG_CACHEDIR + '/webjournal'):
# Mmh, not accessing correct path. Stop reading cache
return False
try:
cached_file = open(cache_path).read()
except:
return False
return cached_file
def cache_article_page(html, journal_name, category, recid, issue, ln):
"""
Caches an article view of a journal.
If cache cannot be written, a warning is reported to the admin.
@return: tuple (path to cache file (or None), message)
"""
issue = issue.replace("/", "_")
issue_number, year = issue.split("_", 1)
category = category.replace(" ", "")
cache_path = os.path.abspath('%s/webjournal/%s/%s/%s/article_%s_%s_%s.html' % \
(CFG_CACHEDIR, journal_name,
year, issue_number, category, recid, ln))
if not cache_path.startswith(CFG_CACHEDIR + '/webjournal'):
# Mmh, not accessing correct path. Stop caching
return (None, 'Trying to cache at wrong location: %s' % cache_path)
cache_path_dir = os.path.dirname(cache_path)
try:
if not os.path.isdir(cache_path_dir):
os.makedirs(cache_path_dir)
cached_file = open(cache_path, "w")
cached_file.write(html)
cached_file.close()
except Exception, e:
register_exception(req=None,
prefix="Could not store article cache",
alert_admin=True)
return (None, e)
return (cache_path_dir, '')
NOT_FOR_ALERT_COMMENTS_RE = re.compile('<!--\s*START_NOT_FOR_ALERT\s*-->.*?<!--\s*END_NOT_FOR_ALERT\s*-->', re.IGNORECASE | re.DOTALL)
def get_article_page_from_cache(journal_name, category, recid, issue, ln, bfo=None):
"""
Gets an article view of a journal from cache.
False if not in cache.
"""
issue = issue.replace("/", "_")
issue_number, year = issue.split("_", 1)
category = category.replace(" ", "")
cache_path = os.path.abspath('%s/webjournal/%s/%s/%s/article_%s_%s_%s.html' % \
(CFG_CACHEDIR, journal_name,
year, issue_number, category, recid, ln))
if not cache_path.startswith(CFG_CACHEDIR + '/webjournal'):
# Mmh, not accessing correct path. Stop reading cache
return False
try:
cached_file = open(cache_path).read()
except:
return False
if CFG_CERN_SITE and bfo:
try:
from invenio.modules.formatter.format_elements import bfe_webjournal_cern_toolbar
cached_file = NOT_FOR_ALERT_COMMENTS_RE.sub(bfe_webjournal_cern_toolbar.format_element(bfo), cached_file, 1)
except ImportError, e:
pass
return cached_file
def clear_cache_for_article(journal_name, category, recid, issue):
"""
Resets the cache for an article (e.g. after an article has been
modified)
"""
issue = issue.replace("/", "_")
issue_number, year = issue.split("_", 1)
category = category.replace(" ", "")
cache_path = os.path.abspath('%s/webjournal/%s/' %
(CFG_CACHEDIR, journal_name))
if not cache_path.startswith(CFG_CACHEDIR + '/webjournal'):
# Mmh, not accessing correct path. Stop deleting cache
return False
# try to delete the article cached file
try:
os.remove('%s/webjournal/%s/%s/%s/article_%s_%s_en.html' %
(CFG_CACHEDIR, journal_name, year, issue_number, category, recid))
except:
pass
try:
os.remove('%s/webjournal/%s/%s/%s/article_%s_%s_fr.html' %
(CFG_CACHEDIR, journal_name, year, issue_number, category, recid))
except:
pass
# delete the index page for the category
try:
os.remove('%s/webjournal/%s/%s/%s/index_%s_en.html'
% (CFG_CACHEDIR, journal_name, year, issue_number, category))
except:
pass
try:
os.remove('%s/webjournal/%s/%s/%s/index_%s_fr.html'
% (CFG_CACHEDIR, journal_name, year, issue_number, category))
except:
pass
try:
path = get_journal_article_cache_path(journal_name, issue)
os.remove(path)
except:
pass
return True
def clear_cache_for_issue(journal_name, issue):
"""
clears the cache of a whole issue.
"""
issue = issue.replace("/", "_")
issue_number, year = issue.split("_", 1)
cache_path_dir = os.path.abspath('%s/webjournal/%s/%s/%s/' % \
(CFG_CACHEDIR, journal_name,
year, issue_number))
if not cache_path_dir.startswith(CFG_CACHEDIR + '/webjournal'):
# Mmh, not accessing correct path. Stop deleting cache
return False
all_cached_files = os.listdir(cache_path_dir)
for cached_file in all_cached_files:
try:
os.remove(cache_path_dir + '/' + cached_file)
except:
return False
return True
######################### CERN SPECIFIC FUNCTIONS #################
def get_recid_from_legacy_number(issue_number, category, number):
"""
Returns the recid based on the issue number, category and
'number'.
This is used to support URLs using the now deprecated 'number'
argument. The function tries to reproduce the behaviour of the
old way of doing, even keeping some of its 'problems' (so that we
reach the same article as before with a given number)..
Returns the recid as int, or -1 if not found
"""
recids = []
if issue_number[0] == "0":
alternative_issue_number = issue_number[1:]
recids = list(search_pattern(p='65017a:"%s" and 773__n:%s' %
(category, issue_number)))
recids.extend(list(search_pattern(p='65017a:"%s" and 773__n:%s' %
(category, alternative_issue_number))))
else:
recids = list(search_pattern(p='65017:"%s" and 773__n:%s' %
(category, issue_number)))
# Now must order the records and pick the one at index 'number'.
# But we have to take into account that there can be multiple
# records at position 1, and that these additional records should
# be numbered with negative numbers:
# 1, 1, 1, 2, 3 -> 1, -1, -2, 2, 3...
negative_index_records = {}
positive_index_records = {}
# Fill in 'negative_index_records' and 'positive_index_records'
# lists with the following loop
for recid in recids:
bfo = BibFormatObject(recid)
order = [subfield['c'] for subfield in bfo.fields('773__') if \
issue_number in subfield.get('n', '')]
if len(order) > 0:
# If several orders are defined for the same article and
# the same issue, keep the first one
order = order[0]
if order.isdigit():
# Order must be an int. Otherwise skip
order = int(order)
if order == 1 and positive_index_records.has_key(1):
# This is then a negative number for this record
index = (len(negative_index_records.keys()) > 0 and \
min(negative_index_records.keys()) -1) or 0
negative_index_records[index] = recid
else:
# Positive number for this record
if not positive_index_records.has_key(order):
positive_index_records[order] = recid
else:
# We make the assumption that we cannot have
# twice the same position for two
# articles. Previous WebJournal module was not
# clear about that. Just drop this record
# (better than crashing or looping forever..)
pass
recid_to_return = -1
# Ok, we can finally pick the recid corresponding to 'number'
if number <= 0:
negative_indexes = negative_index_records.keys()
negative_indexes.sort()
negative_indexes.reverse()
if len(negative_indexes) > abs(number):
recid_to_return = negative_index_records[negative_indexes[abs(number)]]
else:
if positive_index_records.has_key(number):
recid_to_return = positive_index_records[number]
return recid_to_return
def is_recid_in_released_issue(recid):
"""
Returns True if recid is part of the latest issue of the given
journal.
WARNING: the function does not check that the article does not
belong to the draft collection of the record. This is wanted, in
order to workaround the time needed for a record to go from the
draft collection to the final collection
"""
bfo = BibFormatObject(recid)
journal_name = ''
journal_names = [journal_name for journal_name in bfo.fields('773__t') if journal_name]
if journal_names:
journal_name = journal_names[0]
else:
return False
existing_journal_names = [o['journal_name'] for o in get_journals_ids_and_names()]
if not journal_name in existing_journal_names:
# Try to remove whitespace
journal_name = journal_name.replace(' ', '')
if not journal_name in existing_journal_names:
# Journal name unknown from WebJournal
return False
config_strings = get_xml_from_config(["draft_image_access_policy"], journal_name)
if config_strings['draft_image_access_policy'] and \
config_strings['draft_image_access_policy'][0] != 'allow':
# The journal does not want to optimize access to images
return False
article_issues = bfo.fields('773__n')
current_issue = get_current_issue(CFG_SITE_LANG, journal_name)
for article_issue in article_issues:
# Check each issue until a released one is found
if get_release_datetime(article_issue, journal_name):
# Release date exists, issue has been released
return True
else:
# Unreleased issue. Do we still allow based on journal config?
unreleased_issues_mode = get_unreleased_issue_hiding_mode(journal_name)
if (unreleased_issues_mode == 'none' or \
(unreleased_issues_mode == 'future' and \
not issue_is_later_than(article_issue, current_issue))):
return True
return False
diff --git a/invenio/legacy/webjournal/webinterface.py b/invenio/legacy/webjournal/webinterface.py
index 7934086b6..4e58aadd5 100644
--- a/invenio/legacy/webjournal/webinterface.py
+++ b/invenio/legacy/webjournal/webinterface.py
@@ -1,552 +1,552 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebJournal Web Interface."""
__revision__ = "$Id$"
__lastupdated__ = """$Date$"""
import urllib
from invenio.ext.legacy.handler import wash_urlargd, WebInterfaceDirectory
from invenio.modules.access.engine import acc_authorize_action
from invenio.config import \
CFG_SITE_URL, \
CFG_SITE_SECURE_URL, \
CFG_SITE_LANG, \
CFG_CERN_SITE
from invenio.legacy.webuser import getUid
from invenio.utils.url import redirect_to_url
from invenio.ext.logging import register_exception
from invenio.webjournal_config import \
InvenioWebJournalNoJournalOnServerError, \
InvenioWebJournalNoNameError, \
InvenioWebJournalNoCurrentIssueError, \
InvenioWebJournalIssueNumberBadlyFormedError, \
InvenioWebJournalArchiveDateWronglyFormedError, \
InvenioWebJournalJournalIdNotFoundDBError, \
InvenioWebJournalNoArticleNumberError, \
InvenioWebJournalNoPopupRecordError, \
InvenioWebJournalIssueNotFoundDBError, \
InvenioWebJournalNoCategoryError
from invenio.webjournal_utils import \
get_current_issue, \
get_recid_from_legacy_number, \
get_journal_categories
from invenio.webjournal_washer import \
wash_category, \
wash_issue_number, \
wash_journal_name, \
wash_journal_language, \
wash_article_number, \
wash_popup_record, \
wash_archive_date
from invenio.webjournal import \
perform_request_index, \
perform_request_article, \
perform_request_contact, \
perform_request_popup, \
perform_request_search
-from invenio.webstat import register_customevent
+from invenio.legacy.webstat.api import register_customevent
import invenio.legacy.template
webjournal_templates = invenio.legacy.template.load('webjournal')
class WebInterfaceJournalPages(WebInterfaceDirectory):
"""Defines the set of /journal pages."""
journal_name = None
journal_issue_year = None
journal_issue_number = None
category = None
article_id = None
_exports = ['popup', 'search', 'contact']
def _lookup(self, component, path):
""" This handler is invoked for the dynamic URLs """
if component in ['article', 'issue_control', 'edit_article', 'alert',
'feature_record', 'regenerate', 'administrate'] and \
CFG_CERN_SITE:
return WebInterfaceJournalPagesLegacy(), [component]
return self, []
def __call__(self, req, form):
""" Maybe resolve the final / of a directory """
path = req.uri[1:].split('/')
journal_name = None
journal_issue_year = None
journal_issue_number = None
specific_category = None
category = None
article_id = None
if len(path) > 1:
journal_name = path[1]
if len(path) > 2 and path[2].isdigit():
journal_issue_year = path[2]
elif len(path) > 2 and not path[2].isdigit():
specific_category = urllib.unquote(path[2])
if len(path) > 3 and path[3].isdigit():
journal_issue_number = path[3]
if len(path) > 4:
category = urllib.unquote(path[4])
if len(path) > 5 and path[5].isdigit():
article_id = int(path[5])
## Support for legacy journal/[empty]?(args*) urls. There are
## these parameters only in that case
argd = wash_urlargd(form, {'name': (str, ""),
'issue': (str, ""),
'category': (str, ""),
'ln': (str, CFG_SITE_LANG),
'number': (int, None),
'verbose': (int, 0)}
)
if 'name' in form.keys() or \
'issue' in form.keys() or \
'category' in form.keys():
ln = wash_journal_language(argd['ln'])
try:
journal_name = wash_journal_name(ln, argd['name'])
except InvenioWebJournalNoJournalOnServerError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalNoNameError, e:
return e.user_box(req)
try:
issue = wash_issue_number(ln, journal_name,
argd['issue'])
issue_year = issue.split('/')[1]
issue_number = issue.split('/')[0]
except InvenioWebJournalIssueNumberBadlyFormedError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalJournalIdNotFoundDBError, e:
register_exception(req=req)
return e.user_box(req)
category = wash_category(ln, argd['category'], journal_name, issue).replace(' ', '%20')
redirect_to_url(req, CFG_SITE_URL + '/journal/%(name)s/%(issue_year)s/%(issue_number)s/%(category)s/?ln=%(ln)s' % \
{'name': journal_name,
'issue_year': issue_year,
'issue_number': issue_number,
'category': category,
'ln': ln})
## End support for legacy urls
# Check that given journal name exists and that it is written
# with correct casing.
redirect_p = False
try:
washed_journal_name = wash_journal_name(argd['ln'], journal_name)
if washed_journal_name != journal_name:
redirect_p = True
except InvenioWebJournalNoNameError, e:
return e.user_box(req)
except InvenioWebJournalNoJournalOnServerError, e:
register_exception(req=req)
return e.user_box(req)
# If some parameters are missing, deduce them and
# redirect
if not journal_issue_year or \
not journal_issue_number or \
not category or \
redirect_p or \
specific_category:
if not journal_issue_year or not journal_issue_number:
journal_issue = get_current_issue(argd['ln'], washed_journal_name)
journal_issue_year = journal_issue.split('/')[1]
journal_issue_number = journal_issue.split('/')[0]
if not category or specific_category:
categories = get_journal_categories(washed_journal_name,
journal_issue_number + \
'/' + journal_issue_year)
if not categories:
# Mmh, it seems that this issue has no
# category. Ok get all of them regardless of the
# issue
categories = get_journal_categories(washed_journal_name)
if not categories:
# Mmh we really have no category!
try:
raise InvenioWebJournalIssueNotFoundDBError(argd['ln'],
journal_name,
'')
except InvenioWebJournalIssueNotFoundDBError, e:
register_exception(req=req)
return e.user_box(req)
if not category:
category = categories[0].replace(' ', '%20')
if specific_category:
category = specific_category.replace(' ', '%20')
redirect_to_url(req, CFG_SITE_URL + '/journal/%(name)s/%(issue_year)s/%(issue_number)s/%(category)s/?ln=%(ln)s' % \
{'name': washed_journal_name,
'issue_year': journal_issue_year,
'issue_number': journal_issue_number,
'category': category,
'ln': argd['ln']})
journal_issue = ""
if journal_issue_year is not None and \
journal_issue_number is not None:
journal_issue = journal_issue_number + '/' + \
journal_issue_year
try:
journal_name = washed_journal_name
issue = wash_issue_number(argd['ln'], journal_name, journal_issue)
category = wash_category(argd['ln'], category, journal_name, issue)
except InvenioWebJournalNoJournalOnServerError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalNoNameError, e:
return e.user_box(req)
except InvenioWebJournalIssueNumberBadlyFormedError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalNoCategoryError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalJournalIdNotFoundDBError, e:
register_exception(req=req)
return e.user_box(req)
editor = False
if acc_authorize_action(getUid(req), 'cfgwebjournal',
name="%s" % journal_name)[0] == 0:
editor = True
if article_id is None:
html = perform_request_index(req,
journal_name,
journal_issue,
argd['ln'],
category,
editor,
verbose=argd['verbose'])
else:
html = perform_request_article(req,
journal_name,
journal_issue,
argd['ln'],
category,
article_id,
editor,
verbose=argd['verbose'])
# register event in webstat
try:
register_customevent("journals", ["display", journal_name, journal_issue, category, argd['ln'], article_id])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
return html
def contact(self, req, form):
"""
Display contact information for the journal.
"""
argd = wash_urlargd(form, {'name': (str, ""),
'ln': (str, ""),
'verbose': (int, 0)
})
try:
ln = wash_journal_language(argd['ln'])
washed_journal_name = wash_journal_name(ln, argd['name'])
except InvenioWebJournalNoJournalOnServerError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalNoNameError, e:
return e.user_box(req)
html = perform_request_contact(req, ln, washed_journal_name,
verbose=argd['verbose'])
return html
def popup(self, req, form):
"""
simple pass-through function that serves as a checker for popups.
"""
argd = wash_urlargd(form, {'name': (str, ""),
'record': (str, ""),
'ln': (str, "")
})
try:
ln = wash_journal_language(argd['ln'])
washed_journal_name = wash_journal_name(ln, argd['name'])
record = wash_popup_record(ln, argd['record'], washed_journal_name)
except InvenioWebJournalNoJournalOnServerError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalNoNameError, e:
return e.user_box(req)
except InvenioWebJournalNoPopupRecordError, e:
register_exception(req=req)
return e.user_box(req)
html = perform_request_popup(req, ln, washed_journal_name, record)
return html
def search(self, req, form):
"""
Display search interface
"""
argd = wash_urlargd(form, {'name': (str, ""),
'issue': (str, ""),
'archive_year': (str, ""),
'archive_issue': (str, ""),
'archive_select': (str, "False"),
'archive_date': (str, ""),
'archive_search': (str, "False"),
'ln': (str, CFG_SITE_LANG),
'verbose': (int, 0)})
try:
# FIXME: if journal_name is empty, redirect
ln = wash_journal_language(argd['ln'])
washed_journal_name = wash_journal_name(ln, argd['name'])
archive_issue = wash_issue_number(ln, washed_journal_name,
argd['archive_issue'])
archive_date = wash_archive_date(ln, washed_journal_name,
argd['archive_date'])
archive_select = argd['archive_select']
archive_search = argd['archive_search']
except InvenioWebJournalNoJournalOnServerError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalNoNameError, e:
return e.user_box(req)
except InvenioWebJournalNoCurrentIssueError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalIssueNumberBadlyFormedError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalArchiveDateWronglyFormedError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalJournalIdNotFoundDBError, e:
register_exception(req=req)
return e.user_box(req)
html = perform_request_search(req=req,
journal_name=washed_journal_name,
ln=ln,
archive_issue=archive_issue,
archive_select=archive_select,
archive_date=archive_date,
archive_search=archive_search,
verbose=argd['verbose'])
return html
index = __call__
class WebInterfaceJournalPagesLegacy(WebInterfaceDirectory):
"""Defines the set of /journal pages."""
_exports = ['', 'article', 'issue_control', 'edit_article', 'alert',
'feature_record', 'regenerate', 'administrate']
def index(self, req, form):
"""
Index page.
Washes all the parameters and stores them in journal_defaults dict
for subsequent format_elements.
Passes on to logic function and eventually returns HTML.
"""
argd = wash_urlargd(form, {'name': (str, ""),
'issue': (str, ""),
'category': (str, ""),
'ln': (str, "")}
)
try:
ln = wash_journal_language(argd['ln'])
journal_name = wash_journal_name(ln, argd['name'])
issue_number = wash_issue_number(ln, journal_name,
argd['issue'])
category = wash_category(ln, argd['category'], journal_name, issue_number)
except InvenioWebJournalNoJournalOnServerError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalNoNameError, e:
return e.user_box(req)
except InvenioWebJournalNoCurrentIssueError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalIssueNumberBadlyFormedError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalNoCategoryError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalJournalIdNotFoundDBError, e:
register_exception(req=req)
return e.user_box(req)
# the journal_defaults will be used by format elements that have no
# direct access to the params here, no more checking needed
req.journal_defaults = {"name": journal_name,
"issue": issue_number,
"ln": ln,
"category": category}
html = perform_request_index(req, journal_name, issue_number, ln,
category)
return html
def article(self, req, form):
"""
Article page.
Washes all the parameters and stores them in journal_defaults dict
for subsequent format_elements.
Passes on to logic function and eventually returns HTML.
"""
argd = wash_urlargd(form, {'name': (str, ""),
'issue': (str, ""),
'category': (str, ""),
'number': (str, ""),
'ln': (str, ""),
}
)
try:
ln = wash_journal_language(argd['ln'])
journal_name = wash_journal_name(ln, argd['name'])
issue = wash_issue_number(ln, journal_name,
argd['issue'])
issue_year = issue.split('/')[1]
issue_number = issue.split('/')[0]
category = wash_category(ln, argd['category'], journal_name, issue_number)
number = wash_article_number(ln, argd['number'], journal_name)
recid = get_recid_from_legacy_number(issue, category, int(number))
except InvenioWebJournalNoJournalOnServerError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalNoNameError, e:
return e.user_box(req)
except InvenioWebJournalNoCurrentIssueError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalIssueNumberBadlyFormedError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalNoArticleNumberError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalNoCategoryError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalJournalIdNotFoundDBError, e:
register_exception(req=req)
return e.user_box(req)
if recid != -1:
# Found a corresponding record
redirect_to_url(req, CFG_SITE_URL + \
'/journal/' + journal_name + '/' + issue_year + \
'/' + issue_number + '/' + category + \
'/' + str(recid) + '?ln=' + ln)
else:
# Corresponding record not found. Display index
redirect_to_url(req, CFG_SITE_URL + \
'/journal/' + journal_name + '/' + issue_year + \
'/' + issue_number + '/' + category + \
'?ln=' + ln)
def administrate(self, req, form):
"""Index page."""
argd = wash_urlargd(form, {'name': (str, ""),
'ln': (str, "")
})
try:
ln = wash_journal_language(argd['ln'])
journal_name = wash_journal_name(ln, argd['name'])
except InvenioWebJournalNoJournalOnServerError, e:
register_exception(req=req)
return e.user_box(req)
except InvenioWebJournalNoNameError, e:
return e.user_box(req)
redirect_to_url(req, CFG_SITE_SECURE_URL + \
'/admin/webjournal/webjournaladmin.py/administrate?journal_name=' + \
journal_name + '&ln=' + ln)
def feature_record(self, req, form):
"""
Interface to feature a record. Will be saved in a flat file.
"""
argd = wash_urlargd(form, {'name': (str, ""),
'recid': (str, "init"),
'url': (str, "init"),
'ln': (str, "")})
redirect_to_url(req, CFG_SITE_SECURE_URL + \
'/admin/webjournal/webjournaladmin.py/feature_record?journal_name=' + \
argd['name'] + '&ln=' + argd['ln'] + '&recid='+ argd['recid'] + '&url='+ argd['url'])
def regenerate(self, req, form):
"""
Clears the cache for the issue given.
"""
argd = wash_urlargd(form, {'name': (str, ""),
'issue': (str, ""),
'ln': (str, "")})
redirect_to_url(req, CFG_SITE_SECURE_URL + \
'/admin/webjournal/webjournaladmin.py/regenerate?journal_name=' + \
argd['name'] + '&ln=' + argd['ln'] + '&issue=' + argd['issue'])
def alert(self, req, form):
"""
Alert system.
Sends an email alert, in HTML/PlainText or only PlainText to a mailing
list to alert for new journal releases.
"""
argd = wash_urlargd(form, {'name': (str, ""),
'sent': (str, "False"),
'plainText': (str, ''),
'htmlMail': (str, ""),
'recipients': (str, ""),
'subject': (str, ""),
'ln': (str, ""),
'issue': (str, ""),
'force': (str, "False")})
redirect_to_url(req, CFG_SITE_SECURE_URL + \
'/admin/webjournal/webjournaladmin.py/alert?journal_name=' + \
argd['name'] + '&ln=' + argd['ln'] + '&issue=' + argd['issue'] + \
'&sent=' + argd['sent'] + '&plainText=' + argd['plainText'] + \
'&htmlMail=' + argd['htmlMail'] + '&recipients=' + argd['recipients'] + \
'&force=' + argd['force'] + '&subject=' + argd['subject'])
def issue_control(self, req, form):
"""
page that allows full control over creating, backtracing, adding to,
removing from issues.
"""
argd = wash_urlargd(form, {'name': (str, ""),
'add': (str, ""),
'action_publish': (str, "cfg"),
'issue_number': (list, []),
'ln': (str, "")})
redirect_to_url(req, CFG_SITE_SECURE_URL + \
'/admin/webjournal/webjournaladmin.py/issue_control?journal_name=' + \
argd['name'] + '&ln=' + argd['ln'] + '&issue=' + argd['issue_number'] + \
'&action=' + argd['action_publish'])
diff --git a/invenio/legacy/weblinkback/adminlib.py b/invenio/legacy/weblinkback/adminlib.py
index a9aab1094..3a40f8b18 100644
--- a/invenio/legacy/weblinkback/adminlib.py
+++ b/invenio/legacy/weblinkback/adminlib.py
@@ -1,227 +1,227 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebLinkback - Administrative Lib"""
from invenio.config import CFG_SITE_LANG, CFG_SITE_URL
from invenio.utils.url import wash_url_argument
from invenio.base.i18n import gettext_set_language, wash_language
from invenio.legacy.webuser import collect_user_info
-from invenio.weblinkback_dblayer import get_all_linkbacks, \
+from invenio.legacy.weblinkback.db_layer import get_all_linkbacks, \
approve_linkback,\
reject_linkback, \
remove_url, \
add_url_to_list, \
url_exists, \
get_urls,\
get_url_title
-from invenio.weblinkback_config import CFG_WEBLINKBACK_ORDER_BY_INSERTION_TIME, \
+from invenio.legacy.weblinkback.config import CFG_WEBLINKBACK_ORDER_BY_INSERTION_TIME, \
CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION, \
CFG_WEBLINKBACK_STATUS, \
CFG_WEBLINKBACK_ACTION_RETURN_CODE
from invenio.legacy.bibrank.adminlib import addadminbox, \
tupletotable
from invenio.utils.date import convert_datetext_to_dategui
from invenio.modules.formatter import format_record
import cgi
import urllib
import invenio.legacy.template
weblinkback_templates = invenio.legacy.template.load('weblinkback')
def get_navtrail(previous = '', ln=CFG_SITE_LANG):
"""Get the navtrail"""
previous = wash_url_argument(previous, 'str')
ln = wash_language(ln)
_ = gettext_set_language(ln)
navtrail = """<a class="navtrail" href="%s/help/admin">%s</a> """ % (CFG_SITE_URL, _("Admin Area"))
navtrail = navtrail + previous
return navtrail
def perform_request_index(ln=CFG_SITE_LANG):
"""
Display main admin page
"""
return weblinkback_templates.tmpl_admin_index(ln)
def perform_request_display_list(return_code, url_field_value, ln=CFG_SITE_LANG):
"""
Display a list
@param return_code: might indicate errors from a previous action, of CFG_WEBLINKBACK_ACTION_RETURN_CODE
@param url_field_value: value of the url text field
"""
_ = gettext_set_language(ln)
urls = get_urls()
entries = []
for url in urls:
entries.append(('<a href="%s">%s</a>' % (cgi.escape(url[0]), cgi.escape(url[0])),
url[1].lower(),
'<a href="moderatelist?url=%s&action=%s&ln=%s">%s</a>' % (urllib.quote(url[0]), CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION['DELETE'], ln, CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION['DELETE'].lower())))
header = ['URL', 'List', '']
error_message = ""
if return_code != CFG_WEBLINKBACK_ACTION_RETURN_CODE['OK']:
error_message = _("Unknown error")
if return_code == CFG_WEBLINKBACK_ACTION_RETURN_CODE['DUPLICATE']:
error_message = _("The URL already exists in one of the lists")
elif return_code == CFG_WEBLINKBACK_ACTION_RETURN_CODE['INVALID_ACTION']:
error_message = _("Invalid action")
elif return_code == CFG_WEBLINKBACK_ACTION_RETURN_CODE['BAD_INPUT']:
error_message = _("Invalid URL, might contain spaces")
error_message_html = ""
if error_message != "":
error_message_html = "<dt><b><font color=red>" + error_message + "</font></b></dt>" + "<br>"
out = """
<dl>
%(error_message)s
<dt>%(whitelist)s</dt>
<dd>%(whitelistText)s</dd>
<dt>%(blacklist)s</dt>
<dd>%(blacklistText)s</dd>
<dt>%(explanation)s</dt>
</dl>
<table class="admin_wvar" cellspacing="0">
<tr><td>
<form action='moderatelist'>
URL:
<input type="text" name="url" value="%(url)s" />
<input type="hidden" name="action" value="%(action)s" />
<select name="listtype" size="1">
<option value=whitelist>whitelist</option>
<option value=blacklist>blacklist</option>
</select>
<input type="submit" class="adminbutton" value="%(buttonText)s">
</form>
</td></tr></table>
""" % {'whitelist': _('Whitelist'),
'whitelistText': _('linkback requests from these URLs will be approved automatically.'),
'blacklist': _('Blacklist'),
'blacklistText': _('linkback requests from these URLs will be refused automatically, no data will be saved.'),
'explanation': _('All URLs in these lists are checked for containment (infix) in any linkback request URL. A whitelist match has precedence over a blacklist match.'),
'url': cgi.escape(url_field_value),
'action': CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION['INSERT'],
'buttonText': _('Add URL'),
'error_message': error_message_html}
if entries:
out += tupletotable(header=header, tuple=entries, highlight_rows_p=True,
alternate_row_colors_p=True)
else:
out += "<i>%s</i>" % _('There are no URLs in both lists.')
return addadminbox('<b>%s</b>'% _("Reduce the amount of future pending linkback requests"), [out])
def perform_moderate_url(req, url, action, list_type):
"""
Perform a url action
@param url
@param action: CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION['INSERT'] or CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION['DELETE']
@param list_type: of CFG_WEBLINKBACK_LIST_TYPE
@return (String, CFG_WEBLINKBACK_ACTION_RETURN_CODE) the String is url if CFG_WEBLINKBACK_ACTION_RETURN_CODE['BAD_INPUT')
"""
if url == '' or ' ' in url:
return (url, CFG_WEBLINKBACK_ACTION_RETURN_CODE['BAD_INPUT'])
elif action == CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION['INSERT']:
if url_exists(url):
return ('', CFG_WEBLINKBACK_ACTION_RETURN_CODE['DUPLICATE'])
else:
add_url_to_list(url, list_type, collect_user_info(req))
elif action == CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION['DELETE']:
remove_url(url)
else:
return ('', CFG_WEBLINKBACK_ACTION_RETURN_CODE['INVALID_ACTION'])
return ('', CFG_WEBLINKBACK_ACTION_RETURN_CODE['OK'])
def perform_request_display_linkbacks(status, return_code, ln=CFG_SITE_LANG):
"""
Display linkbacks
@param status: of CFG_WEBLINKBACK_STATUS, currently only CFG_WEBLINKBACK_STATUS['PENDING'] is supported
"""
_ = gettext_set_language(ln)
if status == CFG_WEBLINKBACK_STATUS['PENDING']:
linkbacks = get_all_linkbacks(status=status, order=CFG_WEBLINKBACK_ORDER_BY_INSERTION_TIME['DESC'])
entries = []
for (linkbackid, origin_url, recid, additional_properties, linkback_type, linkback_status, insert_time) in linkbacks: # pylint: disable=W0612
moderation_prefix = '<a href="moderatelinkback?action=%%s&linkbackid=%s&ln=%s">%%s</a>' % (linkbackid, ln)
entries.append((linkback_type,
format_record(recID=recid, of='hs', ln=ln),
'<a href="%s">%s</a>' % (cgi.escape(origin_url), cgi.escape(get_url_title(origin_url))),
convert_datetext_to_dategui(str(insert_time)),
moderation_prefix % (CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION['APPROVE'], 'Approve') + " / " + moderation_prefix % (CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION['REJECT'], 'Reject')))
header = ['Linkback type', 'Record', 'Origin', 'Submitted on', '']
error_message = ""
if return_code != CFG_WEBLINKBACK_ACTION_RETURN_CODE['OK']:
error_message = _("Unknown error")
if return_code == CFG_WEBLINKBACK_ACTION_RETURN_CODE['INVALID_ACTION']:
error_message = _("Invalid action")
error_message_html = ""
if error_message != "":
error_message_html = "<dt><b><font color=red>" + error_message + "</font></b></dt>" + "<br>"
out = """
<dl>
%(error_message)s
<dt>%(heading)s</dt>
<dd>%(description)s</dd>
</dl>
""" % {'heading': _("Pending linkbacks"),
'description': _("these linkbacks are not visible to users, they must be approved or rejected."),
'error_message': error_message_html}
if entries:
out += tupletotable(header=header, tuple=entries, highlight_rows_p=True,
alternate_row_colors_p=True)
else:
out += "<i>There are no %s linkbacks.</i>" % status.lower()
return addadminbox('<b>%s</b>'% _("Reduce the amount of currently pending linkback requests"), [out])
else:
return "<i>%s</i>" % _('Currently only pending linkbacks are supported.')
def perform_moderate_linkback(req, linkbackid, action):
"""
Moderate linkbacks
@param linkbackid: linkback id
@param action: of CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION
@return CFG_WEBLINKBACK_ACTION_RETURN_CODE
"""
if action == CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION['APPROVE']:
approve_linkback(linkbackid, collect_user_info(req))
elif action == CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION['REJECT']:
reject_linkback(linkbackid, collect_user_info(req))
else:
return CFG_WEBLINKBACK_ACTION_RETURN_CODE['INVALID_ACTION']
return CFG_WEBLINKBACK_ACTION_RETURN_CODE['OK']
diff --git a/invenio/legacy/weblinkback/api.py b/invenio/legacy/weblinkback/api.py
index 2bcac0362..aff6a843f 100644
--- a/invenio/legacy/weblinkback/api.py
+++ b/invenio/legacy/weblinkback/api.py
@@ -1,334 +1,334 @@
# -*- coding: utf-8 -*-
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebLinkback - Handling Linkbacks"""
from invenio.config import CFG_SITE_URL, \
CFG_SITE_RECORD, \
CFG_SITE_ADMIN_EMAIL, \
CFG_SITE_LANG
-from invenio.weblinkback_config import CFG_WEBLINKBACK_TYPE, \
+from invenio.legacy.weblinkback.config import CFG_WEBLINKBACK_TYPE, \
CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME, \
CFG_WEBLINKBACK_STATUS, \
CFG_WEBLINKBACK_ORDER_BY_INSERTION_TIME, \
CFG_WEBLINKBACK_LIST_TYPE, \
CFG_WEBLINKBACK_TRACKBACK_SUBSCRIPTION_ERROR_MESSAGE, \
CFG_WEBLINKBACK_PAGE_TITLE_STATUS, \
CFG_WEBLINKBACK_BROKEN_COUNT, \
CFG_WEBLINKBACK_LATEST_FACTOR, \
CFG_WEBLINKBACK_MAX_LINKBACKS_IN_EMAIL
-from invenio.weblinkback_dblayer import create_linkback, \
+from invenio.legacy.weblinkback.db_layer import create_linkback, \
get_url_list, \
get_all_linkbacks, \
get_approved_latest_added_linkbacks, \
approve_linkback, \
get_urls_and_titles, \
update_url_title, \
set_url_broken, \
increment_broken_count, \
remove_linkback
from invenio.legacy.search_engine import check_user_can_view_record, \
guess_primary_collection_of_a_record
from invenio.modules.access.engine import acc_authorize_action, \
acc_get_authorized_emails
from invenio.legacy.webuser import collect_user_info
from invenio.ext.email import send_email
from invenio.utils.url import get_title_of_page
def check_user_can_view_linkbacks(user_info, recid):
"""
Check if the user is authorized to view linkbacks for a given recid.
Returns the same type as acc_authorize_action
"""
# check user cannot view the record itself
(auth_code, auth_msg) = check_user_can_view_record(user_info, recid)
if auth_code:
return (auth_code, auth_msg)
# check if user can view the linkbacks
record_primary_collection = guess_primary_collection_of_a_record(recid)
return acc_authorize_action(user_info, 'viewlinkbacks', authorized_if_no_roles=True, collection=record_primary_collection)
def generate_redirect_url(recid, ln=CFG_SITE_LANG, action = None):
"""
Get redirect URL for an action
@param action: the action, must be defined in weblinkback_webinterface.py
@return "CFG_SITE_URL/CFG_SITE_RECORD/recid/linkbacks/action?ln=%s" if action != None,
otherwise CFG_SITE_URL/CFG_SITE_RECORD/recid/linkbacks?ln=%s
"""
result = "%s/%s/%s/linkbacks" % (CFG_SITE_URL, CFG_SITE_RECORD, recid)
if action != None:
return result + "/%s?ln=%s" % (action, ln)
else:
return result + "?ln=%s" % ln
def split_in_days(linkbacks):
"""
Split linkbacks in days
@param linkbacks: a list of this format: [(linkback_id,
origin_url,
recid,
additional_properties,
type,
status,
insert_time)]
in ascending or descending order by insert_time
@return a list of lists of linkbacks
"""
result = []
same_day_list = []
previous_date = None
current_date = None
for i in range(len(linkbacks)):
current_linkback = linkbacks[i]
previous_date = None
if i > 0:
previous_date = current_date
else:
previous_date = current_linkback[6]
current_date = current_linkback[6]
# same day --> same group
if (current_date.year == previous_date.year and
current_date.month == previous_date.month and
current_date.day == previous_date.day):
same_day_list.append(current_linkback)
else:
result.append(same_day_list)
same_day_list = []
same_day_list.append(current_linkback)
# add last group if non-empty
if same_day_list:
result.append(same_day_list)
return result
def create_trackback(recid, url, title, excerpt, blog_name, blog_id, source, user_info):
"""
Create a trackback
@param recid
"""
# copy optional arguments
argument_copy = {}
if title != CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME:
argument_copy['title'] = title
if excerpt != CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME:
argument_copy['excerpt'] = excerpt
if blog_name != CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME:
argument_copy['blog_name'] = blog_name
if blog_id != CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME:
argument_copy['id'] = blog_id
if source != CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME:
argument_copy['source'] = source
additional_properties = ""
if len(argument_copy) > 0:
additional_properties = argument_copy
return create_linkback(url, recid, additional_properties, CFG_WEBLINKBACK_TYPE['TRACKBACK'], user_info)
def send_pending_linkbacks_notification(linkback_type):
"""
Send notification emails to all linkback moderators for all pending linkbacks
@param linkback_type: of CFG_WEBLINKBACK_LIST_TYPE
"""
pending_linkbacks = get_all_linkbacks(linkback_type=CFG_WEBLINKBACK_TYPE['TRACKBACK'], status=CFG_WEBLINKBACK_STATUS['PENDING'])
if pending_linkbacks:
pending_count = len(pending_linkbacks)
cutoff_text = ''
if pending_count > CFG_WEBLINKBACK_MAX_LINKBACKS_IN_EMAIL:
cutoff_text = ' (Printing only the first %s requests)' % CFG_WEBLINKBACK_MAX_LINKBACKS_IN_EMAIL
content = """There are %(count)s new %(linkback_type)s requests which you should approve or reject%(cutoff)s:
""" % {'count': pending_count,
'linkback_type': linkback_type,
'cutoff': cutoff_text}
for pending_linkback in pending_linkbacks[0:CFG_WEBLINKBACK_MAX_LINKBACKS_IN_EMAIL]:
content += """
For %(recordURL)s from %(origin_url)s.
""" % {'recordURL': generate_redirect_url(pending_linkback[2]),
'origin_url': pending_linkback[1]}
for email in acc_get_authorized_emails('moderatelinkbacks'):
send_email(CFG_SITE_ADMIN_EMAIL, email, 'Pending ' + linkback_type + ' requests', content)
def infix_exists_for_url_in_list(url, list_type):
"""
Check if an infix of a url exists in a list
@param url
@param list_type, of CFG_WEBLINKBACK_LIST_TYPE
@return True, False
"""
urls = get_url_list(list_type)
for current_url in urls:
if current_url in url:
return True
return False
def get_latest_linkbacks_to_accessible_records(rg, linkbacks, user_info):
result = []
for linkback in linkbacks:
(auth_code, auth_msg) = check_user_can_view_record(user_info, linkback[2]) # pylint: disable=W0612
if not auth_code:
result.append(linkback)
if len(result) == rg:
break
return result
def perform_request_display_record_linbacks(req, recid, show_admin, weblinkback_templates, ln): # pylint: disable=W0613
"""
Display linkbacks of a record
@param recid
@param argd
@param show_admin: True, False --> show admin parts to approve/reject linkbacks pending requests
@param weblinkback_templates: template object reference
"""
out = weblinkback_templates.tmpl_linkbacks_general(recid=recid,
ln=ln)
if show_admin:
pending_linkbacks = get_all_linkbacks(recid, CFG_WEBLINKBACK_STATUS['PENDING'], CFG_WEBLINKBACK_ORDER_BY_INSERTION_TIME['DESC'])
out += weblinkback_templates.tmpl_linkbacks_admin(pending_linkbacks=pending_linkbacks,
recid=recid,
ln=ln)
approved_linkbacks = get_all_linkbacks(recid, CFG_WEBLINKBACK_STATUS['APPROVED'], CFG_WEBLINKBACK_ORDER_BY_INSERTION_TIME['DESC'])
out += weblinkback_templates.tmpl_linkbacks(approved_linkbacks=approved_linkbacks,
ln=ln)
return out
def perform_request_display_approved_latest_added_linkbacks_to_accessible_records(rg, ln, user_info, weblinkback_templates):
"""
Display approved latest added linbacks to accessible records
@param rg: count of linkbacks to display
@param weblinkback_templates: template object reference
"""
latest_linkbacks = get_approved_latest_added_linkbacks(rg * CFG_WEBLINKBACK_LATEST_FACTOR)
latest_linkbacks = get_latest_linkbacks_to_accessible_records(rg, latest_linkbacks, user_info)
latest_linkbacks_in_days = split_in_days(latest_linkbacks)
out = weblinkback_templates.tmpl_get_latest_linkbacks_top(rg, ln)
out += '<br>'
out += weblinkback_templates.tmpl_get_latest_linkbacks(latest_linkbacks_in_days, ln)
return out
def perform_sendtrackback(recid, url, title, excerpt, blog_name, blog_id, source, current_user):
"""
Send trackback
@param recid: recid
"""
# assume unsuccessful request
status = 400
xml_response = '<response>'
xml_error_response = """<error>1</error>
<message>%s</message>
"""
blacklist_match = infix_exists_for_url_in_list(url, CFG_WEBLINKBACK_LIST_TYPE['BLACKLIST'])
whitelist_match = infix_exists_for_url_in_list(url, CFG_WEBLINKBACK_LIST_TYPE['WHITELIST'])
# faulty request, url argument not set
if url in (CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME, None, ''):
xml_response += xml_error_response % CFG_WEBLINKBACK_TRACKBACK_SUBSCRIPTION_ERROR_MESSAGE['BAD_ARGUMENT']
# request refused: whitelist match has precedence over blacklist match
elif blacklist_match and not whitelist_match:
xml_response += xml_error_response % CFG_WEBLINKBACK_TRACKBACK_SUBSCRIPTION_ERROR_MESSAGE['BLACKLIST']
# request accepted: will be either approved automatically or pending
else:
status = 200
linkback_id = create_trackback(recid, url, title, excerpt, blog_name, blog_id, source, current_user)
# approve request automatically from url in whitelist
if whitelist_match:
approve_linkback(linkback_id, current_user)
xml_response += '</response>'
return xml_response, status
def perform_sendtrackback_disabled():
status = 404
xml_response = """<response>
<error>1</error>
<message>Trackback facility disabled</message>
</response>"""
return xml_response, status
def update_linkbacks(mode):
"""
Update titles of pages that link to the instance
@param mode: 1 update page titles of new linkbacks
2 update page titles of old linkbacks
3 update manually set page titles
4 detect and disable broken linkbacks
"""
if mode in (1, 2, 3):
if mode == 1:
urls_and_titles = get_urls_and_titles(CFG_WEBLINKBACK_PAGE_TITLE_STATUS['NEW'])
elif mode == 2:
urls_and_titles = get_urls_and_titles(CFG_WEBLINKBACK_PAGE_TITLE_STATUS['OLD'])
elif mode == 3:
urls_and_titles = get_urls_and_titles(CFG_WEBLINKBACK_PAGE_TITLE_STATUS['MANUALLY_SET'])
for (url, title, manual_set, broken_count) in urls_and_titles: # pylint: disable=W0612
new_title = get_title_of_page(url)
# Only accept valid titles
if new_title != None:
update_url_title(url, new_title)
elif mode == 4:
urls_and_titles = get_urls_and_titles()
for (url, title, manual_set, broken_count) in urls_and_titles: # pylint: disable=W0612
new_title = get_title_of_page(url)
# Broken one detected
if new_title == None:
increment_broken_count(url)
if broken_count + 1 == CFG_WEBLINKBACK_BROKEN_COUNT:
set_url_broken(url)
def delete_linkbacks_on_blacklist():
"""
Delete all rejected, broken and pending linkbacks whose URL on in the blacklist
"""
linkbacks = list(get_all_linkbacks(status=CFG_WEBLINKBACK_STATUS['PENDING']))
linkbacks.extend(list(get_all_linkbacks(status=CFG_WEBLINKBACK_STATUS['REJECTED'])))
linkbacks.extend(list(get_all_linkbacks(status=CFG_WEBLINKBACK_STATUS['BROKEN'])))
for linkback in linkbacks:
if infix_exists_for_url_in_list(linkback[1], CFG_WEBLINKBACK_LIST_TYPE['BLACKLIST']):
remove_linkback(linkback[0])
diff --git a/invenio/legacy/weblinkback/db_layer.py b/invenio/legacy/weblinkback/db_layer.py
index 0e5a33ddc..b4931b8e2 100644
--- a/invenio/legacy/weblinkback/db_layer.py
+++ b/invenio/legacy/weblinkback/db_layer.py
@@ -1,419 +1,419 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebLinkback - Database Layer"""
from invenio.legacy.dbquery import run_sql
-from invenio.weblinkback_config import CFG_WEBLINKBACK_STATUS, \
+from invenio.legacy.weblinkback.config import CFG_WEBLINKBACK_STATUS, \
CFG_WEBLINKBACK_ORDER_BY_INSERTION_TIME, \
CFG_WEBLINKBACK_DEFAULT_USER, \
CFG_WEBLINKBACK_PAGE_TITLE_STATUS
from invenio.utils.text import xml_entities_to_utf8
def get_all_linkbacks(recid=None, status=None, order=CFG_WEBLINKBACK_ORDER_BY_INSERTION_TIME["ASC"], linkback_type=None):
"""
Get all linkbacks
@param recid: of one record, of all if None
@param status: with a certain status, of all if None
@param order: order by insertion time either "ASC" or "DESC"
@param linkback_type: of a certain type, of all if None
@return [(linkback_id,
origin_url,
recid,
additional_properties,
linkback_type,
linkback_status,
insert_time)]
in order by id
"""
header_sql = """SELECT id,
origin_url,
id_bibrec,
additional_properties,
type,
status,
insert_time
FROM lnkENTRY"""
conditions = []
order_sql = "ORDER by id %s" % order
params = []
def add_condition(column, value):
if value:
if not conditions:
conditions.append('WHERE %s=%%s' % column)
else:
conditions.append('AND %s=%%s' % column)
params.append(value)
add_condition('id_bibrec', recid)
add_condition('status', status)
add_condition('type', linkback_type)
return run_sql(header_sql + ' ' + ' '.join(conditions) + ' ' + order_sql, tuple(params))
def approve_linkback(linkbackid, user_info):
"""
Approve linkback
@param linkbackid: linkback id
@param user_info: user info
"""
update_linkback_status(linkbackid, CFG_WEBLINKBACK_STATUS['APPROVED'], user_info)
def reject_linkback(linkbackid, user_info):
"""
Reject linkback
@param linkbackid: linkback id
@param user_info: user info
"""
update_linkback_status(linkbackid, CFG_WEBLINKBACK_STATUS['REJECTED'], user_info)
def update_linkback_status(linkbackid, new_status, user_info = None):
"""
Update status of a linkback
@param linkbackid: linkback id
@param new_status: new status
@param user_info: user info
"""
if user_info == None:
user_info = {}
user_info['uid'] = CFG_WEBLINKBACK_DEFAULT_USER
run_sql("""UPDATE lnkENTRY
SET status=%s
WHERE id=%s
""", (new_status, linkbackid))
logid = run_sql("""INSERT INTO lnkLOG (id_user, action, log_time)
VALUES
(%s, %s, NOW());
SELECT LAST_INSERT_ID();
""", (user_info['uid'], new_status))
run_sql("""INSERT INTO lnkENTRYLOG (id_lnkENTRY , id_lnkLOG)
VALUES
(%s, %s);
""", (linkbackid, logid))
def create_linkback(origin_url, recid, additional_properties, linkback_type, user_info):
"""
Create linkback
@param origin_url: origin URL,
@param recid: recid
@param additional_properties: additional properties
@param linkback_type: linkback type
@param user_info: user info
@return id of the created linkback
"""
linkbackid = run_sql("""INSERT INTO lnkENTRY (origin_url, id_bibrec, additional_properties, type, status, insert_time)
VALUES
(%s, %s, %s, %s, %s, NOW());
SELECT LAST_INSERT_ID();
""", (origin_url, recid, str(additional_properties), linkback_type, CFG_WEBLINKBACK_STATUS['PENDING']))
logid = run_sql("""INSERT INTO lnkLOG (id_user, action, log_time)
VALUES
(%s, %s, NOW());
SELECT LAST_INSERT_ID();
""", (user_info['uid'], CFG_WEBLINKBACK_STATUS['INSERTED']))
run_sql("""INSERT INTO lnkENTRYLOG (id_lnkENTRY, id_lnkLOG)
VALUES
(%s, %s);
""", (linkbackid, logid))
# add url title entry if necessary
if len(run_sql("""SELECT url
FROM lnkENTRYURLTITLE
WHERE url=%s
""", (origin_url, ))) == 0:
manual_set_title = 0
title = ""
if additional_properties != "" and 'title' in additional_properties.keys():
manual_set_title = 1
title = additional_properties['title']
run_sql("""INSERT INTO lnkENTRYURLTITLE (url, title, manual_set)
VALUES
(%s, %s, %s)
""", (origin_url, title, manual_set_title))
return linkbackid
def get_approved_latest_added_linkbacks(count):
"""
Get approved latest added linkbacks
@param count: count of the linkbacks
@return [(linkback_id,
origin_url,
recid,
additional_properties,
type,
status,
insert_time)]
in descending order by insert_time
"""
return run_sql("""SELECT id,
origin_url,
id_bibrec,
additional_properties,
type,
status,
insert_time
FROM lnkENTRY
WHERE status=%s
ORDER BY insert_time DESC
LIMIT %s
""", (CFG_WEBLINKBACK_STATUS['APPROVED'], count))
def get_url_list(list_type):
"""
@param list_type: of CFG_WEBLINKBACK_LIST_TYPE
@return (url0, ..., urln) in ascending order by url
"""
result = run_sql("""SELECT url
FROM lnkADMINURL
WHERE list=%s
ORDER by url ASC
""", (list_type, ))
return tuple(url[0] for (url) in result)
def get_urls():
"""
Get all URLs and the corresponding listType
@return ((url, CFG_WEBLINKBACK_LIST_TYPE), ..., (url, CFG_WEBLINKBACK_LIST_TYPE)) in ascending order by url
"""
return run_sql("""SELECT url, list
FROM lnkADMINURL
ORDER by url ASC
""")
def url_exists(url, list_type=None):
"""
Check if url exists
@param url
@param list_type: specific list of CFG_WEBLINKBACK_LIST_TYPE, all if None
@return True or False
"""
header_sql = """SELECT url
FROM lnkADMINURL
WHERE url=%s
"""
optional_sql = " AND list=%s"
result = None
if list_type:
result = run_sql(header_sql + optional_sql, (url, list_type))
else:
result = run_sql(header_sql, (url, ))
if result != ():
return True
else:
return False
def add_url_to_list(url, list_type, user_info):
"""
Add a URL to a list
@param url: unique URL string for all lists
@param list_type: of CFG_WEBLINKBACK_LIST_TYPE
@param user_info: user info
@return id of the created url
"""
urlid = run_sql("""INSERT INTO lnkADMINURL (url, list)
VALUES
(%s, %s);
SELECT LAST_INSERT_ID();
""", (url, list_type))
logid = run_sql("""INSERT INTO lnkLOG (id_user, action, log_time)
VALUES
(%s, %s, NOW());
SELECT LAST_INSERT_ID();
""", (user_info['uid'], CFG_WEBLINKBACK_STATUS['INSERTED']))
run_sql("""INSERT INTO lnkADMINURLLOG (id_lnkADMINURL, id_lnkLOG)
VALUES
(%s, %s);
""", (urlid, logid))
return urlid
def remove_url(url):
"""
Remove a URL from list
@param url: unique URL string for all lists
"""
# get ids
urlid = run_sql("""SELECT id
FROM lnkADMINURL
WHERE url=%s
""", (url, ))[0][0]
logids = run_sql("""SELECT log.id
FROM lnkLOG log
JOIN lnkADMINURLLOG url_log
ON log.id=url_log.id_lnkLOG
WHERE url_log.id_lnkADMINURL=%s
""", (urlid, ))
# delete url and url log
run_sql("""DELETE FROM lnkADMINURL
WHERE id=%s;
DELETE FROM lnkADMINURLLOG
WHERE id_lnkADMINURL=%s
""", (urlid, urlid))
# delete log
for logid in logids:
run_sql("""DELETE FROM lnkLOG
WHERE id=%s
""", (logid[0], ))
def get_urls_and_titles(title_status=None):
"""
Get URLs and their corresponding title
@param old_new: of CFG_WEBLINKBACK_PAGE_TITLE_STATUS or None
@return ((url, title, manual_set),...), all rows of the table if None
"""
top_query = """SELECT url, title, manual_set, broken_count
FROM lnkENTRYURLTITLE
WHERE
"""
where_sql = ""
if title_status == CFG_WEBLINKBACK_PAGE_TITLE_STATUS['NEW']:
where_sql = " title='' AND manual_set=0 AND"
elif title_status == CFG_WEBLINKBACK_PAGE_TITLE_STATUS['OLD']:
where_sql = " title<>'' AND manual_set=0 AND"
elif title_status == CFG_WEBLINKBACK_PAGE_TITLE_STATUS['MANUALLY_SET']:
where_sql = " manual_set=1 AND"
where_sql += " broken=0"
return run_sql(top_query + where_sql)
def update_url_title(url, title):
"""
Update the corresponding title of a URL
@param url: URL
@param title: new title
"""
run_sql("""UPDATE lnkENTRYURLTITLE
SET title=%s,
manual_set=0,
broken_count=0,
broken=0
WHERE url=%s
""", (title, url))
def remove_url_title(url):
"""
Remove URL title
@param url: URL
"""
run_sql("""DELETE FROM lnkENTRYURLTITLE
WHERE url=%s
""", (url, ))
def set_url_broken(url):
"""
Set URL broken
@param url: URL
"""
linkbackids = run_sql("""SELECT id
FROM lnkENTRY
WHERE origin_url=%s
""", (url, ))
run_sql("""UPDATE lnkENTRYURLTITLE
SET title=%s,
broken=1
WHERE url=%s
""", (CFG_WEBLINKBACK_STATUS['BROKEN'], url))
# update all linkbacks
for linkbackid in linkbackids:
update_linkback_status(linkbackid[0], CFG_WEBLINKBACK_STATUS['BROKEN'])
def get_url_title(url):
"""
Get URL title or URL if title does not exist (empty string)
@param url: URL
@return title or URL if titles does not exist (empty string)
"""
title = run_sql("""SELECT title
FROM lnkENTRYURLTITLE
WHERE url=%s and title<>"" and broken=0
""", (url, ))
res = url
if len(title) != 0:
res = title[0][0]
return xml_entities_to_utf8(res)
def increment_broken_count(url):
"""
Increment broken count a URL
@param url: URL
"""
run_sql("""UPDATE lnkENTRYURLTITLE
SET broken_count=broken_count+1
WHERE url=%s
""", (url, ))
def remove_linkback(linkbackid):
"""
Remove a linkback database
@param linkbackid: unique URL string for all lists
"""
# get ids
logids = run_sql("""SELECT log.id
FROM lnkLOG log
JOIN lnkENTRYLOG entry_log
ON log.id=entry_log.id_lnkLOG
WHERE entry_log.id_lnkENTRY=%s
""", (linkbackid, ))
# delete linkback entry and entry log
run_sql("""DELETE FROM lnkENTRY
WHERE id=%s;
DELETE FROM lnkENTRYLOG
WHERE id_lnkENTRY=%s
""", (linkbackid, linkbackid))
# delete log
for logid in logids:
run_sql("""DELETE FROM lnkLOG
WHERE id=%s
""", (logid[0], ))
diff --git a/invenio/legacy/weblinkback/templates.py b/invenio/legacy/weblinkback/templates.py
index b6783c5ea..d1d34ee54 100644
--- a/invenio/legacy/weblinkback/templates.py
+++ b/invenio/legacy/weblinkback/templates.py
@@ -1,287 +1,287 @@
# -*- coding: utf-8 -*-
## Comments and reviews for records.
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebLinkback - Web Templates"""
-from invenio.weblinkback_dblayer import get_all_linkbacks, \
+from invenio.legacy.weblinkback.db_layer import get_all_linkbacks, \
get_url_title
-from invenio.weblinkback_config import CFG_WEBLINKBACK_STATUS, \
+from invenio.legacy.weblinkback.config import CFG_WEBLINKBACK_STATUS, \
CFG_WEBLINKBACK_LATEST_COUNT_VALUES, \
CFG_WEBLINKBACK_ACTION_RETURN_CODE
-from invenio.weblinkback import generate_redirect_url
+from invenio.legacy.weblinkback.api import generate_redirect_url
from invenio.base.i18n import gettext_set_language
from invenio.utils.date import convert_datetext_to_dategui
from invenio.config import CFG_SITE_RECORD, \
CFG_SITE_URL, \
CFG_WEBCOMMENT_USE_MATHJAX_IN_COMMENTS
from invenio.utils.html import get_mathjax_header
from invenio.modules.formatter import format_record
import cgi
class Template:
def tmpl_linkbacks_general(self, recid, ln):
"""
Display general linkback information
"""
_ = gettext_set_language(ln)
url = get_trackback_url(recid)
out = '<h4>'
out += _("Trackback URL: ")
out += '<a href="%s" onclick="return false" rel="nofollow">%s</a>' % (url, url)
out += '</h4>'
out += '<div class="comment-subscribe">Trackbacks are used in blog systems to refer to external content. Please copy and paste this trackback URL to the appropriate field of your blog post if you want to refer to this record.</div>'
out += '<br/>'
return out
def tmpl_linkbacks(self, approved_linkbacks, ln):
"""
Display the approved linkbacks of a record
@param approved_linkbacks: approved linkbacks to display
"""
_ = gettext_set_language(ln)
out = self.tmpl_linkbacks_count(approved_linkbacks, ln)
if approved_linkbacks:
out += self.tmpl_linkback_tuple(approved_linkbacks, ln)
return out
def tmpl_linkbacks_admin(self, pending_linkbacks, recid, ln):
"""
Display the pending linkbacks of a record and admin approve/reject features
@param pending_linkbacks: pending linkbacks
"""
_ = gettext_set_language(ln)
out = ''
out += self.tmpl_linkbacks_count(pending_linkbacks, ln, _('to review'))
out += self.tmpl_linkback_tuple_admin(url_approve_prefix=generate_redirect_url(recid, ln, 'approve'),
url_reject_prefix=generate_redirect_url(recid, ln, 'reject'),
linkbacks=pending_linkbacks,
ln=ln)
return out
def tmpl_linkbacks_count(self, linkbacks, ln, additional_text = ''):
"""
Display the count of linkbacks plus an additional text in a grey field
@param linkbacks: collection of linkbacks
@param additional_text: additional text to be display
"""
_ = gettext_set_language(ln)
middle_text = ""
if additional_text != "":
middle_text = " " + additional_text
return self.tmpl_heading(cgi.escape(_('Linkbacks%s: %s')) % (middle_text, len(linkbacks)))
def tmpl_heading(self, text):
"""
Display a text in a grey field
@param text: text
"""
return '''
<table><tr>
<td>
<table><tr><td class="blocknote">%s</td></tr></table>
</td>
</tr></table>
''' % text
def tmpl_linkback_tuple(self, linkbacks, ln):
"""
Display a linkback
@param linkbacks: collection of linkbacks: [(linkback_id,
origin_url,
recid,
additional_properties,
type,
status,
insert_time)]
"""
_ = gettext_set_language(ln)
out = '<table width="95%" style="display: inline";>'
for current_linkback in linkbacks:
url = current_linkback[1]
out += '''<tr><td><font class="rankscoreinfo"><a>(%(type)s)&nbsp;</a></font><small>&nbsp;<a href="%(origin_url)s" target="_blank">%(page_title)s</a>&nbsp;%(submit_date)s</small></td></tr>''' % {
'type': current_linkback[4],
'origin_url': cgi.escape(url),
'page_title': cgi.escape(get_url_title(url)),
'submit_date': '(submitted on <i>' + convert_datetext_to_dategui(str(current_linkback[6])) + '</i>)'}
out += '</table>'
return out
def tmpl_linkback_tuple_admin(self, url_approve_prefix, url_reject_prefix, linkbacks, ln):
"""
Display linkbacks with admin approve/reject features
@param linkbacks: collection of linkbacks: [(linkback_id,
origin_url,
recid,
additional_properties,
type,
status,
insert_time)]
"""
_ = gettext_set_language(ln)
out = ''
for current_linkback in linkbacks:
linkbackid = current_linkback[0]
url = current_linkback[1]
out += '<div style="margin-bottom:20px;background:#F9F9F9;border:1px solid #DDD">'
out += '<div style="background-color:#EEE;padding:2px;font-size:small">&nbsp;%s</div>' % (_('Submitted on') + ' <i>' + convert_datetext_to_dategui(str(current_linkback[6])) + '</i>:')
out += '<br />'
out += '<blockquote>'
out += '''<font class="rankscoreinfo"><a>(%(type)s)&nbsp;</a></font><small>&nbsp;<a href="%(origin_url)s" target="_blank">%(page_title)s</a></small>''' % {
'type': current_linkback[4],
'origin_url': cgi.escape(url),
'page_title': cgi.escape(get_url_title(url))}
out += '</blockquote>'
out += '<br />'
out += '<div style="float:right">'
out += '<small>'
out += '''<a style="color:#8B0000;" href="%s&linkbackid=%s">%s</a>''' % (url_approve_prefix, linkbackid, _("Approve"))
out += '&nbsp;|&nbsp;'
out += '''<a style="color:#8B0000;" href="%s&linkbackid=%s">%s</a>''' % (url_reject_prefix, linkbackid, _("Reject"))
out += '</small>'
out += '</div>'
out += '</div>'
return out
def tmpl_get_mathjaxheader_jqueryheader(self):
mathjaxheader = ''
if CFG_WEBCOMMENT_USE_MATHJAX_IN_COMMENTS:
mathjaxheader = get_mathjax_header()
jqueryheader = '''
<script src="%(CFG_SITE_URL)s/js/jquery.min.js" type="text/javascript" language="javascript"></script>
<script src="%(CFG_SITE_URL)s/js/jquery.MultiFile.pack.js" type="text/javascript" language="javascript"></script>
''' % {'CFG_SITE_URL': CFG_SITE_URL}
return (mathjaxheader, jqueryheader)
def tmpl_get_latest_linkbacks_top(self, current_value, ln):
"""
Top elements to select the count of approved latest added linkbacks to display
@param current_value: current value option will be selected if it exists
"""
_ = gettext_set_language(ln)
result = """<form action='linkbacks' style='form { display: inline; }'><b>%s</b>
<select name="rg" size="1">
""" % _("View last")
for i in range(len(CFG_WEBLINKBACK_LATEST_COUNT_VALUES)):
latest_count_string = str(CFG_WEBLINKBACK_LATEST_COUNT_VALUES[i])
if CFG_WEBLINKBACK_LATEST_COUNT_VALUES[i] == current_value:
result += '<option SELECTED>' + latest_count_string + '</option>'
else:
result += '<option value=' + latest_count_string + '>' + latest_count_string + '</option>'
result += """ </select> <b>linkbacks</b>
<input type="submit" class="adminbutton" value="%s">
</form>
""" % _("Refresh")
return result
def tmpl_get_latest_linkbacks(self, latest_linkbacks, ln):
"""
Display approved latest added linkbacks to display
@param latest_linkbacks: a list of lists of linkbacks
"""
result = ''
for i in range(len(latest_linkbacks)):
day_group = latest_linkbacks[i]
date = day_group[0][6]
date_day_month = convert_datetext_to_dategui(str(date))[:6]
result += self.tmpl_heading(date_day_month)
for j in range(len(day_group)):
current_linkback = day_group[j]
link_type = current_linkback[4]
url = str(current_linkback[1])
recordid = current_linkback[2]
result += '<font class="rankscoreinfo"><a>(%s)&nbsp;</a></font>' % link_type
result += '<small>'
result += '<a href="%s">%s</a> links to ' % (cgi.escape(url), cgi.escape(get_url_title(url)))
result += format_record(recID=recordid, of='hs', ln=ln)
result += '</small>'
result += '<br>'
result += '<br>'
return result
def tmpl_admin_index(self, ln):
"""
Index page of admin interface
"""
_ = gettext_set_language(ln)
out = '<ol>'
pending_linkback_count = len(get_all_linkbacks(status=CFG_WEBLINKBACK_STATUS['PENDING']))
stat_pending_text = ""
if pending_linkback_count > 0:
stat_pending_text = ' <span class="moreinfo"> ('
if pending_linkback_count == 1:
stat_pending_text += "%s pending linkback request" % pending_linkback_count
elif pending_linkback_count > 1:
stat_pending_text += "%s pending linkback requests"% pending_linkback_count
stat_pending_text += ')</span>'
out += '<li><a href="%(siteURL)s/admin/weblinkback/weblinkbackadmin.py/linkbacks?ln=%(ln)s&amp;status=%(status)s">%(label)s</a>%(stat)s</li>' % \
{'siteURL': CFG_SITE_URL,
'ln': ln,
'status': CFG_WEBLINKBACK_STATUS['PENDING'],
'label': _("Pending Linkbacks"),
'stat': stat_pending_text}
out += '<li><a href="%(siteURL)s/linkbacks?ln=%(ln)s">%(label)s</a></li>' % \
{'siteURL': CFG_SITE_URL,
'ln': ln,
'label': _("Recent Linkbacks")}
out += '<li><a href="%(siteURL)s/admin/weblinkback/weblinkbackadmin.py/lists?ln=%(ln)s&amp;returncode=%(returnCode)s">%(label)s</a></li>' % \
{'siteURL': CFG_SITE_URL,
'ln': ln,
'returnCode': CFG_WEBLINKBACK_ACTION_RETURN_CODE['OK'],
'label': _("Linkback Whitelist/Blacklist Manager")}
out += '</ol>'
from invenio.legacy.bibrank.adminlib import addadminbox
return addadminbox('<b>%s</b>'% _("Menu"), [out])
def get_trackback_url(recid):
return '%s/%s/%s/linkbacks/sendtrackback' % (CFG_SITE_URL, CFG_SITE_RECORD, recid)
def get_trackback_auto_discovery_tag(recid):
return '<link rel="trackback" type="application/x-www-form-urlencoded" href="%s" />' \
% cgi.escape(get_trackback_url(recid), True)
diff --git a/invenio/legacy/weblinkback/web/admin/weblinkbackadmin.py b/invenio/legacy/weblinkback/web/admin/weblinkbackadmin.py
index a2c4017fa..1b0e436c6 100644
--- a/invenio/legacy/weblinkback/web/admin/weblinkbackadmin.py
+++ b/invenio/legacy/weblinkback/web/admin/weblinkbackadmin.py
@@ -1,163 +1,163 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebLinkback - Administrative Interface"""
from invenio.base.i18n import wash_language, gettext_set_language
from invenio.modules.access.engine import acc_authorize_action
from invenio.legacy.webpage import page
from invenio.config import CFG_SITE_URL, CFG_SITE_LANG
from invenio.legacy.webuser import getUid, page_not_authorized, collect_user_info
-from invenio.weblinkbackadminlib import get_navtrail, \
+from invenio.legacy.weblinkback.adminlib import get_navtrail, \
perform_request_index, \
perform_request_display_list, \
perform_request_display_linkbacks, \
perform_moderate_linkback, \
perform_moderate_url
-from invenio.weblinkback_config import CFG_WEBLINKBACK_STATUS, \
+from invenio.legacy.weblinkback.config import CFG_WEBLINKBACK_STATUS, \
CFG_WEBLINKBACK_ACTION_RETURN_CODE
def index(req, ln=CFG_SITE_LANG):
"""
Menu of admin options
@param ln: language
"""
ln = wash_language(ln)
_ = gettext_set_language(ln)
navtrail_previous_links = get_navtrail()
navtrail_previous_links +=' &gt; <a class="navtrail" href="%s/admin/weblinkback/weblinkbackadmin.py/">' % CFG_SITE_URL
navtrail_previous_links += _("WebLinkback Admin") + '</a>'
uid = getUid(req)
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'cfgweblinkback')
if auth_code:
return page_not_authorized(req=req, text=auth_msg, navtrail=navtrail_previous_links)
else:
return page(title=_("WebLinkback Admin"),
body=perform_request_index(ln=ln),
uid=uid,
language=ln,
navtrail = navtrail_previous_links,
req=req)
def lists(req, urlfieldvalue='', returncode=CFG_WEBLINKBACK_ACTION_RETURN_CODE['OK'], ln=CFG_SITE_LANG):
"""
Display whitelist and blacklist
@param urlFieldValue: value of the url input field
@return_code: might indicate errors from a previous action, of CFG_WEBLINKBACK_ACTION_RETURN_CODE
@param ln: language
"""
# is passed as a string, must be an integer
return_code = int(returncode)
ln = wash_language(ln)
_ = gettext_set_language(ln)
navtrail_previous_links = get_navtrail()
navtrail_previous_links +=' &gt; <a class="navtrail" href="%s/admin/weblinkback/weblinkbackadmin.py/">' % CFG_SITE_URL
navtrail_previous_links += _("WebLinkback Admin") + '</a>'
uid = getUid(req)
userInfo = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(userInfo, 'cfgweblinkback')
if auth_code:
return page_not_authorized(req=req, text=auth_msg, navtrail=navtrail_previous_links)
else:
return page(title=_("Linkback Whitelist/Blacklist Manager"),
body=perform_request_display_list(return_code=return_code, url_field_value=urlfieldvalue, ln=ln),
uid=uid,
language=ln,
navtrail = navtrail_previous_links,
req=req)
def moderatelist(req, url, action, listtype=None, ln=CFG_SITE_LANG):
"""
Add URL to list
@param url: url
@param listType: of CFG_WEBLINKBACK_LIST_TYPE
"""
ln = wash_language(ln)
_ = gettext_set_language(ln)
navtrail_previous_links = get_navtrail()
navtrail_previous_links +=' &gt; <a class="navtrail" href="%s/admin/weblinkback/weblinkbackadmin.py/">' % CFG_SITE_URL
navtrail_previous_links += _("WebLinkback Admin") + '</a>'
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'cfgweblinkback')
if auth_code:
return page_not_authorized(req=req, text=auth_msg, navtrail=navtrail_previous_links)
else:
url_field_value, return_code = perform_moderate_url(req=req, url=url, action=action, list_type=listtype)
return lists(req=req,
urlfieldvalue=url_field_value,
returncode=return_code,
ln=ln)
def linkbacks(req, status, returncode=CFG_WEBLINKBACK_ACTION_RETURN_CODE['OK'], ln=CFG_SITE_LANG):
"""
Display linkbacks
@param ln: language
@param status: of CFG_WEBLINKBACK_STATUS, currently only CFG_WEBLINKBACK_STATUS['PENDING'] is supported
"""
return_code = int(returncode)
ln = wash_language(ln)
_ = gettext_set_language(ln)
navtrail_previous_links = get_navtrail()
navtrail_previous_links +=' &gt; <a class="navtrail" href="%s/admin/weblinkback/weblinkbackadmin.py/">' % CFG_SITE_URL
navtrail_previous_links += _("WebLinkback Admin") + '</a>'
uid = getUid(req)
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'cfgweblinkback')
if auth_code:
return page_not_authorized(req=req, text=auth_msg, navtrail=navtrail_previous_links)
else:
return page(title=_("Pending Linkbacks"),
body=perform_request_display_linkbacks(return_code=return_code, status=status, ln=ln),
uid=uid,
language=ln,
navtrail = navtrail_previous_links,
req=req)
def moderatelinkback(req, action, linkbackid, ln=CFG_SITE_LANG):
"""
Moderate linkbacks
@param linkbackId: linkback id
@param action: of CFG_WEBLINKBACK_ADMIN_MODERATION_ACTION
"""
ln = wash_language(ln)
_ = gettext_set_language(ln)
navtrail_previous_links = get_navtrail()
navtrail_previous_links +=' &gt; <a class="navtrail" href="%s/admin/weblinkback/weblinkbackadmin.py/">' % CFG_SITE_URL
navtrail_previous_links += _("WebLinkback Admin") + '</a>'
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'cfgweblinkback')
if auth_code:
return page_not_authorized(req=req, text=auth_msg, navtrail=navtrail_previous_links)
else:
return_code = perform_moderate_linkback(req=req, linkbackid=linkbackid, action=action)
return linkbacks(req=req,
status=CFG_WEBLINKBACK_STATUS['PENDING'],
returncode=return_code,
ln=ln)
diff --git a/invenio/legacy/weblinkback/webinterface.py b/invenio/legacy/weblinkback/webinterface.py
index 703ad855a..995d3d217 100644
--- a/invenio/legacy/weblinkback/webinterface.py
+++ b/invenio/legacy/weblinkback/webinterface.py
@@ -1,262 +1,262 @@
# -*- coding: utf-8 -*-
## Comments and reviews for records.
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebLinkback - Web Interface"""
from invenio.base.i18n import gettext_set_language
from invenio.ext.legacy.handler import wash_urlargd, WebInterfaceDirectory
from invenio.legacy.webuser import getUid, collect_user_info, page_not_authorized
-from invenio.weblinkback import check_user_can_view_linkbacks, \
+from invenio.legacy.weblinkback.api import check_user_can_view_linkbacks, \
perform_sendtrackback, \
perform_request_display_record_linbacks, \
perform_request_display_approved_latest_added_linkbacks_to_accessible_records, \
perform_sendtrackback_disabled
-from invenio.weblinkback_dblayer import approve_linkback, \
+from invenio.legacy.weblinkback.db_layer import approve_linkback, \
reject_linkback
-from invenio.weblinkback_config import CFG_WEBLINKBACK_LATEST_COUNT_DEFAULT, \
+from invenio.legacy.weblinkback.config import CFG_WEBLINKBACK_LATEST_COUNT_DEFAULT, \
CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME
from invenio.utils.url import redirect_to_url, make_canonical_urlargd
from invenio.config import CFG_SITE_URL, \
CFG_SITE_SECURE_URL, \
CFG_SITE_LANG, \
CFG_SITE_RECORD, \
CFG_WEBLINKBACK_TRACKBACK_ENABLED
from invenio.legacy.search_engine import guess_primary_collection_of_a_record, \
create_navtrail_links, \
get_colID
from invenio.legacy.webpage import pageheaderonly, pagefooteronly
-from invenio.websearchadminlib import get_detailed_page_tabs
+from invenio.legacy.websearch.adminlib import get_detailed_page_tabs
from invenio.modules.access.engine import acc_authorize_action
import invenio.legacy.template
webstyle_templates = invenio.legacy.template.load('webstyle')
websearch_templates = invenio.legacy.template.load('websearch')
weblinkback_templates = invenio.legacy.template.load('weblinkback')
class WebInterfaceRecordLinkbacksPages(WebInterfaceDirectory):
"""Define the set of record/number/linkbacks pages."""
_exports = ['', 'display', 'index', 'approve', 'reject', 'sendtrackback']
def __init__(self, recid = -1):
self.recid = recid
def index(self, req, form):
"""
Redirect to display function
"""
return self.display(req, form)
def display(self, req, form):
"""
Display the linkbacks of a record and admin approve/reject features
"""
argd = wash_urlargd(form, {})
_ = gettext_set_language(argd['ln'])
# Check authorization
uid = getUid(req)
user_info = collect_user_info(req)
(auth_code, auth_msg) = check_user_can_view_linkbacks(user_info, self.recid)
if auth_code and user_info['email'] == 'guest':
# Ask to login
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'ln': argd['ln'],
'referer': CFG_SITE_URL + user_info['uri']}, {})
return redirect_to_url(req, target)
elif auth_code:
return page_not_authorized(req,
referer="../",
uid=uid,
text=auth_msg,
ln=argd['ln'])
show_admin = False
(auth_code, auth_msg) = acc_authorize_action(req, 'moderatelinkbacks', collection = guess_primary_collection_of_a_record(self.recid))
if not auth_code:
show_admin = True
body = perform_request_display_record_linbacks(req, self.recid, show_admin, weblinkback_templates=weblinkback_templates, ln=argd['ln'])
title = websearch_templates.tmpl_record_page_header_content(req, self.recid, argd['ln'])[0]
# navigation, tabs, top and bottom part
navtrail = create_navtrail_links(cc=guess_primary_collection_of_a_record(self.recid), ln=argd['ln'])
if navtrail:
navtrail += ' &gt; '
navtrail += '<a class="navtrail" href="%s/%s/%s?ln=%s">'% (CFG_SITE_URL, CFG_SITE_RECORD, self.recid, argd['ln'])
navtrail += title
navtrail += '</a>'
navtrail += ' &gt; <a class="navtrail">Linkbacks</a>'
mathjaxheader, jqueryheader = weblinkback_templates.tmpl_get_mathjaxheader_jqueryheader()
unordered_tabs = get_detailed_page_tabs(get_colID(guess_primary_collection_of_a_record(self.recid)),
self.recid,
ln=argd['ln'])
ordered_tabs_id = [(tab_id, values['order']) for (tab_id, values) in unordered_tabs.iteritems()]
ordered_tabs_id.sort(lambda x, y: cmp(x[1], y[1]))
link_ln = ''
if argd['ln'] != CFG_SITE_LANG:
link_ln = '?ln=%s' % argd['ln']
tabs = [(unordered_tabs[tab_id]['label'], \
'%s/%s/%s/%s%s' % (CFG_SITE_URL, CFG_SITE_RECORD, self.recid, tab_id, link_ln), \
tab_id in ['linkbacks'],
unordered_tabs[tab_id]['enabled']) \
for (tab_id, values) in ordered_tabs_id
if unordered_tabs[tab_id]['visible'] == True]
top = webstyle_templates.detailed_record_container_top(self.recid,
tabs,
argd['ln'])
bottom = webstyle_templates.detailed_record_container_bottom(self.recid,
tabs,
argd['ln'])
return pageheaderonly(title=title,
navtrail=navtrail,
uid=uid,
verbose=1,
metaheaderadd = mathjaxheader + jqueryheader,
req=req,
language=argd['ln'],
navmenuid='search',
navtrail_append_title_p=0) + \
websearch_templates.tmpl_search_pagestart(argd['ln']) + \
top + body + bottom + \
websearch_templates.tmpl_search_pageend(argd['ln']) + \
pagefooteronly(language=argd['ln'], req=req)
# Return the same page whether we ask for /CFG_SITE_RECORD/123/linkbacks or /CFG_SITE_RECORD/123/linkbacks/
__call__ = index
def approve(self, req, form):
"""
Approve a linkback
"""
argd = wash_urlargd(form, {'linkbackid': (int, -1)})
authorization = self.check_authorization_moderatelinkbacks(req, argd)
if not authorization:
approve_linkback(argd['linkbackid'], collect_user_info(req))
return self.display(req, form)
else:
return authorization
def reject(self, req, form):
"""
Reject a linkback
"""
argd = wash_urlargd(form, {'linkbackid': (int, -1)})
authorization = self.check_authorization_moderatelinkbacks(req, argd)
if not authorization:
reject_linkback(argd['linkbackid'], collect_user_info(req))
return self.display(req, form)
else:
return authorization
def check_authorization_moderatelinkbacks(self, req, argd):
"""
Check if user has authorization moderate linkbacks
@return if yes: nothing, if guest: login redirect, otherwise page_not_authorized
"""
# Check authorization
uid = getUid(req)
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(req, 'moderatelinkbacks', collection = guess_primary_collection_of_a_record(self.recid))
if auth_code and user_info['email'] == 'guest':
# Ask to login
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'ln': argd['ln'],
'referer': CFG_SITE_URL + user_info['uri']}, {})
return redirect_to_url(req, target)
elif auth_code:
return page_not_authorized(req,
referer="../",
uid=uid,
text=auth_msg,
ln=argd['ln'])
def sendtrackback(self, req, form):
"""
Send a new trackback
"""
if CFG_WEBLINKBACK_TRACKBACK_ENABLED:
argd = wash_urlargd(form, {'url': (str, CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME),
'title': (str, CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME),
'excerpt': (str, CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME),
'blog_name': (str, CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME),
'id': (str, CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME),
'source': (str, CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME),
})
perform_sendtrackback(req, self.recid, argd['url'], argd['title'], argd['excerpt'], argd['blog_name'], argd['id'], argd['source'], argd['ln'])
else:
perform_sendtrackback_disabled(req)
class WebInterfaceRecentLinkbacksPages(WebInterfaceDirectory):
"""Define the set of global /linkbacks pages."""
_exports = ['', 'display', 'index']
def index(self, req, form):
"""
Redirect to display function
"""
return self.display(req, form)
def display(self, req, form):
"""
Display approved latest added linkbacks of the invenio instance
"""
argd = wash_urlargd(form, {'rg': (int, CFG_WEBLINKBACK_LATEST_COUNT_DEFAULT)})
# count must be positive
if argd['rg'] < 0:
argd['rg'] = -argd['rg']
_ = gettext_set_language(argd['ln'])
user_info = collect_user_info(req)
body = perform_request_display_approved_latest_added_linkbacks_to_accessible_records(argd['rg'], argd['ln'], user_info, weblinkback_templates=weblinkback_templates)
navtrail = 'Recent Linkbacks'
mathjaxheader, jqueryheader = weblinkback_templates.tmpl_get_mathjaxheader_jqueryheader()
return pageheaderonly(title=navtrail,
navtrail=navtrail,
verbose=1,
metaheaderadd = mathjaxheader + jqueryheader,
req=req,
language=argd['ln'],
navmenuid='search',
navtrail_append_title_p=0) + \
websearch_templates.tmpl_search_pagestart(argd['ln']) + \
body + \
websearch_templates.tmpl_search_pageend(argd['ln']) + \
pagefooteronly(language=argd['ln'], req=req)
# Return the same page whether we ask for /linkbacks or /linkbacks/
__call__ = index
diff --git a/invenio/legacy/websearch/adminlib.py b/invenio/legacy/websearch/adminlib.py
index a0777fad7..5706a69c5 100644
--- a/invenio/legacy/websearch/adminlib.py
+++ b/invenio/legacy/websearch/adminlib.py
@@ -1,3535 +1,3535 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0301
"""Invenio WebSearch Administrator Interface."""
__revision__ = "$Id$"
import cgi
import random
import time
import sys
from invenio.utils.date import strftime
if sys.hexversion < 0x2040000:
# pylint: disable=W0622
from sets import Set as set
# pylint: enable=W0622
from invenio.config import \
CFG_CACHEDIR, \
CFG_SITE_LANG, \
CFG_SITE_NAME, \
CFG_SITE_URL,\
CFG_WEBCOMMENT_ALLOW_COMMENTS, \
CFG_WEBSEARCH_SHOW_COMMENT_COUNT, \
CFG_WEBCOMMENT_ALLOW_REVIEWS, \
CFG_WEBSEARCH_SHOW_REVIEW_COUNT, \
CFG_BIBRANK_SHOW_CITATION_LINKS, \
CFG_INSPIRE_SITE, \
CFG_CERN_SITE
from invenio.legacy.bibrank.adminlib import \
write_outcome, \
modify_translations, \
get_def_name, \
get_name, \
get_languages, \
addadminbox, \
tupletotable, \
createhiddenform
from invenio.legacy.dbquery import \
run_sql, \
get_table_update_time
from invenio.legacy.websearch_external_collections import \
external_collections_dictionary, \
external_collection_sort_engine_by_name, \
external_collection_get_state, \
external_collection_get_update_state_list, \
external_collection_apply_changes
from invenio.legacy.websearch_external_collections.websearch_external_collections_utils import \
get_collection_descendants
from invenio.legacy.websearch_external_collections.websearch_external_collections_config import CFG_EXTERNAL_COLLECTION_STATES_NAME
#from invenio.modules.formatter.format_elements import bfe_references
#from invenio.modules.formatter.engine import BibFormatObject
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
from invenio.base.i18n import gettext_set_language
#from invenio.legacy.bibrank.citation_searcher import get_cited_by
from invenio.modules.access.control import acc_get_action_id
from invenio.modules.access.local_config import VIEWRESTRCOLL
from invenio.ext.logging import register_exception
from invenio.intbitset import intbitset
from invenio.legacy.bibrank.citation_searcher import get_cited_by_count
from invenio.legacy.bibrecord import record_get_field_instances
def getnavtrail(previous = ''):
"""Get the navtrail"""
navtrail = """<a class="navtrail" href="%s/help/admin">Admin Area</a> """ % (CFG_SITE_URL,)
navtrail = navtrail + previous
return navtrail
def fix_collection_scores():
"""
Re-calculate and re-normalize de scores of the collection relationship.
"""
for id_dad in intbitset(run_sql("SELECT id_dad FROM collection_collection")):
for index, id_son in enumerate(run_sql("SELECT id_son FROM collection_collection WHERE id_dad=%s ORDER BY score DESC", (id_dad, ))):
run_sql("UPDATE collection_collection SET score=%s WHERE id_dad=%s AND id_son=%s", (index * 10 + 10, id_dad, id_son[0]))
def perform_modifytranslations(colID, ln, sel_type='', trans=[], confirm=-1, callback='yes'):
"""Modify the translations of a collection
sel_type - the nametype to modify
trans - the translations in the same order as the languages from get_languages()"""
output = ''
subtitle = ''
sitelangs = get_languages()
if type(trans) is str:
trans = [trans]
if confirm in ["2", 2] and colID:
finresult = modify_translations(colID, sitelangs, sel_type, trans, "collection")
col_dict = dict(get_def_name('', "collection"))
if colID and col_dict.has_key(int(colID)):
colID = int(colID)
subtitle = """<a name="3">3. Modify translations for collection '%s'</a>&nbsp;&nbsp;&nbsp;<small>[<a href="%s/help/admin/websearch-admin-guide#3.3">?</a>]</small>""" % (col_dict[colID], CFG_SITE_URL)
if sel_type == '':
sel_type = get_col_nametypes()[0][0]
header = ['Language', 'Translation']
actions = []
types = get_col_nametypes()
if len(types) > 1:
text = """
<span class="adminlabel">Name type</span>
<select name="sel_type" class="admin_w200">
"""
for (key, value) in types:
text += """<option value="%s" %s>%s""" % (key, key == sel_type and 'selected="selected"' or '', value)
trans_names = get_name(colID, ln, key, "collection")
if trans_names and trans_names[0][0]:
text += ": %s" % trans_names[0][0]
text += "</option>"
text += """</select>"""
output += createhiddenform(action="modifytranslations#3",
text=text,
button="Select",
colID=colID,
ln=ln,
confirm=0)
if confirm in [-1, "-1", 0, "0"]:
trans = []
for (key, value) in sitelangs:
try:
trans_names = get_name(colID, key, sel_type, "collection")
trans.append(trans_names[0][0])
except StandardError, e:
trans.append('')
for nr in range(0, len(sitelangs)):
actions.append(["%s" % (sitelangs[nr][1],)])
actions[-1].append('<input type="text" name="trans" size="30" value="%s"/>' % trans[nr])
text = tupletotable(header=header, tuple=actions)
output += createhiddenform(action="modifytranslations#3",
text=text,
button="Modify",
colID=colID,
sel_type=sel_type,
ln=ln,
confirm=2)
if sel_type and len(trans) and confirm in ["2", 2]:
output += write_outcome(finresult)
body = [output]
if callback:
return perform_editcollection(colID, ln, "perform_modifytranslations", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifyrankmethods(colID, ln, func='', rnkID='', confirm=0, callback='yes'):
"""Modify which rank methods is visible to the collection
func - remove or add rank method
rnkID - the id of the rank method."""
output = ""
subtitle = ""
col_dict = dict(get_def_name('', "collection"))
rnk_dict = dict(get_def_name('', "rnkMETHOD"))
if colID and col_dict.has_key(int(colID)):
colID = int(colID)
if func in ["0", 0] and confirm in ["1", 1]:
finresult = attach_rnk_col(colID, rnkID)
elif func in ["1", 1] and confirm in ["1", 1]:
finresult = detach_rnk_col(colID, rnkID)
subtitle = """<a name="9">9. Modify rank options for collection '%s'</a>&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#3.9">?</a>]</small>""" % (col_dict[colID], CFG_SITE_URL)
output = """
<dl>
<dt>The rank methods enabled for the collection '%s' is:</dt>
""" % col_dict[colID]
rnkmethods = get_col_rnk(colID, ln)
output += """<dd>"""
if not rnkmethods:
output += """No rank methods"""
else:
for id, name in rnkmethods:
output += """%s, """ % name
output += """</dd>
</dl>
"""
rnk_list = get_def_name('', "rnkMETHOD")
rnk_dict_in_col = dict(get_col_rnk(colID, ln))
rnk_list = filter(lambda x: not rnk_dict_in_col.has_key(x[0]), rnk_list)
if rnk_list:
text = """
<span class="adminlabel">Enable:</span>
<select name="rnkID" class="admin_w200">
<option value="-1">- select rank method -</option>
"""
for (id, name) in rnk_list:
text += """<option value="%s" %s>%s</option>""" % (id, (func in ["0", 0] and confirm in ["0", 0] and int(rnkID) == int(id)) and 'selected="selected"' or '' , name)
text += """</select>"""
output += createhiddenform(action="modifyrankmethods#9",
text=text,
button="Enable",
colID=colID,
ln=ln,
func=0,
confirm=1)
if confirm in ["1", 1] and func in ["0", 0] and int(rnkID) != -1:
output += write_outcome(finresult)
elif confirm not in ["0", 0] and func in ["0", 0]:
output += """<b><span class="info">Please select a rank method.</span></b>"""
coll_list = get_col_rnk(colID, ln)
if coll_list:
text = """
<span class="adminlabel">Disable:</span>
<select name="rnkID" class="admin_w200">
<option value="-1">- select rank method-</option>
"""
for (id, name) in coll_list:
text += """<option value="%s" %s>%s</option>""" % (id, (func in ["1", 1] and confirm in ["0", 0] and int(rnkID) == int(id)) and 'selected="selected"' or '' , name)
text += """</select>"""
output += createhiddenform(action="modifyrankmethods#9",
text=text,
button="Disable",
colID=colID,
ln=ln,
func=1,
confirm=1)
if confirm in ["1", 1] and func in ["1", 1] and int(rnkID) != -1:
output += write_outcome(finresult)
elif confirm not in ["0", 0] and func in ["1", 1]:
output += """<b><span class="info">Please select a rank method.</span></b>"""
body = [output]
if callback:
return perform_editcollection(colID, ln, "perform_modifyrankmethods", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_addcollectiontotree(colID, ln, add_dad='', add_son='', rtype='', mtype='', callback='yes', confirm=-1):
"""Form to add a collection to the tree.
add_dad - the dad to add the collection to
add_son - the collection to add
rtype - add it as a regular or virtual
mtype - add it to the regular or virtual tree."""
output = ""
output2 = ""
subtitle = """Attach collection to tree&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#2.2">?</a>]</small>""" % (CFG_SITE_URL)
col_dict = dict(get_def_name('', "collection"))
if confirm not in [-1, "-1"] and not (add_son and add_dad and rtype):
output2 += """<b><span class="info">All fields must be filled.</span></b><br /><br />
"""
elif add_son and add_dad and rtype:
add_son = int(add_son)
add_dad = int(add_dad)
if confirm not in [-1, "-1"]:
if add_son == add_dad:
output2 += """<b><span class="info">Cannot add a collection as a pointer to itself.</span></b><br /><br />
"""
elif check_col(add_dad, add_son):
res = add_col_dad_son(add_dad, add_son, rtype)
output2 += write_outcome(res)
if res[0] == 1:
output2 += """<b><span class="info"><br /> The collection will appear on your website after the next webcoll run. You can either run it manually or wait until bibsched does it for you.</span></b><br /><br />
"""
else:
output2 += """<b><span class="info">Cannot add the collection '%s' as a %s subcollection of '%s' since it will either create a loop, or the association already exists.</span></b><br /><br />
""" % (col_dict[add_son], (rtype=="r" and 'regular' or 'virtual'), col_dict[add_dad])
add_son = ''
add_dad = ''
rtype = ''
tree = get_col_tree(colID)
col_list = col_dict.items()
col_list.sort(compare_on_val)
output = show_coll_not_in_tree(colID, ln, col_dict)
text = """
<span class="adminlabel">Attach collection:</span>
<select name="add_son" class="admin_w200">
<option value="">- select collection -</option>
"""
for (id, name) in col_list:
if id != colID:
text += """<option value="%s" %s>%s</option>""" % (id, str(id)==str(add_son) and 'selected="selected"' or '', name)
text += """
</select><br />
<span class="adminlabel">to parent collection:</span>
<select name="add_dad" class="admin_w200">
<option value="">- select parent collection -</option>
"""
for (id, name) in col_list:
text += """<option value="%s" %s>%s</option>
""" % (id, str(id)==add_dad and 'selected="selected"' or '', name)
text += """</select><br />
"""
text += """
<span class="adminlabel">with relationship:</span>
<select name="rtype" class="admin_w200">
<option value="">- select relationship -</option>
<option value="r" %s>Regular (Narrow by...)</option>
<option value="v" %s>Virtual (Focus on...)</option>
</select>
""" % ((rtype=="r" and 'selected="selected"' or ''), (rtype=="v" and 'selected="selected"' or ''))
output += createhiddenform(action="%s/admin/websearch/websearchadmin.py/addcollectiontotree" % CFG_SITE_URL,
text=text,
button="Add",
colID=colID,
ln=ln,
confirm=1)
output += output2
#output += perform_showtree(colID, ln)
body = [output]
if callback:
return perform_index(colID, ln, mtype="perform_addcollectiontotree", content=addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_addcollection(colID, ln, colNAME='', dbquery='', callback="yes", confirm=-1):
"""form to add a new collection.
colNAME - the name of the new collection
dbquery - the dbquery of the new collection"""
output = ""
subtitle = """Create new collection&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#2.1">?</a>]</small>""" % (CFG_SITE_URL)
text = """
<span class="adminlabel">Default name</span>
<input class="admin_w200" type="text" name="colNAME" value="%s" /><br />
""" % colNAME
output = createhiddenform(action="%s/admin/websearch/websearchadmin.py/addcollection" % CFG_SITE_URL,
text=text,
colID=colID,
ln=ln,
button="Add collection",
confirm=1)
if colNAME and confirm in ["1", 1]:
res = add_col(colNAME, '')
output += write_outcome(res)
if res[0] == 1:
output += perform_addcollectiontotree(colID=colID, ln=ln, add_son=res[1], callback='')
elif confirm not in ["-1", -1]:
output += """<b><span class="info">Please give the collection a name.</span></b>"""
body = [output]
if callback:
return perform_index(colID, ln=ln, mtype="perform_addcollection", content=addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifydbquery(colID, ln, dbquery='', callback='yes', confirm=-1):
"""form to modify the dbquery of the collection.
dbquery - the dbquery of the collection."""
subtitle = ''
output = ""
col_dict = dict(get_def_name('', "collection"))
if colID and col_dict.has_key(int(colID)):
colID = int(colID)
subtitle = """<a name="1">1. Modify collection query for collection '%s'</a>&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#3.1">?</a>]</small>""" % (col_dict[colID], CFG_SITE_URL)
if confirm == -1:
res = run_sql("SELECT dbquery FROM collection WHERE id=%s" % colID)
dbquery = res[0][0]
if not dbquery:
dbquery = ''
reg_sons = len(get_col_tree(colID, 'r'))
vir_sons = len(get_col_tree(colID, 'v'))
if reg_sons > 1:
if dbquery:
output += "Warning: This collection got subcollections, and should because of this not have a collection query, for further explanation, check the WebSearch Guide<br />"
elif reg_sons <= 1:
if not dbquery:
output += "Warning: This collection does not have any subcollections, and should because of this have a collection query, for further explanation, check the WebSearch Guide<br />"
text = """
<span class="adminlabel">Query</span>
<input class="admin_w200" type="text" name="dbquery" value="%s" /><br />
""" % cgi.escape(dbquery, 1)
output += createhiddenform(action="modifydbquery",
text=text,
button="Modify",
colID=colID,
ln=ln,
confirm=1)
if confirm in ["1", 1]:
res = modify_dbquery(colID, dbquery)
if res:
if dbquery == "":
text = """<b><span class="info">Query removed for this collection.</span></b>"""
else:
text = """<b><span class="info">Query set for this collection.</span></b>"""
else:
text = """<b><span class="info">Sorry, could not change query.</span></b>"""
output += text
body = [output]
if callback:
return perform_editcollection(colID, ln, "perform_modifydbquery", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifycollectiontree(colID, ln, move_up='', move_down='', move_from='', move_to='', delete='', rtype='', callback='yes', confirm=0):
"""to modify the collection tree: move a collection up and down, delete a collection, or change the father of the collection.
colID - the main collection of the tree, the root
move_up - move this collection up (is not the collection id, but the place in the tree)
move_up - move this collection down (is not the collection id, but the place in the tree)
move_from - move this collection from the current positon (is not the collection id, but the place in the tree)
move_to - move the move_from collection and set this as it's father. (is not the collection id, but the place in the tree)
delete - delete this collection from the tree (is not the collection id, but the place in the tree)
rtype - the type of the collection in the tree, regular or virtual"""
colID = int(colID)
tree = get_col_tree(colID, rtype)
col_dict = dict(get_def_name('', "collection"))
subtitle = """Modify collection tree: %s&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#2.3">?</a>]&nbsp;&nbsp;&nbsp;<a href="%s/admin/websearch/websearchadmin.py/showtree?colID=%s&amp;ln=%s">Printer friendly version</a></small>""" % (col_dict[colID], CFG_SITE_URL, CFG_SITE_URL, colID, ln)
fin_output = ""
output = ""
try:
if move_up:
move_up = int(move_up)
switch = find_last(tree, move_up)
if switch and switch_col_treescore(tree[move_up], tree[switch]):
output += """<b><span class="info">Moved the %s collection '%s' up and '%s' down.</span></b><br /><br />
""" % ((rtype=="r" and 'regular' or 'virtual'), col_dict[tree[move_up][0]], col_dict[tree[switch][0]])
else:
output += """<b><span class="info">Could not move the %s collection '%s' up and '%s' down.</span></b><br /><br />
""" % ((rtype=="r" and 'regular' or 'virtual'), col_dict[tree[move_up][0]], col_dict[tree[switch][0]])
elif move_down:
move_down = int(move_down)
switch = find_next(tree, move_down)
if switch and switch_col_treescore(tree[move_down], tree[switch]):
output += """<b><span class="info">Moved the %s collection '%s' down and '%s' up.</span></b><br /><br />
""" % ((rtype=="r" and 'regular' or 'virtual'), col_dict[tree[move_down][0]], col_dict[tree[switch][0]])
else:
output += """<b><span class="info">Could not move the %s collection '%s' up and '%s' down.</span></b><br /><br />
""" % ((rtype=="r" and 'regular' or 'virtual'), col_dict[tree[move_up][0]],col_dict[tree[switch][0]])
elif delete:
delete = int(delete)
if confirm in [0, "0"]:
if col_dict[tree[delete][0]] != col_dict[tree[delete][3]]:
text = """<b>Do you want to remove the %s collection '%s' and its subcollections in the %s collection '%s'.</b>
""" % ((tree[delete][4]=="r" and 'regular' or 'virtual'), col_dict[tree[delete][0]], (rtype=="r" and 'regular' or 'virtual'), col_dict[tree[delete][3]])
else:
text = """<b>Do you want to remove all subcollections of the %s collection '%s'.</b>
""" % ((rtype=="r" and 'regular' or 'virtual'), col_dict[tree[delete][3]])
output += createhiddenform(action="%s/admin/websearch/websearchadmin.py/modifycollectiontree#tree" % CFG_SITE_URL,
text=text,
button="Confirm",
colID=colID,
delete=delete,
rtype=rtype,
ln=ln,
confirm=1)
output += createhiddenform(action="%s/admin/websearch/websearchadmin.py/index?mtype=perform_modifycollectiontree#tree" % CFG_SITE_URL,
text="<b>To cancel</b>",
button="Cancel",
colID=colID,
ln=ln)
else:
if remove_col_subcol(tree[delete][0], tree[delete][3], rtype):
if col_dict[tree[delete][0]] != col_dict[tree[delete][3]]:
output += """<b><span class="info">Removed the %s collection '%s' and its subcollections in subdirectory '%s'.</span></b><br /><br />
""" % ((tree[delete][4]=="r" and 'regular' or 'virtual'), col_dict[tree[delete][0]], col_dict[tree[delete][3]])
else:
output += """<b><span class="info">Removed the subcollections of the %s collection '%s'.</span></b><br /><br />
""" % ((rtype=="r" and 'regular' or 'virtual'), col_dict[tree[delete][3]])
else:
output += """<b><span class="info">Could not remove the collection from the tree.</span></b><br /><br />
"""
delete = ''
elif move_from and not move_to:
move_from_rtype = move_from[0]
move_from_id = int(move_from[1:len(move_from)])
text = """<b>Select collection to place the %s collection '%s' under.</b><br /><br />
""" % ((move_from_rtype=="r" and 'regular' or 'virtual'), col_dict[tree[move_from_id][0]])
output += createhiddenform(action="%s/admin/websearch/websearchadmin.py/index?mtype=perform_modifycollectiontree#tree" % CFG_SITE_URL,
text=text,
button="Cancel",
colID=colID,
ln=ln)
elif move_from and move_to:
move_from_rtype = move_from[0]
move_from_id = int(move_from[1:len(move_from)])
move_to_rtype = move_to[0]
move_to_id = int(move_to[1:len(move_to)])
tree_from = get_col_tree(colID, move_from_rtype)
tree_to = get_col_tree(colID, move_to_rtype)
if confirm in [0, '0']:
if move_from_id == move_to_id and move_from_rtype == move_to_rtype:
output += """<b><span class="info">Cannot move to itself.</span></b><br /><br />
"""
elif tree_from[move_from_id][3] == tree_to[move_to_id][0] and move_from_rtype==move_to_rtype:
output += """<b><span class="info">The collection is already there.</span></b><br /><br />
"""
elif check_col(tree_to[move_to_id][0], tree_from[move_from_id][0]) or (tree_to[move_to_id][0] == 1 and tree_from[move_from_id][3] == tree_to[move_to_id][0] and move_from_rtype != move_to_rtype):
text = """<b>Move %s collection '%s' to the %s collection '%s'.</b>
""" % ((tree_from[move_from_id][4]=="r" and 'regular' or 'virtual'), col_dict[tree_from[move_from_id][0]], (tree_to[move_to_id][4]=="r" and 'regular' or 'virtual'), col_dict[tree_to[move_to_id][0]])
output += createhiddenform(action="%s/admin/websearch/websearchadmin.py/modifycollectiontree#tree" % CFG_SITE_URL,
text=text,
button="Confirm",
colID=colID,
move_from=move_from,
move_to=move_to,
ln=ln,
rtype=rtype,
confirm=1)
output += createhiddenform(action="%s/admin/websearch/websearchadmin.py/index?mtype=perform_modifycollectiontree#tree" % CFG_SITE_URL,
text="""<b>To cancel</b>""",
button="Cancel",
colID=colID,
ln=ln)
else:
output += """<b><span class="info">Cannot move the collection '%s' and set it as a subcollection of '%s' since it will create a loop.</span></b><br /><br />
""" % (col_dict[tree_from[move_from_id][0]], col_dict[tree_to[move_to_id][0]])
else:
if (move_to_id != 0 and move_col_tree(tree_from[move_from_id], tree_to[move_to_id])) or (move_to_id == 0 and move_col_tree(tree_from[move_from_id], tree_to[move_to_id], move_to_rtype)):
output += """<b><span class="info">Moved %s collection '%s' to the %s collection '%s'.</span></b><br /><br />
""" % ((move_from_rtype=="r" and 'regular' or 'virtual'), col_dict[tree_from[move_from_id][0]], (move_to_rtype=="r" and 'regular' or 'virtual'), col_dict[tree_to[move_to_id][0]])
else:
output += """<b><span class="info">Could not move %s collection '%s' to the %s collection '%s'.</span></b><br /><br />
""" % ((move_from_rtype=="r" and 'regular' or 'virtual'), col_dict[tree_from[move_from_id][0]], (move_to_rtype=="r" and 'regular' or 'virtual'), col_dict[tree_to[move_to_id][0]])
move_from = ''
move_to = ''
else:
output += """
"""
except StandardError, e:
register_exception()
return """<b><span class="info">An error occured.</span></b>
"""
output += """<table border ="0" width="100%">
<tr><td width="50%">
<b>Narrow by collection:</b>
</td><td width="50%">
<b>Focus on...:</b>
</td></tr><tr><td valign="top">
"""
tree = get_col_tree(colID, 'r')
output += create_colltree(tree, col_dict, colID, ln, move_from, move_to, 'r', "yes")
output += """</td><td valign="top">
"""
tree = get_col_tree(colID, 'v')
output += create_colltree(tree, col_dict, colID, ln, move_from, move_to, 'v', "yes")
output += """</td>
</tr>
</table>
"""
body = [output]
if callback:
return perform_index(colID, ln, mtype="perform_modifycollectiontree", content=addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_showtree(colID, ln):
"""create collection tree/hiarchy"""
col_dict = dict(get_def_name('', "collection"))
subtitle = "Collection tree: %s" % col_dict[int(colID)]
output = """<table border ="0" width="100%">
<tr><td width="50%">
<b>Narrow by collection:</b>
</td><td width="50%">
<b>Focus on...:</b>
</td></tr><tr><td valign="top">
"""
tree = get_col_tree(colID, 'r')
output += create_colltree(tree, col_dict, colID, ln, '', '', 'r', '')
output += """</td><td valign="top">
"""
tree = get_col_tree(colID, 'v')
output += create_colltree(tree, col_dict, colID, ln, '', '', 'v', '')
output += """</td>
</tr>
</table>
"""
body = [output]
return addadminbox(subtitle, body)
def perform_addportalbox(colID, ln, title='', body='', callback='yes', confirm=-1):
"""form to add a new portalbox
title - the title of the portalbox
body - the body of the portalbox"""
col_dict = dict(get_def_name('', "collection"))
colID = int(colID)
subtitle = """<a name="5.1"></a>Create new portalbox"""
text = """
<span class="adminlabel">Title</span>
<textarea cols="50" rows="1" class="admin_wvar" type="text" name="title">%s</textarea><br />
<span class="adminlabel">Body</span>
<textarea cols="50" rows="10" class="admin_wvar" type="text" name="body">%s</textarea><br />
""" % (cgi.escape(title), cgi.escape(body))
output = createhiddenform(action="addportalbox#5.1",
text=text,
button="Add",
colID=colID,
ln=ln,
confirm=1)
if body and confirm in [1, "1"]:
res = add_pbx(title, body)
output += write_outcome(res)
if res[1] == 1:
output += """<b><span class="info"><a href="addexistingportalbox?colID=%s&amp;ln=%s&amp;pbxID=%s#5">Add portalbox to collection</a></span></b>""" % (colID, ln, res[1])
elif confirm not in [-1, "-1"]:
output += """<b><span class="info">Body field must be filled.</span></b>
"""
body = [output]
return perform_showportalboxes(colID, ln, content=addadminbox(subtitle, body))
def perform_addexistingportalbox(colID, ln, pbxID=-1, score=0, position='', sel_ln='', callback='yes', confirm=-1):
"""form to add an existing portalbox to a collection.
colID - the collection to add the portalbox to
pbxID - the portalbox to add
score - the importance of the portalbox.
position - the position of the portalbox on the page
sel_ln - the language of the portalbox"""
subtitle = """<a name="5.2"></a>Add existing portalbox to collection"""
output = ""
colID = int(colID)
res = get_pbx()
pos = get_pbx_pos()
lang = dict(get_languages())
col_dict = dict(get_def_name('', "collection"))
pbx_dict = dict(map(lambda x: (x[0], x[1]), res))
col_pbx = get_col_pbx(colID)
col_pbx = dict(map(lambda x: (x[0], x[5]), col_pbx))
if len(res) > 0:
text = """
<span class="adminlabel">Portalbox</span>
<select name="pbxID" class="admin_w200">
<option value="-1">- Select portalbox -</option>
"""
for (id, t_title, t_body) in res:
text += """<option value="%s" %s>%s - %s...</option>\n""" % \
(id, id == int(pbxID) and 'selected="selected"' or '',
t_title[:40], cgi.escape(t_body[0:40 - min(40, len(t_title))]))
text += """</select><br />
<span class="adminlabel">Language</span>
<select name="sel_ln" class="admin_w200">
<option value="">- Select language -</option>
"""
listlang = lang.items()
listlang.sort()
for (key, name) in listlang:
text += """<option value="%s" %s>%s</option>
""" % (key, key == sel_ln and 'selected="selected"' or '', name)
text += """</select><br />
<span class="adminlabel">Position</span>
<select name="position" class="admin_w200">
<option value="">- Select position -</option>
"""
listpos = pos.items()
listpos.sort()
for (key, name) in listpos:
text += """<option value="%s" %s>%s</option>""" % (key, key==position and 'selected="selected"' or '', name)
text += "</select>"
output += createhiddenform(action="addexistingportalbox#5.2",
text=text,
button="Add",
colID=colID,
ln=ln,
confirm=1)
else:
output = """No existing portalboxes to add, please create a new one.
"""
if pbxID > -1 and position and sel_ln and confirm in [1, "1"]:
pbxID = int(pbxID)
res = add_col_pbx(colID, pbxID, sel_ln, position, '')
output += write_outcome(res)
elif pbxID > -1 and confirm not in [-1, "-1"]:
output += """<b><span class="info">All fields must be filled.</span></b>
"""
body = [output]
output = "<br />" + addadminbox(subtitle, body)
return perform_showportalboxes(colID, ln, content=output)
def perform_deleteportalbox(colID, ln, pbxID=-1, callback='yes', confirm=-1):
"""form to delete a portalbox which is not in use.
colID - the current collection.
pbxID - the id of the portalbox"""
subtitle = """<a name="5.3"></a>Delete an unused portalbox"""
output = ""
colID = int(colID)
if pbxID not in [-1, "-1"] and confirm in [1, "1"]:
ares = get_pbx()
pbx_dict = dict(map(lambda x: (x[0], x[1]), ares))
if pbx_dict.has_key(int(pbxID)):
pname = pbx_dict[int(pbxID)]
ares = delete_pbx(int(pbxID))
else:
return """<b><span class="info">This portalbox does not exist</span></b>"""
res = get_pbx()
col_dict = dict(get_def_name('', "collection"))
pbx_dict = dict(map(lambda x: (x[0], x[1]), res))
col_pbx = get_col_pbx()
col_pbx = dict(map(lambda x: (x[0], x[5]), col_pbx))
if len(res) > 0:
text = """
<span class="adminlabel">Portalbox</span>
<select name="pbxID" class="admin_w200">
"""
text += """<option value="-1">- Select portalbox -"""
for (id, t_title, t_body) in res:
if not col_pbx.has_key(id):
text += """<option value="%s" %s>%s - %s...""" % (id, id == int(pbxID) and 'selected="selected"' or '', t_title, cgi.escape(t_body[0:10]))
text += "</option>"
text += """</select><br />"""
output += createhiddenform(action="deleteportalbox#5.3",
text=text,
button="Delete",
colID=colID,
ln=ln,
confirm=1)
if pbxID not in [-1, "-1"]:
pbxID = int(pbxID)
if confirm in [1, "1"]:
output += write_outcome(ares)
elif confirm not in [-1, "-1"]:
output += """<b><span class="info">Choose a portalbox to delete.</span></b>
"""
body = [output]
output = "<br />" + addadminbox(subtitle, body)
return perform_showportalboxes(colID, ln, content=output)
def perform_modifyportalbox(colID, ln, pbxID=-1, score='', position='', sel_ln='', title='', body='', callback='yes', confirm=-1):
"""form to modify a portalbox in a collection, or change the portalbox itself.
colID - the id of the collection.
pbxID - the portalbox to change
score - the score of the portalbox connected to colID which should be changed.
position - the position of the portalbox in collection colID to change."""
subtitle = ""
output = ""
colID = int(colID)
res = get_pbx()
pos = get_pbx_pos()
lang = dict(get_languages())
col_dict = dict(get_def_name('', "collection"))
pbx_dict = dict(map(lambda x: (x[0], x[1]), res))
col_pbx = get_col_pbx(colID)
col_pbx = dict(map(lambda x: (x[0], x[5]), col_pbx))
if pbxID not in [-1, "-1"]:
pbxID = int(pbxID)
subtitle = """<a name="5.4"></a>Modify portalbox '%s' for this collection""" % pbx_dict[pbxID]
col_pbx = get_col_pbx(colID)
if not (score and position) and not (body and title):
for (id_pbx, id_collection, tln, score, position, title, body) in col_pbx:
if id_pbx == pbxID:
break
output += """Collection (presentation) specific values (Changes implies only to this collection.)<br />"""
text = """
<span class="adminlabel">Position</span>
<select name="position" class="admin_w200">
"""
listpos = pos.items()
listpos.sort()
for (key, name) in listpos:
text += """<option value="%s" %s>%s""" % (key, key==position and 'selected="selected"' or '', name)
text += "</option>"
text += """</select><br />"""
output += createhiddenform(action="modifyportalbox#5.4",
text=text,
button="Modify",
colID=colID,
pbxID=pbxID,
score=score,
title=title,
body=cgi.escape(body, 1),
sel_ln=sel_ln,
ln=ln,
confirm=3)
if pbxID > -1 and score and position and confirm in [3, "3"]:
pbxID = int(pbxID)
res = modify_pbx(colID, pbxID, sel_ln, score, position, '', '')
res2 = get_pbx()
pbx_dict = dict(map(lambda x: (x[0], x[1]), res2))
output += write_outcome(res)
output += """<br />Portalbox (content) specific values (any changes appears everywhere the portalbox is used.)"""
text = """
<span class="adminlabel">Title</span>
<textarea cols="50" rows="1" class="admin_wvar" type="text" name="title">%s</textarea><br />
""" % cgi.escape(title)
text += """
<span class="adminlabel">Body</span>
<textarea cols="50" rows="10" class="admin_wvar" type="text" name="body">%s</textarea><br />
""" % cgi.escape(body)
output += createhiddenform(action="modifyportalbox#5.4",
text=text,
button="Modify",
colID=colID,
pbxID=pbxID,
sel_ln=sel_ln,
score=score,
position=position,
ln=ln,
confirm=4)
if pbxID > -1 and confirm in [4, "4"]:
pbxID = int(pbxID)
res = modify_pbx(colID, pbxID, sel_ln, '', '', title, body)
output += write_outcome(res)
else:
output = """No portalbox to modify."""
body = [output]
output = "<br />" + addadminbox(subtitle, body)
return perform_showportalboxes(colID, ln, content=output)
def perform_switchpbxscore(colID, id_1, id_2, sel_ln, ln):
"""Switch the score of id_1 and id_2 in collection_portalbox.
colID - the current collection
id_1/id_2 - the id's to change the score for.
sel_ln - the language of the portalbox"""
output = ""
res = get_pbx()
pbx_dict = dict(map(lambda x: (x[0], x[1]), res))
res = switch_pbx_score(colID, id_1, id_2, sel_ln)
output += write_outcome(res)
return perform_showportalboxes(colID, ln, content=output)
def perform_showportalboxes(colID, ln, callback='yes', content='', confirm=-1):
"""show the portalboxes of this collection.
colID - the portalboxes to show the collection for."""
colID = int(colID)
col_dict = dict(get_def_name('', "collection"))
subtitle = """<a name="5">5. Modify portalboxes for collection '%s'</a>&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#3.5">?</a>]</small>""" % (col_dict[colID], CFG_SITE_URL)
output = ""
pos = get_pbx_pos()
output = """<dl>
<dt>Portalbox actions (not related to this collection)</dt>
<dd><a href="addportalbox?colID=%s&amp;ln=%s#5.1">Create new portalbox</a></dd>
<dd><a href="deleteportalbox?colID=%s&amp;ln=%s#5.3">Delete an unused portalbox</a></dd>
<dt>Collection specific actions</dt>
<dd><a href="addexistingportalbox?colID=%s&amp;ln=%s#5.2">Add existing portalbox to collection</a></dd>
</dl>
""" % (colID, ln, colID, ln, colID, ln)
header = ['Position', 'Language', '', 'Title', 'Actions']
actions = []
sitelangs = get_languages()
lang = dict(sitelangs)
pos_list = pos.items()
pos_list.sort()
if len(get_col_pbx(colID)) > 0:
for (key, value) in sitelangs:
for (pos_key, pos_value) in pos_list:
res = get_col_pbx(colID, key, pos_key)
i = 0
for (pbxID, colID_pbx, tln, score, position, title, body) in res:
move = """<table cellspacing="1" cellpadding="0" border="0"><tr><td>"""
if i != 0:
move += """<a href="%s/admin/websearch/websearchadmin.py/switchpbxscore?colID=%s&amp;ln=%s&amp;id_1=%s&amp;id_2=%s&amp;sel_ln=%s&amp;rand=%s#5"><img border="0" src="%s/img/smallup.gif" title="Move portalbox up" alt="up" /></a>""" % (CFG_SITE_URL, colID, ln, pbxID, res[i - 1][0], tln, random.randint(0, 1000), CFG_SITE_URL)
else:
move += "&nbsp;&nbsp;&nbsp;"
move += "</td><td>"
i += 1
if i != len(res):
move += """<a href="%s/admin/websearch/websearchadmin.py/switchpbxscore?colID=%s&amp;ln=%s&amp;id_1=%s&amp;id_2=%s&amp;sel_ln=%s&amp;rand=%s#5"><img border="0" src="%s/img/smalldown.gif" title="Move portalbox down" alt="down" /></a>""" % (CFG_SITE_URL, colID, ln, pbxID, res[i][0], tln, random.randint(0, 1000), CFG_SITE_URL)
move += """</td></tr></table>"""
actions.append(["%s" % (i==1 and pos[position] or ''), "%s" % (i==1 and lang[tln] or ''), move, "%s" % title])
for col in [(('Modify', 'modifyportalbox'), ('Remove', 'removeportalbox'),)]:
actions[-1].append('<a href="%s/admin/websearch/websearchadmin.py/%s?colID=%s&amp;ln=%s&amp;pbxID=%s&amp;sel_ln=%s#5.4">%s</a>' % (CFG_SITE_URL, col[0][1], colID, ln, pbxID, tln, col[0][0]))
for (str, function) in col[1:]:
actions[-1][-1] += ' / <a href="%s/admin/websearch/websearchadmin.py/%s?colID=%s&amp;ln=%s&amp;pbxID=%s&amp;sel_ln=%s#5.5">%s</a>' % (CFG_SITE_URL, function, colID, ln, pbxID, tln, str)
output += tupletotable(header=header, tuple=actions)
else:
output += """No portalboxes exists for this collection"""
output += content
body = [output]
if callback:
return perform_editcollection(colID, ln, "perform_showportalboxes", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_removeportalbox(colID, ln, pbxID='', sel_ln='', callback='yes', confirm=0):
"""form to remove a portalbox from a collection.
colID - the current collection, remove the portalbox from this collection.
sel_ln - remove the portalbox with this language
pbxID - remove the portalbox with this id"""
subtitle = """<a name="5.5"></a>Remove portalbox"""
output = ""
col_dict = dict(get_def_name('', "collection"))
res = get_pbx()
pbx_dict = dict(map(lambda x: (x[0], x[1]), res))
if colID and pbxID and sel_ln:
colID = int(colID)
pbxID = int(pbxID)
if confirm in ["0", 0]:
text = """Do you want to remove the portalbox '%s' from the collection '%s'.""" % (pbx_dict[pbxID], col_dict[colID])
output += createhiddenform(action="removeportalbox#5.5",
text=text,
button="Confirm",
colID=colID,
pbxID=pbxID,
sel_ln=sel_ln,
confirm=1)
elif confirm in ["1", 1]:
res = remove_pbx(colID, pbxID, sel_ln)
output += write_outcome(res)
body = [output]
output = "<br />" + addadminbox(subtitle, body)
return perform_showportalboxes(colID, ln, content=output)
def perform_switchfmtscore(colID, type, id_1, id_2, ln):
"""Switch the score of id_1 and id_2 in the table type.
colID - the current collection
id_1/id_2 - the id's to change the score for.
type - like "format" """
fmt_dict = dict(get_def_name('', "format"))
res = switch_score(colID, id_1, id_2, type)
output = write_outcome(res)
return perform_showoutputformats(colID, ln, content=output)
def perform_switchfldscore(colID, id_1, id_2, fmeth, ln):
"""Switch the score of id_1 and id_2 in collection_field_fieldvalue.
colID - the current collection
id_1/id_2 - the id's to change the score for."""
fld_dict = dict(get_def_name('', "field"))
res = switch_fld_score(colID, id_1, id_2)
output = write_outcome(res)
if fmeth == "soo":
return perform_showsortoptions(colID, ln, content=output)
elif fmeth == "sew":
return perform_showsearchfields(colID, ln, content=output)
elif fmeth == "seo":
return perform_showsearchoptions(colID, ln, content=output)
def perform_switchfldvaluescore(colID, id_1, id_fldvalue_1, id_fldvalue_2, ln):
"""Switch the score of id_1 and id_2 in collection_field_fieldvalue.
colID - the current collection
id_1/id_2 - the id's to change the score for."""
name_1 = run_sql("SELECT name from fieldvalue where id=%s", (id_fldvalue_1, ))[0][0]
name_2 = run_sql("SELECT name from fieldvalue where id=%s", (id_fldvalue_2, ))[0][0]
res = switch_fld_value_score(colID, id_1, id_fldvalue_1, id_fldvalue_2)
output = write_outcome(res)
return perform_modifyfield(colID, fldID=id_1, ln=ln, content=output)
def perform_addnewfieldvalue(colID, fldID, ln, name='', value='', callback="yes", confirm=-1):
"""form to add a new fieldvalue.
name - the name of the new fieldvalue
value - the value of the new fieldvalue
"""
output = ""
subtitle = """<a name="7.4"></a>Add new value"""
text = """
<span class="adminlabel">Display name</span>
<input class="admin_w200" type="text" name="name" value="%s" /><br />
<span class="adminlabel">Search value</span>
<input class="admin_w200" type="text" name="value" value="%s" /><br />
""" % (name, value)
output = createhiddenform(action="%s/admin/websearch/websearchadmin.py/addnewfieldvalue" % CFG_SITE_URL,
text=text,
colID=colID,
fldID=fldID,
ln=ln,
button="Add",
confirm=1)
if name and value and confirm in ["1", 1]:
res = add_fldv(name, value)
output += write_outcome(res)
if res[0] == 1:
res = add_col_fld(colID, fldID, 'seo', res[1])
if res[0] == 0:
output += "<br />" + write_outcome(res)
elif confirm not in ["-1", -1]:
output += """<b><span class="info">Please fill in name and value.</span></b>
"""
body = [output]
output = "<br />" + addadminbox(subtitle, body)
return perform_modifyfield(colID, fldID=fldID, ln=ln, content=output)
def perform_modifyfieldvalue(colID, fldID, fldvID, ln, name='', value='', callback="yes", confirm=-1):
"""form to modify a fieldvalue.
name - the name of the fieldvalue
value - the value of the fieldvalue
"""
if confirm in [-1, "-1"]:
res = get_fld_value(fldvID)
(id, name, value) = res[0]
output = ""
subtitle = """<a name="7.4"></a>Modify existing value"""
output = """<dl>
<dt><b><span class="info">Warning: Modifications done below will also inflict on all places the modified data is used.</span></b></dt>
</dl>"""
text = """
<span class="adminlabel">Display name</span>
<input class="admin_w200" type="text" name="name" value="%s" /><br />
<span class="adminlabel">Search value</span>
<input class="admin_w200" type="text" name="value" value="%s" /><br />
""" % (name, value)
output += createhiddenform(action="%s/admin/websearch/websearchadmin.py/modifyfieldvalue" % CFG_SITE_URL,
text=text,
colID=colID,
fldID=fldID,
fldvID=fldvID,
ln=ln,
button="Update",
confirm=1)
output += createhiddenform(action="%s/admin/websearch/websearchadmin.py/modifyfieldvalue" % CFG_SITE_URL,
text="Delete value and all associations",
colID=colID,
fldID=fldID,
fldvID=fldvID,
ln=ln,
button="Delete",
confirm=2)
if name and value and confirm in ["1", 1]:
res = update_fldv(fldvID, name, value)
output += write_outcome(res)
#if res:
# output += """<b><span class="info">Operation successfully completed.</span></b>"""
#else:
# output += """<b><span class="info">Operation failed.</span></b>"""
elif confirm in ["2", 2]:
res = delete_fldv(fldvID)
output += write_outcome(res)
elif confirm not in ["-1", -1]:
output += """<b><span class="info">Please fill in name and value.</span></b>"""
body = [output]
output = "<br />" + addadminbox(subtitle, body)
return perform_modifyfield(colID, fldID=fldID, ln=ln, content=output)
def perform_removefield(colID, ln, fldID='', fldvID='', fmeth='', callback='yes', confirm=0):
"""form to remove a field from a collection.
colID - the current collection, remove the field from this collection.
sel_ln - remove the field with this language
fldID - remove the field with this id"""
if fmeth == "soo":
field = "sort option"
elif fmeth == "sew":
field = "search field"
elif fmeth == "seo":
field = "search option"
else:
field = "field"
subtitle = """<a name="6.4"><a name="7.4"><a name="8.4"></a>Remove %s""" % field
output = ""
col_dict = dict(get_def_name('', "collection"))
fld_dict = dict(get_def_name('', "field"))
res = get_fld_value()
fldv_dict = dict(map(lambda x: (x[0], x[1]), res))
if colID and fldID:
colID = int(colID)
fldID = int(fldID)
if fldvID and fldvID != "None":
fldvID = int(fldvID)
if confirm in ["0", 0]:
text = """Do you want to remove the %s '%s' %s from the collection '%s'.""" % (field, fld_dict[fldID], (fldvID not in["", "None"] and "with value '%s'" % fldv_dict[fldvID] or ''), col_dict[colID])
output += createhiddenform(action="removefield#6.5",
text=text,
button="Confirm",
colID=colID,
fldID=fldID,
fldvID=fldvID,
fmeth=fmeth,
confirm=1)
elif confirm in ["1", 1]:
res = remove_fld(colID, fldID, fldvID)
output += write_outcome(res)
body = [output]
output = "<br />" + addadminbox(subtitle, body)
if fmeth == "soo":
return perform_showsortoptions(colID, ln, content=output)
elif fmeth == "sew":
return perform_showsearchfields(colID, ln, content=output)
elif fmeth == "seo":
return perform_showsearchoptions(colID, ln, content=output)
def perform_removefieldvalue(colID, ln, fldID='', fldvID='', fmeth='', callback='yes', confirm=0):
"""form to remove a field from a collection.
colID - the current collection, remove the field from this collection.
sel_ln - remove the field with this language
fldID - remove the field with this id"""
subtitle = """<a name="7.4"></a>Remove value"""
output = ""
col_dict = dict(get_def_name('', "collection"))
fld_dict = dict(get_def_name('', "field"))
res = get_fld_value()
fldv_dict = dict(map(lambda x: (x[0], x[1]), res))
if colID and fldID:
colID = int(colID)
fldID = int(fldID)
if fldvID and fldvID != "None":
fldvID = int(fldvID)
if confirm in ["0", 0]:
text = """Do you want to remove the value '%s' from the search option '%s'.""" % (fldv_dict[fldvID], fld_dict[fldID])
output += createhiddenform(action="removefieldvalue#7.4",
text=text,
button="Confirm",
colID=colID,
fldID=fldID,
fldvID=fldvID,
fmeth=fmeth,
confirm=1)
elif confirm in ["1", 1]:
res = remove_fld(colID, fldID, fldvID)
output += write_outcome(res)
body = [output]
output = "<br />" + addadminbox(subtitle, body)
return perform_modifyfield(colID, fldID=fldID, ln=ln, content=output)
def perform_rearrangefieldvalue(colID, fldID, ln, callback='yes', confirm=-1):
"""rearrang the fieldvalues alphabetically
colID - the collection
fldID - the field to rearrange the fieldvalue for
"""
subtitle = "Order values alphabetically"
output = ""
col_fldv = get_col_fld(colID, 'seo', fldID)
col_fldv = dict(map(lambda x: (x[1], x[0]), col_fldv))
fldv_names = get_fld_value()
fldv_names = map(lambda x: (x[0], x[1]), fldv_names)
if not col_fldv.has_key(None):
vscore = len(col_fldv)
for (fldvID, name) in fldv_names:
if col_fldv.has_key(fldvID):
run_sql("UPDATE collection_field_fieldvalue SET score_fieldvalue=%s WHERE id_collection=%s and id_field=%s and id_fieldvalue=%s", (vscore, colID, fldID, fldvID))
vscore -= 1
output += write_outcome((1, ""))
else:
output += write_outcome((0, (0, "No values to order")))
body = [output]
output = "<br />" + addadminbox(subtitle, body)
return perform_modifyfield(colID, fldID, ln, content=output)
def perform_rearrangefield(colID, ln, fmeth, callback='yes', confirm=-1):
"""rearrang the fields alphabetically
colID - the collection
"""
subtitle = "Order fields alphabetically"
output = ""
col_fld = dict(map(lambda x: (x[0], x[1]), get_col_fld(colID, fmeth)))
fld_names = get_def_name('', "field")
if len(col_fld) > 0:
score = len(col_fld)
for (fldID, name) in fld_names:
if col_fld.has_key(fldID):
run_sql("UPDATE collection_field_fieldvalue SET score=%s WHERE id_collection=%s and id_field=%s", (score, colID, fldID))
score -= 1
output += write_outcome((1, ""))
else:
output += write_outcome((0, (0, "No fields to order")))
body = [output]
output = "<br />" + addadminbox(subtitle, body)
if fmeth == "soo":
return perform_showsortoptions(colID, ln, content=output)
elif fmeth == "sew":
return perform_showsearchfields(colID, ln, content=output)
elif fmeth == "seo":
return perform_showsearchoptions(colID, ln, content=output)
def perform_addexistingfieldvalue(colID, fldID, fldvID=-1, ln=CFG_SITE_LANG, callback='yes', confirm=-1):
"""form to add an existing fieldvalue to a field.
colID - the collection
fldID - the field to add the fieldvalue to
fldvID - the fieldvalue to add"""
subtitle = """</a><a name="7.4"></a>Add existing value to search option"""
output = ""
if fldvID not in [-1, "-1"] and confirm in [1, "1"]:
fldvID = int(fldvID)
ares = add_col_fld(colID, fldID, 'seo', fldvID)
colID = int(colID)
fldID = int(fldID)
lang = dict(get_languages())
res = get_def_name('', "field")
col_dict = dict(get_def_name('', "collection"))
fld_dict = dict(res)
col_fld = dict(map(lambda x: (x[0], x[1]), get_col_fld(colID, 'seo')))
fld_value = get_fld_value()
fldv_dict = dict(map(lambda x: (x[0], x[1]), fld_value))
text = """
<span class="adminlabel">Value</span>
<select name="fldvID" class="admin_w200">
<option value="-1">- Select value -</option>
"""
res = run_sql("SELECT id,name,value FROM fieldvalue ORDER BY name")
for (id, name, value) in res:
text += """<option value="%s" %s>%s - %s</option>
""" % (id, id == int(fldvID) and 'selected="selected"' or '', name, value)
text += """</select><br />"""
output += createhiddenform(action="addexistingfieldvalue#7.4",
text=text,
button="Add",
colID=colID,
fldID=fldID,
ln=ln,
confirm=1)
if fldvID not in [-1, "-1"] and confirm in [1, "1"]:
output += write_outcome(ares)
elif confirm in [1, "1"]:
output += """<b><span class="info">Select a value to add and try again.</span></b>"""
body = [output]
output = "<br />" + addadminbox(subtitle, body)
return perform_modifyfield(colID, fldID, ln, content=output)
def perform_addexistingfield(colID, ln, fldID=-1, fldvID=-1, fmeth='', callback='yes', confirm=-1):
"""form to add an existing field to a collection.
colID - the collection to add the field to
fldID - the field to add
sel_ln - the language of the field"""
subtitle = """<a name="6.2"></a><a name="7.2"></a><a name="8.2"></a>Add existing field to collection"""
output = ""
if fldID not in [-1, "-1"] and confirm in [1, "1"]:
fldID = int(fldID)
ares = add_col_fld(colID, fldID, fmeth, fldvID)
colID = int(colID)
lang = dict(get_languages())
res = get_def_name('', "field")
col_dict = dict(get_def_name('', "collection"))
fld_dict = dict(res)
col_fld = dict(map(lambda x: (x[0], x[1]), get_col_fld(colID, fmeth)))
fld_value = get_fld_value()
fldv_dict = dict(map(lambda x: (x[0], x[1]), fld_value))
if fldvID:
fldvID = int(fldvID)
text = """
<span class="adminlabel">Field</span>
<select name="fldID" class="admin_w200">
<option value="-1">- Select field -</option>
"""
for (id, var) in res:
if fmeth == 'seo' or (fmeth != 'seo' and not col_fld.has_key(id)):
text += """<option value="%s" %s>%s</option>
""" % (id, '', fld_dict[id])
text += """</select><br />"""
output += createhiddenform(action="addexistingfield#6.2",
text=text,
button="Add",
colID=colID,
fmeth=fmeth,
ln=ln,
confirm=1)
if fldID not in [-1, "-1"] and confirm in [1, "1"]:
output += write_outcome(ares)
elif fldID in [-1, "-1"] and confirm not in [-1, "-1"]:
output += """<b><span class="info">Select a field.</span></b>
"""
body = [output]
output = "<br />" + addadminbox(subtitle, body)
if fmeth == "soo":
return perform_showsortoptions(colID, ln, content=output)
elif fmeth == "sew":
return perform_showsearchfields(colID, ln, content=output)
elif fmeth == "seo":
return perform_showsearchoptions(colID, ln, content=output)
def perform_showsortoptions(colID, ln, callback='yes', content='', confirm=-1):
"""show the sort fields of this collection.."""
colID = int(colID)
col_dict = dict(get_def_name('', "collection"))
fld_dict = dict(get_def_name('', "field"))
fld_type = get_sort_nametypes()
subtitle = """<a name="8">8. Modify sort options for collection '%s'</a>&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#3.8">?</a>]</small>""" % (col_dict[colID], CFG_SITE_URL)
output = """<dl>
<dt>Field actions (not related to this collection)</dt>
<dd>Go to the BibIndex interface to modify the available sort options</dd>
<dt>Collection specific actions
<dd><a href="addexistingfield?colID=%s&amp;ln=%s&amp;fmeth=soo#8.2">Add sort option to collection</a></dd>
<dd><a href="rearrangefield?colID=%s&amp;ln=%s&amp;fmeth=soo#8.2">Order sort options alphabetically</a></dd>
</dl>
""" % (colID, ln, colID, ln)
header = ['', 'Sort option', 'Actions']
actions = []
sitelangs = get_languages()
lang = dict(sitelangs)
fld_type_list = fld_type.items()
if len(get_col_fld(colID, 'soo')) > 0:
res = get_col_fld(colID, 'soo')
i = 0
for (fldID, fldvID, stype, score, score_fieldvalue) in res:
move = """<table cellspacing="1" cellpadding="0" border="0"><tr><td>"""
if i != 0:
move += """<a href="%s/admin/websearch/websearchadmin.py/switchfldscore?colID=%s&amp;ln=%s&amp;id_1=%s&amp;id_2=%s&amp;fmeth=soo&amp;rand=%s#8"><img border="0" src="%s/img/smallup.gif" title="Move up"></a>""" % (CFG_SITE_URL, colID, ln, fldID, res[i - 1][0], random.randint(0, 1000), CFG_SITE_URL)
else:
move += "&nbsp;&nbsp;&nbsp;&nbsp;"
move += "</td><td>"
i += 1
if i != len(res):
move += """<a href="%s/admin/websearch/websearchadmin.py/switchfldscore?colID=%s&amp;ln=%s&amp;id_1=%s&amp;id_2=%s&amp;fmeth=soo&amp;rand=%s#8"><img border="0" src="%s/img/smalldown.gif" title="Move down"></a>""" % (CFG_SITE_URL, colID, ln, fldID, res[i][0], random.randint(0, 1000), CFG_SITE_URL)
move += """</td></tr></table>"""
actions.append([move, fld_dict[int(fldID)]])
for col in [(('Remove sort option', 'removefield'),)]:
actions[-1].append('<a href="%s/admin/websearch/websearchadmin.py/%s?colID=%s&amp;ln=%s&amp;fldID=%s&amp;fmeth=soo#8.4">%s</a>' % (CFG_SITE_URL, col[0][1], colID, ln, fldID, col[0][0]))
for (str, function) in col[1:]:
actions[-1][-1] += ' / <a href="%s/admin/websearch/websearchadmin.py/%s?colID=%s&amp;ln=%s&amp;fldID=%s&amp;fmeth=soo#8.5">%s</a>' % (CFG_SITE_URL, function, colID, ln, fldID, str)
output += tupletotable(header=header, tuple=actions)
else:
output += """No sort options exists for this collection"""
output += content
body = [output]
if callback:
return perform_editcollection(colID, ln, "perform_showsortoptions", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_showsearchfields(colID, ln, callback='yes', content='', confirm=-1):
"""show the search fields of this collection.."""
colID = int(colID)
col_dict = dict(get_def_name('', "collection"))
fld_dict = dict(get_def_name('', "field"))
fld_type = get_sort_nametypes()
subtitle = """<a name="6">6. Modify search fields for collection '%s'</a>&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#3.6">?</a>]</small>""" % (col_dict[colID], CFG_SITE_URL)
output = """<dl>
<dt>Field actions (not related to this collection)</dt>
<dd>Go to the BibIndex interface to modify the available search fields</dd>
<dt>Collection specific actions
<dd><a href="addexistingfield?colID=%s&amp;ln=%s&amp;fmeth=sew#6.2">Add search field to collection</a></dd>
<dd><a href="rearrangefield?colID=%s&amp;ln=%s&amp;fmeth=sew#6.2">Order search fields alphabetically</a></dd>
</dl>
""" % (colID, ln, colID, ln)
header = ['', 'Search field', 'Actions']
actions = []
sitelangs = get_languages()
lang = dict(sitelangs)
fld_type_list = fld_type.items()
if len(get_col_fld(colID, 'sew')) > 0:
res = get_col_fld(colID, 'sew')
i = 0
for (fldID, fldvID, stype, score, score_fieldvalue) in res:
move = """<table cellspacing="1" cellpadding="0" border="0"><tr><td>"""
if i != 0:
move += """<a href="%s/admin/websearch/websearchadmin.py/switchfldscore?colID=%s&amp;ln=%s&amp;id_1=%s&amp;id_2=%s&amp;fmeth=sew&amp;rand=%s#6"><img border="0" src="%s/img/smallup.gif" title="Move up"></a>""" % (CFG_SITE_URL, colID, ln, fldID, res[i - 1][0], random.randint(0, 1000), CFG_SITE_URL)
else:
move += "&nbsp;&nbsp;&nbsp;"
move += "</td><td>"
i += 1
if i != len(res):
move += '<a href="%s/admin/websearch/websearchadmin.py/switchfldscore?colID=%s&amp;ln=%s&amp;id_1=%s&amp;id_2=%s&amp;fmeth=sew&amp;rand=%s#6"><img border="0" src="%s/img/smalldown.gif" title="Move down"></a>' % (CFG_SITE_URL, colID, ln, fldID, res[i][0], random.randint(0, 1000), CFG_SITE_URL)
move += """</td></tr></table>"""
actions.append([move, fld_dict[int(fldID)]])
for col in [(('Remove search field', 'removefield'),)]:
actions[-1].append('<a href="%s/admin/websearch/websearchadmin.py/%s?colID=%s&amp;ln=%s&amp;fldID=%s&amp;fmeth=sew#6.4">%s</a>' % (CFG_SITE_URL, col[0][1], colID, ln, fldID, col[0][0]))
for (str, function) in col[1:]:
actions[-1][-1] += ' / <a href="%s/admin/websearch/websearchadmin.py/%s?colID=%s&amp;ln=%s&amp;fldID=%s#6.5">%s</a>' % (CFG_SITE_URL, function, colID, ln, fldID, str)
output += tupletotable(header=header, tuple=actions)
else:
output += """No search fields exists for this collection"""
output += content
body = [output]
if callback:
return perform_editcollection(colID, ln, "perform_showsearchfields", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_showsearchoptions(colID, ln, callback='yes', content='', confirm=-1):
"""show the sort and search options of this collection.."""
colID = int(colID)
col_dict = dict(get_def_name('', "collection"))
fld_dict = dict(get_def_name('', "field"))
fld_type = get_sort_nametypes()
subtitle = """<a name="7">7. Modify search options for collection '%s'</a>&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#3.7">?</a>]</small>""" % (col_dict[colID], CFG_SITE_URL)
output = """<dl>
<dt>Field actions (not related to this collection)</dt>
<dd>Go to the BibIndex interface to modify the available search options</dd>
<dt>Collection specific actions
<dd><a href="addexistingfield?colID=%s&amp;ln=%s&amp;fmeth=seo#7.2">Add search option to collection</a></dd>
<dd><a href="rearrangefield?colID=%s&amp;ln=%s&amp;fmeth=seo#7.2">Order search options alphabetically</a></dd>
</dl>
""" % (colID, ln, colID, ln)
header = ['', 'Search option', 'Actions']
actions = []
sitelangs = get_languages()
lang = dict(sitelangs)
fld_type_list = fld_type.items()
fld_distinct = run_sql("SELECT distinct(id_field) FROM collection_field_fieldvalue WHERE type='seo' AND id_collection=%s ORDER by score desc", (colID, ))
if len(fld_distinct) > 0:
i = 0
for (id) in fld_distinct:
fldID = id[0]
col_fld = get_col_fld(colID, 'seo', fldID)
move = ""
if i != 0:
move += """<a href="%s/admin/websearch/websearchadmin.py/switchfldscore?colID=%s&amp;ln=%s&amp;id_1=%s&amp;id_2=%s&amp;fmeth=seo&amp;rand=%s#7"><img border="0" src="%s/img/smallup.gif" title="Move up"></a>""" % (CFG_SITE_URL, colID, ln, fldID, fld_distinct[i - 1][0], random.randint(0, 1000), CFG_SITE_URL)
else:
move += "&nbsp;&nbsp;&nbsp;"
i += 1
if i != len(fld_distinct):
move += '<a href="%s/admin/websearch/websearchadmin.py/switchfldscore?colID=%s&amp;ln=%s&amp;id_1=%s&amp;id_2=%s&amp;fmeth=seo&amp;rand=%s#7"><img border="0" src="%s/img/smalldown.gif" title="Move down"></a>' % (CFG_SITE_URL, colID, ln, fldID, fld_distinct[i][0], random.randint(0, 1000), CFG_SITE_URL)
actions.append([move, "%s" % fld_dict[fldID]])
for col in [(('Modify values', 'modifyfield'), ('Remove search option', 'removefield'),)]:
actions[-1].append('<a href="%s/admin/websearch/websearchadmin.py/%s?colID=%s&amp;ln=%s&amp;fldID=%s#7.3">%s</a>' % (CFG_SITE_URL, col[0][1], colID, ln, fldID, col[0][0]))
for (str, function) in col[1:]:
actions[-1][-1] += ' / <a href="%s/admin/websearch/websearchadmin.py/%s?colID=%s&amp;ln=%s&amp;fldID=%s&amp;fmeth=seo#7.3">%s</a>' % (CFG_SITE_URL, function, colID, ln, fldID, str)
output += tupletotable(header=header, tuple=actions)
else:
output += """No search options exists for this collection"""
output += content
body = [output]
if callback:
return perform_editcollection(colID, ln, "perform_showsearchoptions", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifyfield(colID, fldID, fldvID='', ln=CFG_SITE_LANG, content='', callback='yes', confirm=0):
"""Modify the fieldvalues for a field"""
colID = int(colID)
col_dict = dict(get_def_name('', "collection"))
fld_dict = dict(get_def_name('', "field"))
fld_type = get_sort_nametypes()
fldID = int(fldID)
subtitle = """<a name="7.3">Modify values for field '%s'</a>""" % (fld_dict[fldID])
output = """<dl>
<dt>Value specific actions
<dd><a href="addexistingfieldvalue?colID=%s&amp;ln=%s&amp;fldID=%s#7.4">Add existing value to search option</a></dd>
<dd><a href="addnewfieldvalue?colID=%s&amp;ln=%s&amp;fldID=%s#7.4">Add new value to search option</a></dd>
<dd><a href="rearrangefieldvalue?colID=%s&amp;ln=%s&amp;fldID=%s#7.4">Order values alphabetically</a></dd>
</dl>
""" % (colID, ln, fldID, colID, ln, fldID, colID, ln, fldID)
header = ['', 'Value name', 'Actions']
actions = []
sitelangs = get_languages()
lang = dict(sitelangs)
fld_type_list = fld_type.items()
col_fld = list(get_col_fld(colID, 'seo', fldID))
if len(col_fld) == 1 and col_fld[0][1] is None:
output += """<b><span class="info">No values added for this search option yet</span></b>"""
else:
j = 0
for (fldID, fldvID, stype, score, score_fieldvalue) in col_fld:
fieldvalue = get_fld_value(fldvID)
move = ""
if j != 0:
move += """<a href="%s/admin/websearch/websearchadmin.py/switchfldvaluescore?colID=%s&amp;ln=%s&amp;id_1=%s&amp;id_fldvalue_1=%s&amp;id_fldvalue_2=%s&amp;rand=%s#7.3"><img border="0" src="%s/img/smallup.gif" title="Move up"></a>""" % (CFG_SITE_URL, colID, ln, fldID, fldvID, col_fld[j - 1][1], random.randint(0, 1000), CFG_SITE_URL)
else:
move += "&nbsp;&nbsp;&nbsp;"
j += 1
if j != len(col_fld):
move += """<a href="%s/admin/websearch/websearchadmin.py/switchfldvaluescore?colID=%s&amp;ln=%s&amp;id_1=%s&amp;id_fldvalue_1=%s&amp;id_fldvalue_2=%s&amp;rand=%s#7.3"><img border="0" src="%s/img/smalldown.gif" title="Move down"></a>""" % (CFG_SITE_URL, colID, ln, fldID, fldvID, col_fld[j][1], random.randint(0, 1000), CFG_SITE_URL)
if fieldvalue[0][1] != fieldvalue[0][2] and fldvID is not None:
actions.append([move, "%s - %s" % (fieldvalue[0][1], fieldvalue[0][2])])
elif fldvID is not None:
actions.append([move, "%s" % fieldvalue[0][1]])
move = ''
for col in [(('Modify value', 'modifyfieldvalue'), ('Remove value', 'removefieldvalue'),)]:
actions[-1].append('<a href="%s/admin/websearch/websearchadmin.py/%s?colID=%s&amp;ln=%s&amp;fldID=%s&amp;fldvID=%s&amp;fmeth=seo#7.4">%s</a>' % (CFG_SITE_URL, col[0][1], colID, ln, fldID, fldvID, col[0][0]))
for (str, function) in col[1:]:
actions[-1][-1] += ' / <a href="%s/admin/websearch/websearchadmin.py/%s?colID=%s&amp;ln=%s&amp;fldID=%s&amp;fldvID=%s#7.4">%s</a>' % (CFG_SITE_URL, function, colID, ln, fldID, fldvID, str)
output += tupletotable(header=header, tuple=actions)
output += content
body = [output]
output = "<br />" + addadminbox(subtitle, body)
if len(col_fld) == 0:
output = content
return perform_showsearchoptions(colID, ln, content=output)
def perform_showoutputformats(colID, ln, callback='yes', content='', confirm=-1):
"""shows the outputformats of the current collection
colID - the collection id."""
colID = int(colID)
col_dict = dict(get_def_name('', "collection"))
subtitle = """<a name="10">10. Modify output formats for collection '%s'</a>&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#3.10">?</a>]</small>""" % (col_dict[colID], CFG_SITE_URL)
output = """
<dl>
<dt>Output format actions (not specific to the chosen collection)
<dd>Go to the BibFormat interface to modify</dd>
<dt>Collection specific actions
<dd><a href="addexistingoutputformat?colID=%s&amp;ln=%s#10.2">Add existing output format to collection</a></dd>
</dl>
""" % (colID, ln)
header = ['', 'Code', 'Output format', 'Actions']
actions = []
col_fmt = get_col_fmt(colID)
fmt_dict = dict(get_def_name('', "format"))
i = 0
if len(col_fmt) > 0:
for (id_format, colID_fld, code, score) in col_fmt:
move = """<table cellspacing="1" cellpadding="0" border="0"><tr><td>"""
if i != 0:
move += """<a href="%s/admin/websearch/websearchadmin.py/switchfmtscore?colID=%s&amp;ln=%s&amp;type=format&amp;id_1=%s&amp;id_2=%s&amp;rand=%s#10"><img border="0" src="%s/img/smallup.gif" title="Move format up"></a>""" % (CFG_SITE_URL, colID, ln, id_format, col_fmt[i - 1][0], random.randint(0, 1000), CFG_SITE_URL)
else:
move += "&nbsp;&nbsp;&nbsp;"
move += "</td><td>"
i += 1
if i != len(col_fmt):
move += '<a href="%s/admin/websearch/websearchadmin.py/switchfmtscore?colID=%s&amp;ln=%s&amp;type=format&amp;id_1=%s&amp;id_2=%s&amp;rand=%s#10"><img border="0" src="%s/img/smalldown.gif" title="Move format down"></a>' % (CFG_SITE_URL, colID, ln, id_format, col_fmt[i][0], random.randint(0, 1000), CFG_SITE_URL)
move += """</td></tr></table>"""
actions.append([move, code, fmt_dict[int(id_format)]])
for col in [(('Remove', 'removeoutputformat'),)]:
actions[-1].append('<a href="%s/admin/websearch/websearchadmin.py/%s?colID=%s&amp;ln=%s&amp;fmtID=%s#10">%s</a>' % (CFG_SITE_URL, col[0][1], colID, ln, id_format, col[0][0]))
for (str, function) in col[1:]:
actions[-1][-1] += ' / <a href="%s/admin/websearch/websearchadmin.py/%s?colID=%s&amp;ln=%s&amp;fmtID=%s#10">%s</a>' % (CFG_SITE_URL, function, colID, ln, id_format, str)
output += tupletotable(header=header, tuple=actions)
else:
output += """No output formats exists for this collection"""
output += content
body = [output]
if callback:
return perform_editcollection(colID, ln, "perform_showoutputformats", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def external_collections_build_select(colID, external_collection):
output = '<select name="state" class="admin_w200">'
if external_collection.parser:
max_state = 4
else:
max_state = 2
num_selected = external_collection_get_state(external_collection, colID)
for num in range(max_state):
state_name = CFG_EXTERNAL_COLLECTION_STATES_NAME[num]
if num == num_selected:
selected = ' selected'
else:
selected = ''
output += '<option value="%(num)d"%(selected)s>%(state_name)s</option>' % {'num': num, 'selected': selected, 'state_name': state_name}
output += '</select>\n'
return output
def perform_manage_external_collections(colID, ln, callback='yes', content='', confirm=-1):
"""Show the interface to configure external collections to the user."""
colID = int(colID)
subtitle = """<a name="11">11. Configuration of related external collections</a>
&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#3.11">?</a>]</small>""" % CFG_SITE_URL
output = '<form action="update_external_collections" method="POST"><input type="hidden" name="colID" value="%(colID)d">' % {'colID': colID}
table_header = ['External collection', 'Mode', 'Apply also to daughter collections?']
table_content = []
external_collections = external_collection_sort_engine_by_name(external_collections_dictionary.values())
for external_collection in external_collections:
collection_name = external_collection.name
select = external_collections_build_select(colID, external_collection)
recurse = '<input type=checkbox name="recurse" value="%(collection_name)s">' % {'collection_name': collection_name}
table_content.append([collection_name, select, recurse])
output += tupletotable(header=table_header, tuple=table_content)
output += '<input class="adminbutton" type="submit" value="Modify"/>'
output += '</form>'
return addadminbox(subtitle, [output])
def perform_update_external_collections(colID, ln, state_list, recurse_list):
colID = int(colID)
changes = []
output = ""
if not state_list:
return 'Warning : No state found.<br />' + perform_manage_external_collections(colID, ln)
external_collections = external_collection_sort_engine_by_name(external_collections_dictionary.values())
if len(external_collections) != len(state_list):
return 'Warning : Size of state_list different from external_collections!<br />' + perform_manage_external_collections(colID, ln)
for (external_collection, state) in zip(external_collections, state_list):
state = int(state)
collection_name = external_collection.name
recurse = recurse_list and collection_name in recurse_list
oldstate = external_collection_get_state(external_collection, colID)
if oldstate != state or recurse:
changes += external_collection_get_update_state_list(external_collection, colID, state, recurse)
external_collection_apply_changes(changes)
return output + '<br /><br />' + perform_manage_external_collections(colID, ln)
def perform_showdetailedrecordoptions(colID, ln, callback='yes', content='', confirm=-1):
"""Show the interface to configure detailed record page to the user."""
colID = int(colID)
subtitle = """<a name="12">12. Configuration of detailed record page</a>
&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#3.12">?</a>]</small>""" % CFG_SITE_URL
output = '''<form action="update_detailed_record_options" method="post">
<table><tr><td>
<input type="hidden" name="colID" value="%(colID)d">
<dl>
<dt><b>Show tabs:</b></dt>
<dd>
''' % {'colID': colID}
for (tab_id, tab_info) in get_detailed_page_tabs(colID).iteritems():
if tab_id == 'comments' and \
not CFG_WEBCOMMENT_ALLOW_REVIEWS and \
not CFG_WEBCOMMENT_ALLOW_COMMENTS:
continue
check = ''
output += '''<input type="checkbox" id="id%(tabid)s" name="tabs" value="%(tabid)s" %(check)s />
<label for="id%(tabid)s">&nbsp;%(label)s</label><br />
''' % {'tabid':tab_id,
'check':((tab_info['visible'] and 'checked="checked"') or ''),
'label':tab_info['label']}
output += '</dd></dl></td><td>'
output += '</td></tr></table><input class="adminbutton" type="submit" value="Modify"/>'
output += '''<input type="checkbox" id="recurse" name="recurse" value="1" />
<label for="recurse">&nbsp;Also apply to subcollections</label>'''
output += '</form>'
return addadminbox(subtitle, [output])
def perform_update_detailed_record_options(colID, ln, tabs, recurse):
"""Update the preferences for the tab to show/hide in the detailed record page."""
colID = int(colID)
changes = []
output = '<b><span class="info">Operation successfully completed.</span></b>'
if '' in tabs:
tabs.remove('')
tabs.append('metadata')
def update_settings(colID, tabs, recurse):
run_sql("DELETE FROM collectiondetailedrecordpagetabs WHERE id_collection=%s", (colID, ))
run_sql("REPLACE INTO collectiondetailedrecordpagetabs" + \
" SET id_collection=%s, tabs=%s", (colID, ';'.join(tabs)))
## for enabled_tab in tabs:
## run_sql("REPLACE INTO collectiondetailedrecordpagetabs" + \
## " SET id_collection='%s', tabs='%s'" % (colID, ';'.join(tabs)))
if recurse:
for descendant_id in get_collection_descendants(colID):
update_settings(descendant_id, tabs, recurse)
update_settings(colID, tabs, recurse)
## for colID in colIDs:
## run_sql("DELETE FROM collectiondetailedrecordpagetabs WHERE id_collection='%s'" % colID)
## for enabled_tab in tabs:
## run_sql("REPLACE INTO collectiondetailedrecordpagetabs" + \
## " SET id_collection='%s', tabs='%s'" % (colID, ';'.join(tabs)))
#if callback:
return perform_editcollection(colID, ln, "perform_modifytranslations",
'<br /><br />' + output + '<br /><br />' + \
perform_showdetailedrecordoptions(colID, ln))
#else:
# return addadminbox(subtitle, body)
#return output + '<br /><br />' + perform_showdetailedrecordoptions(colID, ln)
def perform_addexistingoutputformat(colID, ln, fmtID=-1, callback='yes', confirm=-1):
"""form to add an existing output format to a collection.
colID - the collection the format should be added to
fmtID - the format to add."""
subtitle = """<a name="10.2"></a>Add existing output format to collection"""
output = ""
if fmtID not in [-1, "-1"] and confirm in [1, "1"]:
ares = add_col_fmt(colID, fmtID)
colID = int(colID)
res = get_def_name('', "format")
fmt_dict = dict(res)
col_dict = dict(get_def_name('', "collection"))
col_fmt = get_col_fmt(colID)
col_fmt = dict(map(lambda x: (x[0], x[2]), col_fmt))
if len(res) > 0:
text = """
<span class="adminlabel">Output format</span>
<select name="fmtID" class="admin_w200">
<option value="-1">- Select output format -</option>
"""
for (id, name) in res:
if not col_fmt.has_key(id):
text += """<option value="%s" %s>%s</option>
""" % (id, id == int(fmtID) and 'selected="selected"' or '', name)
text += """</select><br />
"""
output += createhiddenform(action="addexistingoutputformat#10.2",
text=text,
button="Add",
colID=colID,
ln=ln,
confirm=1)
else:
output = """No existing output formats to add, please create a new one."""
if fmtID not in [-1, "-1"] and confirm in [1, "1"]:
output += write_outcome(ares)
elif fmtID in [-1, "-1"] and confirm not in [-1, "-1"]:
output += """<b><span class="info">Please select output format.</span></b>"""
body = [output]
output = "<br />" + addadminbox(subtitle, body)
return perform_showoutputformats(colID, ln, content=output)
def perform_deleteoutputformat(colID, ln, fmtID=-1, callback='yes', confirm=-1):
"""form to delete an output format not in use.
colID - the collection id of the current collection.
fmtID - the format id to delete."""
subtitle = """<a name="10.3"></a>Delete an unused output format"""
output = """
<dl>
<dd>Deleting an output format will also delete the translations associated.</dd>
</dl>
"""
colID = int(colID)
if fmtID not in [-1, "-1"] and confirm in [1, "1"]:
fmt_dict = dict(get_def_name('', "format"))
old_colNAME = fmt_dict[int(fmtID)]
ares = delete_fmt(int(fmtID))
res = get_def_name('', "format")
fmt_dict = dict(res)
col_dict = dict(get_def_name('', "collection"))
col_fmt = get_col_fmt()
col_fmt = dict(map(lambda x: (x[0], x[2]), col_fmt))
if len(res) > 0:
text = """
<span class="adminlabel">Output format</span>
<select name="fmtID" class="admin_w200">
"""
text += """<option value="-1">- Select output format -"""
for (id, name) in res:
if not col_fmt.has_key(id):
text += """<option value="%s" %s>%s""" % (id, id == int(fmtID) and 'selected="selected"' or '', name)
text += "</option>"
text += """</select><br />"""
output += createhiddenform(action="deleteoutputformat#10.3",
text=text,
button="Delete",
colID=colID,
ln=ln,
confirm=0)
if fmtID not in [-1, "-1"]:
fmtID = int(fmtID)
if confirm in [0, "0"]:
text = """<b>Do you want to delete the output format '%s'.</b>
""" % fmt_dict[fmtID]
output += createhiddenform(action="deleteoutputformat#10.3",
text=text,
button="Confirm",
colID=colID,
fmtID=fmtID,
ln=ln,
confirm=1)
elif confirm in [1, "1"]:
output += write_outcome(ares)
elif confirm not in [-1, "-1"]:
output += """<b><span class="info">Choose a output format to delete.</span></b>
"""
body = [output]
output = "<br />" + addadminbox(subtitle, body)
return perform_showoutputformats(colID, ln, content=output)
def perform_removeoutputformat(colID, ln, fmtID='', callback='yes', confirm=0):
"""form to remove an output format from a collection.
colID - the collection id of the current collection.
fmtID - the format id.
"""
subtitle = """<a name="10.5"></a>Remove output format"""
output = ""
col_dict = dict(get_def_name('', "collection"))
fmt_dict = dict(get_def_name('', "format"))
if colID and fmtID:
colID = int(colID)
fmtID = int(fmtID)
if confirm in ["0", 0]:
text = """Do you want to remove the output format '%s' from the collection '%s'.""" % (fmt_dict[fmtID], col_dict[colID])
output += createhiddenform(action="removeoutputformat#10.5",
text=text,
button="Confirm",
colID=colID,
fmtID=fmtID,
confirm=1)
elif confirm in ["1", 1]:
res = remove_fmt(colID, fmtID)
output += write_outcome(res)
body = [output]
output = "<br />" + addadminbox(subtitle, body)
return perform_showoutputformats(colID, ln, content=output)
def perform_index(colID=1, ln=CFG_SITE_LANG, mtype='', content='', confirm=0):
"""The index method, calling methods to show the collection tree, create new collections and add collections to tree.
"""
subtitle = "Overview"
colID = int(colID)
col_dict = dict(get_def_name('', "collection"))
output = ""
fin_output = ""
if not col_dict.has_key(1):
res = add_col(CFG_SITE_NAME, '')
if res:
fin_output += """<b><span class="info">Created root collection.</span></b><br />"""
else:
return "Cannot create root collection, please check database."
if CFG_SITE_NAME != run_sql("SELECT name from collection WHERE id=1")[0][0]:
res = run_sql("update collection set name=%s where id=1", (CFG_SITE_NAME, ))
if res:
fin_output += """<b><span class="info">The name of the root collection has been modified to be the same as the %(sitename)s installation name given prior to installing %(sitename)s.</span><b><br />""" % {'sitename' : CFG_SITE_NAME}
else:
return "Error renaming root collection."
fin_output += """
<table>
<tr>
<td>0.&nbsp;<small><a href="%s/admin/websearch/websearchadmin.py?colID=%s&amp;ln=%s&amp;mtype=perform_showall">Show all</a></small></td>
<td>1.&nbsp;<small><a href="%s/admin/websearch/websearchadmin.py?colID=%s&amp;ln=%s&amp;mtype=perform_addcollection">Create new collection</a></small></td>
<td>2.&nbsp;<small><a href="%s/admin/websearch/websearchadmin.py?colID=%s&amp;ln=%s&amp;mtype=perform_addcollectiontotree">Attach collection to tree</a></small></td>
<td>3.&nbsp;<small><a href="%s/admin/websearch/websearchadmin.py?colID=%s&amp;ln=%s&amp;mtype=perform_modifycollectiontree">Modify collection tree</a></small></td>
<td>4.&nbsp;<small><a href="%s/admin/websearch/websearchadmin.py?colID=%s&amp;ln=%s&amp;mtype=perform_checkwebcollstatus">Webcoll Status</a></small></td>
</tr><tr>
<td>5.&nbsp;<small><a href="%s/admin/websearch/websearchadmin.py?colID=%s&amp;ln=%s&amp;mtype=perform_checkcollectionstatus">Collection Status</a></small></td>
<td>6.&nbsp;<small><a href="%s/admin/websearch/websearchadmin.py?colID=%s&amp;ln=%s&amp;mtype=perform_checkexternalcollections">Check external collections</a></small></td>
<td>7.&nbsp;<small><a href="%s/help/admin/websearch-admin-guide?ln=%s">Guide</a></small></td>
</tr>
</table>
""" % (CFG_SITE_URL, colID, ln, CFG_SITE_URL, colID, ln, CFG_SITE_URL, colID, ln, CFG_SITE_URL, colID, ln, CFG_SITE_URL, colID, ln, CFG_SITE_URL, colID, ln, CFG_SITE_URL, colID, ln, CFG_SITE_URL, ln)
if mtype == "":
fin_output += """<br /><br /><b><span class="info">To manage the collections, select an item from the menu.</span><b><br />"""
if mtype == "perform_addcollection" and content:
fin_output += content
elif mtype == "perform_addcollection" or mtype == "perform_showall":
fin_output += perform_addcollection(colID=colID, ln=ln, callback='')
fin_output += "<br />"
if mtype == "perform_addcollectiontotree" and content:
fin_output += content
elif mtype == "perform_addcollectiontotree" or mtype == "perform_showall":
fin_output += perform_addcollectiontotree(colID=colID, ln=ln, callback='')
fin_output += "<br />"
if mtype == "perform_modifycollectiontree" and content:
fin_output += content
elif mtype == "perform_modifycollectiontree" or mtype == "perform_showall":
fin_output += perform_modifycollectiontree(colID=colID, ln=ln, callback='')
fin_output += "<br />"
if mtype == "perform_checkwebcollstatus" and content:
fin_output += content
elif mtype == "perform_checkwebcollstatus" or mtype == "perform_showall":
fin_output += perform_checkwebcollstatus(colID, ln, callback='')
if mtype == "perform_checkcollectionstatus" and content:
fin_output += content
elif mtype == "perform_checkcollectionstatus" or mtype == "perform_showall":
fin_output += perform_checkcollectionstatus(colID, ln, callback='')
if mtype == "perform_checkexternalcollections" and content:
fin_output += content
elif mtype == "perform_checkexternalcollections" or mtype == "perform_showall":
fin_output += perform_checkexternalcollections(colID, ln, callback='')
body = [fin_output]
body = [fin_output]
return addadminbox('<b>Menu</b>', body)
def show_coll_not_in_tree(colID, ln, col_dict):
"""Returns collections not in tree"""
tree = get_col_tree(colID)
in_tree = {}
output = "These collections are not in the tree, and should be added:<br />"
for (id, up, down, dad, reltype) in tree:
in_tree[id] = 1
in_tree[dad] = 1
res = run_sql("SELECT id from collection")
if len(res) != len(in_tree):
for id in res:
if not in_tree.has_key(id[0]):
output += """<a href="%s/admin/websearch/websearchadmin.py/editcollection?colID=%s&amp;ln=%s" title="Edit collection">%s</a> ,
""" % (CFG_SITE_URL, id[0], ln, col_dict[id[0]])
output += "<br /><br />"
else:
output = ""
return output
def create_colltree(tree, col_dict, colID, ln, move_from='', move_to='', rtype='', edit=''):
"""Creates the presentation of the collection tree, with the buttons for modifying it.
tree - the tree to present, from get_tree()
col_dict - the name of the collections in a dictionary
colID - the collection id to start with
move_from - if a collection to be moved has been chosen
move_to - the collection which should be set as father of move_from
rtype - the type of the tree, regular or virtual
edit - if the method should output the edit buttons."""
if move_from:
move_from_rtype = move_from[0]
move_from_id = int(move_from[1:len(move_from)])
tree_from = get_col_tree(colID, move_from_rtype)
tree_to = get_col_tree(colID, rtype)
tables = 0
tstack = []
i = 0
text = """
<table border ="0" cellspacing="0" cellpadding="0">"""
for i in range(0, len(tree)):
id_son = tree[i][0]
up = tree[i][1]
down = tree[i][2]
dad = tree[i][3]
reltype = tree[i][4]
tmove_from = ""
j = i
while j > 0:
j = j - 1
try:
if tstack[j][1] == dad:
table = tstack[j][2]
for k in range(0, tables - table):
tables = tables - 1
text += """</table></td></tr>
"""
break
except StandardError, e:
pass
text += """<tr><td>
"""
if i > 0 and tree[i][1] == 0:
tables = tables + 1
text += """</td><td></td><td></td><td></td><td><table border="0" cellspacing="0" cellpadding="0"><tr><td>
"""
if i == 0:
tstack.append((id_son, dad, 1))
else:
tstack.append((id_son, dad, tables))
if up == 1 and edit:
text += """<a href="%s/admin/websearch/websearchadmin.py/modifycollectiontree?colID=%s&amp;ln=%s&amp;move_up=%s&amp;rtype=%s#%s"><img border="0" src="%s/img/smallup.gif" title="Move collection up"></a>""" % (CFG_SITE_URL, colID, ln, i, rtype, tree[i][0], CFG_SITE_URL)
else:
text += """&nbsp;"""
text += "</td><td>"
if down == 1 and edit:
text += """<a href="%s/admin/websearch/websearchadmin.py/modifycollectiontree?colID=%s&amp;ln=%s&amp;move_down=%s&amp;rtype=%s#%s"><img border="0" src="%s/img/smalldown.gif" title="Move collection down"></a>""" % (CFG_SITE_URL, colID, ln, i, rtype, tree[i][0], CFG_SITE_URL)
else:
text += """&nbsp;"""
text += "</td><td>"
if edit:
if move_from and move_to:
tmove_from = move_from
move_from = ''
if not (move_from == "" and i == 0) and not (move_from != "" and int(move_from[1:len(move_from)]) == i and rtype == move_from[0]):
check = "true"
if move_from:
#if tree_from[move_from_id][0] == tree_to[i][0] or not check_col(tree_to[i][0], tree_from[move_from_id][0]):
# check = ''
#elif not check_col(tree_to[i][0], tree_from[move_from_id][0]):
# check = ''
#if not check and (tree_to[i][0] == 1 and tree_from[move_from_id][3] == tree_to[i][0] and move_from_rtype != rtype):
# check = "true"
if check:
text += """<a href="%s/admin/websearch/websearchadmin.py/modifycollectiontree?colID=%s&amp;ln=%s&amp;move_from=%s&amp;move_to=%s%s&amp;rtype=%s#tree"><img border="0" src="%s/img/move_to.gif" title="Move '%s' to '%s'"></a>
""" % (CFG_SITE_URL, colID, ln, move_from, rtype, i, rtype, CFG_SITE_URL, col_dict[tree_from[int(move_from[1:len(move_from)])][0]], col_dict[tree_to[i][0]])
else:
try:
text += """<a href="%s/admin/websearch/websearchadmin.py/modifycollectiontree?colID=%s&amp;ln=%s&amp;move_from=%s%s&amp;rtype=%s#%s"><img border="0" src="%s/img/move_from.gif" title="Move '%s' from this location."></a>""" % (CFG_SITE_URL, colID, ln, rtype, i, rtype, tree[i][0], CFG_SITE_URL, col_dict[tree[i][0]])
except KeyError:
pass
else:
text += """<img border="0" src="%s/img/white_field.gif">
""" % CFG_SITE_URL
else:
text += """<img border="0" src="%s/img/white_field.gif">
""" % CFG_SITE_URL
text += """
</td>
<td>"""
if edit:
try:
text += """<a href="%s/admin/websearch/websearchadmin.py/modifycollectiontree?colID=%s&amp;ln=%s&amp;delete=%s&amp;rtype=%s#%s"><img border="0" src="%s/img/iconcross.gif" title="Remove colletion from tree"></a>""" % (CFG_SITE_URL, colID, ln, i, rtype, tree[i][0], CFG_SITE_URL)
except KeyError:
pass
elif i != 0:
text += """<img border="0" src="%s/img/white_field.gif">
""" % CFG_SITE_URL
text += """</td><td>
"""
if tmove_from:
move_from = tmove_from
try:
text += """<a name="%s"></a>%s<a href="%s/admin/websearch/websearchadmin.py/editcollection?colID=%s&amp;ln=%s" title="Edit collection">%s</a>%s%s%s""" % (tree[i][0], (reltype=="v" and '<i>' or ''), CFG_SITE_URL, tree[i][0], ln, col_dict[id_son], (move_to=="%s%s" %(rtype, i) and '&nbsp;<img border="0" src="%s/img/move_to.gif">' % CFG_SITE_URL or ''), (move_from=="%s%s" % (rtype, i) and '&nbsp;<img border="0" src="%s/img/move_from.gif">' % CFG_SITE_URL or ''), (reltype=="v" and '</i>' or ''))
except KeyError:
pass
text += """</td></tr>
"""
while tables > 0:
text += """</table></td></tr>
"""
tables = tables - 1
text += """</table>"""
return text
def perform_deletecollection(colID, ln, confirm=-1, callback='yes'):
"""form to delete a collection
colID - id of collection
"""
subtitle =''
output = """
<span class="warning">
<strong>
<dl>
<dt>WARNING:</dt>
<dd>When deleting a collection, you also deletes all data related to the collection like translations, relations to other collections and information about which rank methods to use.
<br />For more information, please go to the <a title="See guide" href="%s/help/admin/websearch-admin-guide">WebSearch guide</a> and read the section regarding deleting a collection.</dd>
</dl>
</strong>
</span>
""" % CFG_SITE_URL
col_dict = dict(get_def_name('', "collection"))
if colID != 1 and colID and col_dict.has_key(int(colID)):
colID = int(colID)
subtitle = """<a name="4">4. Delete collection '%s'</a>&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#3.4">?</a>]</small>""" % (col_dict[colID], CFG_SITE_URL)
res = run_sql("SELECT id_dad,id_son,type,score from collection_collection WHERE id_dad=%s", (colID, ))
res2 = run_sql("SELECT id_dad,id_son,type,score from collection_collection WHERE id_son=%s", (colID, ))
if not res and not res2:
if confirm in ["-1", -1]:
text = """Do you want to delete this collection."""
output += createhiddenform(action="deletecollection#4",
text=text,
colID=colID,
button="Delete",
confirm=0)
elif confirm in ["0", 0]:
text = """Are you sure you want to delete this collection."""
output += createhiddenform(action="deletecollection#4",
text=text,
colID=colID,
button="Confirm",
confirm=1)
elif confirm in ["1", 1]:
result = delete_col(colID)
if not result:
raise Exception
else:
output = """<b><span class="info">Can not delete a collection that is a part of the collection tree, remove collection from the tree and try again.</span></b>"""
else:
subtitle = """4. Delete collection"""
output = """<b><span class="info">Not possible to delete the root collection</span></b>"""
body = [output]
if callback:
return perform_editcollection(colID, ln, "perform_deletecollection", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_editcollection(colID=1, ln=CFG_SITE_LANG, mtype='', content=''):
"""interface to modify a collection. this method is calling other methods which again is calling this and sending back the output of the method.
if callback, the method will call perform_editcollection, if not, it will just return its output.
colID - id of the collection
mtype - the method that called this method.
content - the output from that method."""
colID = int(colID)
col_dict = dict(get_def_name('', "collection"))
if not col_dict.has_key(colID):
return """<b><span class="info">Collection deleted.</span></b>
"""
fin_output = """
<table>
<tr>
<td><b>Menu</b></td>
</tr>
<tr>
<td>0.&nbsp;<small><a href="editcollection?colID=%s&amp;ln=%s">Show all</a></small></td>
<td>1.&nbsp;<small><a href="editcollection?colID=%s&amp;ln=%s&amp;mtype=perform_modifydbquery">Modify collection query</a></small></td>
<td>2.&nbsp;<small><a href="editcollection?colID=%s&amp;ln=%s&amp;mtype=perform_modifyrestricted">Modify access restrictions</a></small></td>
<td>3.&nbsp;<small><a href="editcollection?colID=%s&amp;ln=%s&amp;mtype=perform_modifytranslations">Modify translations</a></small></td>
<td>4.&nbsp;<small><a href="editcollection?colID=%s&amp;ln=%s&amp;mtype=perform_deletecollection">Delete collection</a></small></td>
</tr><tr>
<td>5.&nbsp;<small><a href="editcollection?colID=%s&amp;ln=%s&amp;mtype=perform_showportalboxes">Modify portalboxes</a></small></td>
<td>6.&nbsp;<small><a href="editcollection?colID=%s&amp;ln=%s&amp;mtype=perform_showsearchfields#6">Modify search fields</a></small></td>
<td>7.&nbsp;<small><a href="editcollection?colID=%s&amp;ln=%s&amp;mtype=perform_showsearchoptions#7">Modify search options</a></small></td>
<td>8.&nbsp;<small><a href="editcollection?colID=%s&amp;ln=%s&amp;mtype=perform_showsortoptions#8">Modify sort options</a></small></td>
<td>9.&nbsp;<small><a href="editcollection?colID=%s&amp;ln=%s&amp;mtype=perform_modifyrankmethods#9">Modify rank options</a></small></td>
</tr><tr>
<td>10.&nbsp;<small><a href="editcollection?colID=%s&amp;ln=%s&amp;mtype=perform_showoutputformats#10">Modify output formats</a></small></td>
<td>11.&nbsp;<small><a href="editcollection?colID=%s&amp;ln=%s&amp;mtype=perform_manage_external_collections#11">Configuration of related external collections</a></small></td>
<td>12.&nbsp;<small><a href="editcollection?colID=%s&amp;ln=%s&amp;mtype=perform_showdetailedrecordoptions#12">Detailed record page options</a></small></td>
</tr>
</table>
""" % (colID, ln, colID, ln, colID, ln, colID, ln, colID, ln, colID, ln, colID, ln, colID, ln, colID, ln, colID, ln, colID, ln, colID, ln, colID, ln)
if mtype == "perform_modifydbquery" and content:
fin_output += content
elif mtype == "perform_modifydbquery" or not mtype:
fin_output += perform_modifydbquery(colID, ln, callback='')
if mtype == "perform_modifyrestricted" and content:
fin_output += content
elif mtype == "perform_modifyrestricted" or not mtype:
fin_output += perform_modifyrestricted(colID, ln, callback='')
if mtype == "perform_modifytranslations" and content:
fin_output += content
elif mtype == "perform_modifytranslations" or not mtype:
fin_output += perform_modifytranslations(colID, ln, callback='')
if mtype == "perform_deletecollection" and content:
fin_output += content
elif mtype == "perform_deletecollection" or not mtype:
fin_output += perform_deletecollection(colID, ln, callback='')
if mtype == "perform_showportalboxes" and content:
fin_output += content
elif mtype == "perform_showportalboxes" or not mtype:
fin_output += perform_showportalboxes(colID, ln, callback='')
if mtype == "perform_showsearchfields" and content:
fin_output += content
elif mtype == "perform_showsearchfields" or not mtype:
fin_output += perform_showsearchfields(colID, ln, callback='')
if mtype == "perform_showsearchoptions" and content:
fin_output += content
elif mtype == "perform_showsearchoptions" or not mtype:
fin_output += perform_showsearchoptions(colID, ln, callback='')
if mtype == "perform_showsortoptions" and content:
fin_output += content
elif mtype == "perform_showsortoptions" or not mtype:
fin_output += perform_showsortoptions(colID, ln, callback='')
if mtype == "perform_modifyrankmethods" and content:
fin_output += content
elif mtype == "perform_modifyrankmethods" or not mtype:
fin_output += perform_modifyrankmethods(colID, ln, callback='')
if mtype == "perform_showoutputformats" and content:
fin_output += content
elif mtype == "perform_showoutputformats" or not mtype:
fin_output += perform_showoutputformats(colID, ln, callback='')
if mtype == "perform_manage_external_collections" and content:
fin_output += content
elif mtype == "perform_manage_external_collections" or not mtype:
fin_output += perform_manage_external_collections(colID, ln, callback='')
if mtype == "perform_showdetailedrecordoptions" and content:
fin_output += content
elif mtype == "perform_showdetailedrecordoptions" or not mtype:
fin_output += perform_showdetailedrecordoptions(colID, ln, callback='')
return addadminbox("Overview of edit options for collection '%s'" % col_dict[colID], [fin_output])
def perform_checkwebcollstatus(colID, ln, confirm=0, callback='yes'):
"""Check status of the collection tables with respect to the webcoll cache."""
subtitle = """<a name="11"></a>Webcoll Status&nbsp;&nbsp;&nbsp;[<a href="%s/help/admin/websearch-admin-guide#5">?</a>]""" % CFG_SITE_URL
output = ""
colID = int(colID)
col_dict = dict(get_def_name('', "collection"))
output += """<br /><b>Last updates:</b><br />"""
collection_table_update_time = ""
collection_web_update_time = ""
collection_table_update_time = get_table_update_time('collection')
output += "Collection table last updated: %s<br />" % collection_table_update_time
try:
file = open("%s/collections/last_updated" % CFG_CACHEDIR)
collection_web_update_time = file.readline().strip()
output += "Collection cache last updated: %s<br />" % collection_web_update_time
file.close()
except:
pass
# reformat collection_web_update_time to the format suitable for comparisons
try:
collection_web_update_time = strftime("%Y-%m-%d %H:%M:%S",
time.strptime(collection_web_update_time, "%d %b %Y %H:%M:%S"))
except ValueError, e:
pass
if collection_table_update_time > collection_web_update_time:
output += """<br /><b><span class="info">Warning: The collections have been modified since last time Webcoll was executed, to process the changes, Webcoll must be executed.</span></b><br />"""
header = ['ID', 'Name', 'Time', 'Status', 'Progress']
actions = []
output += """<br /><b>Last BibSched tasks:</b><br />"""
res = run_sql("select id, proc, host, user, runtime, sleeptime, arguments, status, progress from schTASK where proc='webcoll' and runtime< now() ORDER by runtime")
if len(res) > 0:
(id, proc, host, user, runtime, sleeptime, arguments, status, progress) = res[len(res) - 1]
webcoll__update_time = runtime
actions.append([id, proc, runtime, (status !="" and status or ''), (progress !="" and progress or '')])
else:
actions.append(['', 'webcoll', '', '', 'Not executed yet'])
res = run_sql("select id, proc, host, user, runtime, sleeptime, arguments, status, progress from schTASK where proc='bibindex' and runtime< now() ORDER by runtime")
if len(res) > 0:
(id, proc, host, user, runtime, sleeptime, arguments, status, progress) = res[len(res) - 1]
actions.append([id, proc, runtime, (status !="" and status or ''), (progress !="" and progress or '')])
else:
actions.append(['', 'bibindex', '', '', 'Not executed yet'])
output += tupletotable(header=header, tuple=actions)
output += """<br /><b>Next scheduled BibSched run:</b><br />"""
actions = []
res = run_sql("select id, proc, host, user, runtime, sleeptime, arguments, status, progress from schTASK where proc='webcoll' and runtime > now() ORDER by runtime")
webcoll_future = ""
if len(res) > 0:
(id, proc, host, user, runtime, sleeptime, arguments, status, progress) = res[0]
webcoll__update_time = runtime
actions.append([id, proc, runtime, (status !="" and status or ''), (progress !="" and progress or '')])
webcoll_future = "yes"
else:
actions.append(['', 'webcoll', '', '', 'Not scheduled'])
res = run_sql("select id, proc, host, user, runtime, sleeptime, arguments, status, progress from schTASK where proc='bibindex' and runtime > now() ORDER by runtime")
bibindex_future = ""
if len(res) > 0:
(id, proc, host, user, runtime, sleeptime, arguments, status, progress) = res[0]
actions.append([id, proc, runtime, (status !="" and status or ''), (progress !="" and progress or '')])
bibindex_future = "yes"
else:
actions.append(['', 'bibindex', '', '', 'Not scheduled'])
output += tupletotable(header=header, tuple=actions)
if webcoll_future == "":
output += """<br /><b><span class="info">Warning: Webcoll is not scheduled for a future run by bibsched, any updates to the collection will not be processed.</span></b><br />"""
if bibindex_future == "":
output += """<br /><b><span class="info">Warning: Bibindex is not scheduled for a future run by bibsched, any updates to the records will not be processed.</span></b><br />"""
body = [output]
if callback:
return perform_index(colID, ln, "perform_checkwebcollstatus", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_modifyrestricted(colID, ln, rest='', callback='yes', confirm=-1):
"""modify which apache group is allowed to access the collection.
rest - the groupname"""
subtitle = ''
output = ""
col_dict = dict(get_def_name('', "collection"))
action_id = acc_get_action_id(VIEWRESTRCOLL)
if colID and col_dict.has_key(int(colID)):
colID = int(colID)
subtitle = """<a name="2">2. Modify access restrictions for collection '%s'</a>&nbsp;&nbsp;&nbsp;<small>[<a title="See guide" href="%s/help/admin/websearch-admin-guide#3.2">?</a>]</small>""" % (col_dict[colID], CFG_SITE_URL)
output = """<p>Please note that Invenio versions greater than <em>0.92.1</em> manage collection restriction via the standard
<strong><a href="/admin/webaccess/webaccessadmin.py/showactiondetails?id_action=%i">WebAccess Admin Interface</a></strong> (action '%s').</p>
""" % (action_id, VIEWRESTRCOLL)
body = [output]
if callback:
return perform_editcollection(colID, ln, "perform_modifyrestricted", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_checkcollectionstatus(colID, ln, confirm=0, callback='yes'):
"""Check the configuration of the collections."""
from invenio.legacy.search_engine import collection_restricted_p, restricted_collection_cache
subtitle = """<a name="11"></a>Collection Status&nbsp;&nbsp;&nbsp;[<a href="%s/help/admin/websearch-admin-guide#6">?</a>]""" % CFG_SITE_URL
output = ""
colID = int(colID)
col_dict = dict(get_def_name('', "collection"))
collections = run_sql("SELECT id, name, dbquery, nbrecs FROM collection "
"ORDER BY id")
header = ['ID', 'Name','Query', 'Subcollections', 'Restricted', 'Hosted',
'I18N', 'Status', 'Number of records']
rnk_list = get_def_name('', "rnkMETHOD")
actions = []
restricted_collection_cache.recreate_cache_if_needed()
for (id, name, dbquery, nbrecs) in collections:
reg_sons = col_has_son(id, 'r')
vir_sons = col_has_son(id, 'v')
status = ""
hosted = ""
if str(dbquery).startswith("hostedcollection:"): hosted = """<b><span class="info">Yes</span></b>"""
else: hosted = """<b><span class="info">No</span></b>"""
langs = run_sql("SELECT ln from collectionname where id_collection=%s", (id, ))
i8n = ""
if len(langs) > 0:
for lang in langs:
i8n += "%s, " % lang
else:
i8n = """<b><span class="info">None</span></b>"""
if reg_sons and dbquery:
status = """<b><span class="warning">1:Conflict</span></b>"""
elif not dbquery and not reg_sons:
status = """<b><span class="warning">2:Empty</span></b>"""
if (reg_sons or vir_sons):
subs = """<b><span class="info">Yes</span></b>"""
else:
subs = """<b><span class="info">No</span></b>"""
if dbquery is None:
dbquery = """<b><span class="info">No</span></b>"""
restricted = collection_restricted_p(name, recreate_cache_if_needed=False)
if restricted:
restricted = """<b><span class="warning">Yes</span></b>"""
if status:
status += """<b><span class="warning">,3:Restricted</span></b>"""
else:
status += """<b><span class="warning">3:Restricted</span></b>"""
else:
restricted = """<b><span class="info">No</span></b>"""
if status == "":
status = """<b><span class="info">OK</span></b>"""
actions.append([id, """<a href="%s/admin/websearch/websearchadmin.py/editcollection?colID=%s&amp;ln=%s">%s</a>""" % (CFG_SITE_URL, id, ln, name), dbquery, subs, restricted, hosted, i8n, status, nbrecs])
output += tupletotable(header=header, tuple=actions)
body = [output]
return addadminbox(subtitle, body)
if callback:
return perform_index(colID, ln, "perform_checkcollectionstatus", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def perform_checkexternalcollections(colID, ln, icl=None, update="", confirm=0, callback='yes'):
"""Check the external collections for inconsistencies."""
subtitle = """<a name="7"></a>Check external collections&nbsp;&nbsp;&nbsp;[<a href="%s/help/admin/websearch-admin-guide#7">?</a>]""" % CFG_SITE_URL
output = ""
colID = int(colID)
if icl:
if update == "add":
# icl : the "inconsistent list" comes as a string, it has to be converted back into a list
icl = eval(icl)
#icl = icl[1:-1].split(',')
for collection in icl:
#collection = str(collection[1:-1])
query_select = "SELECT name FROM externalcollection WHERE name like '%(name)s';" % {'name': collection}
results_select = run_sql(query_select)
if not results_select:
query_insert = "INSERT INTO externalcollection (name) VALUES ('%(name)s');" % {'name': collection}
run_sql(query_insert)
output += """<br /><span class=info>New collection \"%s\" has been added to the database table \"externalcollection\".</span><br />""" % (collection)
else:
output += """<br /><span class=info>Collection \"%s\" has already been added to the database table \"externalcollection\" or was already there.</span><br />""" % (collection)
elif update == "del":
# icl : the "inconsistent list" comes as a string, it has to be converted back into a list
icl = eval(icl)
#icl = icl[1:-1].split(',')
for collection in icl:
#collection = str(collection[1:-1])
query_select = "SELECT id FROM externalcollection WHERE name like '%(name)s';" % {'name': collection}
results_select = run_sql(query_select)
if results_select:
query_delete = "DELETE FROM externalcollection WHERE id like '%(id)s';" % {'id': results_select[0][0]}
query_delete_states = "DELETE FROM collection_externalcollection WHERE id_externalcollection like '%(id)s';" % {'id': results_select[0][0]}
run_sql(query_delete)
run_sql(query_delete_states)
output += """<br /><span class=info>Collection \"%s\" has been deleted from the database table \"externalcollection\".</span><br />""" % (collection)
else:
output += """<br /><span class=info>Collection \"%s\" has already been delete from the database table \"externalcollection\" or was never there.</span><br />""" % (collection)
external_collections_file = []
external_collections_db = []
for coll in external_collections_dictionary.values():
external_collections_file.append(coll.name)
external_collections_file.sort()
query = """SELECT name from externalcollection"""
results = run_sql(query)
for result in results:
external_collections_db.append(result[0])
external_collections_db.sort()
number_file = len(external_collections_file)
number_db = len(external_collections_db)
if external_collections_file == external_collections_db:
output += """<br /><span class="info">External collections are consistent.</span><br /><br />
&nbsp;&nbsp;&nbsp;- database table \"externalcollection\" has %(number_db)s collections<br />
&nbsp;&nbsp;&nbsp;- configuration file \"websearch_external_collections_config.py\" has %(number_file)s collections""" % {
"number_db" : number_db,
"number_file" : number_file}
elif len(external_collections_file) > len(external_collections_db):
external_collections_diff = list(set(external_collections_file) - set(external_collections_db))
external_collections_db.extend(external_collections_diff)
external_collections_db.sort()
if external_collections_file == external_collections_db:
output += """<br /><span class="warning">There is an inconsistency:</span><br /><br />
&nbsp;&nbsp;&nbsp;- database table \"externalcollection\" has %(number_db)s collections
&nbsp;(<span class="warning">missing: %(diff)s</span>)<br />
&nbsp;&nbsp;&nbsp;- configuration file \"websearch_external_collections_config.py\" has %(number_file)s collections
<br /><br /><a href="%(site_url)s/admin/websearch/websearchadmin.py/checkexternalcollections?colID=%(colID)s&amp;icl=%(diff)s&amp;update=add&amp;ln=%(ln)s">
Click here</a> to update your database adding the missing collections. If the problem persists please check your configuration manually.""" % {
"number_db" : number_db,
"number_file" : number_file,
"diff" : external_collections_diff,
"site_url" : CFG_SITE_URL,
"colID" : colID,
"ln" : ln}
else:
output += """<br /><span class="warning">There is an inconsistency:</span><br /><br />
&nbsp;&nbsp;&nbsp;- database table \"externalcollection\" has %(number_db)s collections<br />
&nbsp;&nbsp;&nbsp;- configuration file \"websearch_external_collections_config.py\" has %(number_file)s collections
<br /><br /><span class="warning">The external collections do not match.</span>
<br />To fix the problem please check your configuration manually.""" % {
"number_db" : number_db,
"number_file" : number_file}
elif len(external_collections_file) < len(external_collections_db):
external_collections_diff = list(set(external_collections_db) - set(external_collections_file))
external_collections_file.extend(external_collections_diff)
external_collections_file.sort()
if external_collections_file == external_collections_db:
output += """<br /><span class="warning">There is an inconsistency:</span><br /><br />
&nbsp;&nbsp;&nbsp;- database table \"externalcollection\" has %(number_db)s collections
&nbsp;(<span class="warning">extra: %(diff)s</span>)<br />
&nbsp;&nbsp;&nbsp;- configuration file \"websearch_external_collections_config.py\" has %(number_file)s collections
<br /><br /><a href="%(site_url)s/admin/websearch/websearchadmin.py/checkexternalcollections?colID=%(colID)s&amp;icl=%(diff)s&amp;update=del&amp;ln=%(ln)s">
Click here</a> to force remove the extra collections from your database (warning: use with caution!). If the problem persists please check your configuration manually.""" % {
"number_db" : number_db,
"number_file" : number_file,
"diff" : external_collections_diff,
"site_url" : CFG_SITE_URL,
"colID" : colID,
"ln" : ln}
else:
output += """<br /><span class="warning">There is an inconsistency:</span><br /><br />
&nbsp;&nbsp;&nbsp;- database table \"externalcollection\" has %(number_db)s collections<br />
&nbsp;&nbsp;&nbsp;- configuration file \"websearch_external_collections_config.py\" has %(number_file)s collections
<br /><br /><span class="warning">The external collections do not match.</span>
<br />To fix the problem please check your configuration manually.""" % {
"number_db" : number_db,
"number_file" : number_file}
else:
output += """<br /><span class="warning">There is an inconsistency:</span><br /><br />
&nbsp;&nbsp;&nbsp;- database table \"externalcollection\" has %(number_db)s collections<br />
&nbsp;&nbsp;&nbsp;- configuration file \"websearch_external_collections_config.py\" has %(number_file)s collections
<br /><br /><span class="warning">The number of external collections is the same but the collections do not match.</span>
<br />To fix the problem please check your configuration manually.""" % {
"number_db" : number_db,
"number_file" : number_file}
body = [output]
return addadminbox(subtitle, body)
if callback:
return perform_index(colID, ln, "perform_checkexternalcollections", addadminbox(subtitle, body))
else:
return addadminbox(subtitle, body)
def col_has_son(colID, rtype='r'):
"""Return True if the collection has at least one son."""
return run_sql("SELECT id_son FROM collection_collection WHERE id_dad=%s and type=%s LIMIT 1", (colID, rtype)) != ()
def get_col_tree(colID, rtype=''):
"""Returns a presentation of the tree as a list. TODO: Add loop detection
colID - startpoint for the tree
rtype - get regular or virtual part of the tree"""
try:
colID = int(colID)
stack = [colID]
ssize = 0
tree = [(colID, 0, 0, colID, 'r')]
while len(stack) > 0:
ccolID = stack.pop()
if ccolID == colID and rtype:
res = run_sql("SELECT id_son, score, type FROM collection_collection WHERE id_dad=%s AND type=%s ORDER BY score ASC,id_son", (ccolID, rtype))
else:
res = run_sql("SELECT id_son, score, type FROM collection_collection WHERE id_dad=%s ORDER BY score ASC,id_son", (ccolID, ))
ssize += 1
ntree = []
for i in range(0, len(res)):
id_son = res[i][0]
score = res[i][1]
rtype = res[i][2]
stack.append(id_son)
if i == (len(res) - 1):
up = 0
else:
up = 1
if i == 0:
down = 0
else:
down = 1
ntree.insert(0, (id_son, up, down, ccolID, rtype))
tree = tree[0:ssize] + ntree + tree[ssize:len(tree)]
return tree
except StandardError, e:
register_exception()
return ()
def add_col_dad_son(add_dad, add_son, rtype):
"""Add a son to a collection (dad)
add_dad - add to this collection id
add_son - add this collection id
rtype - either regular or virtual"""
try:
res = run_sql("SELECT score FROM collection_collection WHERE id_dad=%s ORDER BY score ASC", (add_dad, ))
highscore = 0
for score in res:
if int(score[0]) > highscore:
highscore = int(score[0])
highscore += 1
res = run_sql("INSERT INTO collection_collection(id_dad,id_son,score,type) values(%s,%s,%s,%s)", (add_dad, add_son, highscore, rtype))
return (1, highscore)
except StandardError, e:
register_exception()
return (0, e)
def compare_on_val(first, second):
"""Compare the two values"""
return cmp(first[1], second[1])
def get_col_fld(colID=-1, type = '', id_field=''):
"""Returns either all portalboxes associated with a collection, or based on either colID or language or both.
colID - collection id
ln - language id"""
sql = "SELECT id_field,id_fieldvalue,type,score,score_fieldvalue FROM collection_field_fieldvalue, field WHERE id_field=field.id"
params = []
if colID > -1:
sql += " AND id_collection=%s"
params.append(colID)
if id_field:
sql += " AND id_field=%s"
params.append(id_field)
if type:
sql += " AND type=%s"
params.append(type)
sql += " ORDER BY type, score desc, score_fieldvalue desc"
res = run_sql(sql, tuple(params))
return res
def get_col_pbx(colID=-1, ln='', position = ''):
"""Returns either all portalboxes associated with a collection, or based on either colID or language or both.
colID - collection id
ln - language id"""
sql = "SELECT id_portalbox, id_collection, ln, score, position, title, body FROM collection_portalbox, portalbox WHERE id_portalbox = portalbox.id"
params = []
if colID > -1:
sql += " AND id_collection=%s"
params.append(colID)
if ln:
sql += " AND ln=%s"
params.append(ln)
if position:
sql += " AND position=%s"
params.append(position)
sql += " ORDER BY position, ln, score desc"
res = run_sql(sql, tuple(params))
return res
def get_col_fmt(colID=-1):
"""Returns all formats currently associated with a collection, or for one specific collection
colID - the id of the collection"""
if colID not in [-1, "-1"]:
res = run_sql("SELECT id_format, id_collection, code, score FROM collection_format, format WHERE id_format = format.id AND id_collection=%s ORDER BY score desc", (colID, ))
else:
res = run_sql("SELECT id_format, id_collection, code, score FROM collection_format, format WHERE id_format = format.id ORDER BY score desc")
return res
def get_col_rnk(colID, ln):
""" Returns a list of the rank methods the given collection is attached to
colID - id from collection"""
try:
res1 = dict(run_sql("SELECT id_rnkMETHOD, '' FROM collection_rnkMETHOD WHERE id_collection=%s", (colID, )))
res2 = get_def_name('', "rnkMETHOD")
result = filter(lambda x: res1.has_key(x[0]), res2)
return result
except StandardError, e:
return ()
def get_pbx():
"""Returns all portalboxes"""
res = run_sql("SELECT id, title, body FROM portalbox ORDER by title,body")
return res
def get_fld_value(fldvID = ''):
"""Returns fieldvalue"""
sql = "SELECT id, name, value FROM fieldvalue"
params = []
if fldvID:
sql += " WHERE id=%s"
params.append(fldvID)
sql += " ORDER BY name"
res = run_sql(sql, tuple(params))
return res
def get_pbx_pos():
"""Returns a list of all the positions for a portalbox"""
position = {}
position["rt"] = "Right Top"
position["lt"] = "Left Top"
position["te"] = "Title Epilog"
position["tp"] = "Title Prolog"
position["ne"] = "Narrow by coll epilog"
position["np"] = "Narrow by coll prolog"
return position
def get_sort_nametypes():
"""Return a list of the various translationnames for the fields"""
type = {}
type['soo'] = 'Sort options'
type['seo'] = 'Search options'
type['sew'] = 'Search within'
return type
def get_fmt_nametypes():
"""Return a list of the various translationnames for the output formats"""
type = []
type.append(('ln', 'Long name'))
return type
def get_fld_nametypes():
"""Return a list of the various translationnames for the fields"""
type = []
type.append(('ln', 'Long name'))
return type
def get_col_nametypes():
"""Return a list of the various translationnames for the collections"""
type = []
type.append(('ln', 'Long name'))
return type
def find_last(tree, start_son):
"""Find the previous collection in the tree with the same father as start_son"""
id_dad = tree[start_son][3]
while start_son > 0:
start_son -= 1
if tree[start_son][3] == id_dad:
return start_son
def find_next(tree, start_son):
"""Find the next collection in the tree with the same father as start_son"""
id_dad = tree[start_son][3]
while start_son < len(tree):
start_son += 1
if tree[start_son][3] == id_dad:
return start_son
def remove_col_subcol(id_son, id_dad, type):
"""Remove a collection as a son of another collection in the tree, if collection isn't used elsewhere in the tree, remove all registered sons of the id_son.
id_son - collection id of son to remove
id_dad - the id of the dad"""
try:
if id_son != id_dad:
tree = get_col_tree(id_son)
run_sql("DELETE FROM collection_collection WHERE id_son=%s and id_dad=%s", (id_son, id_dad))
else:
tree = get_col_tree(id_son, type)
run_sql("DELETE FROM collection_collection WHERE id_son=%s and id_dad=%s and type=%s", (id_son, id_dad, type))
if not run_sql("SELECT id_dad,id_son,type,score from collection_collection WHERE id_son=%s and type=%s", (id_son, type)):
for (id, up, down, dad, rtype) in tree:
run_sql("DELETE FROM collection_collection WHERE id_son=%s and id_dad=%s", (id, dad))
return (1, "")
except StandardError, e:
return (0, e)
def check_col(add_dad, add_son):
"""Check if the collection can be placed as a son of the dad without causing loops.
add_dad - collection id
add_son - collection id"""
try:
stack = [add_dad]
res = run_sql("SELECT id_dad FROM collection_collection WHERE id_dad=%s AND id_son=%s", (add_dad, add_son))
if res:
raise StandardError
while len(stack) > 0:
colID = stack.pop()
res = run_sql("SELECT id_dad FROM collection_collection WHERE id_son=%s", (colID, ))
for id in res:
if int(id[0]) == int(add_son):
# raise StandardError # this was the original but it didnt work
return(0)
else:
stack.append(id[0])
return (1, "")
except StandardError, e:
return (0, e)
def attach_rnk_col(colID, rnkID):
"""attach rank method to collection
rnkID - id from rnkMETHOD table
colID - id of collection, as in collection table """
try:
res = run_sql("INSERT INTO collection_rnkMETHOD(id_collection, id_rnkMETHOD) values (%s,%s)", (colID, rnkID))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def detach_rnk_col(colID, rnkID):
"""detach rank method from collection
rnkID - id from rnkMETHOD table
colID - id of collection, as in collection table """
try:
res = run_sql("DELETE FROM collection_rnkMETHOD WHERE id_collection=%s AND id_rnkMETHOD=%s", (colID, rnkID))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def switch_col_treescore(col_1, col_2):
try:
res1 = run_sql("SELECT score FROM collection_collection WHERE id_dad=%s and id_son=%s", (col_1[3], col_1[0]))
res2 = run_sql("SELECT score FROM collection_collection WHERE id_dad=%s and id_son=%s", (col_2[3], col_2[0]))
res = run_sql("UPDATE collection_collection SET score=%s WHERE id_dad=%s and id_son=%s", (res2[0][0], col_1[3], col_1[0]))
res = run_sql("UPDATE collection_collection SET score=%s WHERE id_dad=%s and id_son=%s", (res1[0][0], col_2[3], col_2[0]))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def move_col_tree(col_from, col_to, move_to_rtype=''):
"""Move a collection from one point in the tree to another. becomes a son of the endpoint.
col_from - move this collection from current point
col_to - and set it as a son of this collection.
move_to_rtype - either virtual or regular collection"""
try:
res = run_sql("SELECT score FROM collection_collection WHERE id_dad=%s ORDER BY score asc", (col_to[0], ))
highscore = 0
for score in res:
if int(score[0]) > highscore:
highscore = int(score[0])
highscore += 1
if not move_to_rtype:
move_to_rtype = col_from[4]
res = run_sql("DELETE FROM collection_collection WHERE id_son=%s and id_dad=%s", (col_from[0], col_from[3]))
res = run_sql("INSERT INTO collection_collection(id_dad,id_son,score,type) values(%s,%s,%s,%s)", (col_to[0], col_from[0], highscore, move_to_rtype))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def remove_pbx(colID, pbxID, ln):
"""Removes a portalbox from the collection given.
colID - the collection the format is connected to
pbxID - the portalbox which should be removed from the collection.
ln - the language of the portalbox to be removed"""
try:
res = run_sql("DELETE FROM collection_portalbox WHERE id_collection=%s AND id_portalbox=%s AND ln=%s", (colID, pbxID, ln))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def remove_fmt(colID, fmtID):
"""Removes a format from the collection given.
colID - the collection the format is connected to
fmtID - the format which should be removed from the collection."""
try:
res = run_sql("DELETE FROM collection_format WHERE id_collection=%s AND id_format=%s", (colID, fmtID))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def remove_fld(colID, fldID, fldvID=''):
"""Removes a field from the collection given.
colID - the collection the format is connected to
fldID - the field which should be removed from the collection."""
try:
sql = "DELETE FROM collection_field_fieldvalue WHERE id_collection=%s AND id_field=%s"
params = [colID, fldID]
if fldvID:
if fldvID != "None":
sql += " AND id_fieldvalue=%s"
params.append(fldvID)
else:
sql += " AND id_fieldvalue is NULL"
res = run_sql(sql, tuple(params))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def delete_fldv(fldvID):
"""Deletes all data for the given fieldvalue
fldvID - delete all data in the tables associated with fieldvalue and this id"""
try:
res = run_sql("DELETE FROM collection_field_fieldvalue WHERE id_fieldvalue=%s", (fldvID, ))
res = run_sql("DELETE FROM fieldvalue WHERE id=%s", (fldvID, ))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def delete_pbx(pbxID):
"""Deletes all data for the given portalbox
pbxID - delete all data in the tables associated with portalbox and this id """
try:
res = run_sql("DELETE FROM collection_portalbox WHERE id_portalbox=%s", (pbxID, ))
res = run_sql("DELETE FROM portalbox WHERE id=%s", (pbxID, ))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def delete_fmt(fmtID):
"""Deletes all data for the given format
fmtID - delete all data in the tables associated with format and this id """
try:
res = run_sql("DELETE FROM format WHERE id=%s", (fmtID, ))
res = run_sql("DELETE FROM collection_format WHERE id_format=%s", (fmtID, ))
res = run_sql("DELETE FROM formatname WHERE id_format=%s", (fmtID, ))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def delete_col(colID):
"""Deletes all data for the given collection
colID - delete all data in the tables associated with collection and this id """
try:
res = run_sql("DELETE FROM collection WHERE id=%s", (colID, ))
res = run_sql("DELETE FROM collectionname WHERE id_collection=%s", (colID, ))
res = run_sql("DELETE FROM collection_rnkMETHOD WHERE id_collection=%s", (colID, ))
res = run_sql("DELETE FROM collection_collection WHERE id_dad=%s", (colID, ))
res = run_sql("DELETE FROM collection_collection WHERE id_son=%s", (colID, ))
res = run_sql("DELETE FROM collection_portalbox WHERE id_collection=%s", (colID, ))
res = run_sql("DELETE FROM collection_format WHERE id_collection=%s", (colID, ))
res = run_sql("DELETE FROM collection_field_fieldvalue WHERE id_collection=%s", (colID, ))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def add_fmt(code, name, rtype):
"""Add a new output format. Returns the id of the format.
code - the code for the format, max 6 chars.
name - the default name for the default language of the format.
rtype - the default nametype"""
try:
res = run_sql("INSERT INTO format (code, name) values (%s,%s)", (code, name))
fmtID = run_sql("SELECT id FROM format WHERE code=%s", (code,))
res = run_sql("INSERT INTO formatname(id_format, type, ln, value) VALUES (%s,%s,%s,%s)",
(fmtID[0][0], rtype, CFG_SITE_LANG, name))
return (1, fmtID)
except StandardError, e:
register_exception()
return (0, e)
def update_fldv(fldvID, name, value):
"""Modify existing fieldvalue
fldvID - id of fieldvalue to modify
value - the value of the fieldvalue
name - the name of the fieldvalue."""
try:
res = run_sql("UPDATE fieldvalue set name=%s where id=%s", (name, fldvID))
res = run_sql("UPDATE fieldvalue set value=%s where id=%s", (value, fldvID))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def add_fldv(name, value):
"""Add a new fieldvalue, returns id of fieldvalue
value - the value of the fieldvalue
name - the name of the fieldvalue."""
try:
res = run_sql("SELECT id FROM fieldvalue WHERE name=%s and value=%s", (name, value))
if not res:
res = run_sql("INSERT INTO fieldvalue (name, value) values (%s,%s)", (name, value))
res = run_sql("SELECT id FROM fieldvalue WHERE name=%s and value=%s", (name, value))
if res:
return (1, res[0][0])
else:
raise StandardError
except StandardError, e:
register_exception()
return (0, e)
def add_pbx(title, body):
try:
res = run_sql("INSERT INTO portalbox (title, body) values (%s,%s)", (title, body))
res = run_sql("SELECT id FROM portalbox WHERE title=%s AND body=%s", (title, body))
if res:
return (1, res[0][0])
else:
raise StandardError
except StandardError, e:
register_exception()
return (0, e)
def add_col(colNAME, dbquery=None):
"""Adds a new collection to collection table
colNAME - the default name for the collection, saved to collection and collectionname
dbquery - query related to the collection"""
# BTW, sometimes '' are passed instead of None, so change them to None
if not dbquery:
dbquery = None
try:
rtype = get_col_nametypes()[0][0]
colID = run_sql("SELECT id FROM collection WHERE id=1")
if colID:
res = run_sql("INSERT INTO collection (name,dbquery) VALUES (%s,%s)",
(colNAME,dbquery))
else:
res = run_sql("INSERT INTO collection (id,name,dbquery) VALUES (1,%s,%s)",
(colNAME,dbquery))
colID = run_sql("SELECT id FROM collection WHERE name=%s", (colNAME,))
res = run_sql("INSERT INTO collectionname(id_collection, type, ln, value) VALUES (%s,%s,%s,%s)",
(colID[0][0], rtype, CFG_SITE_LANG, colNAME))
if colID:
return (1, colID[0][0])
else:
raise StandardError
except StandardError, e:
register_exception()
return (0, e)
def add_col_pbx(colID, pbxID, ln, position, score=''):
"""add a portalbox to the collection.
colID - the id of the collection involved
pbxID - the portalbox to add
ln - which language the portalbox is for
score - decides which portalbox is the most important
position - position on page the portalbox should appear."""
try:
if score:
res = run_sql("INSERT INTO collection_portalbox(id_portalbox, id_collection, ln, score, position) values (%s,%s,'%s',%s,%s)", (pbxID, colID, ln, score, position))
else:
res = run_sql("SELECT score FROM collection_portalbox WHERE id_collection=%s and ln=%s and position=%s ORDER BY score desc, ln, position", (colID, ln, position))
if res:
score = int(res[0][0])
else:
score = 0
res = run_sql("INSERT INTO collection_portalbox(id_portalbox, id_collection, ln, score, position) values (%s,%s,%s,%s,%s)", (pbxID, colID, ln, (score + 1), position))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def add_col_fmt(colID, fmtID, score=''):
"""Add a output format to the collection.
colID - the id of the collection involved
fmtID - the id of the format.
score - the score of the format, decides sorting, if not given, place the format on top"""
try:
if score:
res = run_sql("INSERT INTO collection_format(id_format, id_collection, score) values (%s,%s,%s)", (fmtID, colID, score))
else:
res = run_sql("SELECT score FROM collection_format WHERE id_collection=%s ORDER BY score desc", (colID, ))
if res:
score = int(res[0][0])
else:
score = 0
res = run_sql("INSERT INTO collection_format(id_format, id_collection, score) values (%s,%s,%s)", (fmtID, colID, (score + 1)))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def add_col_fld(colID, fldID, type, fldvID=''):
"""Add a sort/search/field to the collection.
colID - the id of the collection involved
fldID - the id of the field.
fldvID - the id of the fieldvalue.
type - which type, seo, sew...
score - the score of the format, decides sorting, if not given, place the format on top"""
try:
if fldvID and fldvID not in [-1, "-1"]:
run_sql("DELETE FROM collection_field_fieldvalue WHERE id_collection=%s AND id_field=%s and type=%s and id_fieldvalue is NULL", (colID, fldID, type))
res = run_sql("SELECT score FROM collection_field_fieldvalue WHERE id_collection=%s AND id_field=%s and type=%s ORDER BY score desc", (colID, fldID, type))
if res:
score = int(res[0][0])
res = run_sql("SELECT score_fieldvalue FROM collection_field_fieldvalue WHERE id_collection=%s AND id_field=%s and type=%s ORDER BY score_fieldvalue desc", (colID, fldID, type))
else:
res = run_sql("SELECT score FROM collection_field_fieldvalue WHERE id_collection=%s and type=%s ORDER BY score desc", (colID, type))
if res:
score = int(res[0][0]) + 1
else:
score = 1
res = run_sql("SELECT id_collection,id_field,id_fieldvalue,type,score,score_fieldvalue FROM collection_field_fieldvalue where id_field=%s and id_collection=%s and type=%s and id_fieldvalue=%s", (fldID, colID, type, fldvID))
if not res:
run_sql("UPDATE collection_field_fieldvalue SET score_fieldvalue=score_fieldvalue+1 WHERE id_field=%s AND id_collection=%s and type=%s", (fldID, colID, type))
res = run_sql("INSERT INTO collection_field_fieldvalue(id_field, id_fieldvalue, id_collection, type, score, score_fieldvalue) values (%s,%s,%s,%s,%s,%s)", (fldID, fldvID, colID, type, score, 1))
else:
return (0, (1, "Already exists"))
else:
res = run_sql("SELECT id_collection,id_field,id_fieldvalue,type,score,score_fieldvalue FROM collection_field_fieldvalue WHERE id_collection=%s AND type=%s and id_field=%s and id_fieldvalue is NULL", (colID, type, fldID))
if res:
return (0, (1, "Already exists"))
else:
run_sql("UPDATE collection_field_fieldvalue SET score=score+1")
res = run_sql("INSERT INTO collection_field_fieldvalue(id_field, id_collection, type, score,score_fieldvalue) values (%s,%s,%s,%s, 0)", (fldID, colID, type, 1))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def modify_dbquery(colID, dbquery=None):
"""Modify the dbquery of an collection.
colID - the id of the collection involved
dbquery - the new dbquery"""
# BTW, sometimes '' is passed instead of None, so change it to None
if not dbquery:
dbquery = None
try:
res = run_sql("UPDATE collection SET dbquery=%s WHERE id=%s", (dbquery, colID))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def modify_pbx(colID, pbxID, sel_ln, score='', position='', title='', body=''):
"""Modify a portalbox
colID - the id of the collection involved
pbxID - the id of the portalbox that should be modified
sel_ln - the language of the portalbox that should be modified
title - the title
body - the content
score - if several portalboxes in one position, who should appear on top.
position - position on page"""
try:
if title:
res = run_sql("UPDATE portalbox SET title=%s WHERE id=%s", (title, pbxID))
if body:
res = run_sql("UPDATE portalbox SET body=%s WHERE id=%s", (body, pbxID))
if score:
res = run_sql("UPDATE collection_portalbox SET score=%s WHERE id_collection=%s and id_portalbox=%s and ln=%s", (score, colID, pbxID, sel_ln))
if position:
res = run_sql("UPDATE collection_portalbox SET position=%s WHERE id_collection=%s and id_portalbox=%s and ln=%s", (position, colID, pbxID, sel_ln))
return (1, "")
except Exception, e:
register_exception()
return (0, e)
def switch_fld_score(colID, id_1, id_2):
"""Switch the scores of id_1 and id_2 in collection_field_fieldvalue
colID - collection the id_1 or id_2 is connected to
id_1/id_2 - id field from tables like format..portalbox...
table - name of the table"""
try:
res1 = run_sql("SELECT score FROM collection_field_fieldvalue WHERE id_collection=%s and id_field=%s", (colID, id_1))
res2 = run_sql("SELECT score FROM collection_field_fieldvalue WHERE id_collection=%s and id_field=%s", (colID, id_2))
if res1[0][0] == res2[0][0]:
return (0, (1, "Cannot rearrange the selected fields, either rearrange by name or use the mySQL client to fix the problem."))
else:
res = run_sql("UPDATE collection_field_fieldvalue SET score=%s WHERE id_collection=%s and id_field=%s", (res2[0][0], colID, id_1))
res = run_sql("UPDATE collection_field_fieldvalue SET score=%s WHERE id_collection=%s and id_field=%s", (res1[0][0], colID, id_2))
return (1, "")
except StandardError, e:
register_exception()
return (0, e)
def switch_fld_value_score(colID, id_1, fldvID_1, fldvID_2):
"""Switch the scores of two field_value
colID - collection the id_1 or id_2 is connected to
id_1/id_2 - id field from tables like format..portalbox...
table - name of the table"""
try:
res1 = run_sql("SELECT score_fieldvalue FROM collection_field_fieldvalue WHERE id_collection=%s and id_field=%s and id_fieldvalue=%s", (colID, id_1, fldvID_1))
res2 = run_sql("SELECT score_fieldvalue FROM collection_field_fieldvalue WHERE id_collection=%s and id_field=%s and id_fieldvalue=%s", (colID, id_1, fldvID_2))
if res1[0][0] == res2[0][0]:
return (0, (1, "Cannot rearrange the selected fields, either rearrange by name or use the mySQL client to fix the problem."))
else:
res = run_sql("UPDATE collection_field_fieldvalue SET score_fieldvalue=%s WHERE id_collection=%s and id_field=%s and id_fieldvalue=%s", (res2[0][0], colID, id_1, fldvID_1))
res = run_sql("UPDATE collection_field_fieldvalue SET score_fieldvalue=%s WHERE id_collection=%s and id_field=%s and id_fieldvalue=%s", (res1[0][0], colID, id_1, fldvID_2))
return (1, "")
except Exception, e:
register_exception()
return (0, e)
def switch_pbx_score(colID, id_1, id_2, sel_ln):
"""Switch the scores of id_1 and id_2 in the table given by the argument.
colID - collection the id_1 or id_2 is connected to
id_1/id_2 - id field from tables like format..portalbox...
table - name of the table"""
try:
res1 = run_sql("SELECT score FROM collection_portalbox WHERE id_collection=%s and id_portalbox=%s and ln=%s", (colID, id_1, sel_ln))
res2 = run_sql("SELECT score FROM collection_portalbox WHERE id_collection=%s and id_portalbox=%s and ln=%s", (colID, id_2, sel_ln))
if res1[0][0] == res2[0][0]:
return (0, (1, "Cannot rearrange the selected fields, either rearrange by name or use the mySQL client to fix the problem."))
res = run_sql("UPDATE collection_portalbox SET score=%s WHERE id_collection=%s and id_portalbox=%s and ln=%s", (res2[0][0], colID, id_1, sel_ln))
res = run_sql("UPDATE collection_portalbox SET score=%s WHERE id_collection=%s and id_portalbox=%s and ln=%s", (res1[0][0], colID, id_2, sel_ln))
return (1, "")
except Exception, e:
register_exception()
return (0, e)
def switch_score(colID, id_1, id_2, table):
"""Switch the scores of id_1 and id_2 in the table given by the argument.
colID - collection the id_1 or id_2 is connected to
id_1/id_2 - id field from tables like format..portalbox...
table - name of the table"""
try:
res1 = run_sql("SELECT score FROM collection_%s WHERE id_collection=%%s and id_%s=%%s" % (table, table), (colID, id_1))
res2 = run_sql("SELECT score FROM collection_%s WHERE id_collection=%%s and id_%s=%%s" % (table, table), (colID, id_2))
if res1[0][0] == res2[0][0]:
return (0, (1, "Cannot rearrange the selected fields, either rearrange by name or use the mySQL client to fix the problem."))
res = run_sql("UPDATE collection_%s SET score=%%s WHERE id_collection=%%s and id_%s=%%s" % (table, table), (res2[0][0], colID, id_1))
res = run_sql("UPDATE collection_%s SET score=%%s WHERE id_collection=%%s and id_%s=%%s" % (table, table), (res1[0][0], colID, id_2))
return (1, "")
except Exception, e:
register_exception()
return (0, e)
def get_detailed_page_tabs(colID=None, recID=None, ln=CFG_SITE_LANG):
"""
Returns the complete list of tabs to be displayed in the
detailed record pages.
Returned structured is a dict with
- key : last component of the url that leads to detailed record tab: http:www.../CFG_SITE_RECORD/74/key
- values: a dictionary with the following keys:
- label: *string* label to be printed as tab (Not localized here)
- visible: *boolean* if False, tab should not be shown
- enabled: *boolean* if True, tab should be disabled
- order: *int* position of the tab in the list of tabs
- ln: language of the tab labels
returns dict
"""
_ = gettext_set_language(ln)
tabs = {'metadata' : {'label': _('Information'), 'visible': False, 'enabled': True, 'order': 1},
'references': {'label': _('References'), 'visible': False, 'enabled': True, 'order': 2},
'citations' : {'label': _('Citations'), 'visible': False, 'enabled': True, 'order': 3},
'keywords' : {'label': _('Keywords'), 'visible': False, 'enabled': True, 'order': 4},
'comments' : {'label': _('Comments'), 'visible': False, 'enabled': True, 'order': 5},
'reviews' : {'label': _('Reviews'), 'visible': False, 'enabled': True, 'order': 6},
'usage' : {'label': _('Usage statistics'), 'visible': False, 'enabled': True, 'order': 7},
'files' : {'label': _('Files'), 'visible': False, 'enabled': True, 'order': 8},
'plots' : {'label': _('Plots'), 'visible': False, 'enabled': True, 'order': 9},
'holdings' : {'label': _('Holdings'), 'visible': False, 'enabled': True, 'order': 10},
'linkbacks' : {'label': _('Linkbacks'), 'visible': False, 'enabled': True, 'order': 11},
}
res = run_sql("SELECT tabs FROM collectiondetailedrecordpagetabs " + \
"WHERE id_collection=%s", (colID, ))
if len(res) > 0:
tabs_state = res[0][0].split(';')
for tab_state in tabs_state:
if tabs.has_key(tab_state):
tabs[tab_state]['visible'] = True;
else:
# no preference set for this collection.
# assume all tabs are displayed
for key in tabs.keys():
tabs[key]['visible'] = True
if not CFG_WEBCOMMENT_ALLOW_COMMENTS:
tabs['comments']['visible'] = False
tabs['comments']['enabled'] = False
if not CFG_WEBCOMMENT_ALLOW_REVIEWS:
tabs['reviews']['visible'] = False
tabs['reviews']['enabled'] = False
if recID is not None:
# Disable references if no references found
#bfo = BibFormatObject(recID)
#if bfe_references.format_element(bfo, '', '') == '':
# tabs['references']['enabled'] = False
## FIXME: the above was commented out because bfe_references
## may be too slow. And we do not really need this anyway
## because we can disable tabs in WebSearch Admin on a
## collection-by-collection basis. If we need this, then we
## should probably call bfo.fields('999') here that should be
## much faster than calling bfe_references.
# Disable citations if not citations found
#if len(get_cited_by(recID)) == 0:
# tabs['citations']['enabled'] = False
## FIXME: the above was commented out because get_cited_by()
## may be too slow. And we do not really need this anyway
## because we can disable tags in WebSearch Admin on a
## collection-by-collection basis.
# Disable Files tab if no file found except for Plots:
disable_files_tab_p = True
for abibdoc in BibRecDocs(recID).list_bibdocs():
abibdoc_type = abibdoc.get_type()
if abibdoc_type == 'Plot':
continue # ignore attached plots
else:
if CFG_INSPIRE_SITE and not \
abibdoc_type in ('', 'INSPIRE-PUBLIC', 'Supplementary Material'):
# ignore non-empty, non-INSPIRE-PUBLIC, non-suppl doctypes for INSPIRE
continue
# okay, we found at least one non-Plot file:
disable_files_tab_p = False
break
if disable_files_tab_p:
tabs['files']['enabled'] = False
#Disable holdings tab if collection != Books
collection = run_sql("""select name from collection where id=%s""", (colID, ))
if collection[0][0] != 'Books':
tabs['holdings']['enabled'] = False
# Disable Plots tab if no docfile of doctype Plot found
brd = BibRecDocs(recID)
if len(brd.list_bibdocs('Plot')) == 0:
tabs['plots']['enabled'] = False
if CFG_CERN_SITE:
from invenio.legacy.search_engine import get_collection_reclist
if recID in get_collection_reclist("Books & Proceedings"):
tabs['holdings']['visible'] = True
tabs['holdings']['enabled'] = True
tabs[''] = tabs['metadata']
del tabs['metadata']
return tabs
def get_detailed_page_tabs_counts(recID):
"""
Returns the number of citations, references and comments/reviews
that have to be shown on the corresponding tabs in the
detailed record pages
@param recID: record id
@return: dictionary with following keys
'Citations': number of citations to be shown in the "Citations" tab
'References': number of references to be shown in the "References" tab
'Comments': number of comments to be shown in the "Comments" tab
'Reviews': number of reviews to be shown in the "Reviews" tab
"""
num_comments = 0 #num of comments
num_reviews = 0 #num of reviews
tabs_counts = {'Citations' : 0,
'References' : -1,
'Discussions' : 0,
'Comments' : 0,
'Reviews' : 0
}
from invenio.legacy.search_engine import get_field_tags, get_record
if CFG_BIBRANK_SHOW_CITATION_LINKS:
tabs_counts['Citations'] = get_cited_by_count(recID)
if not CFG_CERN_SITE:#FIXME:should be replaced by something like CFG_SHOW_REFERENCES
reftag = ""
reftags = get_field_tags("reference")
if reftags:
reftag = reftags[0]
tmprec = get_record(recID)
if reftag and len(reftag) > 4:
tabs_counts['References'] = len(record_get_field_instances(tmprec, reftag[0:3], reftag[3], reftag[4]))
# obtain number of comments/reviews
from invenio.webcommentadminlib import get_nb_reviews, get_nb_comments
if CFG_WEBCOMMENT_ALLOW_COMMENTS and CFG_WEBSEARCH_SHOW_COMMENT_COUNT:
num_comments = get_nb_comments(recID, count_deleted=False)
if CFG_WEBCOMMENT_ALLOW_REVIEWS and CFG_WEBSEARCH_SHOW_REVIEW_COUNT:
num_reviews = get_nb_reviews(recID, count_deleted=False)
if num_comments:
tabs_counts['Comments'] = num_comments
tabs_counts['Discussions'] += num_comments
if num_reviews:
tabs_counts['Reviews'] = num_reviews
tabs_counts['Discussions'] += num_reviews
return tabs_counts
diff --git a/invenio/legacy/websearch/scripts/webcoll.py b/invenio/legacy/websearch/scripts/webcoll.py
index 940fbff4b..ebb90ce7e 100644
--- a/invenio/legacy/websearch/scripts/webcoll.py
+++ b/invenio/legacy/websearch/scripts/webcoll.py
@@ -1,60 +1,60 @@
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
from invenio.base.factory import with_app_context
@with_app_context()
def main():
"""Main that construct all the bibtask."""
- from invenio.bibtask import task_init
- from invenio.websearch_webcoll import (
+ from invenio.legacy.bibsched.bibtask import task_init
+ from invenio.legacy.websearch.webcoll import (
task_submit_elaborate_specific_parameter, task_submit_check_options,
task_run_core, __revision__)
task_init(authorization_action="runwebcoll",
authorization_msg="WebColl Task Submission",
description="""Description:
webcoll updates the collection cache (record universe for a
given collection plus web page elements) based on invenio.conf and DB
configuration parameters. If the collection name is passed as an argument,
only this collection's cache will be updated. If the recursive option is
set as well, the collection's descendants will also be updated.\n""",
help_specific_usage=" -c, --collection\t Update cache for the given "
"collection only. [all]\n"
" -r, --recursive\t Update cache for the given collection and all its\n"
"\t\t\t descendants (to be used in combination with -c). [no]\n"
" -q, --quick\t\t Skip webpage cache update for those collections whose\n"
"\t\t\t reclist was not changed. Note: if you use this option, it is advised\n"
"\t\t\t to schedule, e.g. a nightly 'webcoll --force'. [no]\n"
" -f, --force\t\t Force update even if cache is up to date. [no]\n"
" -p, --part\t\t Update only certain cache parts (1=reclist,"
" 2=webpage). [both]\n"
" -l, --language\t Update pages in only certain language"
" (e.g. fr,it,...). [all]\n",
version=__revision__,
specific_params=("c:rqfp:l:", [
"collection=",
"recursive",
"quick",
"force",
"part=",
"language="
]),
task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
task_submit_check_options_fnc=task_submit_check_options,
task_run_fnc=task_run_core)
diff --git a/invenio/legacy/websearch/templates.py b/invenio/legacy/websearch/templates.py
index 8b39a3056..e1586b86a 100644
--- a/invenio/legacy/websearch/templates.py
+++ b/invenio/legacy/websearch/templates.py
@@ -1,4625 +1,4596 @@
# -*- coding: utf-8 -*-
## This file is part of Invenio.
## Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0301
__revision__ = "$Id$"
import time
import cgi
import string
import re
import locale
from urllib import quote, urlencode
from xml.sax.saxutils import escape as xml_escape
from invenio.config import \
CFG_WEBSEARCH_LIGHTSEARCH_PATTERN_BOX_WIDTH, \
CFG_WEBSEARCH_SIMPLESEARCH_PATTERN_BOX_WIDTH, \
CFG_WEBSEARCH_ADVANCEDSEARCH_PATTERN_BOX_WIDTH, \
CFG_WEBSEARCH_AUTHOR_ET_AL_THRESHOLD, \
CFG_WEBSEARCH_USE_ALEPH_SYSNOS, \
CFG_WEBSEARCH_SPLIT_BY_COLLECTION, \
CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS, \
CFG_BIBRANK_SHOW_READING_STATS, \
CFG_BIBRANK_SHOW_DOWNLOAD_STATS, \
CFG_BIBRANK_SHOW_DOWNLOAD_GRAPHS, \
CFG_BIBRANK_SHOW_CITATION_LINKS, \
CFG_BIBRANK_SHOW_CITATION_STATS, \
CFG_BIBRANK_SHOW_CITATION_GRAPHS, \
CFG_WEBSEARCH_RSS_TTL, \
CFG_SITE_LANG, \
CFG_SITE_NAME, \
CFG_SITE_NAME_INTL, \
CFG_VERSION, \
CFG_SITE_URL, \
CFG_SITE_SUPPORT_EMAIL, \
CFG_SITE_ADMIN_EMAIL, \
CFG_CERN_SITE, \
CFG_INSPIRE_SITE, \
CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE, \
CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES, \
CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS, \
CFG_BIBINDEX_CHARS_PUNCTUATION, \
CFG_WEBCOMMENT_ALLOW_COMMENTS, \
CFG_WEBCOMMENT_ALLOW_REVIEWS, \
CFG_WEBSEARCH_WILDCARD_LIMIT, \
CFG_WEBSEARCH_SHOW_COMMENT_COUNT, \
CFG_WEBSEARCH_SHOW_REVIEW_COUNT, \
CFG_SITE_RECORD, \
CFG_WEBSEARCH_PREV_NEXT_HIT_LIMIT
from invenio.legacy.dbquery import run_sql
from invenio.base.i18n import gettext_set_language
from invenio.base.globals import cfg
from invenio.utils.url import make_canonical_urlargd, drop_default_urlargd, create_html_link, create_url
from invenio.utils.html import nmtoken_from_string
from invenio.ext.legacy.handler import wash_urlargd
from invenio.legacy.bibrank.citation_searcher import get_cited_by_count
from invenio.legacy.webuser import session_param_get
from invenio.intbitset import intbitset
from invenio.legacy.websearch_external_collections import external_collection_get_state, get_external_collection_engine
from invenio.legacy.websearch_external_collections.websearch_external_collections_utils import get_collection_id
from invenio.legacy.websearch_external_collections.websearch_external_collections_config import CFG_EXTERNAL_COLLECTION_MAXRESULTS
from invenio.legacy.bibrecord import get_fieldvalues
_RE_PUNCTUATION = re.compile(CFG_BIBINDEX_CHARS_PUNCTUATION)
_RE_SPACES = re.compile(r"\s+")
class Template:
# This dictionary maps Invenio language code to locale codes (ISO 639)
tmpl_localemap = {
'bg': 'bg_BG',
'ar': 'ar_AR',
'ca': 'ca_ES',
'de': 'de_DE',
'el': 'el_GR',
'en': 'en_US',
'es': 'es_ES',
'pt': 'pt_BR',
'fa': 'fa_IR',
'fr': 'fr_FR',
'it': 'it_IT',
'ka': 'ka_GE',
'lt': 'lt_LT',
'ro': 'ro_RO',
'ru': 'ru_RU',
'rw': 'rw_RW',
'sk': 'sk_SK',
'cs': 'cs_CZ',
'no': 'no_NO',
'sv': 'sv_SE',
'uk': 'uk_UA',
'ja': 'ja_JA',
'pl': 'pl_PL',
'hr': 'hr_HR',
'zh_CN': 'zh_CN',
'zh_TW': 'zh_TW',
'hu': 'hu_HU',
'af': 'af_ZA',
'gl': 'gl_ES'
}
tmpl_default_locale = "en_US" # which locale to use by default, useful in case of failure
# Type of the allowed parameters for the web interface for search results
- search_results_default_urlargd = {
- 'cc': (str, CFG_SITE_NAME),
- 'c': (list, []),
- 'p': (str, ""), 'f': (str, ""),
- 'rg': (int, CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS),
- 'sf': (str, ""),
- 'so': (str, "d"),
- 'sp': (str, ""),
- 'rm': (str, ""),
- 'of': (str, "hb"),
- 'ot': (list, []),
- 'em': (str,""),
- 'aas': (int, CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE),
- 'as': (int, CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE),
- 'p1': (str, ""), 'f1': (str, ""), 'm1': (str, ""), 'op1':(str, ""),
- 'p2': (str, ""), 'f2': (str, ""), 'm2': (str, ""), 'op2':(str, ""),
- 'p3': (str, ""), 'f3': (str, ""), 'm3': (str, ""),
- 'sc': (int, 0),
- 'jrec': (int, 0),
- 'recid': (int, -1), 'recidb': (int, -1), 'sysno': (str, ""),
- 'id': (int, -1), 'idb': (int, -1), 'sysnb': (str, ""),
- 'action': (str, "search"),
- 'action_search': (str, ""),
- 'action_browse': (str, ""),
- 'd1': (str, ""),
- 'd1y': (int, 0), 'd1m': (int, 0), 'd1d': (int, 0),
- 'd2': (str, ""),
- 'd2y': (int, 0), 'd2m': (int, 0), 'd2d': (int, 0),
- 'dt': (str, ""),
- 'ap': (int, 1),
- 'verbose': (int, 0),
- 'ec': (list, []),
- 'wl': (int, CFG_WEBSEARCH_WILDCARD_LIMIT),
- }
+ @property
+ def search_results_default_urlargd(self):
+ from invenio.modules.search.washers import \
+ search_results_default_urlargd
+ return search_results_default_urlargd
# ...and for search interfaces
search_interface_default_urlargd = {
'aas': (int, CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE),
'as': (int, CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE),
'verbose': (int, 0),
'em' : (str, "")}
# ...and for RSS feeds
rss_default_urlargd = {'c' : (list, []),
'cc' : (str, ""),
'p' : (str, ""),
'f' : (str, ""),
'p1' : (str, ""),
'f1' : (str, ""),
'm1' : (str, ""),
'op1': (str, ""),
'p2' : (str, ""),
'f2' : (str, ""),
'm2' : (str, ""),
'op2': (str, ""),
'p3' : (str, ""),
'f3' : (str, ""),
'm3' : (str, ""),
'wl' : (int, CFG_WEBSEARCH_WILDCARD_LIMIT)}
tmpl_openurl_accepted_args = {
'id' : (list, []),
'genre' : (str, ''),
'aulast' : (str, ''),
'aufirst' : (str, ''),
'auinit' : (str, ''),
'auinit1' : (str, ''),
'auinitm' : (str, ''),
'issn' : (str, ''),
'eissn' : (str, ''),
'coden' : (str, ''),
'isbn' : (str, ''),
'sici' : (str, ''),
'bici' : (str, ''),
'title' : (str, ''),
'stitle' : (str, ''),
'atitle' : (str, ''),
'volume' : (str, ''),
'part' : (str, ''),
'issue' : (str, ''),
'spage' : (str, ''),
'epage' : (str, ''),
'pages' : (str, ''),
'artnum' : (str, ''),
'date' : (str, ''),
'ssn' : (str, ''),
'quarter' : (str, ''),
'url_ver' : (str, ''),
'ctx_ver' : (str, ''),
'rft_val_fmt' : (str, ''),
'rft_id' : (list, []),
'rft.atitle' : (str, ''),
'rft.title' : (str, ''),
'rft.jtitle' : (str, ''),
'rft.stitle' : (str, ''),
'rft.date' : (str, ''),
'rft.volume' : (str, ''),
'rft.issue' : (str, ''),
'rft.spage' : (str, ''),
'rft.epage' : (str, ''),
'rft.pages' : (str, ''),
'rft.artnumber' : (str, ''),
'rft.issn' : (str, ''),
'rft.eissn' : (str, ''),
'rft.aulast' : (str, ''),
'rft.aufirst' : (str, ''),
'rft.auinit' : (str, ''),
'rft.auinit1' : (str, ''),
'rft.auinitm' : (str, ''),
'rft.ausuffix' : (str, ''),
'rft.au' : (list, []),
'rft.aucorp' : (str, ''),
'rft.isbn' : (str, ''),
'rft.coden' : (str, ''),
'rft.sici' : (str, ''),
'rft.genre' : (str, 'unknown'),
'rft.chron' : (str, ''),
'rft.ssn' : (str, ''),
'rft.quarter' : (int, ''),
'rft.part' : (str, ''),
'rft.btitle' : (str, ''),
'rft.isbn' : (str, ''),
'rft.atitle' : (str, ''),
'rft.place' : (str, ''),
'rft.pub' : (str, ''),
'rft.edition' : (str, ''),
'rft.tpages' : (str, ''),
'rft.series' : (str, ''),
}
tmpl_opensearch_rss_url_syntax = "%(CFG_SITE_URL)s/rss?p={searchTerms}&amp;jrec={startIndex}&amp;rg={count}&amp;ln={language}" % {'CFG_SITE_URL': CFG_SITE_URL}
tmpl_opensearch_html_url_syntax = "%(CFG_SITE_URL)s/search?p={searchTerms}&amp;jrec={startIndex}&amp;rg={count}&amp;ln={language}" % {'CFG_SITE_URL': CFG_SITE_URL}
def tmpl_openurl2invenio(self, openurl_data):
""" Return an Invenio url corresponding to a search with the data
included in the openurl form map.
"""
def isbn_to_isbn13_isbn10(isbn):
isbn = isbn.replace(' ', '').replace('-', '')
if len(isbn) == 10 and isbn.isdigit():
## We already have isbn10
return ('', isbn)
if len(isbn) != 13 and isbn.isdigit():
return ('', '')
isbn13, isbn10 = isbn, isbn[3:-1]
checksum = 0
weight = 10
for char in isbn10:
checksum += int(char) * weight
weight -= 1
checksum = 11 - (checksum % 11)
if checksum == 10:
isbn10 += 'X'
if checksum == 11:
isbn10 += '0'
else:
isbn10 += str(checksum)
return (isbn13, isbn10)
from invenio.legacy.search_engine import perform_request_search
doi = ''
pmid = ''
bibcode = ''
oai = ''
issn = ''
isbn = ''
for elem in openurl_data['id']:
if elem.startswith('doi:'):
doi = elem[len('doi:'):]
elif elem.startswith('pmid:'):
pmid = elem[len('pmid:'):]
elif elem.startswith('bibcode:'):
bibcode = elem[len('bibcode:'):]
elif elem.startswith('oai:'):
oai = elem[len('oai:'):]
for elem in openurl_data['rft_id']:
if elem.startswith('info:doi/'):
doi = elem[len('info:doi/'):]
elif elem.startswith('info:pmid/'):
pmid = elem[len('info:pmid/'):]
elif elem.startswith('info:bibcode/'):
bibcode = elem[len('info:bibcode/'):]
elif elem.startswith('info:oai/'):
oai = elem[len('info:oai/')]
elif elem.startswith('urn:ISBN:'):
isbn = elem[len('urn:ISBN:'):]
elif elem.startswith('urn:ISSN:'):
issn = elem[len('urn:ISSN:'):]
## Building author query
aulast = openurl_data['rft.aulast'] or openurl_data['aulast']
aufirst = openurl_data['rft.aufirst'] or openurl_data['aufirst']
auinit = openurl_data['rft.auinit'] or \
openurl_data['auinit'] or \
openurl_data['rft.auinit1'] + ' ' + openurl_data['rft.auinitm'] or \
openurl_data['auinit1'] + ' ' + openurl_data['auinitm'] or aufirst[:1]
auinit = auinit.upper()
if aulast and aufirst:
author_query = 'author:"%s, %s" or author:"%s, %s"' % (aulast, aufirst, aulast, auinit)
elif aulast and auinit:
author_query = 'author:"%s, %s"' % (aulast, auinit)
else:
author_query = ''
## Building title query
title = openurl_data['rft.atitle'] or \
openurl_data['atitle'] or \
openurl_data['rft.btitle'] or \
openurl_data['rft.title'] or \
openurl_data['title']
if title:
title_query = 'title:"%s"' % title
title_query_cleaned = 'title:"%s"' % _RE_SPACES.sub(' ', _RE_PUNCTUATION.sub(' ', title))
else:
title_query = ''
## Building journal query
jtitle = openurl_data['rft.stitle'] or \
openurl_data['stitle'] or \
openurl_data['rft.jtitle'] or \
openurl_data['title']
if jtitle:
journal_query = 'journal:"%s"' % jtitle
else:
journal_query = ''
## Building isbn query
isbn = isbn or openurl_data['rft.isbn'] or \
openurl_data['isbn']
isbn13, isbn10 = isbn_to_isbn13_isbn10(isbn)
if isbn13:
isbn_query = 'isbn:"%s" or isbn:"%s"' % (isbn13, isbn10)
elif isbn10:
isbn_query = 'isbn:"%s"' % isbn10
else:
isbn_query = ''
## Building issn query
issn = issn or openurl_data['rft.eissn'] or \
openurl_data['eissn'] or \
openurl_data['rft.issn'] or \
openurl_data['issn']
if issn:
issn_query = 'issn:"%s"' % issn
else:
issn_query = ''
## Building coden query
coden = openurl_data['rft.coden'] or openurl_data['coden']
if coden:
coden_query = 'coden:"%s"' % coden
else:
coden_query = ''
## Building doi query
if False: #doi: #FIXME Temporaly disabled until doi field is properly setup
doi_query = 'doi:"%s"' % doi
else:
doi_query = ''
## Trying possible searches
if doi_query:
if perform_request_search(p=doi_query):
return '%s/search?%s' % (CFG_SITE_URL, urlencode({
'p' : doi_query,
'sc' : CFG_WEBSEARCH_SPLIT_BY_COLLECTION,
'of' : 'hd'}))
if isbn_query:
if perform_request_search(p=isbn_query):
return '%s/search?%s' % (CFG_SITE_URL, urlencode({
'p' : isbn_query,
'sc' : CFG_WEBSEARCH_SPLIT_BY_COLLECTION,
'of' : 'hd'}))
if coden_query:
if perform_request_search(p=coden_query):
return '%s/search?%s' % (CFG_SITE_URL, urlencode({
'p' : coden_query,
'sc' : CFG_WEBSEARCH_SPLIT_BY_COLLECTION,
'of' : 'hd'}))
if author_query and title_query:
if perform_request_search(p='%s and %s' % (title_query, author_query)):
return '%s/search?%s' % (CFG_SITE_URL, urlencode({
'p' : '%s and %s' % (title_query, author_query),
'sc' : CFG_WEBSEARCH_SPLIT_BY_COLLECTION,
'of' : 'hd'}))
if title_query:
result = len(perform_request_search(p=title_query))
if result == 1:
return '%s/search?%s' % (CFG_SITE_URL, urlencode({
'p' : title_query,
'sc' : CFG_WEBSEARCH_SPLIT_BY_COLLECTION,
'of' : 'hd'}))
elif result > 1:
return '%s/search?%s' % (CFG_SITE_URL, urlencode({
'p' : title_query,
'sc' : CFG_WEBSEARCH_SPLIT_BY_COLLECTION,
'of' : 'hb'}))
## Nothing worked, let's return a search that the user can improve
if author_query and title_query:
return '%s/search%s' % (CFG_SITE_URL, make_canonical_urlargd({
'p' : '%s and %s' % (title_query_cleaned, author_query),
'sc' : CFG_WEBSEARCH_SPLIT_BY_COLLECTION,
'of' : 'hb'}, {}))
elif title_query:
return '%s/search%s' % (CFG_SITE_URL, make_canonical_urlargd({
'p' : title_query_cleaned,
'sc' : CFG_WEBSEARCH_SPLIT_BY_COLLECTION,
'of' : 'hb'}, {}))
else:
## Mmh. Too few information provided.
return '%s/search%s' % (CFG_SITE_URL, make_canonical_urlargd({
'p' : 'recid:-1',
'sc' : CFG_WEBSEARCH_SPLIT_BY_COLLECTION,
'of' : 'hb'}, {}))
def tmpl_opensearch_description(self, ln):
""" Returns the OpenSearch description file of this site.
"""
_ = gettext_set_language(ln)
return """<OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/"
xmlns:moz="http://www.mozilla.org/2006/browser/search/">
<ShortName>%(short_name)s</ShortName>
<LongName>%(long_name)s</LongName>
<Description>%(description)s</Description>
<InputEncoding>UTF-8</InputEncoding>
<OutputEncoding>UTF-8</OutputEncoding>
<Language>*</Language>
<Contact>%(CFG_SITE_ADMIN_EMAIL)s</Contact>
<Query role="example" searchTerms="a" />
<Developer>Powered by Invenio</Developer>
<Url type="text/html" indexOffset="1" rel="results" template="%(html_search_syntax)s" />
<Url type="application/rss+xml" indexOffset="1" rel="results" template="%(rss_search_syntax)s" />
<Url type="application/opensearchdescription+xml" rel="self" template="%(CFG_SITE_URL)s/opensearchdescription" />
<moz:SearchForm>%(CFG_SITE_URL)s</moz:SearchForm>
</OpenSearchDescription>""" % \
{'CFG_SITE_URL': CFG_SITE_URL,
'short_name': CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME)[:16],
'long_name': CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME),
'description': (_("Search on %(x_CFG_SITE_NAME_INTL)s") % \
{'x_CFG_SITE_NAME_INTL': CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME)})[:1024],
'CFG_SITE_ADMIN_EMAIL': CFG_SITE_ADMIN_EMAIL,
'rss_search_syntax': self.tmpl_opensearch_rss_url_syntax,
'html_search_syntax': self.tmpl_opensearch_html_url_syntax
}
def build_search_url(self, known_parameters={}, **kargs):
""" Helper for generating a canonical search
url. 'known_parameters' is the list of query parameters you
inherit from your current query. You can then pass keyword
arguments to modify this query.
build_search_url(known_parameters, of="xm")
The generated URL is absolute.
"""
parameters = {}
parameters.update(known_parameters)
parameters.update(kargs)
# Now, we only have the arguments which have _not_ their default value
parameters = drop_default_urlargd(parameters, self.search_results_default_urlargd)
# Treat `as' argument specially:
if parameters.has_key('aas'):
parameters['as'] = parameters['aas']
del parameters['aas']
# Asking for a recid? Return a /CFG_SITE_RECORD/<recid> URL
if 'recid' in parameters:
target = "%s/%s/%s" % (CFG_SITE_URL, CFG_SITE_RECORD, parameters['recid'])
del parameters['recid']
target += make_canonical_urlargd(parameters, self.search_results_default_urlargd)
return target
return "%s/search%s" % (CFG_SITE_URL, make_canonical_urlargd(parameters, self.search_results_default_urlargd))
def build_search_interface_url(self, known_parameters={}, **kargs):
""" Helper for generating a canonical search interface URL."""
parameters = {}
parameters.update(known_parameters)
parameters.update(kargs)
c = parameters['c']
del parameters['c']
# Now, we only have the arguments which have _not_ their default value
parameters = drop_default_urlargd(parameters, self.search_results_default_urlargd)
# Treat `as' argument specially:
if parameters.has_key('aas'):
parameters['as'] = parameters['aas']
del parameters['aas']
if c and c != CFG_SITE_NAME:
base = CFG_SITE_URL + '/collection/' + quote(c)
else:
base = CFG_SITE_URL
return create_url(base, parameters)
def build_rss_url(self, known_parameters, **kargs):
"""Helper for generating a canonical RSS URL"""
parameters = {}
parameters.update(known_parameters)
parameters.update(kargs)
# Keep only interesting parameters
argd = wash_urlargd(parameters, self.rss_default_urlargd)
if argd:
# Handle 'c' differently since it is a list
c = argd.get('c', [])
del argd['c']
# Create query, and drop empty params
args = make_canonical_urlargd(argd, self.rss_default_urlargd)
if c != []:
# Add collections
c = [quote(coll) for coll in c]
if args == '':
args += '?'
else:
args += '&amp;'
args += 'c=' + '&amp;c='.join(c)
return CFG_SITE_URL + '/rss' + args
def tmpl_record_page_header_content(self, req, recid, ln):
"""
Provide extra information in the header of /CFG_SITE_RECORD pages
Return (title, description, keywords), not escaped for HTML
"""
_ = gettext_set_language(ln)
title = get_fieldvalues(recid, "245__a") or \
get_fieldvalues(recid, "111__a")
if title:
title = title[0]
else:
title = _("Record") + ' #%d' % recid
keywords = ', '.join(get_fieldvalues(recid, "6531_a"))
description = ' '.join(get_fieldvalues(recid, "520__a"))
description += "\n"
description += '; '.join(get_fieldvalues(recid, "100__a") + get_fieldvalues(recid, "700__a"))
return (title, description, keywords)
def tmpl_exact_author_browse_help_link(self, p, p1, p2, p3, f, f1, f2, f3, rm, cc, ln, jrec, rg, aas, action, link_name):
"""
Creates the 'exact author' help link for browsing.
"""
_ = gettext_set_language(ln)
url = create_html_link(self.build_search_url(p=p,
p1=p1,
p2=p2,
p3=p3,
f=f,
f1=f1,
f2=f2,
f3=f3,
rm=rm,
cc=cc,
ln=ln,
jrec=jrec,
rg=rg,
aas=aas,
action=action),
{}, _(link_name), {'class': 'nearestterms'})
return "Did you mean to browse in %s index?" % url
def tmpl_navtrail_links(self, aas, ln, dads):
"""
Creates the navigation bar at top of each search page (*Home > Root collection > subcollection > ...*)
Parameters:
- 'aas' *int* - Should we display an advanced search box?
- 'ln' *string* - The language to display
- 'separator' *string* - The separator between two consecutive collections
- 'dads' *list* - A list of parent links, eachone being a dictionary of ('name', 'longname')
"""
out = []
for url, name in dads:
args = {'c': url, 'as': aas, 'ln': ln}
out.append(create_html_link(self.build_search_interface_url(**args), {}, cgi.escape(name), {'class': 'navtrail'}))
return ' &gt; '.join(out)
def tmpl_webcoll_body(self, ln, collection, te_portalbox,
searchfor, np_portalbox, narrowsearch,
focuson, instantbrowse, ne_portalbox, show_body=True):
""" Creates the body of the main search page.
Parameters:
- 'ln' *string* - language of the page being generated
- 'collection' - collection id of the page being generated
- 'te_portalbox' *string* - The HTML code for the portalbox on top of search
- 'searchfor' *string* - The HTML code for the search for box
- 'np_portalbox' *string* - The HTML code for the portalbox on bottom of search
- 'narrowsearch' *string* - The HTML code for the search categories (left bottom of page)
- 'focuson' *string* - The HTML code for the "focuson" categories (right bottom of page)
- 'ne_portalbox' *string* - The HTML code for the bottom of the page
"""
if not narrowsearch:
narrowsearch = instantbrowse
body = '''
<form name="search" action="%(siteurl)s/search" method="get">
%(searchfor)s
%(np_portalbox)s''' % {
'siteurl' : CFG_SITE_URL,
'searchfor' : searchfor,
'np_portalbox' : np_portalbox
}
if show_body:
body += '''
<table cellspacing="0" cellpadding="0" border="0" class="narrowandfocusonsearchbox">
<tr>
<td valign="top">%(narrowsearch)s</td>
''' % { 'narrowsearch' : narrowsearch }
if focuson:
body += """<td valign="top">""" + focuson + """</td>"""
body += """</tr></table>"""
elif focuson:
body += focuson
body += """%(ne_portalbox)s
</form>""" % {'ne_portalbox' : ne_portalbox}
return body
def tmpl_portalbox(self, title, body):
"""Creates portalboxes based on the parameters
Parameters:
- 'title' *string* - The title of the box
- 'body' *string* - The HTML code for the body of the box
"""
out = """<div class="portalbox">
<div class="portalboxheader">%(title)s</div>
<div class="portalboxbody">%(body)s</div>
</div>""" % {'title' : cgi.escape(title), 'body' : body}
return out
def tmpl_searchfor_light(self, ln, collection_id, collection_name, record_count,
example_search_queries): # EXPERIMENTAL
"""Produces light *Search for* box for the current collection.
Parameters:
- 'ln' *string* - *str* The language to display
- 'collection_id' - *str* The collection id
- 'collection_name' - *str* The collection name in current language
- 'example_search_queries' - *list* List of search queries given as example for this collection
"""
# load the right message language
_ = gettext_set_language(ln)
out = '''
<!--create_searchfor_light()-->
'''
argd = drop_default_urlargd({'ln': ln, 'sc': CFG_WEBSEARCH_SPLIT_BY_COLLECTION},
self.search_results_default_urlargd)
# Only add non-default hidden values
for field, value in argd.items():
out += self.tmpl_input_hidden(field, value)
header = _("Search %s records for:") % \
self.tmpl_nbrecs_info(record_count, "", "")
asearchurl = self.build_search_interface_url(c=collection_id,
aas=max(CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES),
ln=ln)
# Build example of queries for this collection
example_search_queries_links = [create_html_link(self.build_search_url(p=example_query,
ln=ln,
aas= -1,
c=collection_id),
{},
cgi.escape(example_query),
{'class': 'examplequery'}) \
for example_query in example_search_queries]
example_query_html = ''
if len(example_search_queries) > 0:
example_query_link = example_search_queries_links[0]
# offers more examples if possible
more = ''
if len(example_search_queries_links) > 1:
more = '''
<script type="text/javascript">
function toggle_more_example_queries_visibility(){
var more = document.getElementById('more_example_queries');
var link = document.getElementById('link_example_queries');
var sep = document.getElementById('more_example_sep');
if (more.style.display=='none'){
more.style.display = '';
link.innerHTML = "%(show_less)s"
link.style.color = "rgb(204,0,0)";
sep.style.display = 'none';
} else {
more.style.display = 'none';
link.innerHTML = "%(show_more)s"
link.style.color = "rgb(0,0,204)";
sep.style.display = '';
}
return false;
}
</script>
<span id="more_example_queries" style="display:none;text-align:right"><br/>%(more_example_queries)s<br/></span>
<a id="link_example_queries" href="#" onclick="toggle_more_example_queries_visibility()" style="display:none"></a>
<script type="text/javascript">
var link = document.getElementById('link_example_queries');
var sep = document.getElementById('more_example_sep');
link.style.display = '';
link.innerHTML = "%(show_more)s";
sep.style.display = '';
</script>
''' % {'more_example_queries': '<br/>'.join(example_search_queries_links[1:]),
'show_less':_("less"),
'show_more':_("more")}
example_query_html += '''<p style="text-align:right;margin:0px;">
%(example)s<span id="more_example_sep" style="display:none;">&nbsp;&nbsp;::&nbsp;</span>%(more)s
</p>
''' % {'example': _("Example: %(x_sample_search_query)s") % \
{'x_sample_search_query': example_query_link},
'more': more}
# display options to search in current collection or everywhere
search_in = ''
if collection_name != CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME):
search_in += '''
<input type="radio" name="cc" value="%(collection_id)s" id="searchCollection" checked="checked"/>
<label for="searchCollection">%(search_in_collection_name)s</label>
<input type="radio" name="cc" value="%(root_collection_name)s" id="searchEverywhere" />
<label for="searchEverywhere">%(search_everywhere)s</label>
''' % {'search_in_collection_name': _("Search in %(x_collection_name)s") % \
{'x_collection_name': collection_name},
'collection_id': collection_id,
'root_collection_name': CFG_SITE_NAME,
'search_everywhere': _("Search everywhere")}
# print commentary start:
out += '''
<table class="searchbox lightsearch">
<tbody>
<tr valign="baseline">
<td class="searchboxbody" align="right"><input type="text" name="p" size="%(sizepattern)d" value="" class="lightsearchfield"/><br/>
<small><small>%(example_query_html)s</small></small>
</td>
<td class="searchboxbody" align="left">
<input class="formbutton" type="submit" name="action_search" value="%(msg_search)s" />
</td>
<td class="searchboxbody" align="left" rowspan="2" valign="top">
<small><small>
<a href="%(siteurl)s/help/search-tips%(langlink)s">%(msg_search_tips)s</a><br/>
%(asearch)s
</small></small>
</td>
</tr></table>
<!--<tr valign="baseline">
<td class="searchboxbody" colspan="2" align="left">
<small>
--><small>%(search_in)s</small><!--
</small>
</td>
</tr>
</tbody>
</table>-->
<!--/create_searchfor_light()-->
''' % {'ln' : ln,
'sizepattern' : CFG_WEBSEARCH_LIGHTSEARCH_PATTERN_BOX_WIDTH,
'langlink': ln != CFG_SITE_LANG and '?ln=' + ln or '',
'siteurl' : CFG_SITE_URL,
'asearch' : create_html_link(asearchurl, {}, _('Advanced Search')),
'header' : header,
'msg_search' : _('Search'),
'msg_browse' : _('Browse'),
'msg_search_tips' : _('Search Tips'),
'search_in': search_in,
'example_query_html': example_query_html}
return out
def tmpl_searchfor_simple(self, ln, collection_id, collection_name, record_count, middle_option):
"""Produces simple *Search for* box for the current collection.
Parameters:
- 'ln' *string* - *str* The language to display
- 'collection_id' - *str* The collection id
- 'collection_name' - *str* The collection name in current language
- 'record_count' - *str* Number of records in this collection
- 'middle_option' *string* - HTML code for the options (any field, specific fields ...)
"""
# load the right message language
_ = gettext_set_language(ln)
out = '''
<!--create_searchfor_simple()-->
'''
argd = drop_default_urlargd({'ln': ln, 'cc': collection_id, 'sc': CFG_WEBSEARCH_SPLIT_BY_COLLECTION},
self.search_results_default_urlargd)
# Only add non-default hidden values
for field, value in argd.items():
out += self.tmpl_input_hidden(field, value)
header = _("Search %s records for:") % \
self.tmpl_nbrecs_info(record_count, "", "")
asearchurl = self.build_search_interface_url(c=collection_id,
aas=max(CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES),
ln=ln)
# print commentary start:
out += '''
<table class="searchbox simplesearch">
<thead>
<tr align="left">
<th colspan="3" class="searchboxheader">%(header)s</th>
</tr>
</thead>
<tbody>
<tr valign="baseline">
<td class="searchboxbody" align="left"><input type="text" name="p" size="%(sizepattern)d" value="" class="simplesearchfield"/></td>
<td class="searchboxbody" align="left">%(middle_option)s</td>
<td class="searchboxbody" align="left">
<input class="formbutton" type="submit" name="action_search" value="%(msg_search)s" />
<input class="formbutton" type="submit" name="action_browse" value="%(msg_browse)s" /></td>
</tr>
<tr valign="baseline">
<td class="searchboxbody" colspan="3" align="right">
<small>
<a href="%(siteurl)s/help/search-tips%(langlink)s">%(msg_search_tips)s</a> ::
%(asearch)s
</small>
</td>
</tr>
</tbody>
</table>
<!--/create_searchfor_simple()-->
''' % {'ln' : ln,
'sizepattern' : CFG_WEBSEARCH_SIMPLESEARCH_PATTERN_BOX_WIDTH,
'langlink': ln != CFG_SITE_LANG and '?ln=' + ln or '',
'siteurl' : CFG_SITE_URL,
'asearch' : create_html_link(asearchurl, {}, _('Advanced Search')),
'header' : header,
'middle_option' : middle_option,
'msg_search' : _('Search'),
'msg_browse' : _('Browse'),
'msg_search_tips' : _('Search Tips')}
return out
def tmpl_searchfor_advanced(self,
ln, # current language
collection_id,
collection_name,
record_count,
middle_option_1, middle_option_2, middle_option_3,
searchoptions,
sortoptions,
rankoptions,
displayoptions,
formatoptions
):
"""
Produces advanced *Search for* box for the current collection.
Parameters:
- 'ln' *string* - The language to display
- 'middle_option_1' *string* - HTML code for the first row of options (any field, specific fields ...)
- 'middle_option_2' *string* - HTML code for the second row of options (any field, specific fields ...)
- 'middle_option_3' *string* - HTML code for the third row of options (any field, specific fields ...)
- 'searchoptions' *string* - HTML code for the search options
- 'sortoptions' *string* - HTML code for the sort options
- 'rankoptions' *string* - HTML code for the rank options
- 'displayoptions' *string* - HTML code for the display options
- 'formatoptions' *string* - HTML code for the format options
"""
# load the right message language
_ = gettext_set_language(ln)
out = '''
<!--create_searchfor_advanced()-->
'''
argd = drop_default_urlargd({'ln': ln, 'aas': 1, 'cc': collection_id, 'sc': CFG_WEBSEARCH_SPLIT_BY_COLLECTION},
self.search_results_default_urlargd)
# Only add non-default hidden values
for field, value in argd.items():
out += self.tmpl_input_hidden(field, value)
header = _("Search %s records for") % \
self.tmpl_nbrecs_info(record_count, "", "")
header += ':'
ssearchurl = self.build_search_interface_url(c=collection_id, aas=min(CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES), ln=ln)
out += '''
<table class="searchbox advancedsearch">
<thead>
<tr>
<th class="searchboxheader" colspan="3">%(header)s</th>
</tr>
</thead>
<tbody>
<tr valign="bottom">
<td class="searchboxbody" style="white-space: nowrap;">
%(matchbox_m1)s<input type="text" name="p1" size="%(sizepattern)d" value="" class="advancedsearchfield"/>
</td>
<td class="searchboxbody" style="white-space: nowrap;">%(middle_option_1)s</td>
<td class="searchboxbody">%(andornot_op1)s</td>
</tr>
<tr valign="bottom">
<td class="searchboxbody" style="white-space: nowrap;">
%(matchbox_m2)s<input type="text" name="p2" size="%(sizepattern)d" value="" class="advancedsearchfield"/>
</td>
<td class="searchboxbody">%(middle_option_2)s</td>
<td class="searchboxbody">%(andornot_op2)s</td>
</tr>
<tr valign="bottom">
<td class="searchboxbody" style="white-space: nowrap;">
%(matchbox_m3)s<input type="text" name="p3" size="%(sizepattern)d" value="" class="advancedsearchfield"/>
</td>
<td class="searchboxbody">%(middle_option_3)s</td>
<td class="searchboxbody" style="white-space: nowrap;">
<input class="formbutton" type="submit" name="action_search" value="%(msg_search)s" />
<input class="formbutton" type="submit" name="action_browse" value="%(msg_browse)s" /></td>
</tr>
<tr valign="bottom">
<td colspan="3" class="searchboxbody" align="right">
<small>
<a href="%(siteurl)s/help/search-tips%(langlink)s">%(msg_search_tips)s</a> ::
%(ssearch)s
</small>
</td>
</tr>
</tbody>
</table>
<!-- @todo - more imports -->
''' % {'ln' : ln,
'sizepattern' : CFG_WEBSEARCH_ADVANCEDSEARCH_PATTERN_BOX_WIDTH,
'langlink': ln != CFG_SITE_LANG and '?ln=' + ln or '',
'siteurl' : CFG_SITE_URL,
'ssearch' : create_html_link(ssearchurl, {}, _("Simple Search")),
'header' : header,
'matchbox_m1' : self.tmpl_matchtype_box('m1', ln=ln),
'middle_option_1' : middle_option_1,
'andornot_op1' : self.tmpl_andornot_box('op1', ln=ln),
'matchbox_m2' : self.tmpl_matchtype_box('m2', ln=ln),
'middle_option_2' : middle_option_2,
'andornot_op2' : self.tmpl_andornot_box('op2', ln=ln),
'matchbox_m3' : self.tmpl_matchtype_box('m3', ln=ln),
'middle_option_3' : middle_option_3,
'msg_search' : _("Search"),
'msg_browse' : _("Browse"),
'msg_search_tips' : _("Search Tips")}
if (searchoptions):
out += """<table class="searchbox">
<thead>
<tr>
<th class="searchboxheader">
%(searchheader)s
</th>
</tr>
</thead>
<tbody>
<tr valign="bottom">
<td class="searchboxbody">%(searchoptions)s</td>
</tr>
</tbody>
</table>""" % {
'searchheader' : _("Search options:"),
'searchoptions' : searchoptions
}
out += """<table class="searchbox">
<thead>
<tr>
<th class="searchboxheader">
%(added)s
</th>
<th class="searchboxheader">
%(until)s
</th>
</tr>
</thead>
<tbody>
<tr valign="bottom">
<td class="searchboxbody">%(added_or_modified)s %(date_added)s</td>
<td class="searchboxbody">%(date_until)s</td>
</tr>
</tbody>
</table>
<table class="searchbox">
<thead>
<tr>
<th class="searchboxheader">
%(msg_sort)s
</th>
<th class="searchboxheader">
%(msg_display)s
</th>
<th class="searchboxheader">
%(msg_format)s
</th>
</tr>
</thead>
<tbody>
<tr valign="bottom">
<td class="searchboxbody">%(sortoptions)s %(rankoptions)s</td>
<td class="searchboxbody">%(displayoptions)s</td>
<td class="searchboxbody">%(formatoptions)s</td>
</tr>
</tbody>
</table>
<!--/create_searchfor_advanced()-->
""" % {
'added' : _("Added/modified since:"),
'until' : _("until:"),
'added_or_modified': self.tmpl_inputdatetype(ln=ln),
'date_added' : self.tmpl_inputdate("d1", ln=ln),
'date_until' : self.tmpl_inputdate("d2", ln=ln),
'msg_sort' : _("Sort by:"),
'msg_display' : _("Display results:"),
'msg_format' : _("Output format:"),
'sortoptions' : sortoptions,
'rankoptions' : rankoptions,
'displayoptions' : displayoptions,
'formatoptions' : formatoptions
}
return out
def tmpl_matchtype_box(self, name='m', value='', ln='en'):
"""Returns HTML code for the 'match type' selection box.
Parameters:
- 'name' *string* - The name of the produced select
- 'value' *string* - The selected value (if any value is already selected)
- 'ln' *string* - the language to display
"""
# load the right message language
_ = gettext_set_language(ln)
out = """
<select name="%(name)s">
<option value="a"%(sela)s>%(opta)s</option>
<option value="o"%(selo)s>%(opto)s</option>
<option value="e"%(sele)s>%(opte)s</option>
<option value="p"%(selp)s>%(optp)s</option>
<option value="r"%(selr)s>%(optr)s</option>
</select>
""" % {'name' : name,
'sela' : self.tmpl_is_selected('a', value),
'opta' : _("All of the words:"),
'selo' : self.tmpl_is_selected('o', value),
'opto' : _("Any of the words:"),
'sele' : self.tmpl_is_selected('e', value),
'opte' : _("Exact phrase:"),
'selp' : self.tmpl_is_selected('p', value),
'optp' : _("Partial phrase:"),
'selr' : self.tmpl_is_selected('r', value),
'optr' : _("Regular expression:")
}
return out
def tmpl_is_selected(self, var, fld):
"""
Checks if *var* and *fld* are equal, and if yes, returns ' selected="selected"'. Useful for select boxes.
Parameters:
- 'var' *string* - First value to compare
- 'fld' *string* - Second value to compare
"""
if var == fld:
return ' selected="selected"'
else:
return ""
def tmpl_andornot_box(self, name='op', value='', ln='en'):
"""
Returns HTML code for the AND/OR/NOT selection box.
Parameters:
- 'name' *string* - The name of the produced select
- 'value' *string* - The selected value (if any value is already selected)
- 'ln' *string* - the language to display
"""
# load the right message language
_ = gettext_set_language(ln)
out = """
<select name="%(name)s">
<option value="a"%(sela)s>%(opta)s</option>
<option value="o"%(selo)s>%(opto)s</option>
<option value="n"%(seln)s>%(optn)s</option>
</select>
""" % {'name' : name,
'sela' : self.tmpl_is_selected('a', value), 'opta' : _("AND"),
'selo' : self.tmpl_is_selected('o', value), 'opto' : _("OR"),
'seln' : self.tmpl_is_selected('n', value), 'optn' : _("AND NOT")
}
return out
def tmpl_inputdate(self, name, ln, sy=0, sm=0, sd=0):
"""
Produces *From Date*, *Until Date* kind of selection box. Suitable for search options.
Parameters:
- 'name' *string* - The base name of the produced selects
- 'ln' *string* - the language to display
"""
# load the right message language
_ = gettext_set_language(ln)
box = """
<select name="%(name)sd">
<option value=""%(sel)s>%(any)s</option>
""" % {
'name' : name,
'any' : _("any day"),
'sel' : self.tmpl_is_selected(sd, 0)
}
for day in range(1, 32):
box += """<option value="%02d"%s>%02d</option>""" % (day, self.tmpl_is_selected(sd, day), day)
box += """</select>"""
# month
box += """
<select name="%(name)sm">
<option value=""%(sel)s>%(any)s</option>
""" % {
'name' : name,
'any' : _("any month"),
'sel' : self.tmpl_is_selected(sm, 0)
}
# trailing space in May distinguishes short/long form of the month name
for mm, month in [(1, _("January")), (2, _("February")), (3, _("March")), (4, _("April")), \
(5, _("May ")), (6, _("June")), (7, _("July")), (8, _("August")), \
(9, _("September")), (10, _("October")), (11, _("November")), (12, _("December"))]:
box += """<option value="%02d"%s>%s</option>""" % (mm, self.tmpl_is_selected(sm, mm), month.strip())
box += """</select>"""
# year
box += """
<select name="%(name)sy">
<option value=""%(sel)s>%(any)s</option>
""" % {
'name' : name,
'any' : _("any year"),
'sel' : self.tmpl_is_selected(sy, 0)
}
this_year = int(time.strftime("%Y", time.localtime()))
for year in range(this_year - 20, this_year + 1):
box += """<option value="%d"%s>%d</option>""" % (year, self.tmpl_is_selected(sy, year), year)
box += """</select>"""
return box
def tmpl_inputdatetype(self, dt='', ln=CFG_SITE_LANG):
"""
Produces input date type selection box to choose
added-or-modified date search option.
Parameters:
- 'dt' *string - date type (c=created, m=modified)
- 'ln' *string* - the language to display
"""
# load the right message language
_ = gettext_set_language(ln)
box = """<select name="dt">
<option value="">%(added)s </option>
<option value="m"%(sel)s>%(modified)s </option>
</select>
""" % { 'added': _("Added since:"),
'modified': _("Modified since:"),
'sel': self.tmpl_is_selected(dt, 'm'),
}
return box
def tmpl_narrowsearch(self, aas, ln, type, father,
has_grandchildren, sons, display_grandsons,
grandsons):
"""
Creates list of collection descendants of type *type* under title *title*.
If aas==1, then links to Advanced Search interfaces; otherwise Simple Search.
Suitable for 'Narrow search' and 'Focus on' boxes.
Parameters:
- 'aas' *bool* - Should we display an advanced search box?
- 'ln' *string* - The language to display
- 'type' *string* - The type of the produced box (virtual collections or normal collections)
- 'father' *collection* - The current collection
- 'has_grandchildren' *bool* - If the current collection has grand children
- 'sons' *list* - The list of the sub-collections (first level)
- 'display_grandsons' *bool* - If the grand children collections should be displayed (2 level deep display)
- 'grandsons' *list* - The list of sub-collections (second level)
"""
# load the right message language
_ = gettext_set_language(ln)
title = {'r': _("Narrow by collection:"),
'v': _("Focus on:")}[type]
if has_grandchildren:
style_prolog = "<strong>"
style_epilog = "</strong>"
else:
style_prolog = ""
style_epilog = ""
out = """<table class="%(narrowsearchbox)s">
<thead>
<tr>
<th colspan="2" align="left" class="%(narrowsearchbox)sheader">
%(title)s
</th>
</tr>
</thead>
<tbody>""" % {'title' : title,
'narrowsearchbox': {'r': 'narrowsearchbox',
'v': 'focusonsearchbox'}[type]}
# iterate through sons:
i = 0
for son in sons:
out += """<tr><td class="%(narrowsearchbox)sbody" valign="top">""" % \
{ 'narrowsearchbox': {'r': 'narrowsearchbox',
'v': 'focusonsearchbox'}[type]}
if type == 'r':
if son.restricted_p() and son.restricted_p() != father.restricted_p():
out += """<input type="checkbox" name="c" value="%(name)s" /></td>""" % {'name' : cgi.escape(son.name) }
# hosted collections are checked by default only when configured so
elif str(son.dbquery).startswith("hostedcollection:"):
external_collection_engine = get_external_collection_engine(str(son.name))
if external_collection_engine and external_collection_engine.selected_by_default:
out += """<input type="checkbox" name="c" value="%(name)s" checked="checked" /></td>""" % {'name' : cgi.escape(son.name) }
elif external_collection_engine and not external_collection_engine.selected_by_default:
out += """<input type="checkbox" name="c" value="%(name)s" /></td>""" % {'name' : cgi.escape(son.name) }
else:
# strangely, the external collection engine was never found. In that case,
# why was the hosted collection here in the first place?
out += """<input type="checkbox" name="c" value="%(name)s" /></td>""" % {'name' : cgi.escape(son.name) }
else:
out += """<input type="checkbox" name="c" value="%(name)s" checked="checked" /></td>""" % {'name' : cgi.escape(son.name) }
else:
out += '</td>'
out += """<td valign="top">%(link)s%(recs)s """ % {
'link': create_html_link(self.build_search_interface_url(c=son.name, ln=ln, aas=aas),
{}, style_prolog + cgi.escape(son.get_name(ln)) + style_epilog),
'recs' : self.tmpl_nbrecs_info(son.nbrecs, ln=ln)}
# the following prints the "external collection" arrow just after the name and
# number of records of the hosted collection
# 1) we might want to make the arrow work as an anchor to the hosted collection as well.
# That would probably require a new separate function under invenio.utils.url
# 2) we might want to place the arrow between the name and the number of records of the hosted collection
# That would require to edit/separate the above out += ...
if type == 'r':
if str(son.dbquery).startswith("hostedcollection:"):
out += """<img src="%(siteurl)s/img/external-icon-light-8x8.gif" border="0" alt="%(name)s"/>""" % \
{ 'siteurl' : CFG_SITE_URL, 'name' : cgi.escape(son.name), }
if son.restricted_p():
out += """ <small class="warning">[%(msg)s]</small> """ % { 'msg' : _("restricted") }
if display_grandsons and len(grandsons[i]):
# iterate trough grandsons:
out += """<br />"""
for grandson in grandsons[i]:
out += """ <small>%(link)s%(nbrec)s</small> """ % {
'link': create_html_link(self.build_search_interface_url(c=grandson.name, ln=ln, aas=aas),
{},
cgi.escape(grandson.get_name(ln))),
'nbrec' : self.tmpl_nbrecs_info(grandson.nbrecs, ln=ln)}
# the following prints the "external collection" arrow just after the name and
# number of records of the hosted collection
# Some relatives comments have been made just above
if type == 'r':
if str(grandson.dbquery).startswith("hostedcollection:"):
out += """<img src="%(siteurl)s/img/external-icon-light-8x8.gif" border="0" alt="%(name)s"/>""" % \
{ 'siteurl' : CFG_SITE_URL, 'name' : cgi.escape(grandson.name), }
out += """</td></tr>"""
i += 1
out += "</tbody></table>"
return out
def tmpl_searchalso(self, ln, engines_list, collection_id):
_ = gettext_set_language(ln)
box_name = _("Search also:")
html = """<table cellspacing="0" cellpadding="0" border="0">
<tr><td valign="top"><table class="searchalsosearchbox">
<thead><tr><th colspan="2" align="left" class="searchalsosearchboxheader">%(box_name)s
</th></tr></thead><tbody>
""" % locals()
for engine in engines_list:
internal_name = engine.name
name = _(internal_name)
base_url = engine.base_url
if external_collection_get_state(engine, collection_id) == 3:
checked = ' checked="checked"'
else:
checked = ''
html += """<tr><td class="searchalsosearchboxbody" valign="top">
<input type="checkbox" name="ec" id="%(id)s" value="%(internal_name)s" %(checked)s /></td>
<td valign="top" class="searchalsosearchboxbody">
<div style="white-space: nowrap"><label for="%(id)s">%(name)s</label>
<a href="%(base_url)s">
<img src="%(siteurl)s/img/external-icon-light-8x8.gif" border="0" alt="%(name)s"/></a>
</div></td></tr>""" % \
{ 'checked': checked,
'base_url': base_url,
'internal_name': internal_name,
'name': cgi.escape(name),
'id': "extSearch" + nmtoken_from_string(name),
'siteurl': CFG_SITE_URL, }
html += """</tbody></table></td></tr></table>"""
return html
def tmpl_nbrecs_info(self, number, prolog=None, epilog=None, ln=CFG_SITE_LANG):
"""
Return information on the number of records.
Parameters:
- 'number' *string* - The number of records
- 'prolog' *string* (optional) - An HTML code to prefix the number (if **None**, will be
'<small class="nbdoccoll">(')
- 'epilog' *string* (optional) - An HTML code to append to the number (if **None**, will be
')</small>')
"""
if number is None:
number = 0
if prolog is None:
prolog = '''&nbsp;<small class="nbdoccoll">('''
if epilog is None:
epilog = ''')</small>'''
return prolog + self.tmpl_nice_number(number, ln) + epilog
def tmpl_box_restricted_content(self, ln):
"""
Displays a box containing a *restricted content* message
Parameters:
- 'ln' *string* - The language to display
"""
# load the right message language
_ = gettext_set_language(ln)
return _("This collection is restricted. If you are authorized to access it, please click on the Search button.")
def tmpl_box_hosted_collection(self, ln):
"""
Displays a box containing a *hosted collection* message
Parameters:
- 'ln' *string* - The language to display
"""
# load the right message language
_ = gettext_set_language(ln)
return _("This is a hosted external collection. Please click on the Search button to see its content.")
def tmpl_box_no_records(self, ln):
"""
Displays a box containing a *no content* message
Parameters:
- 'ln' *string* - The language to display
"""
# load the right message language
_ = gettext_set_language(ln)
return _("This collection does not contain any document yet.")
def tmpl_instant_browse(self, aas, ln, recids, more_link=None, grid_layout=False):
"""
Formats a list of records (given in the recids list) from the database.
Parameters:
- 'aas' *int* - Advanced Search interface or not (0 or 1)
- 'ln' *string* - The language to display
- 'recids' *list* - the list of records from the database
- 'more_link' *string* - the "More..." link for the record. If not given, will not be displayed
"""
# load the right message language
_ = gettext_set_language(ln)
body = '''<table class="latestadditionsbox">'''
if grid_layout:
body += '<tr><td><div>'
for recid in recids:
if grid_layout:
body += '''
<abbr class="unapi-id" title="%(recid)s"></abbr>
%(body)s
''' % {
'recid': recid['id'],
'body': recid['body']}
else:
body += '''
<tr>
<td class="latestadditionsboxtimebody">%(date)s</td>
<td class="latestadditionsboxrecordbody">
<abbr class="unapi-id" title="%(recid)s"></abbr>
%(body)s
</td>
</tr>''' % {
'recid': recid['id'],
'date': recid['date'],
'body': recid['body']
}
if grid_layout:
body += '''<div style="clear:both"></div>'''
body += '''</div></td></tr>'''
body += "</table>"
if more_link:
body += '<div align="right"><small>' + \
create_html_link(more_link, {}, '[&gt;&gt; %s]' % _("more")) + \
'</small></div>'
return '''
<table class="narrowsearchbox">
<thead>
<tr>
<th class="narrowsearchboxheader">%(header)s</th>
</tr>
</thead>
<tbody>
<tr>
<td class="narrowsearchboxbody">%(body)s</td>
</tr>
</tbody>
</table>''' % {'header' : _("Latest additions:"),
'body' : body,
}
def tmpl_searchwithin_select(self, ln, fieldname, selected, values):
"""
Produces 'search within' selection box for the current collection.
Parameters:
- 'ln' *string* - The language to display
- 'fieldname' *string* - the name of the select box produced
- 'selected' *string* - which of the values is selected
- 'values' *list* - the list of values in the select
"""
out = '<select name="%(fieldname)s">' % {'fieldname': fieldname}
if values:
for pair in values:
out += """<option value="%(value)s"%(selected)s>%(text)s</option>""" % {
'value' : cgi.escape(pair['value']),
'selected' : self.tmpl_is_selected(pair['value'], selected),
'text' : cgi.escape(pair['text'])
}
out += """</select>"""
return out
def tmpl_select(self, fieldname, values, selected=None, css_class=''):
"""
Produces a generic select box
Parameters:
- 'css_class' *string* - optional, a css class to display this select with
- 'fieldname' *list* - the name of the select box produced
- 'selected' *string* - which of the values is selected
- 'values' *list* - the list of values in the select
"""
if css_class != '':
class_field = ' class="%s"' % css_class
else:
class_field = ''
out = '<select name="%(fieldname)s"%(class)s>' % {
'fieldname' : fieldname,
'class' : class_field
}
for pair in values:
if pair.get('selected', False) or pair['value'] == selected:
flag = ' selected="selected"'
else:
flag = ''
out += '<option value="%(value)s"%(selected)s>%(text)s</option>' % {
'value' : cgi.escape(str(pair['value'])),
'selected' : flag,
'text' : cgi.escape(pair['text'])
}
out += """</select>"""
return out
def tmpl_record_links(self, recid, ln, sf='', so='d', sp='', rm=''):
"""
Displays the *More info* and *Find similar* links for a record
Parameters:
- 'ln' *string* - The language to display
- 'recid' *string* - the id of the displayed record
"""
# load the right message language
_ = gettext_set_language(ln)
out = '''<br /><span class="moreinfo">%(detailed)s - %(similar)s</span>''' % {
'detailed': create_html_link(self.build_search_url(recid=recid, ln=ln),
{},
_("Detailed record"), {'class': "moreinfo"}),
'similar': create_html_link(self.build_search_url(p="recid:%d" % recid, rm='wrd', ln=ln),
{},
_("Similar records"),
{'class': "moreinfo"})}
if CFG_BIBRANK_SHOW_CITATION_LINKS:
num_timescited = get_cited_by_count(recid)
if num_timescited:
out += '''<span class="moreinfo"> - %s </span>''' % \
create_html_link(self.build_search_url(p='refersto:recid:%d' % recid,
sf=sf,
so=so,
sp=sp,
rm=rm,
ln=ln),
{}, _("Cited by %i records") % num_timescited, {'class': "moreinfo"})
return out
def tmpl_record_body(self, titles, authors, dates, rns, abstracts, urls_u, urls_z, ln):
"""
Displays the "HTML basic" format of a record
Parameters:
- 'authors' *list* - the authors (as strings)
- 'dates' *list* - the dates of publication
- 'rns' *list* - the quicknotes for the record
- 'abstracts' *list* - the abstracts for the record
- 'urls_u' *list* - URLs to the original versions of the record
- 'urls_z' *list* - Not used
"""
out = ""
for title in titles:
out += "<strong>%(title)s</strong> " % {
'title' : cgi.escape(title)
}
if authors:
out += " / "
for author in authors[:CFG_WEBSEARCH_AUTHOR_ET_AL_THRESHOLD]:
out += '%s ' % \
create_html_link(self.build_search_url(p=author, f='author', ln=ln),
{}, cgi.escape(author))
if len(authors) > CFG_WEBSEARCH_AUTHOR_ET_AL_THRESHOLD:
out += "<em>et al</em>"
for date in dates:
out += " %s." % cgi.escape(date)
for rn in rns:
out += """ <small class="quicknote">[%(rn)s]</small>""" % {'rn' : cgi.escape(rn)}
for abstract in abstracts:
out += "<br /><small>%(abstract)s [...]</small>" % {'abstract' : cgi.escape(abstract[:1 + string.find(abstract, '.')]) }
for idx in range(0, len(urls_u)):
out += """<br /><small class="note"><a class="note" href="%(url)s">%(name)s</a></small>""" % {
'url' : urls_u[idx],
'name' : urls_u[idx]
}
return out
def tmpl_search_in_bibwords(self, p, f, ln, nearest_box):
"""
Displays the *Words like current ones* links for a search
Parameters:
- 'p' *string* - Current search words
- 'f' *string* - the fields in which the search was done
- 'nearest_box' *string* - the HTML code for the "nearest_terms" box - most probably from a create_nearest_terms_box call
"""
# load the right message language
_ = gettext_set_language(ln)
out = '<p>'
if f:
out += _("Words nearest to %(x_word)s inside %(x_field)s in any collection are:") % {'x_word': '<em>' + cgi.escape(p) + '</em>',
'x_field': '<em>' + cgi.escape(f) + '</em>'}
else:
out += _("Words nearest to %(x_word)s in any collection are:") % {'x_word': '<em>' + cgi.escape(p) + '</em>'}
out += '<br />' + nearest_box + '</p>'
return out
def tmpl_nearest_term_box(self, p, ln, f, terminfo, intro):
"""
Displays the *Nearest search terms* box
Parameters:
- 'p' *string* - Current search words
- 'f' *string* - a collection description (if the search has been completed in a collection)
- 'ln' *string* - The language to display
- 'terminfo': tuple (term, hits, argd) for each near term
- 'intro' *string* - the intro HTML to prefix the box with
"""
out = '''<table class="nearesttermsbox" cellpadding="0" cellspacing="0" border="0">'''
for term, hits, argd in terminfo:
if hits:
hitsinfo = str(hits)
else:
hitsinfo = '-'
argd['f'] = f
argd['p'] = term
term = cgi.escape(term)
# FIXME this is hack to get correct links to nearest terms
from flask import has_request_context, request
if has_request_context() and request.values.get('of', '') != argd.get('of', ''):
if 'of' in request.values:
argd['of'] = request.values.get('of')
else:
del argd['of']
if term == p: # print search word for orientation:
nearesttermsboxbody_class = "nearesttermsboxbodyselected"
if hits > 0:
term = create_html_link(self.build_search_url(argd), {},
term, {'class': "nearesttermsselected"})
else:
nearesttermsboxbody_class = "nearesttermsboxbody"
term = create_html_link(self.build_search_url(argd), {},
term, {'class': "nearestterms"})
out += '''\
<tr>
<td class="%(nearesttermsboxbody_class)s" align="right">%(hits)s</td>
<td class="%(nearesttermsboxbody_class)s" width="15">&nbsp;</td>
<td class="%(nearesttermsboxbody_class)s" align="left">%(term)s</td>
</tr>
''' % {'hits': hitsinfo,
'nearesttermsboxbody_class': nearesttermsboxbody_class,
'term': term}
out += "</table>"
return intro + "<blockquote>" + out + "</blockquote>"
def tmpl_browse_pattern(self, f, fn, ln, browsed_phrases_in_colls, colls, rg):
"""
Displays the *Nearest search terms* box
Parameters:
- 'f' *string* - field (*not* i18nized)
- 'fn' *string* - field name (i18nized)
- 'ln' *string* - The language to display
- 'browsed_phrases_in_colls' *array* - the phrases to display
- 'colls' *array* - the list of collection parameters of the search (c's)
- 'rg' *int* - the number of records
"""
# load the right message language
_ = gettext_set_language(ln)
out = """<table class="searchresultsbox">
<thead>
<tr>
<th class="searchresultsboxheader" style="text-align: right;" width="15">
%(hits)s
</th>
<th class="searchresultsboxheader" width="15">
&nbsp;
</th>
<th class="searchresultsboxheader" style="text-align: left;">
%(fn)s
</th>
</tr>
</thead>
<tbody>""" % {
'hits' : _("Hits"),
'fn' : cgi.escape(fn)
}
if len(browsed_phrases_in_colls) == 1:
# one hit only found:
phrase, nbhits = browsed_phrases_in_colls[0][0], browsed_phrases_in_colls[0][1]
query = {'c': colls,
'ln': ln,
'p': '"%s"' % phrase.replace('"', '\\"'),
'f': f,
'rg' : rg}
out += """<tr>
<td class="searchresultsboxbody" style="text-align: right;">
%(nbhits)s
</td>
<td class="searchresultsboxbody" width="15">
&nbsp;
</td>
<td class="searchresultsboxbody" style="text-align: left;">
%(link)s
</td>
</tr>""" % {'nbhits': nbhits,
'link': create_html_link(self.build_search_url(query),
{}, cgi.escape(phrase))}
elif len(browsed_phrases_in_colls) > 1:
# first display what was found but the last one:
for phrase, nbhits in browsed_phrases_in_colls[:-1]:
query = {'c': colls,
'ln': ln,
'p': '"%s"' % phrase.replace('"', '\\"'),
'f': f,
'rg' : rg}
out += """<tr>
<td class="searchresultsboxbody" style="text-align: right;">
%(nbhits)s
</td>
<td class="searchresultsboxbody" width="15">
&nbsp;
</td>
<td class="searchresultsboxbody" style="text-align: left;">
%(link)s
</td>
</tr>""" % {'nbhits' : nbhits,
'link': create_html_link(self.build_search_url(query),
{},
cgi.escape(phrase))}
# now display last hit as "previous term":
phrase, nbhits = browsed_phrases_in_colls[0]
query_previous = {'c': colls,
'ln': ln,
'p': '"%s"' % phrase.replace('"', '\\"'),
'f': f,
'rg' : rg}
# now display last hit as "next term":
phrase, nbhits = browsed_phrases_in_colls[-1]
query_next = {'c': colls,
'ln': ln,
'p': '"%s"' % phrase.replace('"', '\\"'),
'f': f,
'rg' : rg}
out += """<tr><td colspan="2" class="normal">
&nbsp;
</td>
<td class="normal">
%(link_previous)s
<img src="%(siteurl)s/img/sp.gif" alt="" border="0" />
<img src="%(siteurl)s/img/sn.gif" alt="" border="0" />
%(link_next)s
</td>
</tr>""" % {'link_previous': create_html_link(self.build_search_url(query_previous, action='browse'), {}, _("Previous")),
'link_next': create_html_link(self.build_search_url(query_next, action='browse'),
{}, _("next")),
'siteurl' : CFG_SITE_URL}
out += """</tbody>
</table>"""
return out
def tmpl_search_box(self, ln, aas, cc, cc_intl, ot, sp,
action, fieldslist, f1, f2, f3, m1, m2, m3,
p1, p2, p3, op1, op2, rm, p, f, coll_selects,
d1y, d2y, d1m, d2m, d1d, d2d, dt, sort_fields,
sf, so, ranks, sc, rg, formats, of, pl, jrec, ec,
show_colls=True, show_title=True):
"""
Displays the *Nearest search terms* box
Parameters:
- 'ln' *string* - The language to display
- 'aas' *bool* - Should we display an advanced search box? -1 -> 1, from simpler to more advanced
- 'cc_intl' *string* - the i18nized current collection name, used for display
- 'cc' *string* - the internal current collection name
- 'ot', 'sp' *string* - hidden values
- 'action' *string* - the action demanded by the user
- 'fieldslist' *list* - the list of all fields available, for use in select within boxes in advanced search
- 'p, f, f1, f2, f3, m1, m2, m3, p1, p2, p3, op1, op2, op3, rm' *strings* - the search parameters
- 'coll_selects' *array* - a list of lists, each containing the collections selects to display
- 'd1y, d2y, d1m, d2m, d1d, d2d' *int* - the search between dates
- 'dt' *string* - the dates' types (creation dates, modification dates)
- 'sort_fields' *array* - the select information for the sort fields
- 'sf' *string* - the currently selected sort field
- 'so' *string* - the currently selected sort order ("a" or "d")
- 'ranks' *array* - ranking methods
- 'rm' *string* - selected ranking method
- 'sc' *string* - split by collection or not
- 'rg' *string* - selected results/page
- 'formats' *array* - available output formats
- 'of' *string* - the selected output format
- 'pl' *string* - `limit to' search pattern
- show_colls *bool* - propose coll selection box?
- show_title *bool* show cc_intl in page title?
"""
# load the right message language
_ = gettext_set_language(ln)
# These are hidden fields the user does not manipulate
# directly
if aas == -1:
argd = drop_default_urlargd({
'ln': ln, 'aas': aas,
'ot': ot, 'sp': sp, 'ec': ec,
}, self.search_results_default_urlargd)
else:
argd = drop_default_urlargd({
'cc': cc, 'ln': ln, 'aas': aas,
'ot': ot, 'sp': sp, 'ec': ec,
}, self.search_results_default_urlargd)
out = ""
if show_title:
# display cc name if asked for
out += '''
<h1 class="headline">%(ccname)s</h1>''' % {'ccname' : cgi.escape(cc_intl), }
out += '''
<form name="search" action="%(siteurl)s/search" method="get">
''' % {'siteurl' : CFG_SITE_URL}
# Only add non-default hidden values
for field, value in argd.items():
out += self.tmpl_input_hidden(field, value)
leadingtext = _("Search")
if action == 'browse':
leadingtext = _("Browse")
if aas == 1:
# print Advanced Search form:
# define search box elements:
out += '''
<table class="searchbox advancedsearch">
<thead>
<tr>
<th colspan="3" class="searchboxheader">
%(leading)s:
</th>
</tr>
</thead>
<tbody>
<tr valign="top" style="white-space:nowrap;">
<td class="searchboxbody">%(matchbox1)s
<input type="text" name="p1" size="%(sizepattern)d" value="%(p1)s" class="advancedsearchfield"/>
</td>
<td class="searchboxbody">%(searchwithin1)s</td>
<td class="searchboxbody">%(andornot1)s</td>
</tr>
<tr valign="top">
<td class="searchboxbody">%(matchbox2)s
<input type="text" name="p2" size="%(sizepattern)d" value="%(p2)s" class="advancedsearchfield"/>
</td>
<td class="searchboxbody">%(searchwithin2)s</td>
<td class="searchboxbody">%(andornot2)s</td>
</tr>
<tr valign="top">
<td class="searchboxbody">%(matchbox3)s
<input type="text" name="p3" size="%(sizepattern)d" value="%(p3)s" class="advancedsearchfield"/>
</td>
<td class="searchboxbody">%(searchwithin3)s</td>
<td class="searchboxbody" style="white-space:nowrap;">
<input class="formbutton" type="submit" name="action_search" value="%(search)s" />
<input class="formbutton" type="submit" name="action_browse" value="%(browse)s" />&nbsp;
</td>
</tr>
<tr valign="bottom">
<td colspan="3" align="right" class="searchboxbody">
<small>
<a href="%(siteurl)s/help/search-tips%(langlink)s">%(search_tips)s</a> ::
%(simple_search)s
</small>
</td>
</tr>
</tbody>
</table>
''' % {
'simple_search': create_html_link(self.build_search_url(p=p1, f=f1, rm=rm, cc=cc, ln=ln, jrec=jrec, rg=rg),
{}, _("Simple Search")),
'leading' : leadingtext,
'sizepattern' : CFG_WEBSEARCH_ADVANCEDSEARCH_PATTERN_BOX_WIDTH,
'matchbox1' : self.tmpl_matchtype_box('m1', m1, ln=ln),
'p1' : cgi.escape(p1, 1),
'searchwithin1' : self.tmpl_searchwithin_select(
ln=ln,
fieldname='f1',
selected=f1,
values=self._add_mark_to_field(value=f1, fields=fieldslist, ln=ln)
),
'andornot1' : self.tmpl_andornot_box(
name='op1',
value=op1,
ln=ln
),
'matchbox2' : self.tmpl_matchtype_box('m2', m2, ln=ln),
'p2' : cgi.escape(p2, 1),
'searchwithin2' : self.tmpl_searchwithin_select(
ln=ln,
fieldname='f2',
selected=f2,
values=self._add_mark_to_field(value=f2, fields=fieldslist, ln=ln)
),
'andornot2' : self.tmpl_andornot_box(
name='op2',
value=op2,
ln=ln
),
'matchbox3' : self.tmpl_matchtype_box('m3', m3, ln=ln),
'p3' : cgi.escape(p3, 1),
'searchwithin3' : self.tmpl_searchwithin_select(
ln=ln,
fieldname='f3',
selected=f3,
values=self._add_mark_to_field(value=f3, fields=fieldslist, ln=ln)
),
'search' : _("Search"),
'browse' : _("Browse"),
'siteurl' : CFG_SITE_URL,
'ln' : ln,
'langlink': ln != CFG_SITE_LANG and '?ln=' + ln or '',
'search_tips': _("Search Tips")
}
elif aas == 0:
# print Simple Search form:
out += '''
<table class="searchbox simplesearch">
<thead>
<tr>
<th colspan="3" class="searchboxheader">
%(leading)s:
</th>
</tr>
</thead>
<tbody>
<tr valign="top">
<td class="searchboxbody"><input type="text" name="p" size="%(sizepattern)d" value="%(p)s" class="simplesearchfield"/></td>
<td class="searchboxbody">%(searchwithin)s</td>
<td class="searchboxbody">
<input class="formbutton" type="submit" name="action_search" value="%(search)s" />
<input class="formbutton" type="submit" name="action_browse" value="%(browse)s" />&nbsp;
</td>
</tr>
<tr valign="bottom">
<td colspan="3" align="right" class="searchboxbody">
<small>
<a href="%(siteurl)s/help/search-tips%(langlink)s">%(search_tips)s</a> ::
%(advanced_search)s
</small>
</td>
</tr>
</tbody>
</table>
''' % {
'advanced_search': create_html_link(self.build_search_url(p1=p,
f1=f,
rm=rm,
aas=max(CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES),
cc=cc,
jrec=jrec,
ln=ln,
rg=rg),
{}, _("Advanced Search")),
'leading' : leadingtext,
'sizepattern' : CFG_WEBSEARCH_SIMPLESEARCH_PATTERN_BOX_WIDTH,
'p' : cgi.escape(p, 1),
'searchwithin' : self.tmpl_searchwithin_select(
ln=ln,
fieldname='f',
selected=f,
values=self._add_mark_to_field(value=f, fields=fieldslist, ln=ln)
),
'search' : _("Search"),
'browse' : _("Browse"),
'siteurl' : CFG_SITE_URL,
'ln' : ln,
'langlink': ln != CFG_SITE_LANG and '?ln=' + ln or '',
'search_tips': _("Search Tips")
}
else:
# EXPERIMENTAL
# print light search form:
search_in = ''
if cc_intl != CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME):
search_in = '''
<input type="radio" name="cc" value="%(collection_id)s" id="searchCollection" checked="checked"/>
<label for="searchCollection">%(search_in_collection_name)s</label>
<input type="radio" name="cc" value="%(root_collection_name)s" id="searchEverywhere" />
<label for="searchEverywhere">%(search_everywhere)s</label>
''' % {'search_in_collection_name': _("Search in %(x_collection_name)s") % \
{'x_collection_name': cgi.escape(cc_intl)},
'collection_id': cc,
'root_collection_name': CFG_SITE_NAME,
'search_everywhere': _("Search everywhere")}
out += '''
<table class="searchbox lightsearch">
<tr valign="top">
<td class="searchboxbody"><input type="text" name="p" size="%(sizepattern)d" value="%(p)s" class="lightsearchfield"/></td>
<td class="searchboxbody">
<input class="formbutton" type="submit" name="action_search" value="%(search)s" />
</td>
<td class="searchboxbody" align="left" rowspan="2" valign="top">
<small><small>
<a href="%(siteurl)s/help/search-tips%(langlink)s">%(search_tips)s</a><br/>
%(advanced_search)s
</td>
</tr>
</table>
<small>%(search_in)s</small>
''' % {
'sizepattern' : CFG_WEBSEARCH_LIGHTSEARCH_PATTERN_BOX_WIDTH,
'advanced_search': create_html_link(self.build_search_url(p1=p,
f1=f,
rm=rm,
aas=max(CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES),
cc=cc,
jrec=jrec,
ln=ln,
rg=rg),
{}, _("Advanced Search")),
'leading' : leadingtext,
'p' : cgi.escape(p, 1),
'searchwithin' : self.tmpl_searchwithin_select(
ln=ln,
fieldname='f',
selected=f,
values=self._add_mark_to_field(value=f, fields=fieldslist, ln=ln)
),
'search' : _("Search"),
'browse' : _("Browse"),
'siteurl' : CFG_SITE_URL,
'ln' : ln,
'langlink': ln != CFG_SITE_LANG and '?ln=' + ln or '',
'search_tips': _("Search Tips"),
'search_in': search_in
}
## secondly, print Collection(s) box:
if show_colls and aas > -1:
# display collections only if there is more than one
selects = ''
for sel in coll_selects:
selects += self.tmpl_select(fieldname='c', values=sel)
out += """
<table class="searchbox">
<thead>
<tr>
<th colspan="3" class="searchboxheader">
%(leading)s %(msg_coll)s:
</th>
</tr>
</thead>
<tbody>
<tr valign="bottom">
<td valign="top" class="searchboxbody">
%(colls)s
</td>
</tr>
</tbody>
</table>
""" % {
'leading' : leadingtext,
'msg_coll' : _("collections"),
'colls' : selects,
}
## thirdly, print search limits, if applicable:
if action != _("Browse") and pl:
out += """<table class="searchbox">
<thead>
<tr>
<th class="searchboxheader">
%(limitto)s
</th>
</tr>
</thead>
<tbody>
<tr valign="bottom">
<td class="searchboxbody">
<input type="text" name="pl" size="%(sizepattern)d" value="%(pl)s" />
</td>
</tr>
</tbody>
</table>""" % {
'limitto' : _("Limit to:"),
'sizepattern' : CFG_WEBSEARCH_ADVANCEDSEARCH_PATTERN_BOX_WIDTH,
'pl' : cgi.escape(pl, 1),
}
## fourthly, print from/until date boxen, if applicable:
if action == _("Browse") or (d1y == 0 and d1m == 0 and d1d == 0 and d2y == 0 and d2m == 0 and d2d == 0):
pass # do not need it
else:
cell_6_a = self.tmpl_inputdatetype(dt, ln) + self.tmpl_inputdate("d1", ln, d1y, d1m, d1d)
cell_6_b = self.tmpl_inputdate("d2", ln, d2y, d2m, d2d)
out += """<table class="searchbox">
<thead>
<tr>
<th class="searchboxheader">
%(added)s
</th>
<th class="searchboxheader">
%(until)s
</th>
</tr>
</thead>
<tbody>
<tr valign="bottom">
<td class="searchboxbody">%(added_or_modified)s %(date1)s</td>
<td class="searchboxbody">%(date2)s</td>
</tr>
</tbody>
</table>""" % {
'added' : _("Added/modified since:"),
'until' : _("until:"),
'added_or_modified': self.tmpl_inputdatetype(dt, ln),
'date1' : self.tmpl_inputdate("d1", ln, d1y, d1m, d1d),
'date2' : self.tmpl_inputdate("d2", ln, d2y, d2m, d2d),
}
## fifthly, print Display results box, including sort/rank, formats, etc:
if action != _("Browse") and aas > -1:
rgs = []
for i in [10, 25, 50, 100, 250, 500]:
if i <= CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS:
rgs.append({ 'value' : i, 'text' : "%d %s" % (i, _("results"))})
# enrich sort fields list if we are sorting by some MARC tag:
sort_fields = self._add_mark_to_field(value=sf, fields=sort_fields, ln=ln)
# create sort by HTML box:
out += """<table class="searchbox">
<thead>
<tr>
<th class="searchboxheader">
%(sort_by)s
</th>
<th class="searchboxheader">
%(display_res)s
</th>
<th class="searchboxheader">
%(out_format)s
</th>
</tr>
</thead>
<tbody>
<tr valign="bottom">
<td class="searchboxbody">
%(select_sf)s %(select_so)s %(select_rm)s
</td>
<td class="searchboxbody">
%(select_rg)s %(select_sc)s
</td>
<td class="searchboxbody">%(select_of)s</td>
</tr>
</tbody>
</table>""" % {
'sort_by' : _("Sort by:"),
'display_res' : _("Display results:"),
'out_format' : _("Output format:"),
'select_sf' : self.tmpl_select(fieldname='sf', values=sort_fields, selected=sf, css_class='address'),
'select_so' : self.tmpl_select(fieldname='so', values=[{
'value' : 'a',
'text' : _("asc.")
}, {
'value' : 'd',
'text' : _("desc.")
}], selected=so, css_class='address'),
'select_rm' : self.tmpl_select(fieldname='rm', values=ranks, selected=rm, css_class='address'),
'select_rg' : self.tmpl_select(fieldname='rg', values=rgs, selected=rg, css_class='address'),
'select_sc' : self.tmpl_select(fieldname='sc', values=[{
'value' : 0,
'text' : _("single list")
}, {
'value' : 1,
'text' : _("split by collection")
}], selected=sc, css_class='address'),
'select_of' : self.tmpl_select(
fieldname='of',
selected=of,
values=self._add_mark_to_field(value=of, fields=formats, chars=3, ln=ln),
css_class='address'),
}
## last but not least, print end of search box:
out += """</form>"""
return out
def tmpl_input_hidden(self, name, value):
"Produces the HTML code for a hidden field "
if isinstance(value, list):
list_input = [self.tmpl_input_hidden(name, val) for val in value]
return "\n".join(list_input)
# # Treat `as', `aas' arguments specially:
if name == 'aas':
name = 'as'
return """<input type="hidden" name="%(name)s" value="%(value)s" />""" % {
'name' : cgi.escape(str(name), 1),
'value' : cgi.escape(str(value), 1),
}
def _add_mark_to_field(self, value, fields, ln, chars=1):
"""Adds the current value as a MARC tag in the fields array
Useful for advanced search"""
# load the right message language
_ = gettext_set_language(ln)
out = fields
if value and str(value[0:chars]).isdigit():
out.append({'value' : value,
'text' : str(value) + " " + _("MARC tag")
})
return out
def tmpl_search_pagestart(self, ln) :
"page start for search page. Will display after the page header"
return """<div class="pagebody"><div class="pagebodystripemiddle">"""
def tmpl_search_pageend(self, ln) :
"page end for search page. Will display just before the page footer"
return """</div></div>"""
def tmpl_print_search_info(self, ln, middle_only,
collection, collection_name, collection_id,
aas, sf, so, rm, rg, nb_found, of, ot, p, f, f1,
f2, f3, m1, m2, m3, op1, op2, p1, p2,
p3, d1y, d1m, d1d, d2y, d2m, d2d, dt,
all_fieldcodes, cpu_time, pl_in_url,
jrec, sc, sp):
"""Prints stripe with the information on 'collection' and 'nb_found' results and CPU time.
Also, prints navigation links (beg/next/prev/end) inside the results set.
If middle_only is set to 1, it will only print the middle box information (beg/netx/prev/end/etc) links.
This is suitable for displaying navigation links at the bottom of the search results page.
Parameters:
- 'ln' *string* - The language to display
- 'middle_only' *bool* - Only display parts of the interface
- 'collection' *string* - the collection name
- 'collection_name' *string* - the i18nized current collection name
- 'aas' *bool* - if we display the advanced search interface
- 'sf' *string* - the currently selected sort format
- 'so' *string* - the currently selected sort order ("a" or "d")
- 'rm' *string* - selected ranking method
- 'rg' *int* - selected results/page
- 'nb_found' *int* - number of results found
- 'of' *string* - the selected output format
- 'ot' *string* - hidden values
- 'p' *string* - Current search words
- 'f' *string* - the fields in which the search was done
- 'f1, f2, f3, m1, m2, m3, p1, p2, p3, op1, op2' *strings* - the search parameters
- 'jrec' *int* - number of first record on this page
- 'd1y, d2y, d1m, d2m, d1d, d2d' *int* - the search between dates
- 'dt' *string* the dates' type (creation date, modification date)
- 'all_fieldcodes' *array* - all the available fields
- 'cpu_time' *float* - the time of the query in seconds
"""
# load the right message language
_ = gettext_set_language(ln)
out = ""
# left table cells: print collection name
if not middle_only:
out += '''
<a name="%(collection_id)s"></a>
<form action="%(siteurl)s/search" method="get">
<table class="searchresultsbox"><tr><td class="searchresultsboxheader" align="left">
<strong><big>%(collection_link)s</big></strong></td>
''' % {
'collection_id': collection_id,
'siteurl' : CFG_SITE_URL,
'collection_link': create_html_link(self.build_search_interface_url(c=collection, aas=aas, ln=ln),
{}, cgi.escape(collection_name))
}
else:
out += """
<div style="clear:both"></div>
<form action="%(siteurl)s/search" method="get"><div align="center">
""" % { 'siteurl' : CFG_SITE_URL }
# middle table cell: print beg/next/prev/end arrows:
if not middle_only:
out += """<td class="searchresultsboxheader" align="center">
%(recs_found)s &nbsp;""" % {
'recs_found' : _("%s records found") % ('<strong>' + self.tmpl_nice_number(nb_found, ln) + '</strong>')
}
else:
out += "<small>"
if nb_found > rg:
out += "" + cgi.escape(collection_name) + " : " + _("%s records found") % ('<strong>' + self.tmpl_nice_number(nb_found, ln) + '</strong>') + " &nbsp; "
if nb_found > rg: # navig.arrows are needed, since we have many hits
query = {'p': p, 'f': f,
'cc': collection,
'sf': sf, 'so': so,
'sp': sp, 'rm': rm,
'of': of, 'ot': ot,
'aas': aas, 'ln': ln,
'p1': p1, 'p2': p2, 'p3': p3,
'f1': f1, 'f2': f2, 'f3': f3,
'm1': m1, 'm2': m2, 'm3': m3,
'op1': op1, 'op2': op2,
'sc': 0,
'd1y': d1y, 'd1m': d1m, 'd1d': d1d,
'd2y': d2y, 'd2m': d2m, 'd2d': d2d,
'dt': dt,
}
# @todo here
def img(gif, txt):
return '<img src="%(siteurl)s/img/%(gif)s.gif" alt="%(txt)s" border="0" />' % {
'txt': txt, 'gif': gif, 'siteurl': CFG_SITE_URL}
if jrec - rg > 1:
out += create_html_link(self.build_search_url(query, jrec=1, rg=rg),
{}, img('sb', _("begin")),
{'class': 'img'})
if jrec > 1:
out += create_html_link(self.build_search_url(query, jrec=max(jrec - rg, 1), rg=rg),
{}, img('sp', _("previous")),
{'class': 'img'})
if jrec + rg - 1 < nb_found:
out += "%d - %d" % (jrec, jrec + rg - 1)
else:
out += "%d - %d" % (jrec, nb_found)
if nb_found >= jrec + rg:
out += create_html_link(self.build_search_url(query,
jrec=jrec + rg,
rg=rg),
{}, img('sn', _("next")),
{'class':'img'})
if nb_found >= jrec + rg + rg:
out += create_html_link(self.build_search_url(query,
jrec=nb_found - rg + 1,
rg=rg),
{}, img('se', _("end")),
{'class': 'img'})
# still in the navigation part
cc = collection
sc = 0
for var in ['p', 'cc', 'f', 'sf', 'so', 'of', 'rg', 'aas', 'ln', 'p1', 'p2', 'p3', 'f1', 'f2', 'f3', 'm1', 'm2', 'm3', 'op1', 'op2', 'sc', 'd1y', 'd1m', 'd1d', 'd2y', 'd2m', 'd2d', 'dt']:
out += self.tmpl_input_hidden(name=var, value=vars()[var])
for var in ['ot', 'sp', 'rm']:
if vars()[var]:
out += self.tmpl_input_hidden(name=var, value=vars()[var])
if pl_in_url:
fieldargs = cgi.parse_qs(pl_in_url)
for fieldcode in all_fieldcodes:
# get_fieldcodes():
if fieldargs.has_key(fieldcode):
for val in fieldargs[fieldcode]:
out += self.tmpl_input_hidden(name=fieldcode, value=val)
out += """&nbsp; %(jump)s <input type="text" name="jrec" size="4" value="%(jrec)d" />""" % {
'jump' : _("jump to record:"),
'jrec' : jrec,
}
if not middle_only:
out += "</td>"
else:
out += "</small>"
# right table cell: cpu time info
if not middle_only:
if cpu_time > -1:
out += """<td class="searchresultsboxheader" align="right"><small>%(time)s</small>&nbsp;</td>""" % {
'time' : _("Search took %s seconds.") % ('%.2f' % cpu_time),
}
out += "</tr></table>"
else:
out += "</div>"
out += "</form>"
return out
def tmpl_print_hosted_search_info(self, ln, middle_only,
collection, collection_name, collection_id,
aas, sf, so, rm, rg, nb_found, of, ot, p, f, f1,
f2, f3, m1, m2, m3, op1, op2, p1, p2,
p3, d1y, d1m, d1d, d2y, d2m, d2d, dt,
all_fieldcodes, cpu_time, pl_in_url,
jrec, sc, sp):
"""Prints stripe with the information on 'collection' and 'nb_found' results and CPU time.
Also, prints navigation links (beg/next/prev/end) inside the results set.
If middle_only is set to 1, it will only print the middle box information (beg/netx/prev/end/etc) links.
This is suitable for displaying navigation links at the bottom of the search results page.
Parameters:
- 'ln' *string* - The language to display
- 'middle_only' *bool* - Only display parts of the interface
- 'collection' *string* - the collection name
- 'collection_name' *string* - the i18nized current collection name
- 'aas' *bool* - if we display the advanced search interface
- 'sf' *string* - the currently selected sort format
- 'so' *string* - the currently selected sort order ("a" or "d")
- 'rm' *string* - selected ranking method
- 'rg' *int* - selected results/page
- 'nb_found' *int* - number of results found
- 'of' *string* - the selected output format
- 'ot' *string* - hidden values
- 'p' *string* - Current search words
- 'f' *string* - the fields in which the search was done
- 'f1, f2, f3, m1, m2, m3, p1, p2, p3, op1, op2' *strings* - the search parameters
- 'jrec' *int* - number of first record on this page
- 'd1y, d2y, d1m, d2m, d1d, d2d' *int* - the search between dates
- 'dt' *string* the dates' type (creation date, modification date)
- 'all_fieldcodes' *array* - all the available fields
- 'cpu_time' *float* - the time of the query in seconds
"""
# load the right message language
_ = gettext_set_language(ln)
out = ""
# left table cells: print collection name
if not middle_only:
out += '''
<a name="%(collection_id)s"></a>
<form action="%(siteurl)s/search" method="get">
<table class="searchresultsbox"><tr><td class="searchresultsboxheader" align="left">
<strong><big>%(collection_link)s</big></strong></td>
''' % {
'collection_id': collection_id,
'siteurl' : CFG_SITE_URL,
'collection_link': create_html_link(self.build_search_interface_url(c=collection, aas=aas, ln=ln),
{}, cgi.escape(collection_name))
}
else:
out += """
<form action="%(siteurl)s/search" method="get"><div align="center">
""" % { 'siteurl' : CFG_SITE_URL }
# middle table cell: print beg/next/prev/end arrows:
if not middle_only:
# in case we have a hosted collection that timed out do not print its number of records, as it is yet unknown
if nb_found != -963:
out += """<td class="searchresultsboxheader" align="center">
%(recs_found)s &nbsp;""" % {
'recs_found' : _("%s records found") % ('<strong>' + self.tmpl_nice_number(nb_found, ln) + '</strong>')
}
#elif nb_found = -963:
# out += """<td class="searchresultsboxheader" align="center">
# %(recs_found)s &nbsp;""" % {
# 'recs_found' : _("%s records found") % ('<strong>' + self.tmpl_nice_number(nb_found, ln) + '</strong>')
# }
else:
out += "<small>"
# we do not care about timed out hosted collections here, because the bumber of records found will never be bigger
# than rg anyway, since it's negative
if nb_found > rg:
out += "" + cgi.escape(collection_name) + " : " + _("%s records found") % ('<strong>' + self.tmpl_nice_number(nb_found, ln) + '</strong>') + " &nbsp; "
if nb_found > rg: # navig.arrows are needed, since we have many hits
query = {'p': p, 'f': f,
'cc': collection,
'sf': sf, 'so': so,
'sp': sp, 'rm': rm,
'of': of, 'ot': ot,
'aas': aas, 'ln': ln,
'p1': p1, 'p2': p2, 'p3': p3,
'f1': f1, 'f2': f2, 'f3': f3,
'm1': m1, 'm2': m2, 'm3': m3,
'op1': op1, 'op2': op2,
'sc': 0,
'd1y': d1y, 'd1m': d1m, 'd1d': d1d,
'd2y': d2y, 'd2m': d2m, 'd2d': d2d,
'dt': dt,
}
# @todo here
def img(gif, txt):
return '<img src="%(siteurl)s/img/%(gif)s.gif" alt="%(txt)s" border="0" />' % {
'txt': txt, 'gif': gif, 'siteurl': CFG_SITE_URL}
if jrec - rg > 1:
out += create_html_link(self.build_search_url(query, jrec=1, rg=rg),
{}, img('sb', _("begin")),
{'class': 'img'})
if jrec > 1:
out += create_html_link(self.build_search_url(query, jrec=max(jrec - rg, 1), rg=rg),
{}, img('sp', _("previous")),
{'class': 'img'})
if jrec + rg - 1 < nb_found:
out += "%d - %d" % (jrec, jrec + rg - 1)
else:
out += "%d - %d" % (jrec, nb_found)
if nb_found >= jrec + rg:
out += create_html_link(self.build_search_url(query,
jrec=jrec + rg,
rg=rg),
{}, img('sn', _("next")),
{'class':'img'})
if nb_found >= jrec + rg + rg:
out += create_html_link(self.build_search_url(query,
jrec=nb_found - rg + 1,
rg=rg),
{}, img('se', _("end")),
{'class': 'img'})
# still in the navigation part
cc = collection
sc = 0
for var in ['p', 'cc', 'f', 'sf', 'so', 'of', 'rg', 'aas', 'ln', 'p1', 'p2', 'p3', 'f1', 'f2', 'f3', 'm1', 'm2', 'm3', 'op1', 'op2', 'sc', 'd1y', 'd1m', 'd1d', 'd2y', 'd2m', 'd2d', 'dt']:
out += self.tmpl_input_hidden(name=var, value=vars()[var])
for var in ['ot', 'sp', 'rm']:
if vars()[var]:
out += self.tmpl_input_hidden(name=var, value=vars()[var])
if pl_in_url:
fieldargs = cgi.parse_qs(pl_in_url)
for fieldcode in all_fieldcodes:
# get_fieldcodes():
if fieldargs.has_key(fieldcode):
for val in fieldargs[fieldcode]:
out += self.tmpl_input_hidden(name=fieldcode, value=val)
out += """&nbsp; %(jump)s <input type="text" name="jrec" size="4" value="%(jrec)d" />""" % {
'jump' : _("jump to record:"),
'jrec' : jrec,
}
if not middle_only:
out += "</td>"
else:
out += "</small>"
# right table cell: cpu time info
if not middle_only:
if cpu_time > -1:
out += """<td class="searchresultsboxheader" align="right"><small>%(time)s</small>&nbsp;</td>""" % {
'time' : _("Search took %s seconds.") % ('%.2f' % cpu_time),
}
out += "</tr></table>"
else:
out += "</div>"
out += "</form>"
return out
def tmpl_nice_number(self, number, ln=CFG_SITE_LANG, thousands_separator=',', max_ndigits_after_dot=None):
"""
Return nicely printed number NUMBER in language LN using
given THOUSANDS_SEPARATOR character.
If max_ndigits_after_dot is specified and the number is float, the
number is rounded by taking in consideration up to max_ndigits_after_dot
digit after the dot.
This version does not pay attention to locale. See
tmpl_nice_number_via_locale().
"""
if type(number) is float:
if max_ndigits_after_dot is not None:
number = round(number, max_ndigits_after_dot)
int_part, frac_part = str(number).split('.')
return '%s.%s' % (self.tmpl_nice_number(int(int_part), ln, thousands_separator), frac_part)
else:
chars_in = list(str(number))
number = len(chars_in)
chars_out = []
for i in range(0, number):
if i % 3 == 0 and i != 0:
chars_out.append(thousands_separator)
chars_out.append(chars_in[number - i - 1])
chars_out.reverse()
return ''.join(chars_out)
def tmpl_nice_number_via_locale(self, number, ln=CFG_SITE_LANG):
"""
Return nicely printed number NUM in language LN using the locale.
See also version tmpl_nice_number().
"""
if number is None:
return None
# Temporarily switch the numeric locale to the requested one, and format the number
# In case the system has no locale definition, use the vanilla form
ol = locale.getlocale(locale.LC_NUMERIC)
try:
locale.setlocale(locale.LC_NUMERIC, self.tmpl_localemap.get(ln, self.tmpl_default_locale))
except locale.Error:
return str(number)
try:
number = locale.format('%d', number, True)
except TypeError:
return str(number)
locale.setlocale(locale.LC_NUMERIC, ol)
return number
def tmpl_record_format_htmlbrief_header(self, ln):
"""Returns the header of the search results list when output
is html brief. Note that this function is called for each collection
results when 'split by collection' is enabled.
See also: tmpl_record_format_htmlbrief_footer,
tmpl_record_format_htmlbrief_body
Parameters:
- 'ln' *string* - The language to display
"""
# load the right message language
_ = gettext_set_language(ln)
out = """
<form action="%(siteurl)s/yourbaskets/add" method="post">
<table>
""" % {
'siteurl' : CFG_SITE_URL,
}
return out
def tmpl_record_format_htmlbrief_footer(self, ln, display_add_to_basket=True):
"""Returns the footer of the search results list when output
is html brief. Note that this function is called for each collection
results when 'split by collection' is enabled.
See also: tmpl_record_format_htmlbrief_header(..),
tmpl_record_format_htmlbrief_body(..)
Parameters:
- 'ln' *string* - The language to display
- 'display_add_to_basket' *bool* - whether to display Add-to-basket button
"""
# load the right message language
_ = gettext_set_language(ln)
out = """</table>
<br />
<input type="hidden" name="colid" value="0" />
%(add_to_basket)s
</form>""" % {
'add_to_basket': display_add_to_basket and """<input class="formbutton" type="submit" name="action" value="%s" />""" % _("Add to basket") or "",
}
return out
def tmpl_record_format_htmlbrief_body(self, ln, recid,
row_number, relevance,
record, relevances_prologue,
relevances_epilogue,
display_add_to_basket=True):
"""Returns the html brief format of one record. Used in the
search results list for each record.
See also: tmpl_record_format_htmlbrief_header(..),
tmpl_record_format_htmlbrief_footer(..)
Parameters:
- 'ln' *string* - The language to display
- 'row_number' *int* - The position of this record in the list
- 'recid' *int* - The recID
- 'relevance' *string* - The relevance of the record
- 'record' *string* - The formatted record
- 'relevances_prologue' *string* - HTML code to prepend the relevance indicator
- 'relevances_epilogue' *string* - HTML code to append to the relevance indicator (used mostly for formatting)
"""
# load the right message language
_ = gettext_set_language(ln)
checkbox_for_baskets = """<input name="recid" type="checkbox" value="%(recid)s" />""" % \
{'recid': recid, }
if not display_add_to_basket:
checkbox_for_baskets = ''
out = """
<tr><td valign="top" align="right" style="white-space: nowrap;">
%(checkbox_for_baskets)s
<abbr class="unapi-id" title="%(recid)s"></abbr>
%(number)s.
""" % {'recid': recid,
'number': row_number,
'checkbox_for_baskets': checkbox_for_baskets}
if relevance:
out += """<br /><div class="rankscoreinfo"><a title="rank score">%(prologue)s%(relevance)s%(epilogue)s</a></div>""" % {
'prologue' : relevances_prologue,
'epilogue' : relevances_epilogue,
'relevance' : relevance
}
out += """</td><td valign="top">%s</td></tr>""" % record
return out
def tmpl_print_results_overview(self, ln, results_final_nb_total, cpu_time, results_final_nb, colls, ec, hosted_colls_potential_results_p=False):
"""Prints results overview box with links to particular collections below.
Parameters:
- 'ln' *string* - The language to display
- 'results_final_nb_total' *int* - The total number of hits for the query
- 'colls' *array* - The collections with hits, in the format:
- 'coll[code]' *string* - The code of the collection (canonical name)
- 'coll[name]' *string* - The display name of the collection
- 'results_final_nb' *array* - The number of hits, indexed by the collection codes:
- 'cpu_time' *string* - The time the query took
- 'url_args' *string* - The rest of the search query
- 'ec' *array* - selected external collections
- 'hosted_colls_potential_results_p' *boolean* - check if there are any hosted collections searches
that timed out during the pre-search
"""
if len(colls) == 1 and not ec:
# if one collection only and no external collections, print nothing:
return ""
# load the right message language
_ = gettext_set_language(ln)
# first find total number of hits:
# if there were no hosted collections that timed out during the pre-search print out the exact number of records found
if not hosted_colls_potential_results_p:
out = """<table class="searchresultsbox">
<thead><tr><th class="searchresultsboxheader">%(founds)s</th></tr></thead>
<tbody><tr><td class="searchresultsboxbody"> """ % {
'founds' : _("%(x_fmt_open)sResults overview:%(x_fmt_close)s Found %(x_nb_records)s records in %(x_nb_seconds)s seconds.") % \
{'x_fmt_open': '<strong>',
'x_fmt_close': '</strong>',
'x_nb_records': '<strong>' + self.tmpl_nice_number(results_final_nb_total, ln) + '</strong>',
'x_nb_seconds': '%.2f' % cpu_time}
}
# if there were (only) hosted_collections that timed out during the pre-search print out a fuzzier message
else:
if results_final_nb_total == 0:
out = """<table class="searchresultsbox">
<thead><tr><th class="searchresultsboxheader">%(founds)s</th></tr></thead>
<tbody><tr><td class="searchresultsboxbody"> """ % {
'founds' : _("%(x_fmt_open)sResults overview%(x_fmt_close)s") % \
{'x_fmt_open': '<strong>',
'x_fmt_close': '</strong>'}
}
elif results_final_nb_total > 0:
out = """<table class="searchresultsbox">
<thead><tr><th class="searchresultsboxheader">%(founds)s</th></tr></thead>
<tbody><tr><td class="searchresultsboxbody"> """ % {
'founds' : _("%(x_fmt_open)sResults overview:%(x_fmt_close)s Found at least %(x_nb_records)s records in %(x_nb_seconds)s seconds.") % \
{'x_fmt_open': '<strong>',
'x_fmt_close': '</strong>',
'x_nb_records': '<strong>' + self.tmpl_nice_number(results_final_nb_total, ln) + '</strong>',
'x_nb_seconds': '%.2f' % cpu_time}
}
# then print hits per collection:
out += """<script type="text/javascript">
$(document).ready(function() {
$('a.morecolls').click(function() {
$('.morecollslist').show();
$(this).hide();
$('.lesscolls').show();
return false;
});
$('a.lesscolls').click(function() {
$('.morecollslist').hide();
$(this).hide();
$('.morecolls').show();
return false;
});
});
</script>"""
count = 0
for coll in colls:
if results_final_nb.has_key(coll['code']) and results_final_nb[coll['code']] > 0:
count += 1
out += """
<span %(collclass)s><strong><a href="#%(coll)s">%(coll_name)s</a></strong>, <a href="#%(coll)s">%(number)s</a><br /></span>""" % \
{'collclass' : count > cfg['CFG_WEBSEARCH_RESULTS_OVERVIEW_MAX_COLLS_TO_PRINT'] and 'class="morecollslist" style="display:none"' or '',
'coll' : coll['id'],
'coll_name' : cgi.escape(coll['name']),
'number' : _("%s records found") % \
('<strong>' + self.tmpl_nice_number(results_final_nb[coll['code']], ln) + '</strong>')}
# the following is used for hosted collections that have timed out,
# i.e. for which we don't know the exact number of results yet.
elif results_final_nb.has_key(coll['code']) and results_final_nb[coll['code']] == -963:
count += 1
out += """
<span %(collclass)s><strong><a href="#%(coll)s">%(coll_name)s</a></strong><br /></span>""" % \
{'collclass' : count > cfg['CFG_WEBSEARCH_RESULTS_OVERVIEW_MAX_COLLS_TO_PRINT'] and 'class="morecollslist" style="display:none"' or '',
'coll' : coll['id'],
'coll_name' : cgi.escape(coll['name']),
'number' : _("%s records found") % \
('<strong>' + self.tmpl_nice_number(results_final_nb[coll['code']], ln) + '</strong>')}
if count > cfg['CFG_WEBSEARCH_RESULTS_OVERVIEW_MAX_COLLS_TO_PRINT']:
out += """<a class="lesscolls" style="display:none; color:red; font-size:small" href="#"><i>%s</i></a>""" % _("Show less collections")
out += """<a class="morecolls" style="color:red; font-size:small" href="#"><i>%s</i></a>""" % _("Show all collections")
out += "</td></tr></tbody></table>"
return out
def tmpl_print_hosted_results(self, url_and_engine, ln, of=None, req=None, limit=CFG_EXTERNAL_COLLECTION_MAXRESULTS, display_body=True, display_add_to_basket = True):
"""Print results of a given search engine.
"""
if display_body:
_ = gettext_set_language(ln)
#url = url_and_engine[0]
engine = url_and_engine[1]
#name = _(engine.name)
db_id = get_collection_id(engine.name)
#base_url = engine.base_url
out = ""
results = engine.parser.parse_and_get_results(None, of=of, req=req, limit=limit, parseonly=True)
if len(results) != 0:
if of == 'hb':
out += """
<form action="%(siteurl)s/yourbaskets/add" method="post">
<input type="hidden" name="colid" value="%(col_db_id)s" />
<table>
""" % {
'siteurl' : CFG_SITE_URL,
'col_db_id' : db_id,
}
else:
if of == 'hb':
out += """
<table>
"""
for result in results:
out += result.html.replace('>Detailed record<', '>External record<').replace('>Similar records<', '>Similar external records<')
if len(results) != 0:
if of == 'hb':
out += """</table>
<br />"""
if display_add_to_basket:
out += """<input class="formbutton" type="submit" name="action" value="%(basket)s" />
""" % {'basket' : _("Add to basket")}
out += """</form>"""
else:
if of == 'hb':
out += """
</table>
"""
# we have already checked if there are results or no, maybe the following if should be removed?
if not results:
if of.startswith("h"):
out = _('No results found...') + '<br />'
return out
else:
return ""
def tmpl_print_searchresultbox(self, header, body):
"""print a nicely formatted box for search results """
#_ = gettext_set_language(ln)
# first find total number of hits:
out = '<table class="searchresultsbox"><thead><tr><th class="searchresultsboxheader">' + header + '</th></tr></thead><tbody><tr><td class="searchresultsboxbody">' + body + '</td></tr></tbody></table>'
return out
def tmpl_search_no_boolean_hits(self, ln, nearestterms):
"""No hits found, proposes alternative boolean queries
Parameters:
- 'ln' *string* - The language to display
- 'nearestterms' *array* - Parts of the interface to display, in the format:
- 'nearestterms[nbhits]' *int* - The resulting number of hits
- 'nearestterms[url_args]' *string* - The search parameters
- 'nearestterms[p]' *string* - The search terms
"""
# load the right message language
_ = gettext_set_language(ln)
out = _("Boolean query returned no hits. Please combine your search terms differently.")
out += '''<blockquote><table class="nearesttermsbox" cellpadding="0" cellspacing="0" border="0">'''
for term, hits, argd in nearestterms:
out += '''\
<tr>
<td class="nearesttermsboxbody" align="right">%(hits)s</td>
<td class="nearesttermsboxbody" width="15">&nbsp;</td>
<td class="nearesttermsboxbody" align="left">
%(link)s
</td>
</tr>''' % {'hits' : hits,
'link': create_html_link(self.build_search_url(argd),
{}, cgi.escape(term),
{'class': "nearestterms"})}
out += """</table></blockquote>"""
return out
def tmpl_similar_author_names(self, authors, ln):
"""No hits found, proposes alternative boolean queries
Parameters:
- 'authors': a list of (name, hits) tuples
- 'ln' *string* - The language to display
"""
# load the right message language
_ = gettext_set_language(ln)
out = '''<a name="googlebox"></a>
<table class="googlebox"><tr><th colspan="2" class="googleboxheader">%(similar)s</th></tr>''' % {
'similar' : _("See also: similar author names")
}
for author, hits in authors:
out += '''\
<tr>
<td class="googleboxbody">%(nb)d</td>
<td class="googleboxbody">%(link)s</td>
</tr>''' % {'link': create_html_link(
self.build_search_url(p=author,
f='author',
ln=ln),
{}, cgi.escape(author), {'class':"google"}),
'nb' : hits}
out += """</table>"""
return out
def tmpl_print_record_detailed(self, recID, ln):
"""Displays a detailed on-the-fly record
Parameters:
- 'ln' *string* - The language to display
- 'recID' *int* - The record id
"""
# okay, need to construct a simple "Detailed record" format of our own:
out = "<p>&nbsp;"
# secondly, title:
titles = get_fieldvalues(recID, "245__a") or \
get_fieldvalues(recID, "111__a")
for title in titles:
out += "<p><center><big><strong>%s</strong></big></center></p>" % cgi.escape(title)
# thirdly, authors:
authors = get_fieldvalues(recID, "100__a") + get_fieldvalues(recID, "700__a")
if authors:
out += "<p><center>"
for author in authors:
out += '%s; ' % create_html_link(self.build_search_url(
ln=ln,
p=author,
f='author'),
{}, cgi.escape(author))
out += "</center></p>"
# fourthly, date of creation:
dates = get_fieldvalues(recID, "260__c")
for date in dates:
out += "<p><center><small>%s</small></center></p>" % date
# fifthly, abstract:
abstracts = get_fieldvalues(recID, "520__a")
for abstract in abstracts:
out += """<p style="margin-left: 15%%; width: 70%%">
<small><strong>Abstract:</strong> %s</small></p>""" % abstract
# fifthly bis, keywords:
keywords = get_fieldvalues(recID, "6531_a")
if len(keywords):
out += """<p style="margin-left: 15%%; width: 70%%">
<small><strong>Keyword(s):</strong>"""
for keyword in keywords:
out += '%s; ' % create_html_link(
self.build_search_url(ln=ln,
p=keyword,
f='keyword'),
{}, cgi.escape(keyword))
out += '</small></p>'
# fifthly bis bis, published in:
prs_p = get_fieldvalues(recID, "909C4p")
prs_v = get_fieldvalues(recID, "909C4v")
prs_y = get_fieldvalues(recID, "909C4y")
prs_n = get_fieldvalues(recID, "909C4n")
prs_c = get_fieldvalues(recID, "909C4c")
for idx in range(0, len(prs_p)):
out += """<p style="margin-left: 15%%; width: 70%%">
<small><strong>Publ. in:</strong> %s""" % prs_p[idx]
if prs_v and prs_v[idx]:
out += """<strong>%s</strong>""" % prs_v[idx]
if prs_y and prs_y[idx]:
out += """(%s)""" % prs_y[idx]
if prs_n and prs_n[idx]:
out += """, no.%s""" % prs_n[idx]
if prs_c and prs_c[idx]:
out += """, p.%s""" % prs_c[idx]
out += """.</small></p>"""
# sixthly, fulltext link:
urls_z = get_fieldvalues(recID, "8564_z")
urls_u = get_fieldvalues(recID, "8564_u")
# we separate the fulltext links and image links
for url_u in urls_u:
if url_u.endswith('.png'):
continue
else:
link_text = "URL"
try:
if urls_z[idx]:
link_text = urls_z[idx]
except IndexError:
pass
out += """<p style="margin-left: 15%%; width: 70%%">
<small><strong>%s:</strong> <a href="%s">%s</a></small></p>""" % (link_text, urls_u[idx], urls_u[idx])
# print some white space at the end:
out += "<br /><br />"
return out
def tmpl_print_record_list_for_similarity_boxen(self, title, recID_score_list, ln=CFG_SITE_LANG):
"""Print list of records in the "hs" (HTML Similarity) format for similarity boxes.
RECID_SCORE_LIST is a list of (recID1, score1), (recID2, score2), etc.
"""
from invenio.legacy.search_engine import print_record, record_public_p
recID_score_list_to_be_printed = []
# firstly find 5 first public records to print:
nb_records_to_be_printed = 0
nb_records_seen = 0
while nb_records_to_be_printed < 5 and nb_records_seen < len(recID_score_list) and nb_records_seen < 50:
# looking through first 50 records only, picking first 5 public ones
(recID, score) = recID_score_list[nb_records_seen]
nb_records_seen += 1
if record_public_p(recID):
nb_records_to_be_printed += 1
recID_score_list_to_be_printed.append([recID, score])
# secondly print them:
out = '''
<table><tr>
<td>
<table><tr><td class="blocknote">%(title)s</td></tr></table>
</td>
</tr>
<tr>
<td><table>
''' % { 'title': cgi.escape(title) }
for recid, score in recID_score_list_to_be_printed:
out += '''
<tr><td><font class="rankscoreinfo"><a>(%(score)s)&nbsp;</a></font><small>&nbsp;%(info)s</small></td></tr>''' % {
'score': score,
'info' : print_record(recid, format="hs", ln=ln),
}
out += """</table></td></tr></table> """
return out
def tmpl_print_record_brief(self, ln, recID):
"""Displays a brief record on-the-fly
Parameters:
- 'ln' *string* - The language to display
- 'recID' *int* - The record id
"""
out = ""
# record 'recID' does not exist in format 'format', so print some default format:
# firstly, title:
titles = get_fieldvalues(recID, "245__a") or \
get_fieldvalues(recID, "111__a")
# secondly, authors:
authors = get_fieldvalues(recID, "100__a") + get_fieldvalues(recID, "700__a")
# thirdly, date of creation:
dates = get_fieldvalues(recID, "260__c")
# thirdly bis, report numbers:
rns = get_fieldvalues(recID, "037__a")
rns = get_fieldvalues(recID, "088__a")
# fourthly, beginning of abstract:
abstracts = get_fieldvalues(recID, "520__a")
# fifthly, fulltext link:
urls_z = get_fieldvalues(recID, "8564_z")
urls_u = get_fieldvalues(recID, "8564_u")
# get rid of images
images = []
non_image_urls_u = []
for url_u in urls_u:
if url_u.endswith('.png'):
images.append(url_u)
else:
non_image_urls_u.append(url_u)
## unAPI identifier
out = '<abbr class="unapi-id" title="%s"></abbr>\n' % recID
out += self.tmpl_record_body(
titles=titles,
authors=authors,
dates=dates,
rns=rns,
abstracts=abstracts,
urls_u=non_image_urls_u,
urls_z=urls_z,
ln=ln)
return out
def tmpl_print_record_brief_links(self, ln, recID, sf='', so='d', sp='', rm='', display_claim_link=False):
"""Displays links for brief record on-the-fly
Parameters:
- 'ln' *string* - The language to display
- 'recID' *int* - The record id
"""
from invenio.ext.template import render_template_to_string
tpl = """{%- from "search/helpers.html" import record_brief_links with context -%}
{{ record_brief_links(get_record(recid)) }}"""
return render_template_to_string(tpl, recid=recID, _from_string=True).encode('utf-8')
def tmpl_xml_rss_prologue(self, current_url=None,
previous_url=None, next_url=None,
first_url=None, last_url=None,
nb_found=None, jrec=None, rg=None, cc=None):
"""Creates XML RSS 2.0 prologue."""
title = CFG_SITE_NAME
description = '%s latest documents' % CFG_SITE_NAME
if cc and cc != CFG_SITE_NAME:
title += ': ' + cgi.escape(cc)
description += ' in ' + cgi.escape(cc)
out = """<rss version="2.0"
xmlns:media="http://search.yahoo.com/mrss/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/">
<channel>
<title>%(rss_title)s</title>
<link>%(siteurl)s</link>
<description>%(rss_description)s</description>
<language>%(sitelang)s</language>
<pubDate>%(timestamp)s</pubDate>
<category></category>
<generator>Invenio %(version)s</generator>
<webMaster>%(sitesupportemail)s</webMaster>
<ttl>%(timetolive)s</ttl>%(previous_link)s%(next_link)s%(current_link)s%(total_results)s%(start_index)s%(items_per_page)s
<image>
<url>%(siteurl)s/img/site_logo_rss.png</url>
<title>%(sitename)s</title>
<link>%(siteurl)s</link>
</image>
<atom:link rel="search" href="%(siteurl)s/opensearchdescription" type="application/opensearchdescription+xml" title="Content Search" />
<textInput>
<title>Search </title>
<description>Search this site:</description>
<name>p</name>
<link>%(siteurl)s/search</link>
</textInput>
""" % {'sitename': CFG_SITE_NAME,
'siteurl': CFG_SITE_URL,
'sitelang': CFG_SITE_LANG,
'search_syntax': self.tmpl_opensearch_rss_url_syntax,
'timestamp': time.strftime("%a, %d %b %Y %H:%M:%S GMT", time.gmtime()),
'version': CFG_VERSION,
'sitesupportemail': CFG_SITE_SUPPORT_EMAIL,
'timetolive': CFG_WEBSEARCH_RSS_TTL,
'current_link': (current_url and \
'\n<atom:link rel="self" href="%s" />\n' % current_url) or '',
'previous_link': (previous_url and \
'\n<atom:link rel="previous" href="%s" />' % previous_url) or '',
'next_link': (next_url and \
'\n<atom:link rel="next" href="%s" />' % next_url) or '',
'first_link': (first_url and \
'\n<atom:link rel="first" href="%s" />' % first_url) or '',
'last_link': (last_url and \
'\n<atom:link rel="last" href="%s" />' % last_url) or '',
'total_results': (nb_found and \
'\n<opensearch:totalResults>%i</opensearch:totalResults>' % nb_found) or '',
'start_index': (jrec and \
'\n<opensearch:startIndex>%i</opensearch:startIndex>' % jrec) or '',
'items_per_page': (rg and \
'\n<opensearch:itemsPerPage>%i</opensearch:itemsPerPage>' % rg) or '',
'rss_title': title,
'rss_description': description
}
return out
def tmpl_xml_rss_epilogue(self):
"""Creates XML RSS 2.0 epilogue."""
out = """\
</channel>
</rss>\n"""
return out
def tmpl_xml_podcast_prologue(self, current_url=None,
previous_url=None, next_url=None,
first_url=None, last_url=None,
nb_found=None, jrec=None, rg=None, cc=None):
"""Creates XML podcast prologue."""
title = CFG_SITE_NAME
description = '%s latest documents' % CFG_SITE_NAME
if CFG_CERN_SITE:
title = 'CERN'
description = 'CERN latest documents'
if cc and cc != CFG_SITE_NAME:
title += ': ' + cgi.escape(cc)
description += ' in ' + cgi.escape(cc)
out = """<rss xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" version="2.0">
<channel>
<title>%(podcast_title)s</title>
<link>%(siteurl)s</link>
<description>%(podcast_description)s</description>
<language>%(sitelang)s</language>
<pubDate>%(timestamp)s</pubDate>
<category></category>
<generator>Invenio %(version)s</generator>
<webMaster>%(siteadminemail)s</webMaster>
<ttl>%(timetolive)s</ttl>%(previous_link)s%(next_link)s%(current_link)s
<image>
<url>%(siteurl)s/img/site_logo_rss.png</url>
<title>%(sitename)s</title>
<link>%(siteurl)s</link>
</image>
<itunes:owner>
<itunes:email>%(siteadminemail)s</itunes:email>
</itunes:owner>
""" % {'sitename': CFG_SITE_NAME,
'siteurl': CFG_SITE_URL,
'sitelang': CFG_SITE_LANG,
'siteadminemail': CFG_SITE_ADMIN_EMAIL,
'timestamp': time.strftime("%a, %d %b %Y %H:%M:%S GMT", time.gmtime()),
'version': CFG_VERSION,
'sitesupportemail': CFG_SITE_SUPPORT_EMAIL,
'timetolive': CFG_WEBSEARCH_RSS_TTL,
'current_link': (current_url and \
'\n<atom:link rel="self" href="%s" />\n' % current_url) or '',
'previous_link': (previous_url and \
'\n<atom:link rel="previous" href="%s" />' % previous_url) or '',
'next_link': (next_url and \
'\n<atom:link rel="next" href="%s" />' % next_url) or '',
'first_link': (first_url and \
'\n<atom:link rel="first" href="%s" />' % first_url) or '',
'last_link': (last_url and \
'\n<atom:link rel="last" href="%s" />' % last_url) or '',
'podcast_title': title,
'podcast_description': description
}
return out
def tmpl_xml_podcast_epilogue(self):
"""Creates XML podcast epilogue."""
out = """\n</channel>
</rss>\n"""
return out
def tmpl_xml_nlm_prologue(self):
"""Creates XML NLM prologue."""
out = """<articles>\n"""
return out
def tmpl_xml_nlm_epilogue(self):
"""Creates XML NLM epilogue."""
out = """\n</articles>"""
return out
def tmpl_xml_refworks_prologue(self):
"""Creates XML RefWorks prologue."""
out = """<references>\n"""
return out
def tmpl_xml_refworks_epilogue(self):
"""Creates XML RefWorks epilogue."""
out = """\n</references>"""
return out
def tmpl_xml_endnote_prologue(self):
"""Creates XML EndNote prologue."""
out = """<xml>\n<records>\n"""
return out
def tmpl_xml_endnote_8x_prologue(self):
"""Creates XML EndNote prologue."""
out = """<records>\n"""
return out
def tmpl_xml_endnote_epilogue(self):
"""Creates XML EndNote epilogue."""
out = """\n</records>\n</xml>"""
return out
def tmpl_xml_endnote_8x_epilogue(self):
"""Creates XML EndNote epilogue."""
out = """\n</records>"""
return out
def tmpl_xml_marc_prologue(self):
"""Creates XML MARC prologue."""
out = """<collection xmlns="http://www.loc.gov/MARC21/slim">\n"""
return out
def tmpl_xml_marc_epilogue(self):
"""Creates XML MARC epilogue."""
out = """\n</collection>"""
return out
def tmpl_xml_mods_prologue(self):
"""Creates XML MODS prologue."""
out = """<modsCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\n
xsi:schemaLocation="http://www.loc.gov/mods/v3\n
http://www.loc.gov/standards/mods/v3/mods-3-3.xsd">\n"""
return out
def tmpl_xml_mods_epilogue(self):
"""Creates XML MODS epilogue."""
out = """\n</modsCollection>"""
return out
def tmpl_xml_default_prologue(self):
"""Creates XML default format prologue. (Sanity calls only.)"""
out = """<collection>\n"""
return out
def tmpl_xml_default_epilogue(self):
"""Creates XML default format epilogue. (Sanity calls only.)"""
out = """\n</collection>"""
return out
def tmpl_collection_not_found_page_title(self, colname, ln=CFG_SITE_LANG):
"""
Create page title for cases when unexisting collection was asked for.
"""
_ = gettext_set_language(ln)
out = _("Collection %s Not Found") % cgi.escape(colname)
return out
def tmpl_collection_not_found_page_body(self, colname, ln=CFG_SITE_LANG):
"""
Create page body for cases when unexisting collection was asked for.
"""
_ = gettext_set_language(ln)
out = """<h1>%(title)s</h1>
<p>%(sorry)s</p>
<p>%(you_may_want)s</p>
""" % { 'title': self.tmpl_collection_not_found_page_title(colname, ln),
'sorry': _("Sorry, collection %s does not seem to exist.") % \
('<strong>' + cgi.escape(colname) + '</strong>'),
'you_may_want': _("You may want to start browsing from %s.") % \
('<a href="' + CFG_SITE_URL + '?ln=' + ln + '">' + \
cgi.escape(CFG_SITE_NAME_INTL.get(ln, CFG_SITE_NAME)) + '</a>')}
return out
def tmpl_alert_rss_teaser_box_for_query(self, id_query, ln, display_email_alert_part=True):
"""Propose teaser for setting up this query as alert or RSS feed.
Parameters:
- 'id_query' *int* - ID of the query we make teaser for
- 'ln' *string* - The language to display
- 'display_email_alert_part' *bool* - whether to display email alert part
"""
# load the right message language
_ = gettext_set_language(ln)
# get query arguments:
res = run_sql("SELECT urlargs FROM query WHERE id=%s", (id_query,))
argd = {}
if res:
argd = cgi.parse_qs(res[0][0])
rssurl = self.build_rss_url(argd)
alerturl = CFG_SITE_URL + '/youralerts/input?ln=%s&amp;idq=%s' % (ln, id_query)
if display_email_alert_part:
msg_alert = _("""Set up a personal %(x_url1_open)semail alert%(x_url1_close)s
or subscribe to the %(x_url2_open)sRSS feed%(x_url2_close)s.""") % \
{'x_url1_open': '<a href="%s"><img src="%s/img/mail-icon-12x8.gif" border="0" alt="" /></a> ' % (alerturl, CFG_SITE_URL) + ' <a class="google" href="%s">' % (alerturl),
'x_url1_close': '</a>',
'x_url2_open': '<a href="%s"><img src="%s/img/feed-icon-12x12.gif" border="0" alt="" /></a> ' % (rssurl, CFG_SITE_URL) + ' <a class="google" href="%s">' % rssurl,
'x_url2_close': '</a>', }
else:
msg_alert = _("""Subscribe to the %(x_url2_open)sRSS feed%(x_url2_close)s.""") % \
{'x_url2_open': '<a href="%s"><img src="%s/img/feed-icon-12x12.gif" border="0" alt="" /></a> ' % (rssurl, CFG_SITE_URL) + ' <a class="google" href="%s">' % rssurl,
'x_url2_close': '</a>', }
out = '''<a name="googlebox"></a>
<table class="googlebox"><tr><th class="googleboxheader">%(similar)s</th></tr>
<tr><td class="googleboxbody">%(msg_alert)s</td></tr>
</table>
''' % {
'similar' : _("Interested in being notified about new results for this query?"),
'msg_alert': msg_alert, }
return out
def tmpl_detailed_record_metadata(self, recID, ln, format,
content,
creationdate=None,
modificationdate=None):
"""Returns the main detailed page of a record
Parameters:
- 'recID' *int* - The ID of the printed record
- 'ln' *string* - The language to display
- 'format' *string* - The format in used to print the record
- 'content' *string* - The main content of the page
- 'creationdate' *string* - The creation date of the printed record
- 'modificationdate' *string* - The last modification date of the printed record
"""
_ = gettext_set_language(ln)
## unAPI identifier
out = '<abbr class="unapi-id" title="%s"></abbr>\n' % recID
out += content
return out
def tmpl_display_back_to_search(self, req, recID, ln):
"""
Displays next-hit/previous-hit/back-to-search links
on the detailed record pages in order to be able to quickly
flip between detailed record pages
@param req: Apache request object
@type req: Apache request object
@param recID: detailed record ID
@type recID: int
@param ln: language of the page
@type ln: string
@return: html output
@rtype: html
"""
_ = gettext_set_language(ln)
# this variable is set to zero and then, nothing is displayed
if not CFG_WEBSEARCH_PREV_NEXT_HIT_LIMIT:
return ''
# search for a specific record having not done any search before
wlq = session_param_get(req, 'websearch-last-query', '')
wlqh = session_param_get(req, 'websearch-last-query-hits')
out = '''<br/><br/><div align="right">'''
# excedeed limit CFG_WEBSEARCH_PREV_NEXT_HIT_LIMIT,
# then will be displayed only the back to search link
if wlqh is None:
out += '''<div style="padding-bottom:2px;padding-top:30px;"><span class="moreinfo" style="margin-right:10px;">
%(back)s </span></div></div>''' % \
{'back': create_html_link(wlq, {}, _("Back to search"), {'class': "moreinfo"})}
return out
# let's look for the recID's collection
record_found = False
for coll in wlqh:
if recID in coll:
record_found = True
coll_recID = coll
break
# let's calculate lenght of recID's collection
if record_found:
recIDs = coll_recID[::-1]
totalrec = len(recIDs)
# search for a specific record having not done any search before
else:
return ''
# if there is only one hit,
# to show only the "back to search" link
if totalrec == 1:
# to go back to the last search results page
out += '''<div style="padding-bottom:2px;padding-top:30px;"><span class="moreinfo" style="margin-right:10px;">
%(back)s </span></div></div>''' % \
{'back': create_html_link(wlq, {}, _("Back to search"), {'class': "moreinfo"})}
elif totalrec > 1:
pos = recIDs.index(recID)
numrec = pos + 1
if pos == 0:
recIDnext = recIDs[pos + 1]
recIDlast = recIDs[totalrec - 1]
# to display only next and last links
out += '''<div><span class="moreinfo" style="margin-right:10px;">
%(numrec)s %(totalrec)s %(next)s %(last)s </span></div> ''' % {
'numrec': _("%s of") % ('<strong>' + self.tmpl_nice_number(numrec, ln) + '</strong>'),
'totalrec': ("%s") % ('<strong>' + self.tmpl_nice_number(totalrec, ln) + '</strong>'),
'next': create_html_link(self.build_search_url(recid=recIDnext, ln=ln),
{}, ('<font size="4">&rsaquo;</font>'), {'class': "moreinfo"}),
'last': create_html_link(self.build_search_url(recid=recIDlast, ln=ln),
{}, ('<font size="4">&raquo;</font>'), {'class': "moreinfo"})}
elif pos == totalrec - 1:
recIDfirst = recIDs[0]
recIDprev = recIDs[pos - 1]
# to display only first and previous links
out += '''<div style="padding-top:30px;"><span class="moreinfo" style="margin-right:10px;">
%(first)s %(previous)s %(numrec)s %(totalrec)s</span></div>''' % {
'first': create_html_link(self.build_search_url(recid=recIDfirst, ln=ln),
{}, ('<font size="4">&laquo;</font>'), {'class': "moreinfo"}),
'previous': create_html_link(self.build_search_url(recid=recIDprev, ln=ln),
{}, ('<font size="4">&lsaquo;</font>'), {'class': "moreinfo"}),
'numrec': _("%s of") % ('<strong>' + self.tmpl_nice_number(numrec, ln) + '</strong>'),
'totalrec': ("%s") % ('<strong>' + self.tmpl_nice_number(totalrec, ln) + '</strong>')}
else:
# to display all links
recIDfirst = recIDs[0]
recIDprev = recIDs[pos - 1]
recIDnext = recIDs[pos + 1]
recIDlast = recIDs[len(recIDs) - 1]
out += '''<div style="padding-top:30px;"><span class="moreinfo" style="margin-right:10px;">
%(first)s %(previous)s
%(numrec)s %(totalrec)s %(next)s %(last)s </span></div>''' % {
'first': create_html_link(self.build_search_url(recid=recIDfirst, ln=ln),
{}, ('<font size="4">&laquo;</font>'),
{'class': "moreinfo"}),
'previous': create_html_link(self.build_search_url(recid=recIDprev, ln=ln),
{}, ('<font size="4">&lsaquo;</font>'), {'class': "moreinfo"}),
'numrec': _("%s of") % ('<strong>' + self.tmpl_nice_number(numrec, ln) + '</strong>'),
'totalrec': ("%s") % ('<strong>' + self.tmpl_nice_number(totalrec, ln) + '</strong>'),
'next': create_html_link(self.build_search_url(recid=recIDnext, ln=ln),
{}, ('<font size="4">&rsaquo;</font>'), {'class': "moreinfo"}),
'last': create_html_link(self.build_search_url(recid=recIDlast, ln=ln),
{}, ('<font size="4">&raquo;</font>'), {'class': "moreinfo"})}
out += '''<div style="padding-bottom:2px;"><span class="moreinfo" style="margin-right:10px;">
%(back)s </span></div></div>''' % {
'back': create_html_link(wlq, {}, _("Back to search"), {'class': "moreinfo"})}
return out
def tmpl_record_plots(self, recID, ln):
"""
Displays little tables containing the images and captions contained in the specified document.
Parameters:
- 'recID' *int* - The ID of the printed record
- 'ln' *string* - The language to display
"""
from invenio.legacy.search_engine import get_record
from invenio.legacy.bibrecord import field_get_subfield_values
from invenio.legacy.bibrecord import record_get_field_instances
_ = gettext_set_language(ln)
out = ''
rec = get_record(recID)
flds = record_get_field_instances(rec, '856', '4')
images = []
for fld in flds:
image = field_get_subfield_values(fld, 'u')
caption = field_get_subfield_values(fld, 'y')
if type(image) == list and len(image) > 0:
image = image[0]
else:
continue
if type(caption) == list and len(caption) > 0:
caption = caption[0]
else:
continue
if not image.endswith('.png'):
# huh?
continue
if len(caption) >= 5:
images.append((int(caption[:5]), image, caption[5:]))
else:
# we don't have any idea of the order... just put it on
images.append(99999, image, caption)
images = sorted(images, key=lambda x: x[0])
for (index, image, caption) in images:
# let's put everything in nice little subtables with the image
# next to the caption
out = out + '<table width="95%" style="display: inline;">' + \
'<tr><td width="66%"><a name="' + str(index) + '" ' + \
'href="' + image + '">' + \
'<img src="' + image + '" width="95%"/></a></td>' + \
'<td width="33%">' + caption + '</td></tr>' + \
'</table>'
out = out + '<br /><br />'
return out
def tmpl_detailed_record_statistics(self, recID, ln,
downloadsimilarity,
downloadhistory, viewsimilarity):
"""Returns the statistics page of a record
Parameters:
- 'recID' *int* - The ID of the printed record
- 'ln' *string* - The language to display
- downloadsimilarity *string* - downloadsimilarity box
- downloadhistory *string* - downloadhistory box
- viewsimilarity *string* - viewsimilarity box
"""
# load the right message language
_ = gettext_set_language(ln)
out = ''
if CFG_BIBRANK_SHOW_DOWNLOAD_STATS and downloadsimilarity is not None:
similar = self.tmpl_print_record_list_for_similarity_boxen (
_("People who downloaded this document also downloaded:"), downloadsimilarity, ln)
out = '<table>'
out += '''
<tr><td>%(graph)s</td></tr>
<tr><td>%(similar)s</td></tr>
''' % { 'siteurl': CFG_SITE_URL, 'recid': recID, 'ln': ln,
'similar': similar, 'more': _("more"),
'graph': downloadsimilarity
}
out += '</table>'
out += '<br />'
if CFG_BIBRANK_SHOW_READING_STATS and viewsimilarity is not None:
out += self.tmpl_print_record_list_for_similarity_boxen (
_("People who viewed this page also viewed:"), viewsimilarity, ln)
if CFG_BIBRANK_SHOW_DOWNLOAD_GRAPHS and downloadhistory is not None:
out += downloadhistory + '<br />'
return out
def tmpl_detailed_record_citations_prologue(self, recID, ln):
"""Returns the prologue of the citations page of a record
Parameters:
- 'recID' *int* - The ID of the printed record
- 'ln' *string* - The language to display
"""
return '<table>'
def tmpl_detailed_record_citations_epilogue(self, recID, ln):
"""Returns the epilogue of the citations page of a record
Parameters:
- 'recID' *int* - The ID of the printed record
- 'ln' *string* - The language to display
"""
return '</table>'
def tmpl_detailed_record_citations_citing_list(self, recID, ln,
citinglist,
sf='', so='d', sp='', rm=''):
"""Returns the list of record citing this one
Parameters:
- 'recID' *int* - The ID of the printed record
- 'ln' *string* - The language to display
- citinglist *list* - a list of tuples [(x1,y1),(x2,y2),..] where x is doc id and y is number of citations
"""
# load the right message language
_ = gettext_set_language(ln)
out = ''
if CFG_BIBRANK_SHOW_CITATION_STATS and citinglist is not None:
similar = self.tmpl_print_record_list_for_similarity_boxen(
_("Cited by: %s records") % len (citinglist), citinglist, ln)
out += '''
<tr><td>
%(similar)s&nbsp;%(more)s
<br /><br />
</td></tr>''' % {
'more': create_html_link(
self.build_search_url(p='refersto:recid:%d' % recID, #XXXX
sf=sf,
so=so,
sp=sp,
rm=rm,
ln=ln),
{}, _("more")),
'similar': similar}
return out
def tmpl_detailed_record_citations_citation_history(self, recID, ln,
citationhistory):
"""Returns the citations history graph of this record
Parameters:
- 'recID' *int* - The ID of the printed record
- 'ln' *string* - The language to display
- citationhistory *string* - citationhistory box
"""
# load the right message language
_ = gettext_set_language(ln)
out = ''
if CFG_BIBRANK_SHOW_CITATION_GRAPHS and citationhistory is not None:
out = '<!--citation history--><tr><td>%s</td></tr>' % citationhistory
else:
out = "<!--not showing citation history. CFG_BIBRANK_SHOW_CITATION_GRAPHS:"
out += str(CFG_BIBRANK_SHOW_CITATION_GRAPHS) + " citationhistory "
if citationhistory:
out += str(len(citationhistory)) + "-->"
else:
out += "no citationhistory -->"
return out
def tmpl_detailed_record_citations_co_citing(self, recID, ln,
cociting):
"""Returns the list of cocited records
Parameters:
- 'recID' *int* - The ID of the printed record
- 'ln' *string* - The language to display
- cociting *string* - cociting box
"""
# load the right message language
_ = gettext_set_language(ln)
out = ''
if CFG_BIBRANK_SHOW_CITATION_STATS and cociting is not None:
similar = self.tmpl_print_record_list_for_similarity_boxen (
_("Co-cited with: %s records") % len (cociting), cociting, ln)
out = '''
<tr><td>
%(similar)s&nbsp;%(more)s
<br />
</td></tr>''' % { 'more': create_html_link(self.build_search_url(p='cocitedwith:%d' % recID, ln=ln),
{}, _("more")),
'similar': similar }
return out
def tmpl_detailed_record_citations_self_cited(self, recID, ln,
selfcited, citinglist):
"""Returns the list of self-citations for this record
Parameters:
- 'recID' *int* - The ID of the printed record
- 'ln' *string* - The language to display
- selfcited list - a list of self-citations for recID
"""
# load the right message language
_ = gettext_set_language(ln)
out = ''
if CFG_BIBRANK_SHOW_CITATION_GRAPHS and selfcited is not None:
sc_scorelist = [] #a score list for print..
for s in selfcited:
#copy weight from citations
weight = 0
for c in citinglist:
(crec, score) = c
if crec == s:
weight = score
tmp = [s, weight]
sc_scorelist.append(tmp)
scite = self.tmpl_print_record_list_for_similarity_boxen (
_(".. of which self-citations: %s records") % len (selfcited), sc_scorelist, ln)
out = '<tr><td>' + scite + '</td></tr>'
return out
def tmpl_author_information(self, req, pubs, authorname, num_downloads,
aff_pubdict, citedbylist, kwtuples, authors,
vtuples, names_dict, person_link,
bibauthorid_data, ln, return_html=False):
"""Prints stuff about the author given as authorname.
1. Author name + his/her institutes. Each institute I has a link
to papers where the auhtor has I as institute.
2. Publications, number: link to search by author.
3. Keywords
4. Author collabs
5. Publication venues like journals
The parameters are data structures needed to produce 1-6, as follows:
req - request
pubs - list of recids, probably the records that have the author as an author
authorname - evident
num_downloads - evident
aff_pubdict - a dictionary where keys are inst names and values lists of recordids
citedbylist - list of recs that cite pubs
kwtuples - keyword tuples like ('HIGGS BOSON',[3,4]) where 3 and 4 are recids
authors - a list of authors that have collaborated with authorname
names_dict - a dict of {name: frequency}
"""
from invenio.legacy.search_engine import perform_request_search
from operator import itemgetter
_ = gettext_set_language(ln)
ib_pubs = intbitset(pubs)
html = []
# construct an extended search as an interim solution for author id
# searches. Will build "(exactauthor:v1 OR exactauthor:v2)" strings
# extended_author_search_str = ""
# if bibauthorid_data["is_baid"]:
# if len(names_dict.keys()) > 1:
# extended_author_search_str = '('
#
# for name_index, name_query in enumerate(names_dict.keys()):
# if name_index > 0:
# extended_author_search_str += " OR "
#
# extended_author_search_str += 'exactauthor:"' + name_query + '"'
#
# if len(names_dict.keys()) > 1:
# extended_author_search_str += ')'
# rec_query = 'exactauthor:"' + authorname + '"'
#
# if bibauthorid_data["is_baid"] and extended_author_search_str:
# rec_query = extended_author_search_str
baid_query = ""
extended_author_search_str = ""
if 'is_baid' in bibauthorid_data and bibauthorid_data['is_baid']:
if bibauthorid_data["cid"]:
baid_query = 'author:%s' % bibauthorid_data["cid"]
elif bibauthorid_data["pid"] > -1:
baid_query = 'author:%s' % bibauthorid_data["pid"]
## todo: figure out if the author index is filled with pids/cids.
## if not: fall back to exactauthor search.
# if not index:
# baid_query = ""
if not baid_query:
baid_query = 'exactauthor:"' + authorname + '"'
if bibauthorid_data['is_baid']:
if len(names_dict.keys()) > 1:
extended_author_search_str = '('
for name_index, name_query in enumerate(names_dict.keys()):
if name_index > 0:
extended_author_search_str += " OR "
extended_author_search_str += 'exactauthor:"' + name_query + '"'
if len(names_dict.keys()) > 1:
extended_author_search_str += ')'
if bibauthorid_data['is_baid'] and extended_author_search_str:
baid_query = extended_author_search_str
baid_query = baid_query + " "
sorted_names_list = sorted(names_dict.iteritems(), key=itemgetter(1),
reverse=True)
# Prepare data for display
# construct names box
header = "<strong>" + _("Name variants") + "</strong>"
content = []
for name, frequency in sorted_names_list:
prquery = baid_query + ' exactauthor:"' + name + '"'
name_lnk = create_html_link(self.build_search_url(p=prquery),
{},
str(frequency),)
content.append("%s (%s)" % (name, name_lnk))
if not content:
content = [_("No Name Variants")]
names_box = self.tmpl_print_searchresultbox(header, "<br />\n".join(content))
# construct papers box
rec_query = baid_query
searchstr = create_html_link(self.build_search_url(p=rec_query),
{}, "<strong>" + "All papers (" + str(len(pubs)) + ")" + "</strong>",)
line1 = "<strong>" + _("Papers") + "</strong>"
line2 = searchstr
if CFG_BIBRANK_SHOW_DOWNLOAD_STATS and num_downloads:
line2 += " (" + _("downloaded") + " "
line2 += str(num_downloads) + " " + _("times") + ")"
if CFG_INSPIRE_SITE:
CFG_COLLS = ['Book',
'Conference',
'Introductory',
'Lectures',
'Preprint',
'Published',
'Review',
'Thesis']
else:
CFG_COLLS = ['Article',
'Book',
'Preprint', ]
collsd = {}
for coll in CFG_COLLS:
coll_papers = list(ib_pubs & intbitset(perform_request_search(f="collection", p=coll)))
if coll_papers:
collsd[coll] = coll_papers
colls = collsd.keys()
colls.sort(lambda x, y: cmp(len(collsd[y]), len(collsd[x]))) # sort by number of papers
for coll in colls:
rec_query = baid_query + 'collection:' + coll
line2 += "<br />" + create_html_link(self.build_search_url(p=rec_query),
{}, coll + " (" + str(len(collsd[coll])) + ")",)
if not pubs:
line2 = _("No Papers")
papers_box = self.tmpl_print_searchresultbox(line1, line2)
#make a authoraff string that looks like CERN (1), Caltech (2) etc
authoraff = ""
aff_pubdict_keys = aff_pubdict.keys()
aff_pubdict_keys.sort(lambda x, y: cmp(len(aff_pubdict[y]), len(aff_pubdict[x])))
if aff_pubdict_keys:
for a in aff_pubdict_keys:
print_a = a
if (print_a == ' '):
print_a = _("unknown affiliation")
if authoraff:
authoraff += '<br>'
authoraff += create_html_link(self.build_search_url(p=' or '.join(["%s" % x for x in aff_pubdict[a]]),
f='recid'),
{}, print_a + ' (' + str(len(aff_pubdict[a])) + ')',)
else:
authoraff = _("No Affiliations")
line1 = "<strong>" + _("Affiliations") + "</strong>"
line2 = authoraff
affiliations_box = self.tmpl_print_searchresultbox(line1, line2)
# print frequent keywords:
keywstr = ""
if (kwtuples):
for (kw, freq) in kwtuples:
if keywstr:
keywstr += '<br>'
rec_query = baid_query + 'keyword:"' + kw + '"'
searchstr = create_html_link(self.build_search_url(p=rec_query),
{}, kw + " (" + str(freq) + ")",)
keywstr = keywstr + " " + searchstr
else:
keywstr += _('No Keywords')
line1 = "<strong>" + _("Frequent keywords") + "</strong>"
line2 = keywstr
keyword_box = self.tmpl_print_searchresultbox(line1, line2)
header = "<strong>" + _("Frequent co-authors") + "</strong>"
content = []
sorted_coauthors = sorted(sorted(authors.iteritems(), key=itemgetter(0)),
key=itemgetter(1), reverse=True)
for name, frequency in sorted_coauthors:
rec_query = baid_query + 'exactauthor:"' + name + '"'
lnk = create_html_link(self.build_search_url(p=rec_query), {}, "%s (%s)" % (name, frequency),)
content.append("%s" % lnk)
if not content:
content = [_("No Frequent Co-authors")]
coauthor_box = self.tmpl_print_searchresultbox(header, "<br />\n".join(content))
pubs_to_papers_link = create_html_link(self.build_search_url(p=baid_query), {}, str(len(pubs)))
display_name = ""
try:
display_name = sorted_names_list[0][0]
except IndexError:
display_name = "&nbsp;"
headertext = ('<h1>%s <span style="font-size:50%%;">(%s papers)</span></h1>'
% (display_name, pubs_to_papers_link))
if return_html:
html.append(headertext)
else:
req.write(headertext)
#req.write("<h1>%s</h1>" % (authorname))
if person_link:
cmp_link = ('<div><a href="%s/person/claimstub?person=%s">%s</a></div>'
% (CFG_SITE_URL, person_link,
_("This is me. Verify my publication list.")))
if return_html:
html.append(cmp_link)
else:
req.write(cmp_link)
if return_html:
html.append("<table width=80%><tr valign=top><td>")
html.append(names_box)
html.append("<br />")
html.append(papers_box)
html.append("<br />")
html.append(keyword_box)
html.append("</td>")
html.append("<td>&nbsp;</td>")
html.append("<td>")
html.append(affiliations_box)
html.append("<br />")
html.append(coauthor_box)
html.append("</td></tr></table>")
else:
req.write("<table width=80%><tr valign=top><td>")
req.write(names_box)
req.write("<br />")
req.write(papers_box)
req.write("<br />")
req.write(keyword_box)
req.write("</td>")
req.write("<td>&nbsp;</td>")
req.write("<td>")
req.write(affiliations_box)
req.write("<br />")
req.write(coauthor_box)
req.write("</td></tr></table>")
# print citations:
rec_query = baid_query
if len(citedbylist):
line1 = "<strong>" + _("Citations:") + "</strong>"
line2 = ""
if not pubs:
line2 = _("No Citation Information available")
sr_box = self.tmpl_print_searchresultbox(line1, line2)
if return_html:
html.append(sr_box)
else:
req.write(sr_box)
if return_html:
return "\n".join(html)
# print frequent co-authors:
# collabstr = ""
# if (authors):
# for c in authors:
# c = c.strip()
# if collabstr:
# collabstr += '<br>'
# #do not add this person him/herself in the list
# cUP = c.upper()
# authornameUP = authorname.upper()
# if not cUP == authornameUP:
# commpubs = intbitset(pubs) & intbitset(perform_request_search(p="exactauthor:\"%s\" exactauthor:\"%s\"" % (authorname, c)))
# collabstr = collabstr + create_html_link(self.build_search_url(p='exactauthor:"' + authorname + '" exactauthor:"' + c + '"'),
# {}, c + " (" + str(len(commpubs)) + ")",)
# else: collabstr += 'None'
# banner = self.tmpl_print_searchresultbox("<strong>" + _("Frequent co-authors:") + "</strong>", collabstr)
# print frequently publishes in journals:
#if (vtuples):
# pubinfo = ""
# for t in vtuples:
# (journal, num) = t
# pubinfo += create_html_link(self.build_search_url(p='exactauthor:"' + authorname + '" ' + \
# 'journal:"' + journal + '"'),
# {}, journal + " ("+str(num)+")<br/>")
# banner = self.tmpl_print_searchresultbox("<strong>" + _("Frequently publishes in:") + "<strong>", pubinfo)
# req.write(banner)
def tmpl_detailed_record_references(self, recID, ln, content):
"""Returns the discussion page of a record
Parameters:
- 'recID' *int* - The ID of the printed record
- 'ln' *string* - The language to display
- 'content' *string* - The main content of the page
"""
# load the right message language
out = ''
if content is not None:
out += content
return out
def tmpl_citesummary_title(self, ln=CFG_SITE_LANG):
"""HTML citesummary title and breadcrumbs
A part of HCS format suite."""
return ''
def tmpl_citesummary2_title(self, searchpattern, ln=CFG_SITE_LANG):
"""HTML citesummary title and breadcrumbs
A part of HCS2 format suite."""
return ''
def tmpl_citesummary_back_link(self, searchpattern, ln=CFG_SITE_LANG):
"""HTML back to citesummary link
A part of HCS2 format suite."""
_ = gettext_set_language(ln)
out = ''
params = {'ln': 'en',
'p': quote(searchpattern),
'of': 'hcs'}
msg = _('Back to citesummary')
url = CFG_SITE_URL + '/search?' + \
'&'.join(['='.join(i) for i in params.iteritems()])
out += '<p><a href="%(url)s">%(msg)s</a></p>' % {'url': url, 'msg': msg}
return out
def tmpl_citesummary_more_links(self, searchpattern, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
out = ''
msg = _('<p><a href="%(url)s">%(msg)s</a></p>')
params = {'ln': ln,
'p': quote(searchpattern),
'of': 'hcs2'}
url = CFG_SITE_URL + '/search?' + \
'&amp;'.join(['='.join(i) for i in params.iteritems()])
out += msg % {'url': url,
'msg': _('Exclude self-citations')}
return out
def tmpl_citesummary_prologue(self, d_recids, collections, search_patterns,
searchfield, citable_recids, total_count,
ln=CFG_SITE_LANG):
"""HTML citesummary format, prologue. A part of HCS format suite."""
_ = gettext_set_language(ln)
out = """<table id="citesummary">
<tr>
<td>
<strong class="headline">%(msg_title)s</strong>
</td>""" % \
{'msg_title': _("Citation summary results"), }
for coll, dummy in collections:
out += '<td align="right">%s</td>' % _(coll)
out += '</tr>'
out += """<tr><td><strong>%(msg_recs)s</strong></td>""" % \
{'msg_recs': _("Total number of papers analyzed:"), }
for coll, colldef in collections:
link_url = CFG_SITE_URL + '/search?p='
if search_patterns[coll]:
p = search_patterns[coll]
if searchfield:
if " " in p:
p = searchfield + ':"' + p + '"'
else:
p = searchfield + ':' + p
link_url += quote(p)
if colldef:
link_url += '%20AND%20' + quote(colldef)
link_text = self.tmpl_nice_number(len(d_recids[coll]), ln)
out += '<td align="right"><a href="%s">%s</a></td>' % (link_url,
link_text)
out += '</tr>'
return out
def tmpl_citesummary_overview(self, collections, d_total_cites,
d_avg_cites, ln=CFG_SITE_LANG):
"""HTML citesummary format, overview. A part of HCS format suite."""
_ = gettext_set_language(ln)
out = """<tr><td><strong>%(msg_cites)s</strong></td>""" % \
{'msg_cites': _("Total number of citations:"), }
for coll, dummy in collections:
total_cites = d_total_cites[coll]
out += '<td align="right">%s</td>' % \
self.tmpl_nice_number(total_cites, ln)
out += '</tr>'
out += """<tr><td><strong>%(msg_avgcit)s</strong></td>""" % \
{'msg_avgcit': _("Average citations per paper:"), }
for coll, dummy in collections:
avg_cites = d_avg_cites[coll]
out += '<td align="right">%.1f</td>' % avg_cites
out += '</tr>'
return out
def tmpl_citesummary_minus_self_cites(self, d_total_cites, d_avg_cites,
ln=CFG_SITE_LANG):
"""HTML citesummary format, overview. A part of HCS format suite."""
_ = gettext_set_language(ln)
msg = _("Total number of citations excluding self-citations")
out = """<tr><td><strong>%(msg_cites)s</strong>""" % \
{'msg_cites': msg, }
# use ? help linking in the style of oai_repository_admin.py
msg = ' <small><small>[<a href="%s%s">?</a>]</small></small></td>'
out += msg % (CFG_SITE_URL,
'/help/citation-metrics#citesummary_self-cites')
for total_cites in d_total_cites.values():
out += '<td align="right">%s</td>' % \
self.tmpl_nice_number(total_cites, ln)
out += '</tr>'
msg = _("Average citations per paper excluding self-citations")
out += """<tr><td><strong>%(msg_avgcit)s</strong>""" % \
{'msg_avgcit': msg, }
# use ? help linking in the style of oai_repository_admin.py
msg = ' <small><small>[<a href="%s%s">?</a>]</small></small></td>'
out += msg % (CFG_SITE_URL,
'/help/citation-metrics#citesummary_self-cites')
for avg_cites in d_avg_cites.itervalues():
out += '<td align="right">%.1f</td>' % avg_cites
out += '</tr>'
return out
def tmpl_citesummary_footer(self):
return ''
def tmpl_citesummary_breakdown_header(self, ln=CFG_SITE_LANG):
_ = gettext_set_language(ln)
return """<tr><td><strong>%(msg_breakdown)s</strong></td></tr>""" % \
{'msg_breakdown': _("Breakdown of papers by citations:"), }
def tmpl_citesummary_breakdown_by_fame(self, d_cites, low, high, fame,
l_colls, searchpatterns,
searchfield, ln=CFG_SITE_LANG):
"""HTML citesummary format, breakdown by fame.
A part of HCS format suite."""
_ = gettext_set_language(ln)
out = """<tr><td>%(fame)s</td>""" % \
{'fame': _(fame), }
for coll, colldef in l_colls:
link_url = CFG_SITE_URL + '/search?p='
if searchpatterns.get(coll, None):
p = searchpatterns.get(coll, None)
if searchfield:
if " " in p:
p = searchfield + ':"' + p + '"'
else:
p = searchfield + ':' + p
link_url += quote(p) + '%20AND%20'
if colldef:
link_url += quote(colldef) + '%20AND%20'
if low == 0 and high == 0:
link_url += quote('cited:0')
else:
link_url += quote('cited:%i->%i' % (low, high))
link_text = self.tmpl_nice_number(d_cites[coll], ln)
out += '<td align="right"><a href="%s">%s</a></td>' % (link_url,
link_text)
out += '</tr>'
return out
def tmpl_citesummary_h_index(self, collections,
d_h_factors, ln=CFG_SITE_LANG):
"""HTML citesummary format, h factor output. A part of the HCS suite."""
_ = gettext_set_language(ln)
out = "<tr><td></td></tr><tr><td><strong>%(msg_metrics)s</strong> <small><small>[<a href=\"%(help_url)s\">?</a>]</small></small></td></tr>" % \
{'msg_metrics': _("Citation metrics"),
'help_url': CFG_SITE_URL + '/help/citation-metrics', }
out += '<tr><td>h-index'
# use ? help linking in the style of oai_repository_admin.py
msg = ' <small><small>[<a href="%s%s">?</a>]</small></small></td>'
out += msg % (CFG_SITE_URL,
'/help/citation-metrics#citesummary_h-index')
for coll, dummy in collections:
h_factors = d_h_factors[coll]
out += '<td align="right">%s</td>' % \
self.tmpl_nice_number(h_factors, ln)
out += '</tr>'
return out
def tmpl_citesummary_epilogue(self, ln=CFG_SITE_LANG):
"""HTML citesummary format, epilogue. A part of HCS format suite."""
out = "</table>"
return out
def tmpl_unapi(self, formats, identifier=None):
"""
Provide a list of object format available from the unAPI service
for the object identified by IDENTIFIER
"""
out = '<?xml version="1.0" encoding="UTF-8" ?>\n'
if identifier:
out += '<formats id="%i">\n' % (identifier)
else:
out += "<formats>\n"
for format_name, format_type in formats.iteritems():
docs = ''
if format_name == 'xn':
docs = 'http://www.nlm.nih.gov/databases/dtd/'
format_type = 'application/xml'
format_name = 'nlm'
elif format_name == 'xm':
docs = 'http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd'
format_type = 'application/xml'
format_name = 'marcxml'
elif format_name == 'xr':
format_type = 'application/rss+xml'
docs = 'http://www.rssboard.org/rss-2-0/'
elif format_name == 'xw':
format_type = 'application/xml'
docs = 'http://www.refworks.com/RefWorks/help/RefWorks_Tagged_Format.htm'
elif format_name == 'xoaidc':
format_type = 'application/xml'
docs = 'http://www.openarchives.org/OAI/2.0/oai_dc.xsd'
elif format_name == 'xe':
format_type = 'application/xml'
docs = 'http://www.endnote.com/support/'
format_name = 'endnote'
elif format_name == 'xd':
format_type = 'application/xml'
docs = 'http://dublincore.org/schemas/'
format_name = 'dc'
elif format_name == 'xo':
format_type = 'application/xml'
docs = 'http://www.loc.gov/standards/mods/v3/mods-3-3.xsd'
format_name = 'mods'
if docs:
out += '<format name="%s" type="%s" docs="%s" />\n' % (xml_escape(format_name), xml_escape(format_type), xml_escape(docs))
else:
out += '<format name="%s" type="%s" />\n' % (xml_escape(format_name), xml_escape(format_type))
out += "</formats>"
return out
diff --git a/invenio/legacy/websearch/webcoll.py b/invenio/legacy/websearch/webcoll.py
index fac751e0f..8172baa1f 100644
--- a/invenio/legacy/websearch/webcoll.py
+++ b/invenio/legacy/websearch/webcoll.py
@@ -1,1102 +1,1102 @@
# -*- coding: utf-8 -*-
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Create Invenio collection cache."""
__revision__ = "$Id$"
import calendar
import copy
import sys
import cgi
import re
import os
import string
import time
import cPickle
from invenio.config import \
CFG_CERN_SITE, \
CFG_WEBSEARCH_INSTANT_BROWSE, \
CFG_WEBSEARCH_NARROW_SEARCH_SHOW_GRANDSONS, \
CFG_WEBSEARCH_I18N_LATEST_ADDITIONS, \
CFG_CACHEDIR, \
CFG_SITE_LANG, \
CFG_SITE_NAME, \
CFG_SITE_LANGS, \
CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES, \
CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE, \
CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS
from invenio.base.i18n import gettext_set_language, language_list_long
from invenio.legacy.search_engine import search_pattern_parenthesised, get_creation_date, get_field_i18nname, collection_restricted_p, sort_records, EM_REPOSITORY
from invenio.legacy.dbquery import run_sql, Error, get_table_update_time
from invenio.legacy.bibrank.record_sorter import get_bibrank_methods
from invenio.utils.date import convert_datestruct_to_dategui, strftime
from invenio.modules.formatter import format_record
from invenio.utils.shell import mymkdir
from invenio.intbitset import intbitset
from invenio.legacy.websearch_external_collections import \
external_collection_load_states, \
dico_collection_external_searches, \
external_collection_sort_engine_by_name
-from invenio.bibtask import task_init, task_get_option, task_set_option, \
+from invenio.legacy.bibsched.bibtask import task_init, task_get_option, task_set_option, \
write_message, task_has_option, task_update_progress, \
task_sleep_now_if_required
import invenio.legacy.template
websearch_templates = invenio.legacy.template.load('websearch')
from invenio.legacy.websearch_external_collections.websearch_external_collections_searcher import external_collections_dictionary
from invenio.legacy.websearch_external_collections.websearch_external_collections_config import CFG_EXTERNAL_COLLECTION_TIMEOUT
from invenio.legacy.websearch_external_collections.websearch_external_collections_config import CFG_HOSTED_COLLECTION_TIMEOUT_NBRECS
from invenio.base.signals import webcoll_after_webpage_cache_update
## global vars
COLLECTION_HOUSE = {} # will hold collections we treat in this run of the program; a dict of {collname2, collobject1}, ...
# CFG_CACHE_LAST_UPDATED_TIMESTAMP_TOLERANCE -- cache timestamp
# tolerance (in seconds), to account for the fact that an admin might
# accidentally happen to edit the collection definitions at exactly
# the same second when some webcoll process was about to be started.
# In order to be safe, let's put an exaggerated timestamp tolerance
# value such as 20 seconds:
CFG_CACHE_LAST_UPDATED_TIMESTAMP_TOLERANCE = 20
# CFG_CACHE_LAST_UPDATED_TIMESTAMP_FILE -- location of the cache
# timestamp file:
CFG_CACHE_LAST_UPDATED_TIMESTAMP_FILE = "%s/collections/last_updated" % CFG_CACHEDIR
# CFG_CACHE_LAST_FAST_UPDATED_TIMESTAMP_FILE -- location of the cache
# timestamp file usef when running webcoll in the fast-mode.
CFG_CACHE_LAST_FAST_UPDATED_TIMESTAMP_FILE = "%s/collections/last_fast_updated" % CFG_CACHEDIR
def get_collection(colname):
"""Return collection object from the collection house for given colname.
If does not exist, then create it."""
if not COLLECTION_HOUSE.has_key(colname):
colobject = Collection(colname)
COLLECTION_HOUSE[colname] = colobject
return COLLECTION_HOUSE[colname]
## auxiliary functions:
def is_selected(var, fld):
"Checks if the two are equal, and if yes, returns ' selected'. Useful for select boxes."
if var == fld:
return ' selected="selected"'
else:
return ""
def get_field(recID, tag):
"Gets list of field 'tag' for the record with 'recID' system number."
out = []
digit = tag[0:2]
bx = "bib%sx" % digit
bibx = "bibrec_bib%sx" % digit
query = "SELECT bx.value FROM %s AS bx, %s AS bibx WHERE bibx.id_bibrec='%s' AND bx.id=bibx.id_bibxxx AND bx.tag='%s'" \
% (bx, bibx, recID, tag)
res = run_sql(query)
for row in res:
out.append(row[0])
return out
def check_nbrecs_for_all_external_collections():
"""Check if any of the external collections have changed their total number of records, aka nbrecs.
Return True if any of the total numbers of records have changed and False if they're all the same."""
res = run_sql("SELECT name FROM collection WHERE dbquery LIKE 'hostedcollection:%';")
for row in res:
coll_name = row[0]
if (get_collection(coll_name)).check_nbrecs_for_external_collection():
return True
return False
class Collection:
"Holds the information on collections (id,name,dbquery)."
def __init__(self, name=""):
"Creates collection instance by querying the DB configuration database about 'name'."
self.calculate_reclist_run_already = 0 # to speed things up without much refactoring
self.update_reclist_run_already = 0 # to speed things up without much refactoring
self.reclist_updated_since_start = 0 # to check if webpage cache need rebuilding
self.reclist_with_nonpublic_subcolls = intbitset()
# used to store the temporary result of the calculation of nbrecs of an external collection
self.nbrecs_tmp = None
if not name:
self.name = CFG_SITE_NAME # by default we are working on the home page
self.id = 1
self.dbquery = None
self.nbrecs = None
self.reclist = intbitset()
self.old_reclist = intbitset()
self.reclist_updated_since_start = 1
else:
self.name = name
try:
res = run_sql("""SELECT id,name,dbquery,nbrecs,reclist FROM collection
WHERE name=%s""", (name,))
if res:
self.id = res[0][0]
self.name = res[0][1]
self.dbquery = res[0][2]
self.nbrecs = res[0][3]
try:
self.reclist = intbitset(res[0][4])
except:
self.reclist = intbitset()
self.reclist_updated_since_start = 1
else: # collection does not exist!
self.id = None
self.dbquery = None
self.nbrecs = None
self.reclist = intbitset()
self.reclist_updated_since_start = 1
self.old_reclist = intbitset(self.reclist)
except Error, e:
print "Error %d: %s" % (e.args[0], e.args[1])
sys.exit(1)
def get_example_search_queries(self):
"""Returns list of sample search queries for this collection.
"""
res = run_sql("""SELECT example.body FROM example
LEFT JOIN collection_example on example.id=collection_example.id_example
WHERE collection_example.id_collection=%s ORDER BY collection_example.score""", (self.id,))
return [query[0] for query in res]
def get_name(self, ln=CFG_SITE_LANG, name_type="ln", prolog="", epilog="", prolog_suffix=" ", epilog_suffix=""):
"""Return nicely formatted collection name for language LN.
The NAME_TYPE may be 'ln' (=long name), 'sn' (=short name), etc."""
out = prolog
i18name = ""
res = run_sql("SELECT value FROM collectionname WHERE id_collection=%s AND ln=%s AND type=%s", (self.id, ln, name_type))
try:
i18name += res[0][0]
except IndexError:
pass
if i18name:
out += i18name
else:
out += self.name
out += epilog
return out
def get_ancestors(self):
"Returns list of ancestors of the current collection."
ancestors = []
ancestors_ids = intbitset()
id_son = self.id
while 1:
query = "SELECT cc.id_dad,c.name FROM collection_collection AS cc, collection AS c "\
"WHERE cc.id_son=%d AND c.id=cc.id_dad" % int(id_son)
res = run_sql(query, None, 1)
if res:
col_ancestor = get_collection(res[0][1])
# looking for loops
if self.id in ancestors_ids:
write_message("Loop found in collection %s" % self.name, stream=sys.stderr)
raise OverflowError("Loop found in collection %s" % self.name)
else:
ancestors.append(col_ancestor)
ancestors_ids.add(col_ancestor.id)
id_son = res[0][0]
else:
break
ancestors.reverse()
return ancestors
def restricted_p(self):
"""Predicate to test if the collection is restricted or not. Return the contect of the
`restrited' column of the collection table (typically Apache group). Otherwise return
None if the collection is public."""
if collection_restricted_p(self.name):
return 1
return None
def get_sons(self, type='r'):
"Returns list of direct sons of type 'type' for the current collection."
sons = []
id_dad = self.id
query = "SELECT cc.id_son,c.name FROM collection_collection AS cc, collection AS c "\
"WHERE cc.id_dad=%d AND cc.type='%s' AND c.id=cc.id_son ORDER BY score DESC, c.name ASC" % (int(id_dad), type)
res = run_sql(query)
for row in res:
sons.append(get_collection(row[1]))
return sons
def get_descendants(self, type='r'):
"Returns list of all descendants of type 'type' for the current collection."
descendants = []
descendant_ids = intbitset()
id_dad = self.id
query = "SELECT cc.id_son,c.name FROM collection_collection AS cc, collection AS c "\
"WHERE cc.id_dad=%d AND cc.type='%s' AND c.id=cc.id_son ORDER BY score DESC" % (int(id_dad), type)
res = run_sql(query)
for row in res:
col_desc = get_collection(row[1])
# looking for loops
if self.id in descendant_ids:
write_message("Loop found in collection %s" % self.name, stream=sys.stderr)
raise OverflowError("Loop found in collection %s" % self.name)
else:
descendants.append(col_desc)
descendant_ids.add(col_desc.id)
tmp_descendants = col_desc.get_descendants()
for descendant in tmp_descendants:
descendant_ids.add(descendant.id)
descendants += tmp_descendants
return descendants
def write_cache_file(self, filename='', filebody={}):
"Write a file inside collection cache."
# open file:
dirname = "%s/collections" % (CFG_CACHEDIR)
mymkdir(dirname)
fullfilename = dirname + "/%s.html" % filename
try:
os.umask(022)
f = open(fullfilename, "wb")
except IOError, v:
try:
(code, message) = v
except:
code = 0
message = v
print "I/O Error: " + str(message) + " (" + str(code) + ")"
sys.exit(1)
# print user info:
write_message("... creating %s" % fullfilename, verbose=6)
sys.stdout.flush()
# print page body:
cPickle.dump(filebody, f, cPickle.HIGHEST_PROTOCOL)
# close file:
f.close()
def update_webpage_cache(self, lang):
"""Create collection page header, navtrail, body (including left and right stripes) and footer, and
call write_cache_file() afterwards to update the collection webpage cache."""
return {} ## webpage cache update is not really needed in
## Invenio-on-Flask, so let's return quickly here
## for great speed-up benefit
## precalculate latest additions for non-aggregate
## collections (the info is ln and as independent)
if self.dbquery:
if CFG_WEBSEARCH_I18N_LATEST_ADDITIONS:
self.create_latest_additions_info(ln=lang)
else:
self.create_latest_additions_info()
# load the right message language
_ = gettext_set_language(lang)
# create dictionary with data
cache = {"te_portalbox" : self.create_portalbox(lang, 'te'),
"np_portalbox" : self.create_portalbox(lang, 'np'),
"ne_portalbox" : self.create_portalbox(lang, 'ne'),
"tp_portalbox" : self.create_portalbox(lang, "tp"),
"lt_portalbox" : self.create_portalbox(lang, "lt"),
"rt_portalbox" : self.create_portalbox(lang, "rt"),
"last_updated" : convert_datestruct_to_dategui(time.localtime(),
ln=lang)}
for aas in CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES: # do light, simple and advanced search pages:
cache["navtrail_%s" % aas] = self.create_navtrail_links(aas, lang)
cache["searchfor_%s" % aas] = self.create_searchfor(aas, lang)
cache["narrowsearch_%s" % aas] = self.create_narrowsearch(aas, lang, 'r')
cache["focuson_%s" % aas] = self.create_narrowsearch(aas, lang, "v")+ \
self.create_external_collections_box(lang)
cache["instantbrowse_%s" % aas] = self.create_instant_browse(aas=aas, ln=lang)
# write cache file
self.write_cache_file("%s-ln=%s"%(self.name, lang), cache)
return cache
def create_navtrail_links(self, aas=CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE, ln=CFG_SITE_LANG):
"""Creates navigation trail links, i.e. links to collection
ancestors (except Home collection). If aas==1, then links to
Advanced Search interfaces; otherwise Simple Search.
"""
dads = []
for dad in self.get_ancestors():
if dad.name != CFG_SITE_NAME: # exclude Home collection
dads.append((dad.name, dad.get_name(ln)))
return websearch_templates.tmpl_navtrail_links(
aas=aas, ln=ln, dads=dads)
def create_portalbox(self, lang=CFG_SITE_LANG, position="rt"):
"""Creates portalboxes of language CFG_SITE_LANG of the position POSITION by consulting DB configuration database.
The position may be: 'lt'='left top', 'rt'='right top', etc."""
out = ""
query = "SELECT p.title,p.body FROM portalbox AS p, collection_portalbox AS cp "\
" WHERE cp.id_collection=%d AND p.id=cp.id_portalbox AND cp.ln='%s' AND cp.position='%s' "\
" ORDER BY cp.score DESC" % (self.id, lang, position)
res = run_sql(query)
for row in res:
title, body = row[0], row[1]
if title:
out += websearch_templates.tmpl_portalbox(title = title,
body = body)
else:
# no title specified, so print body ``as is'' only:
out += body
return out
def create_narrowsearch(self, aas=CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE, ln=CFG_SITE_LANG, type="r"):
"""Creates list of collection descendants of type 'type' under title 'title'.
If aas==1, then links to Advanced Search interfaces; otherwise Simple Search.
Suitable for 'Narrow search' and 'Focus on' boxes."""
# get list of sons and analyse it
sons = self.get_sons(type)
if not sons:
return ''
# get descendents
descendants = self.get_descendants(type)
grandsons = []
if CFG_WEBSEARCH_NARROW_SEARCH_SHOW_GRANDSONS:
# load grandsons for each son
for son in sons:
grandsons.append(son.get_sons())
# return ""
return websearch_templates.tmpl_narrowsearch(
aas = aas,
ln = ln,
type = type,
father = self,
has_grandchildren = len(descendants)>len(sons),
sons = sons,
display_grandsons = CFG_WEBSEARCH_NARROW_SEARCH_SHOW_GRANDSONS,
grandsons = grandsons
)
def create_external_collections_box(self, ln=CFG_SITE_LANG):
external_collection_load_states()
if not dico_collection_external_searches.has_key(self.id):
return ""
engines_list = external_collection_sort_engine_by_name(dico_collection_external_searches[self.id])
return websearch_templates.tmpl_searchalso(ln, engines_list, self.id)
def create_latest_additions_info(self, rg=CFG_WEBSEARCH_INSTANT_BROWSE, ln=CFG_SITE_LANG):
"""
Create info about latest additions that will be used for
create_instant_browse() later.
"""
self.latest_additions_info = []
if self.nbrecs and self.reclist:
# firstly, get last 'rg' records:
recIDs = list(self.reclist)
of = 'hb'
# CERN hack begins: tweak latest additions for selected collections:
if CFG_CERN_SITE:
# alter recIDs list for some CERN collections:
this_year = time.strftime("%Y", time.localtime())
if self.name in ['CERN Yellow Reports','Videos']:
last_year = str(int(this_year) - 1)
# detect recIDs only from this and past year:
recIDs = list(self.reclist & \
search_pattern_parenthesised(p='year:%s or year:%s' % \
(this_year, last_year)))
elif self.name in ['VideosXXX']:
# detect recIDs only from this year:
recIDs = list(self.reclist & \
search_pattern_parenthesised(p='year:%s' % this_year))
elif self.name == 'CMS Physics Analysis Summaries' and \
1281585 in self.reclist:
# REALLY, REALLY temporary hack
recIDs = list(self.reclist)
recIDs.remove(1281585)
# apply special filters:
if self.name in ['Videos']:
# select only videos with movies:
recIDs = list(intbitset(recIDs) & \
search_pattern_parenthesised(p='collection:"PUBLVIDEOMOVIE"'))
of = 'hvp'
# sort some CERN collections specially:
if self.name in ['Videos',
'Video Clips',
'Video Movies',
'Video News',
'Video Rushes',
'Webcast',
'ATLAS Videos',
'Restricted Video Movies',
'Restricted Video Rushes',
'LHC First Beam Videos',
'CERN openlab Videos']:
recIDs = sort_records(None, recIDs, '269__c')
elif self.name in ['LHCb Talks']:
recIDs = sort_records(None, recIDs, 'reportnumber')
# CERN hack ends.
total = len(recIDs)
to_display = min(rg, total)
for idx in range(total-1, total-to_display-1, -1):
recid = recIDs[idx]
self.latest_additions_info.append({'id': recid,
'format': format_record(recid, of, ln=ln),
'date': get_creation_date(recid, fmt="%Y-%m-%d<br />%H:%i")})
return
def create_instant_browse(self, rg=CFG_WEBSEARCH_INSTANT_BROWSE, aas=CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE, ln=CFG_SITE_LANG):
"Searches database and produces list of last 'rg' records."
if self.restricted_p():
return websearch_templates.tmpl_box_restricted_content(ln = ln)
if str(self.dbquery).startswith("hostedcollection:"):
return websearch_templates.tmpl_box_hosted_collection(ln = ln)
if rg == 0:
# do not show latest additions box
return ""
# CERN hack: do not display latest additions for some CERN collections:
if CFG_CERN_SITE and self.name in ['Periodicals', 'Electronic Journals',
'Press Office Photo Selection',
'Press Office Video Selection']:
return ""
try:
self.latest_additions_info
latest_additions_info_p = True
except:
latest_additions_info_p = False
if latest_additions_info_p:
passIDs = []
for idx in range(0, min(len(self.latest_additions_info), rg)):
# CERN hack: display the records in a grid layout, so do not show the related links
if CFG_CERN_SITE and self.name in ['Videos']:
passIDs.append({'id': self.latest_additions_info[idx]['id'],
'body': self.latest_additions_info[idx]['format'],
'date': self.latest_additions_info[idx]['date']})
else:
passIDs.append({'id': self.latest_additions_info[idx]['id'],
'body': self.latest_additions_info[idx]['format'] + \
websearch_templates.tmpl_record_links(recid=self.latest_additions_info[idx]['id'],
rm='citation',
ln=ln),
'date': self.latest_additions_info[idx]['date']})
if self.nbrecs > rg:
url = websearch_templates.build_search_url(
cc=self.name, jrec=rg+1, ln=ln, aas=aas)
else:
url = ""
# CERN hack: display the records in a grid layout
if CFG_CERN_SITE and self.name in ['Videos']:
return websearch_templates.tmpl_instant_browse(
aas=aas, ln=ln, recids=passIDs, more_link=url, grid_layout=True)
return websearch_templates.tmpl_instant_browse(
aas=aas, ln=ln, recids=passIDs, more_link=url)
return websearch_templates.tmpl_box_no_records(ln=ln)
def create_searchoptions(self):
"Produces 'Search options' portal box."
box = ""
query = """SELECT DISTINCT(cff.id_field),f.code,f.name FROM collection_field_fieldvalue AS cff, field AS f
WHERE cff.id_collection=%d AND cff.id_fieldvalue IS NOT NULL AND cff.id_field=f.id
ORDER BY cff.score DESC""" % self.id
res = run_sql(query)
if res:
for row in res:
field_id = row[0]
field_code = row[1]
field_name = row[2]
query_bis = """SELECT fv.value,fv.name FROM fieldvalue AS fv, collection_field_fieldvalue AS cff
WHERE cff.id_collection=%d AND cff.type='seo' AND cff.id_field=%d AND fv.id=cff.id_fieldvalue
ORDER BY cff.score_fieldvalue DESC, cff.score DESC, fv.name ASC""" % (self.id, field_id)
res_bis = run_sql(query_bis)
if res_bis:
values = [{'value' : '', 'text' : 'any' + ' ' + field_name}] # FIXME: internationalisation of "any"
for row_bis in res_bis:
values.append({'value' : cgi.escape(row_bis[0], 1), 'text' : row_bis[1]})
box += websearch_templates.tmpl_select(
fieldname = field_code,
values = values
)
return box
def create_sortoptions(self, ln=CFG_SITE_LANG):
"""Produces 'Sort options' portal box."""
# load the right message language
_ = gettext_set_language(ln)
box = ""
query = """SELECT f.code,f.name FROM field AS f, collection_field_fieldvalue AS cff
WHERE id_collection=%d AND cff.type='soo' AND cff.id_field=f.id
ORDER BY cff.score DESC, f.name ASC""" % self.id
values = [{'value' : '', 'text': "- %s -" % _("latest first")}]
res = run_sql(query)
if res:
for row in res:
values.append({'value' : row[0], 'text': get_field_i18nname(row[1], ln)})
else:
for tmp in ('title', 'author', 'report number', 'year'):
values.append({'value' : tmp.replace(' ', ''), 'text' : get_field_i18nname(tmp, ln)})
box = websearch_templates.tmpl_select(
fieldname = 'sf',
css_class = 'address',
values = values
)
box += websearch_templates.tmpl_select(
fieldname = 'so',
css_class = 'address',
values = [
{'value' : 'a' , 'text' : _("asc.")},
{'value' : 'd' , 'text' : _("desc.")}
]
)
return box
def create_rankoptions(self, ln=CFG_SITE_LANG):
"Produces 'Rank options' portal box."
# load the right message language
_ = gettext_set_language(ln)
values = [{'value' : '', 'text': "- %s %s -" % (string.lower(_("OR")), _("rank by"))}]
for (code, name) in get_bibrank_methods(self.id, ln):
values.append({'value' : code, 'text': name})
box = websearch_templates.tmpl_select(
fieldname = 'rm',
css_class = 'address',
values = values
)
return box
def create_displayoptions(self, ln=CFG_SITE_LANG):
"Produces 'Display options' portal box."
# load the right message language
_ = gettext_set_language(ln)
values = []
for i in ['10', '25', '50', '100', '250', '500']:
values.append({'value' : i, 'text' : i + ' ' + _("results")})
box = websearch_templates.tmpl_select(
fieldname = 'rg',
selected = str(CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS),
css_class = 'address',
values = values
)
if self.get_sons():
box += websearch_templates.tmpl_select(
fieldname = 'sc',
css_class = 'address',
values = [
{'value' : '1' , 'text' : _("split by collection")},
{'value' : '0' , 'text' : _("single list")}
]
)
return box
def create_formatoptions(self, ln=CFG_SITE_LANG):
"Produces 'Output format options' portal box."
# load the right message language
_ = gettext_set_language(ln)
box = ""
values = []
query = """SELECT f.code,f.name FROM format AS f, collection_format AS cf
WHERE cf.id_collection=%d AND cf.id_format=f.id AND f.visibility='1'
ORDER BY cf.score DESC, f.name ASC""" % self.id
res = run_sql(query)
if res:
for row in res:
values.append({'value' : row[0], 'text': row[1]})
else:
values.append({'value' : 'hb', 'text' : "HTML %s" % _("brief")})
box = websearch_templates.tmpl_select(
fieldname = 'of',
css_class = 'address',
values = values
)
return box
def create_searchwithin_selection_box(self, fieldname='f', value='', ln='en'):
"""Produces 'search within' selection box for the current collection."""
# get values
query = """SELECT f.code,f.name FROM field AS f, collection_field_fieldvalue AS cff
WHERE cff.type='sew' AND cff.id_collection=%d AND cff.id_field=f.id
ORDER BY cff.score DESC, f.name ASC""" % self.id
res = run_sql(query)
values = [{'value' : '', 'text' : get_field_i18nname("any field", ln)}]
if res:
for row in res:
values.append({'value' : row[0], 'text' : get_field_i18nname(row[1], ln)})
else:
if CFG_CERN_SITE:
for tmp in ['title', 'author', 'abstract', 'report number', 'year']:
values.append({'value' : tmp.replace(' ', ''), 'text' : get_field_i18nname(tmp, ln)})
else:
for tmp in ['title', 'author', 'abstract', 'keyword', 'report number', 'journal', 'year', 'fulltext', 'reference']:
values.append({'value' : tmp.replace(' ', ''), 'text' : get_field_i18nname(tmp, ln)})
return websearch_templates.tmpl_searchwithin_select(
fieldname = fieldname,
ln = ln,
selected = value,
values = values
)
def create_searchexample(self):
"Produces search example(s) for the current collection."
out = "$collSearchExamples = getSearchExample(%d, $se);" % self.id
return out
def create_searchfor(self, aas=CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE, ln=CFG_SITE_LANG):
"Produces either Simple or Advanced 'Search for' box for the current collection."
if aas == 1:
return self.create_searchfor_advanced(ln)
elif aas == 0:
return self.create_searchfor_simple(ln)
else:
return self.create_searchfor_light(ln)
def create_searchfor_light(self, ln=CFG_SITE_LANG):
"Produces light 'Search for' box for the current collection."
return websearch_templates.tmpl_searchfor_light(
ln=ln,
collection_id = self.name,
collection_name=self.get_name(ln=ln),
record_count=self.nbrecs,
example_search_queries=self.get_example_search_queries(),
)
def create_searchfor_simple(self, ln=CFG_SITE_LANG):
"Produces simple 'Search for' box for the current collection."
return websearch_templates.tmpl_searchfor_simple(
ln=ln,
collection_id = self.name,
collection_name=self.get_name(ln=ln),
record_count=self.nbrecs,
middle_option = self.create_searchwithin_selection_box(ln=ln),
)
def create_searchfor_advanced(self, ln=CFG_SITE_LANG):
"Produces advanced 'Search for' box for the current collection."
return websearch_templates.tmpl_searchfor_advanced(
ln = ln,
collection_id = self.name,
collection_name=self.get_name(ln=ln),
record_count=self.nbrecs,
middle_option_1 = self.create_searchwithin_selection_box('f1', ln=ln),
middle_option_2 = self.create_searchwithin_selection_box('f2', ln=ln),
middle_option_3 = self.create_searchwithin_selection_box('f3', ln=ln),
searchoptions = self.create_searchoptions(),
sortoptions = self.create_sortoptions(ln),
rankoptions = self.create_rankoptions(ln),
displayoptions = self.create_displayoptions(ln),
formatoptions = self.create_formatoptions(ln)
)
def calculate_reclist(self):
"""Calculate, set and return the (reclist, reclist_with_nonpublic_subcolls) tuple for given collection."""
if self.calculate_reclist_run_already or str(self.dbquery).startswith("hostedcollection:"):
# do we have to recalculate?
return (self.reclist, self.reclist_with_nonpublic_subcolls)
write_message("... calculating reclist of %s" % self.name, verbose=6)
reclist = intbitset() # will hold results for public sons only; good for storing into DB
reclist_with_nonpublic_subcolls = intbitset() # will hold results for both public and nonpublic sons; good for deducing total
# number of documents
if not self.dbquery:
# A - collection does not have dbquery, so query recursively all its sons
# that are either non-restricted or that have the same restriction rules
for coll in self.get_sons():
coll_reclist, coll_reclist_with_nonpublic_subcolls = coll.calculate_reclist()
if ((coll.restricted_p() is None) or
(coll.restricted_p() == self.restricted_p())):
# add this reclist ``for real'' only if it is public
reclist.union_update(coll_reclist)
reclist_with_nonpublic_subcolls.union_update(coll_reclist_with_nonpublic_subcolls)
else:
# B - collection does have dbquery, so compute it:
# (note: explicitly remove DELETED records)
if CFG_CERN_SITE:
reclist = search_pattern_parenthesised(None, self.dbquery + \
' -980__:"DELETED" -980__:"DUMMY"', ap=-9) #ap=-9 for allow queries containing hidden tags
else:
reclist = search_pattern_parenthesised(None, self.dbquery + ' -980__:"DELETED"', ap=-9) #ap=-9 allow queries containing hidden tags
reclist_with_nonpublic_subcolls = copy.deepcopy(reclist)
# store the results:
self.nbrecs = len(reclist_with_nonpublic_subcolls)
self.reclist = reclist
self.reclist_with_nonpublic_subcolls = reclist_with_nonpublic_subcolls
# last but not least, update the speed-up flag:
self.calculate_reclist_run_already = 1
# return the two sets:
return (self.reclist, self.reclist_with_nonpublic_subcolls)
def calculate_nbrecs_for_external_collection(self, timeout=CFG_EXTERNAL_COLLECTION_TIMEOUT):
"""Calculate the total number of records, aka nbrecs, for given external collection."""
#if self.calculate_reclist_run_already:
# do we have to recalculate?
#return self.nbrecs
#write_message("... calculating nbrecs of external collection %s" % self.name, verbose=6)
if external_collections_dictionary.has_key(self.name):
engine = external_collections_dictionary[self.name]
if engine.parser:
self.nbrecs_tmp = engine.parser.parse_nbrecs(timeout)
if self.nbrecs_tmp >= 0: return self.nbrecs_tmp
# the parse_nbrecs() function returns negative values for some specific cases
# maybe we can handle these specific cases, some warnings or something
# for now the total number of records remains silently the same
else: return self.nbrecs
else: write_message("External collection %s does not have a parser!" % self.name, verbose=6)
else: write_message("External collection %s not found!" % self.name, verbose=6)
return 0
# last but not least, update the speed-up flag:
#self.calculate_reclist_run_already = 1
def check_nbrecs_for_external_collection(self):
"""Check if the external collections has changed its total number of records, aka nbrecs.
Rerurns True if the total number of records has changed and False if it's the same"""
write_message("*** self.nbrecs = %s / self.cal...ion = %s ***" % (str(self.nbrecs), str(self.calculate_nbrecs_for_external_collection())), verbose=6)
write_message("*** self.nbrecs != self.cal...ion = %s ***" % (str(self.nbrecs != self.calculate_nbrecs_for_external_collection()),), verbose=6)
return self.nbrecs != self.calculate_nbrecs_for_external_collection(CFG_HOSTED_COLLECTION_TIMEOUT_NBRECS)
def set_nbrecs_for_external_collection(self):
"""Set this external collection's total number of records, aka nbrecs"""
if self.calculate_reclist_run_already:
# do we have to recalculate?
return
write_message("... calculating nbrecs of external collection %s" % self.name, verbose=6)
if self.nbrecs_tmp:
self.nbrecs = self.nbrecs_tmp
else:
self.nbrecs = self.calculate_nbrecs_for_external_collection(CFG_HOSTED_COLLECTION_TIMEOUT_NBRECS)
# last but not least, update the speed-up flag:
self.calculate_reclist_run_already = 1
def update_reclist(self):
"Update the record universe for given collection; nbrecs, reclist of the collection table."
if self.update_reclist_run_already:
# do we have to reupdate?
return 0
write_message("... updating reclist of %s (%s recs)" % (self.name, self.nbrecs), verbose=6)
sys.stdout.flush()
try:
## In principle we could skip this update if old_reclist==reclist
## however we just update it here in case of race-conditions.
run_sql("UPDATE collection SET nbrecs=%s, reclist=%s WHERE id=%s",
(self.nbrecs, self.reclist.fastdump(), self.id))
if self.old_reclist != self.reclist:
self.reclist_updated_since_start = 1
else:
write_message("... no changes in reclist detected", verbose=6)
except Error, e:
print "Database Query Error %d: %s." % (e.args[0], e.args[1])
sys.exit(1)
# last but not least, update the speed-up flag:
self.update_reclist_run_already = 1
return 0
def perform_display_collection(colID, colname, aas, ln, em, show_help_boxes):
"""Returns the data needed to display a collection page
The arguments are as follows:
colID - id of the collection to display
colname - name of the collection to display
aas - 0 if simple search, 1 if advanced search
ln - language of the page
em - code to display just part of the page
show_help_boxes - whether to show the help boxes or not"""
# check and update cache if necessary
try:
cachedfile = open("%s/collections/%s-ln=%s.html" % \
(CFG_CACHEDIR, colname, ln), "rb")
data = cPickle.load(cachedfile)
cachedfile.close()
except:
data = get_collection(colname).update_webpage_cache(ln)
# check em value to return just part of the page
if em != "":
if EM_REPOSITORY["search_box"] not in em:
data["searchfor_%s" % aas] = ""
if EM_REPOSITORY["see_also_box"] not in em:
data["focuson_%s" % aas] = ""
if EM_REPOSITORY["all_portalboxes"] not in em:
if EM_REPOSITORY["te_portalbox"] not in em:
data["te_portalbox"] = ""
if EM_REPOSITORY["np_portalbox"] not in em:
data["np_portalbox"] = ""
if EM_REPOSITORY["ne_portalbox"] not in em:
data["ne_portalbox"] = ""
if EM_REPOSITORY["tp_portalbox"] not in em:
data["tp_portalbox"] = ""
if EM_REPOSITORY["lt_portalbox"] not in em:
data["lt_portalbox"] = ""
if EM_REPOSITORY["rt_portalbox"] not in em:
data["rt_portalbox"] = ""
c_body = websearch_templates.tmpl_webcoll_body(ln, colID, data["te_portalbox"],
data["searchfor_%s"%aas], data["np_portalbox"], data["narrowsearch_%s"%aas],
data["focuson_%s"%aas], data["instantbrowse_%s"%aas], data["ne_portalbox"],
em=="" or EM_REPOSITORY["body"] in em)
if show_help_boxes <= 0:
data["rt_portalbox"] = ""
return (c_body, data["navtrail_%s"%aas], data["lt_portalbox"], data["rt_portalbox"],
data["tp_portalbox"], data["te_portalbox"], data["last_updated"])
def get_datetime(var, format_string="%Y-%m-%d %H:%M:%S"):
"""Returns a date string according to the format string.
It can handle normal date strings and shifts with respect
to now."""
date = time.time()
shift_re = re.compile("([-\+]{0,1})([\d]+)([dhms])")
factors = {"d":24*3600, "h":3600, "m":60, "s":1}
m = shift_re.match(var)
if m:
sign = m.groups()[0] == "-" and -1 or 1
factor = factors[m.groups()[2]]
value = float(m.groups()[1])
date = time.localtime(date + sign * factor * value)
date = strftime(format_string, date)
else:
date = time.strptime(var, format_string)
date = strftime(format_string, date)
return date
def get_current_time_timestamp():
"""Return timestamp corresponding to the current time."""
return time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
def compare_timestamps_with_tolerance(timestamp1,
timestamp2,
tolerance=0):
"""Compare two timestamps TIMESTAMP1 and TIMESTAMP2, of the form
'2005-03-31 17:37:26'. Optionally receives a TOLERANCE argument
(in seconds). Return -1 if TIMESTAMP1 is less than TIMESTAMP2
minus TOLERANCE, 0 if they are equal within TOLERANCE limit,
and 1 if TIMESTAMP1 is greater than TIMESTAMP2 plus TOLERANCE.
"""
# remove any trailing .00 in timestamps:
timestamp1 = re.sub(r'\.[0-9]+$', '', timestamp1)
timestamp2 = re.sub(r'\.[0-9]+$', '', timestamp2)
# first convert timestamps to Unix epoch seconds:
timestamp1_seconds = calendar.timegm(time.strptime(timestamp1, "%Y-%m-%d %H:%M:%S"))
timestamp2_seconds = calendar.timegm(time.strptime(timestamp2, "%Y-%m-%d %H:%M:%S"))
# now compare them:
if timestamp1_seconds < timestamp2_seconds - tolerance:
return -1
elif timestamp1_seconds > timestamp2_seconds + tolerance:
return 1
else:
return 0
def get_database_last_updated_timestamp():
"""Return last updated timestamp for collection-related and
record-related database tables.
"""
database_tables_timestamps = []
database_tables_timestamps.append(get_table_update_time('bibrec'))
database_tables_timestamps.append(get_table_update_time('bibfmt'))
try:
database_tables_timestamps.append(get_table_update_time('idxWORD%'))
except ValueError:
# There are no indexes in the database. That's OK.
pass
database_tables_timestamps.append(get_table_update_time('collection%'))
database_tables_timestamps.append(get_table_update_time('portalbox'))
database_tables_timestamps.append(get_table_update_time('field%'))
database_tables_timestamps.append(get_table_update_time('format%'))
database_tables_timestamps.append(get_table_update_time('rnkMETHODNAME'))
database_tables_timestamps.append(get_table_update_time('accROLE_accACTION_accARGUMENT', run_on_slave=True))
return max(database_tables_timestamps)
def get_cache_last_updated_timestamp():
"""Return last updated cache timestamp."""
try:
f = open(CFG_CACHE_LAST_UPDATED_TIMESTAMP_FILE, "r")
except:
return "1970-01-01 00:00:00"
timestamp = f.read()
f.close()
return timestamp
def set_cache_last_updated_timestamp(timestamp):
"""Set last updated cache timestamp to TIMESTAMP."""
try:
with open(CFG_CACHE_LAST_UPDATED_TIMESTAMP_FILE, "w") as f:
f.write(timestamp)
except:
# FIXME: do something here
pass
return timestamp
def task_submit_elaborate_specific_parameter(key, value, opts, args):
""" Given the string key it checks it's meaning, eventually using the value.
Usually it fills some key in the options dict.
It must return True if it has elaborated the key, False, if it doesn't
know that key.
eg:
if key in ['-n', '--number']:
self.options['number'] = value
return True
return False
"""
if key in ("-c", "--collection"):
task_set_option("collection", value)
elif key in ("-r", "--recursive"):
task_set_option("recursive", 1)
elif key in ("-f", "--force"):
task_set_option("force", 1)
elif key in ("-q", "--quick"):
task_set_option("quick", 1)
elif key in ("-p", "--part"):
task_set_option("part", int(value))
elif key in ("-l", "--language"):
languages = task_get_option("language", [])
languages += value.split(',')
for ln in languages:
if ln not in CFG_SITE_LANGS:
print 'ERROR: "%s" is not a recognized language code' % ln
return False
task_set_option("language", languages)
else:
return False
return True
def task_submit_check_options():
if task_has_option('collection'):
coll = get_collection(task_get_option("collection"))
if coll.id is None:
print 'ERROR: Collection "%s" does not exist' % coll.name
return False
return True
def task_run_core():
""" Reimplement to add the body of the task."""
##
## ------->--->time--->------>
## (-1) | ( 0) | ( 1)
## | | |
## [T.db] | [T.fc] | [T.db]
## | | |
## |<-tol|tol->|
##
## the above is the compare_timestamps_with_tolerance result "diagram"
## [T.db] stands fore the database timestamp and [T.fc] for the file cache timestamp
## ( -1, 0, 1) stand for the returned value
## tol stands for the tolerance in seconds
##
## When a record has been added or deleted from one of the collections the T.db becomes greater that the T.fc
## and when webcoll runs it is fully ran. It recalculates the reclists and nbrecs, and since it updates the
## collections db table it also updates the T.db. The T.fc is set as the moment the task started running thus
## slightly before the T.db (practically the time distance between the start of the task and the last call of
## update_reclist). Therefore when webcoll runs again, and even if no database changes have taken place in the
## meanwhile, it fully runs (because compare_timestamps_with_tolerance returns 0). This time though, and if
## no databases changes have taken place, the T.db remains the same while T.fc is updated and as a result if
## webcoll runs again it will not be fully ran
##
task_run_start_timestamp = get_current_time_timestamp()
colls = []
# decide whether we need to run or not, by comparing last updated timestamps:
write_message("Database timestamp is %s." % get_database_last_updated_timestamp(), verbose=3)
write_message("Collection cache timestamp is %s." % get_cache_last_updated_timestamp(), verbose=3)
if task_has_option("part"):
write_message("Running cache update part %s only." % task_get_option("part"), verbose=3)
if check_nbrecs_for_all_external_collections() or task_has_option("force") or \
compare_timestamps_with_tolerance(get_database_last_updated_timestamp(),
get_cache_last_updated_timestamp(),
CFG_CACHE_LAST_UPDATED_TIMESTAMP_TOLERANCE) >= 0:
## either forced update was requested or cache is not up to date, so recreate it:
# firstly, decide which collections to do:
if task_has_option("collection"):
coll = get_collection(task_get_option("collection"))
colls.append(coll)
if task_has_option("recursive"):
r_type_descendants = coll.get_descendants(type='r')
colls += r_type_descendants
v_type_descendants = coll.get_descendants(type='v')
colls += v_type_descendants
else:
res = run_sql("SELECT name FROM collection ORDER BY id")
for row in res:
colls.append(get_collection(row[0]))
# secondly, update collection reclist cache:
if task_get_option('part', 1) == 1:
i = 0
for coll in colls:
i += 1
write_message("%s / reclist cache update" % coll.name)
if str(coll.dbquery).startswith("hostedcollection:"):
coll.set_nbrecs_for_external_collection()
else:
coll.calculate_reclist()
coll.update_reclist()
task_update_progress("Part 1/2: done %d/%d" % (i, len(colls)))
task_sleep_now_if_required(can_stop_too=True)
# thirdly, update collection webpage cache:
if task_get_option("part", 2) == 2:
i = 0
for coll in colls:
i += 1
if coll.reclist_updated_since_start or task_has_option("collection") or task_get_option("force") or not task_get_option("quick"):
write_message("%s / webpage cache update" % coll.name)
for lang in CFG_SITE_LANGS:
coll.update_webpage_cache(lang)
webcoll_after_webpage_cache_update.send(coll.name, collection=coll, lang=lang)
else:
write_message("%s / webpage cache seems not to need an update and --quick was used" % coll.name, verbose=2)
task_update_progress("Part 2/2: done %d/%d" % (i, len(colls)))
task_sleep_now_if_required(can_stop_too=True)
# finally update the cache last updated timestamp:
# (but only when all collections were updated, not when only
# some of them were forced-updated as per admin's demand)
if not task_has_option("collection"):
set_cache_last_updated_timestamp(task_run_start_timestamp)
write_message("Collection cache timestamp is set to %s." % get_cache_last_updated_timestamp(), verbose=3)
else:
## cache up to date, we don't have to run
write_message("Collection cache is up to date, no need to run.")
## we are done:
return True
### okay, here we go:
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/websearch/webinterface.py b/invenio/legacy/websearch/webinterface.py
index c246d898b..3d7f9abf9 100644
--- a/invenio/legacy/websearch/webinterface.py
+++ b/invenio/legacy/websearch/webinterface.py
@@ -1,1151 +1,1151 @@
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebSearch URL handler."""
__revision__ = "$Id$"
import cgi
import os
import datetime
import time
import sys
from urllib import quote
from invenio.utils import apache
import threading
#maximum number of collaborating authors etc shown in GUI
MAX_COLLAB_LIST = 10
MAX_KEYWORD_LIST = 10
MAX_VENUE_LIST = 10
#tag constants
AUTHOR_TAG = "100__a"
AUTHOR_INST_TAG = "100__u"
COAUTHOR_TAG = "700__a"
COAUTHOR_INST_TAG = "700__u"
VENUE_TAG = "909C4p"
KEYWORD_TAG = "695__a"
FKEYWORD_TAG = "6531_a"
CFG_INSPIRE_UNWANTED_KEYWORDS_START = ['talk',
'conference',
'conference proceedings',
'numerical calculations',
'experimental results',
'review',
'bibliography',
'upper limit',
'lower limit',
'tables',
'search for',
'on-shell',
'off-shell',
'formula',
'lectures',
'book',
'thesis']
CFG_INSPIRE_UNWANTED_KEYWORDS_MIDDLE = ['GeV',
'((']
if sys.hexversion < 0x2040000:
# pylint: disable=W0622
from sets import Set as set
# pylint: enable=W0622
from invenio.config import \
CFG_SITE_URL, \
CFG_SITE_NAME, \
CFG_CACHEDIR, \
CFG_SITE_LANG, \
CFG_SITE_SECURE_URL, \
CFG_BIBRANK_SHOW_DOWNLOAD_STATS, \
CFG_WEBSEARCH_INSTANT_BROWSE_RSS, \
CFG_WEBSEARCH_RSS_TTL, \
CFG_WEBSEARCH_RSS_MAX_CACHED_REQUESTS, \
CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE, \
CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES, \
CFG_WEBDIR, \
CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS, \
CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS, \
CFG_WEBSEARCH_USE_ALEPH_SYSNOS, \
CFG_WEBSEARCH_RSS_I18N_COLLECTIONS, \
CFG_INSPIRE_SITE, \
CFG_WEBSEARCH_WILDCARD_LIMIT, \
CFG_SITE_RECORD
from invenio.legacy.dbquery import Error
from invenio.ext.legacy.handler import wash_urlargd, WebInterfaceDirectory
from invenio.utils.url import redirect_to_url, make_canonical_urlargd, drop_default_urlargd
from invenio.utils.html import get_mathjax_header
from invenio.utils.html import nmtoken_from_string
from invenio.legacy.webuser import getUid, page_not_authorized, get_user_preferences, \
collect_user_info, logoutUser, isUserSuperAdmin
from invenio.legacy.webcomment.webinterface import WebInterfaceCommentsPages
from invenio.legacy.weblinkback.webinterface import WebInterfaceRecordLinkbacksPages
from invenio.legacy.bibcirculation.webinterface import WebInterfaceHoldingsPages
from invenio.legacy.webpage import page, pageheaderonly, create_error_box
from invenio.base.i18n import gettext_set_language
from invenio.legacy.search_engine import check_user_can_view_record, \
collection_reclist_cache, \
collection_restricted_p, \
create_similarly_named_authors_link_box, \
get_colID, \
get_coll_i18nname, \
get_most_popular_field_values, \
get_mysql_recid_from_aleph_sysno, \
guess_primary_collection_of_a_record, \
page_end, \
page_start, \
perform_request_cache, \
perform_request_log, \
perform_request_search, \
restricted_collection_cache, \
get_coll_normalised_name, \
EM_REPOSITORY
-from invenio.websearch_webcoll import perform_display_collection
+from invenio.legacy.websearch.webcoll import perform_display_collection
from invenio.legacy.bibrecord import get_fieldvalues, \
get_fieldvalues_alephseq_like
from invenio.modules.access.engine import acc_authorize_action
from invenio.modules.access.local_config import VIEWRESTRCOLL
from invenio.modules.access.mailcookie import mail_cookie_create_authorize_action
from invenio.modules.formatter import format_records
from invenio.modules.formatter.engine import get_output_formats
-from invenio.websearch_webcoll import get_collection
+from invenio.legacy.websearch.webcoll import get_collection
from invenio.intbitset import intbitset
-from invenio.bibupload import find_record_from_sysno
+from invenio.legacy.bibupload.engine import find_record_from_sysno
from invenio.legacy.bibrank.citation_searcher import get_cited_by_list
from invenio.legacy.bibrank.downloads_indexer import get_download_weight_total
from invenio.legacy.search_engine.summarizer import summarize_records
from invenio.ext.logging import register_exception
from invenio.legacy.bibedit.webinterface import WebInterfaceEditPages
from invenio.bibeditmulti_webinterface import WebInterfaceMultiEditPages
from invenio.legacy.bibmerge.webinterface import WebInterfaceMergePages
from invenio.legacy.bibdocfile.webinterface import WebInterfaceManageDocFilesPages, WebInterfaceFilesPages
from invenio.legacy.search_engine import get_record
from invenio.utils.shell import mymkdir
import invenio.legacy.template
websearch_templates = invenio.legacy.template.load('websearch')
search_results_default_urlargd = websearch_templates.search_results_default_urlargd
search_interface_default_urlargd = websearch_templates.search_interface_default_urlargd
try:
output_formats = [output_format['attrs']['code'].lower() for output_format in \
get_output_formats(with_attributes=True).values()]
except KeyError:
output_formats = ['xd', 'xm', 'hd', 'hb', 'hs', 'hx']
output_formats.extend(['hm', 't', 'h'])
def wash_search_urlargd(form):
"""
Create canonical search arguments from those passed via web form.
"""
argd = wash_urlargd(form, search_results_default_urlargd)
if argd.has_key('as'):
argd['aas'] = argd['as']
del argd['as']
if argd.get('aas', CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE) not in CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES:
argd['aas'] = CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE
# Sometimes, users pass ot=245,700 instead of
# ot=245&ot=700. Normalize that.
ots = []
for ot in argd['ot']:
ots += ot.split(',')
argd['ot'] = ots
# We can either get the mode of function as
# action=<browse|search>, or by setting action_browse or
# action_search.
if argd['action_browse']:
argd['action'] = 'browse'
elif argd['action_search']:
argd['action'] = 'search'
else:
if argd['action'] not in ('browse', 'search'):
argd['action'] = 'search'
del argd['action_browse']
del argd['action_search']
if argd['em'] != "":
argd['em'] = argd['em'].split(",")
return argd
class WebInterfaceUnAPIPages(WebInterfaceDirectory):
""" Handle /unapi set of pages."""
_exports = ['']
def __call__(self, req, form):
argd = wash_urlargd(form, {
'id' : (int, 0),
'format' : (str, '')})
formats_dict = get_output_formats(True)
formats = {}
for format in formats_dict.values():
if format['attrs']['visibility']:
formats[format['attrs']['code'].lower()] = format['attrs']['content_type']
del formats_dict
if argd['id'] and argd['format']:
## Translate back common format names
format = {
'nlm' : 'xn',
'marcxml' : 'xm',
'dc' : 'xd',
'endnote' : 'xe',
'mods' : 'xo'
}.get(argd['format'], argd['format'])
if format in formats:
redirect_to_url(req, '%s/%s/%s/export/%s' % (CFG_SITE_URL, CFG_SITE_RECORD, argd['id'], format))
else:
raise apache.SERVER_RETURN, apache.HTTP_NOT_ACCEPTABLE
elif argd['id']:
return websearch_templates.tmpl_unapi(formats, identifier=argd['id'])
else:
return websearch_templates.tmpl_unapi(formats)
index = __call__
class WebInterfaceRecordPages(WebInterfaceDirectory):
""" Handling of a /CFG_SITE_RECORD/<recid> URL fragment """
_exports = ['', 'files', 'reviews', 'comments', 'usage',
'references', 'export', 'citations', 'holdings', 'edit',
'keywords', 'multiedit', 'merge', 'plots', 'linkbacks']
#_exports.extend(output_formats)
def __init__(self, recid, tab, format=None):
self.recid = recid
self.tab = tab
self.format = format
self.files = WebInterfaceFilesPages(self.recid)
self.reviews = WebInterfaceCommentsPages(self.recid, reviews=1)
self.comments = WebInterfaceCommentsPages(self.recid)
self.usage = self
self.references = self
self.keywords = self
self.holdings = WebInterfaceHoldingsPages(self.recid)
self.citations = self
self.plots = self
self.export = WebInterfaceRecordExport(self.recid, self.format)
self.edit = WebInterfaceEditPages(self.recid)
self.merge = WebInterfaceMergePages(self.recid)
self.linkbacks = WebInterfaceRecordLinkbacksPages(self.recid)
return
def __call__(self, req, form):
argd = wash_search_urlargd(form)
argd['recid'] = self.recid
argd['tab'] = self.tab
if self.format is not None:
argd['of'] = self.format
req.argd = argd
uid = getUid(req)
if uid == -1:
return page_not_authorized(req, "../",
text="You are not authorized to view this record.",
navmenuid='search')
elif uid > 0:
pref = get_user_preferences(uid)
try:
if not form.has_key('rg'):
# fetch user rg preference only if not overridden via URL
argd['rg'] = int(pref['websearch_group_records'])
except (KeyError, ValueError):
pass
user_info = collect_user_info(req)
(auth_code, auth_msg) = check_user_can_view_record(user_info, self.recid)
if argd['rg'] > CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS and acc_authorize_action(req, 'runbibedit')[0] != 0:
argd['rg'] = CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS
#check if the user has rights to set a high wildcard limit
#if not, reduce the limit set by user, with the default one
if CFG_WEBSEARCH_WILDCARD_LIMIT > 0 and (argd['wl'] > CFG_WEBSEARCH_WILDCARD_LIMIT or argd['wl'] == 0):
if acc_authorize_action(req, 'runbibedit')[0] != 0:
argd['wl'] = CFG_WEBSEARCH_WILDCARD_LIMIT
# only superadmins can use verbose parameter for obtaining debug information
if not isUserSuperAdmin(user_info):
argd['verbose'] = 0
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : CFG_SITE_SECURE_URL + req.unparsed_uri}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
return page_not_authorized(req, "../", \
text=auth_msg, \
navmenuid='search')
from invenio.legacy.search_engine import record_exists, get_merged_recid
# check if the current record has been deleted
# and has been merged, case in which the deleted record
# will be redirect to the new one
record_status = record_exists(argd['recid'])
merged_recid = get_merged_recid(argd['recid'])
if record_status == -1 and merged_recid:
url = CFG_SITE_URL + '/' + CFG_SITE_RECORD + '/%s?ln=%s'
url %= (str(merged_recid), argd['ln'])
redirect_to_url(req, url)
elif record_status == -1:
req.status = apache.HTTP_GONE ## The record is gone!
# mod_python does not like to return [] in case when of=id:
out = perform_request_search(req, **argd)
if isinstance(out, intbitset):
return out.fastdump()
elif out == []:
return str(out)
else:
return out
# Return the same page wether we ask for /CFG_SITE_RECORD/123 or /CFG_SITE_RECORD/123/
index = __call__
class WebInterfaceRecordRestrictedPages(WebInterfaceDirectory):
""" Handling of a /record-restricted/<recid> URL fragment """
_exports = ['', 'files', 'reviews', 'comments', 'usage',
'references', 'export', 'citations', 'holdings', 'edit',
'keywords', 'multiedit', 'merge', 'plots', 'linkbacks']
#_exports.extend(output_formats)
def __init__(self, recid, tab, format=None):
self.recid = recid
self.tab = tab
self.format = format
self.files = WebInterfaceFilesPages(self.recid)
self.reviews = WebInterfaceCommentsPages(self.recid, reviews=1)
self.comments = WebInterfaceCommentsPages(self.recid)
self.usage = self
self.references = self
self.keywords = self
self.holdings = WebInterfaceHoldingsPages(self.recid)
self.citations = self
self.plots = self
self.export = WebInterfaceRecordExport(self.recid, self.format)
self.edit = WebInterfaceEditPages(self.recid)
self.merge = WebInterfaceMergePages(self.recid)
self.linkbacks = WebInterfaceRecordLinkbacksPages(self.recid)
return
def __call__(self, req, form):
argd = wash_search_urlargd(form)
argd['recid'] = self.recid
if self.format is not None:
argd['of'] = self.format
req.argd = argd
uid = getUid(req)
user_info = collect_user_info(req)
if uid == -1:
return page_not_authorized(req, "../",
text="You are not authorized to view this record.",
navmenuid='search')
elif uid > 0:
pref = get_user_preferences(uid)
try:
if not form.has_key('rg'):
# fetch user rg preference only if not overridden via URL
argd['rg'] = int(pref['websearch_group_records'])
except (KeyError, ValueError):
pass
if argd['rg'] > CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS and acc_authorize_action(req, 'runbibedit')[0] != 0:
argd['rg'] = CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS
#check if the user has rights to set a high wildcard limit
#if not, reduce the limit set by user, with the default one
if CFG_WEBSEARCH_WILDCARD_LIMIT > 0 and (argd['wl'] > CFG_WEBSEARCH_WILDCARD_LIMIT or argd['wl'] == 0):
if acc_authorize_action(req, 'runbibedit')[0] != 0:
argd['wl'] = CFG_WEBSEARCH_WILDCARD_LIMIT
# only superadmins can use verbose parameter for obtaining debug information
if not isUserSuperAdmin(user_info):
argd['verbose'] = 0
record_primary_collection = guess_primary_collection_of_a_record(self.recid)
if collection_restricted_p(record_primary_collection):
(auth_code, dummy) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=record_primary_collection)
if auth_code:
return page_not_authorized(req, "../",
text="You are not authorized to view this record.",
navmenuid='search')
# Keep all the arguments, they might be reused in the
# record page itself to derivate other queries
req.argd = argd
# mod_python does not like to return [] in case when of=id:
out = perform_request_search(req, **argd)
if isinstance(out, intbitset):
return out.fastdump()
elif out == []:
return str(out)
else:
return out
# Return the same page wether we ask for /CFG_SITE_RECORD/123 or /CFG_SITE_RECORD/123/
index = __call__
class WebInterfaceSearchResultsPages(WebInterfaceDirectory):
""" Handling of the /search URL and its sub-pages. """
_exports = ['', 'authenticate', 'cache', 'log']
def __call__(self, req, form):
""" Perform a search. """
argd = wash_search_urlargd(form)
_ = gettext_set_language(argd['ln'])
if req.method == 'POST':
raise apache.SERVER_RETURN, apache.HTTP_METHOD_NOT_ALLOWED
uid = getUid(req)
user_info = collect_user_info(req)
if uid == -1:
return page_not_authorized(req, "../",
text=_("You are not authorized to view this area."),
navmenuid='search')
elif uid > 0:
pref = get_user_preferences(uid)
try:
if not form.has_key('rg'):
# fetch user rg preference only if not overridden via URL
argd['rg'] = int(pref['websearch_group_records'])
except (KeyError, ValueError):
pass
if argd['rg'] > CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS and acc_authorize_action(req, 'runbibedit')[0] != 0:
argd['rg'] = CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS
involved_collections = set()
involved_collections.update(argd['c'])
involved_collections.add(argd['cc'])
if argd['id'] > 0:
argd['recid'] = argd['id']
if argd['idb'] > 0:
argd['recidb'] = argd['idb']
if argd['sysno']:
tmp_recid = find_record_from_sysno(argd['sysno'])
if tmp_recid:
argd['recid'] = tmp_recid
if argd['sysnb']:
tmp_recid = find_record_from_sysno(argd['sysnb'])
if tmp_recid:
argd['recidb'] = tmp_recid
if argd['recid'] > 0:
if argd['recidb'] > argd['recid']:
# Hack to check if among the restricted collections
# at least a record of the range is there and
# then if the user is not authorized for that
# collection.
recids = intbitset(xrange(argd['recid'], argd['recidb']))
restricted_collection_cache.recreate_cache_if_needed()
for collname in restricted_collection_cache.cache:
(auth_code, auth_msg) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=collname)
if auth_code and user_info['email'] == 'guest':
coll_recids = get_collection(collname).reclist
if coll_recids & recids:
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : collname})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : CFG_SITE_SECURE_URL + req.unparsed_uri}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
return page_not_authorized(req, "../", \
text=auth_msg, \
navmenuid='search')
else:
involved_collections.add(guess_primary_collection_of_a_record(argd['recid']))
# If any of the collection requires authentication, redirect
# to the authentication form.
for coll in involved_collections:
if collection_restricted_p(coll):
(auth_code, auth_msg) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=coll)
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : coll})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : CFG_SITE_SECURE_URL + req.unparsed_uri}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
return page_not_authorized(req, "../", \
text=auth_msg, \
navmenuid='search')
#check if the user has rights to set a high wildcard limit
#if not, reduce the limit set by user, with the default one
if CFG_WEBSEARCH_WILDCARD_LIMIT > 0 and (argd['wl'] > CFG_WEBSEARCH_WILDCARD_LIMIT or argd['wl'] == 0):
auth_code, auth_message = acc_authorize_action(req, 'runbibedit')
if auth_code != 0:
argd['wl'] = CFG_WEBSEARCH_WILDCARD_LIMIT
# only superadmins can use verbose parameter for obtaining debug information
if not isUserSuperAdmin(user_info):
argd['verbose'] = 0
# Keep all the arguments, they might be reused in the
# search_engine itself to derivate other queries
req.argd = argd
# mod_python does not like to return [] in case when of=id:
out = perform_request_search(req, **argd)
if isinstance(out, intbitset):
return out.fastdump()
elif out == []:
return str(out)
else:
return out
def cache(self, req, form):
"""Search cache page."""
argd = wash_urlargd(form, {'action': (str, 'show')})
return perform_request_cache(req, action=argd['action'])
def log(self, req, form):
"""Search log page."""
argd = wash_urlargd(form, {'date': (str, '')})
return perform_request_log(req, date=argd['date'])
def authenticate(self, req, form):
"""Restricted search results pages."""
argd = wash_search_urlargd(form)
user_info = collect_user_info(req)
for coll in argd['c'] + [argd['cc']]:
if collection_restricted_p(coll):
(auth_code, auth_msg) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=coll)
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : coll})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : CFG_SITE_SECURE_URL + req.unparsed_uri}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
return page_not_authorized(req, "../", \
text=auth_msg, \
navmenuid='search')
#check if the user has rights to set a high wildcard limit
#if not, reduce the limit set by user, with the default one
if CFG_WEBSEARCH_WILDCARD_LIMIT > 0 and (argd['wl'] > CFG_WEBSEARCH_WILDCARD_LIMIT or argd['wl'] == 0):
auth_code, auth_message = acc_authorize_action(req, 'runbibedit')
if auth_code != 0:
argd['wl'] = CFG_WEBSEARCH_WILDCARD_LIMIT
# only superadmins can use verbose parameter for obtaining debug information
if not isUserSuperAdmin(user_info):
argd['verbose'] = 0
# Keep all the arguments, they might be reused in the
# search_engine itself to derivate other queries
req.argd = argd
uid = getUid(req)
if uid > 0:
pref = get_user_preferences(uid)
try:
if not form.has_key('rg'):
# fetch user rg preference only if not overridden via URL
argd['rg'] = int(pref['websearch_group_records'])
except (KeyError, ValueError):
pass
# mod_python does not like to return [] in case when of=id:
out = perform_request_search(req, **argd)
if isinstance(out, intbitset):
return out.fastdump()
elif out == []:
return str(out)
else:
return out
index = __call__
class WebInterfaceLegacySearchPages(WebInterfaceDirectory):
""" Handling of the /search.py URL and its sub-pages. """
_exports = ['', ('authenticate', 'index')]
def __call__(self, req, form):
""" Perform a search. """
argd = wash_search_urlargd(form)
# We either jump into the generic search form, or the specific
# /CFG_SITE_RECORD/... display if a recid is requested
if argd['recid'] != -1:
target = '/%s/%d' % (CFG_SITE_RECORD, argd['recid'])
del argd['recid']
else:
target = '/search'
target += make_canonical_urlargd(argd, search_results_default_urlargd)
return redirect_to_url(req, target, apache.HTTP_MOVED_PERMANENTLY)
index = __call__
# Parameters for the legacy URLs, of the form /?c=ALEPH
legacy_collection_default_urlargd = {
'as': (int, CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE),
'aas': (int, CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE),
'verbose': (int, 0),
'c': (str, CFG_SITE_NAME)}
class WebInterfaceSearchInterfacePages(WebInterfaceDirectory):
""" Handling of collection navigation."""
_exports = [('index.py', 'legacy_collection'),
('', 'legacy_collection'),
('search.py', 'legacy_search'),
'search', 'openurl',
'opensearchdescription', 'logout_SSO_hook']
search = WebInterfaceSearchResultsPages()
legacy_search = WebInterfaceLegacySearchPages()
def logout_SSO_hook(self, req, form):
"""Script triggered by the display of the centralized SSO logout
dialog. It logouts the user from Invenio and stream back the
expected picture."""
logoutUser(req)
req.content_type = 'image/gif'
req.encoding = None
req.filename = 'wsignout.gif'
req.headers_out["Content-Disposition"] = "inline; filename=wsignout.gif"
req.set_content_length(os.path.getsize('%s/img/wsignout.gif' % CFG_WEBDIR))
req.send_http_header()
req.sendfile('%s/img/wsignout.gif' % CFG_WEBDIR)
def _lookup(self, component, path):
""" This handler is invoked for the dynamic URLs (for
collections and records)"""
if component == 'collection':
c = '/'.join(path)
def answer(req, form):
"""Accessing collections cached pages."""
# Accessing collections: this is for accessing the
# cached page on top of each collection.
argd = wash_urlargd(form, search_interface_default_urlargd)
# We simply return the cached page of the collection
argd['c'] = c
if not argd['c']:
# collection argument not present; display
# home collection by default
argd['c'] = CFG_SITE_NAME
# Treat `as' argument specially:
if argd.has_key('as'):
argd['aas'] = argd['as']
del argd['as']
if argd.get('aas', CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE) not in CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES:
argd['aas'] = CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE
return display_collection(req, **argd)
return answer, []
elif component == CFG_SITE_RECORD and path and path[0] == 'merge':
return WebInterfaceMergePages(), path[1:]
elif component == CFG_SITE_RECORD and path and path[0] == 'edit':
return WebInterfaceEditPages(), path[1:]
elif component == CFG_SITE_RECORD and path and path[0] == 'multiedit':
return WebInterfaceMultiEditPages(), path[1:]
elif component == CFG_SITE_RECORD and path and path[0] in ('managedocfiles', 'managedocfilesasync'):
return WebInterfaceManageDocFilesPages(), path
elif component == CFG_SITE_RECORD or component == 'record-restricted':
try:
if CFG_WEBSEARCH_USE_ALEPH_SYSNOS:
# let us try to recognize /<CFG_SITE_RECORD>/<SYSNO> style of URLs:
# check for SYSNOs with an embedded slash; needed for [ARXIVINV-15]
if len(path) > 1 and get_mysql_recid_from_aleph_sysno(path[0] + "/" + path[1]):
path[0] = path[0] + "/" + path[1]
del path[1]
x = get_mysql_recid_from_aleph_sysno(path[0])
if x:
recid = x
else:
recid = int(path[0])
else:
recid = int(path[0])
except IndexError:
# display record #1 for URL /CFG_SITE_RECORD without a number
recid = 1
except ValueError:
if path[0] == '':
# display record #1 for URL /CFG_SITE_RECORD/ without a number
recid = 1
else:
# display page not found for URLs like /CFG_SITE_RECORD/foo
return None, []
from invenio.intbitset import __maxelem__
if recid <= 0 or recid > __maxelem__:
# __maxelem__ = 2147483647
# display page not found for URLs like /CFG_SITE_RECORD/-5 or /CFG_SITE_RECORD/0 or /CFG_SITE_RECORD/2147483649
return None, []
format = None
tab = ''
try:
if path[1] in ['', 'files', 'reviews', 'comments', 'usage',
'references', 'citations', 'holdings', 'edit',
'keywords', 'multiedit', 'merge', 'plots', 'linkbacks']:
tab = path[1]
elif path[1] == 'export':
tab = ''
format = path[2]
# format = None
# elif path[1] in output_formats:
# tab = ''
# format = path[1]
else:
# display page not found for URLs like /CFG_SITE_RECORD/references
# for a collection where 'references' tabs is not visible
return None, []
except IndexError:
# Keep normal url if tabs is not specified
pass
#if component == 'record-restricted':
#return WebInterfaceRecordRestrictedPages(recid, tab, format), path[1:]
#else:
return WebInterfaceRecordPages(recid, tab, format), path[1:]
elif component == 'sslredirect':
## Fallback solution for sslredirect special path that should
## be rather implemented as an Apache level redirection
def redirecter(req, form):
real_url = "http://" + '/'.join(path)
redirect_to_url(req, real_url)
return redirecter, []
return None, []
def openurl(self, req, form):
""" OpenURL Handler."""
argd = wash_urlargd(form, websearch_templates.tmpl_openurl_accepted_args)
ret_url = websearch_templates.tmpl_openurl2invenio(argd)
if ret_url:
return redirect_to_url(req, ret_url)
else:
return redirect_to_url(req, CFG_SITE_URL)
def opensearchdescription(self, req, form):
"""OpenSearch description file"""
req.content_type = "application/opensearchdescription+xml"
req.send_http_header()
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG),
'verbose': (int, 0) })
return websearch_templates.tmpl_opensearch_description(ln=argd['ln'])
def legacy_collection(self, req, form):
"""Collection URL backward compatibility handling."""
accepted_args = dict(legacy_collection_default_urlargd)
argd = wash_urlargd(form, accepted_args)
# Treat `as' argument specially:
if argd.has_key('as'):
argd['aas'] = argd['as']
del argd['as']
if argd.get('aas', CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE) not in (0, 1):
argd['aas'] = CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE
# If we specify no collection, then we don't need to redirect
# the user, so that accessing <http://yoursite/> returns the
# default collection.
if not form.has_key('c'):
return display_collection(req, **argd)
# make the collection an element of the path, and keep the
# other query elements as is. If the collection is CFG_SITE_NAME,
# however, redirect to the main URL.
c = argd['c']
del argd['c']
if c == CFG_SITE_NAME:
target = '/'
else:
target = '/collection/' + quote(c)
# Treat `as' argument specially:
# We are going to redirect, so replace `aas' by `as' visible argument:
if argd.has_key('aas'):
argd['as'] = argd['aas']
del argd['aas']
target += make_canonical_urlargd(argd, legacy_collection_default_urlargd)
return redirect_to_url(req, target)
def display_collection(req, c, aas, verbose, ln, em=""):
"""Display search interface page for collection c by looking
in the collection cache."""
_ = gettext_set_language(ln)
req.argd = drop_default_urlargd({'aas': aas, 'verbose': verbose, 'ln': ln, 'em' : em},
search_interface_default_urlargd)
if em != "":
em = em.split(",")
# get user ID:
try:
uid = getUid(req)
user_preferences = {}
if uid == -1:
return page_not_authorized(req, "../",
text="You are not authorized to view this collection",
navmenuid='search')
elif uid > 0:
user_preferences = get_user_preferences(uid)
except Error:
register_exception(req=req, alert_admin=True)
return page(title=_("Internal Error"),
body=create_error_box(req, verbose=verbose, ln=ln),
description="%s - Internal Error" % CFG_SITE_NAME,
keywords="%s, Internal Error" % CFG_SITE_NAME,
language=ln,
req=req,
navmenuid='search')
# start display:
req.content_type = "text/html"
req.send_http_header()
# deduce collection id:
colID = get_colID(get_coll_normalised_name(c))
if type(colID) is not int:
page_body = '<p>' + (_("Sorry, collection %s does not seem to exist.") % ('<strong>' + str(c) + '</strong>')) + '</p>'
page_body = '<p>' + (_("You may want to start browsing from %s.") % ('<a href="' + CFG_SITE_URL + '?ln=' + ln + '">' + get_coll_i18nname(CFG_SITE_NAME, ln) + '</a>')) + '</p>'
if req.method == 'HEAD':
raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND
return page(title=_("Collection %s Not Found") % cgi.escape(c),
body=page_body,
description=(CFG_SITE_NAME + ' - ' + _("Not found") + ': ' + cgi.escape(str(c))),
keywords="%s" % CFG_SITE_NAME,
uid=uid,
language=ln,
req=req,
navmenuid='search')
c_body, c_navtrail, c_portalbox_lt, c_portalbox_rt, c_portalbox_tp, c_portalbox_te, \
c_last_updated = perform_display_collection(colID, c, aas, ln, em,
user_preferences.get('websearch_helpbox', 1))
if em == "" or EM_REPOSITORY["body"] in em:
try:
title = get_coll_i18nname(c, ln)
except:
title = ""
else:
title = ""
show_title_p = True
body_css_classes = []
if c == CFG_SITE_NAME:
# Do not display title on home collection
show_title_p = False
body_css_classes.append('home')
if len(collection_reclist_cache.cache.keys()) == 1:
# if there is only one collection defined, do not print its
# title on the page as it would be displayed repetitively.
show_title_p = False
if aas == -1:
show_title_p = False
if CFG_INSPIRE_SITE == 1:
# INSPIRE should never show title, but instead use css to
# style collections
show_title_p = False
body_css_classes.append(nmtoken_from_string(c))
# RSS:
rssurl = CFG_SITE_URL + '/rss'
rssurl_params = []
if c != CFG_SITE_NAME:
rssurl_params.append('cc=' + quote(c))
if ln != CFG_SITE_LANG and \
c in CFG_WEBSEARCH_RSS_I18N_COLLECTIONS:
rssurl_params.append('ln=' + ln)
if rssurl_params:
rssurl += '?' + '&amp;'.join(rssurl_params)
if 'hb' in CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS:
metaheaderadd = get_mathjax_header(req.is_https())
else:
metaheaderadd = ''
return page(title=title,
body=c_body,
navtrail=c_navtrail,
description="%s - %s" % (CFG_SITE_NAME, c),
keywords="%s, %s" % (CFG_SITE_NAME, c),
metaheaderadd=metaheaderadd,
uid=uid,
language=ln,
req=req,
cdspageboxlefttopadd=c_portalbox_lt,
cdspageboxrighttopadd=c_portalbox_rt,
titleprologue=c_portalbox_tp,
titleepilogue=c_portalbox_te,
lastupdated=c_last_updated,
navmenuid='search',
rssurl=rssurl,
body_css_classes=body_css_classes,
show_title_p=show_title_p,
show_header=em == "" or EM_REPOSITORY["header"] in em,
show_footer=em == "" or EM_REPOSITORY["footer"] in em)
class WebInterfaceRSSFeedServicePages(WebInterfaceDirectory):
"""RSS 2.0 feed service pages."""
def __call__(self, req, form):
"""RSS 2.0 feed service."""
# Keep only interesting parameters for the search
default_params = websearch_templates.rss_default_urlargd
# We need to keep 'jrec' and 'rg' here in order to have
# 'multi-page' RSS. These parameters are not kept be default
# as we don't want to consider them when building RSS links
# from search and browse pages.
default_params.update({'jrec':(int, 1),
'rg': (int, CFG_WEBSEARCH_INSTANT_BROWSE_RSS)})
argd = wash_urlargd(form, default_params)
user_info = collect_user_info(req)
for coll in argd['c'] + [argd['cc']]:
if collection_restricted_p(coll):
(auth_code, auth_msg) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=coll)
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : coll})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : CFG_SITE_SECURE_URL + req.unparsed_uri}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
return page_not_authorized(req, "../", \
text=auth_msg, \
navmenuid='search')
# Create a standard filename with these parameters
current_url = websearch_templates.build_rss_url(argd)
cache_filename = current_url.split('/')[-1]
# In the same way as previously, add 'jrec' & 'rg'
req.content_type = "application/rss+xml"
req.send_http_header()
try:
# Try to read from cache
path = "%s/rss/%s.xml" % (CFG_CACHEDIR, cache_filename)
# Check if cache needs refresh
filedesc = open(path, "r")
last_update_time = datetime.datetime.fromtimestamp(os.stat(os.path.abspath(path)).st_mtime)
assert(datetime.datetime.now() < last_update_time + datetime.timedelta(minutes=CFG_WEBSEARCH_RSS_TTL))
c_rss = filedesc.read()
filedesc.close()
req.write(c_rss)
return
except Exception, e:
# do it live and cache
previous_url = None
if argd['jrec'] > 1:
prev_jrec = argd['jrec'] - argd['rg']
if prev_jrec < 1:
prev_jrec = 1
previous_url = websearch_templates.build_rss_url(argd,
jrec=prev_jrec)
#check if the user has rights to set a high wildcard limit
#if not, reduce the limit set by user, with the default one
if CFG_WEBSEARCH_WILDCARD_LIMIT > 0 and (argd['wl'] > CFG_WEBSEARCH_WILDCARD_LIMIT or argd['wl'] == 0):
if acc_authorize_action(req, 'runbibedit')[0] != 0:
argd['wl'] = CFG_WEBSEARCH_WILDCARD_LIMIT
req.argd = argd
recIDs = perform_request_search(req, of="id",
c=argd['c'], cc=argd['cc'],
p=argd['p'], f=argd['f'],
p1=argd['p1'], f1=argd['f1'],
m1=argd['m1'], op1=argd['op1'],
p2=argd['p2'], f2=argd['f2'],
m2=argd['m2'], op2=argd['op2'],
p3=argd['p3'], f3=argd['f3'],
m3=argd['m3'], wl=argd['wl'])
nb_found = len(recIDs)
next_url = None
if len(recIDs) >= argd['jrec'] + argd['rg']:
next_url = websearch_templates.build_rss_url(argd,
jrec=(argd['jrec'] + argd['rg']))
first_url = websearch_templates.build_rss_url(argd, jrec=1)
last_url = websearch_templates.build_rss_url(argd, jrec=nb_found - argd['rg'] + 1)
recIDs = recIDs[-argd['jrec']:(-argd['rg'] - argd['jrec']):-1]
rss_prologue = '<?xml version="1.0" encoding="UTF-8"?>\n' + \
websearch_templates.tmpl_xml_rss_prologue(current_url=current_url,
previous_url=previous_url,
next_url=next_url,
first_url=first_url, last_url=last_url,
nb_found=nb_found,
jrec=argd['jrec'], rg=argd['rg'],
cc=argd['cc']) + '\n'
req.write(rss_prologue)
rss_body = format_records(recIDs,
of='xr',
ln=argd['ln'],
user_info=user_info,
record_separator="\n",
req=req, epilogue="\n")
rss_epilogue = websearch_templates.tmpl_xml_rss_epilogue() + '\n'
req.write(rss_epilogue)
# update cache
dirname = "%s/rss" % (CFG_CACHEDIR)
mymkdir(dirname)
fullfilename = "%s/rss/%s.xml" % (CFG_CACHEDIR, cache_filename)
try:
# Remove the file just in case it already existed
# so that a bit of space is created
os.remove(fullfilename)
except OSError:
pass
# Check if there's enough space to cache the request.
if len(os.listdir(dirname)) < CFG_WEBSEARCH_RSS_MAX_CACHED_REQUESTS:
try:
os.umask(022)
f = open(fullfilename, "w")
f.write(rss_prologue + rss_body + rss_epilogue)
f.close()
except IOError, v:
if v[0] == 36:
# URL was too long. Never mind, don't cache
pass
else:
raise repr(v)
index = __call__
class WebInterfaceRecordExport(WebInterfaceDirectory):
""" Handling of a /<CFG_SITE_RECORD>/<recid>/export/<format> URL fragment """
_exports = output_formats
def __init__(self, recid, format=None):
self.recid = recid
self.format = format
for output_format in output_formats:
self.__dict__[output_format] = self
return
def __call__(self, req, form):
argd = wash_search_urlargd(form)
argd['recid'] = self.recid
if self.format is not None:
argd['of'] = self.format
req.argd = argd
uid = getUid(req)
if uid == -1:
return page_not_authorized(req, "../",
text="You are not authorized to view this record.",
navmenuid='search')
elif uid > 0:
pref = get_user_preferences(uid)
try:
if not form.has_key('rg'):
# fetch user rg preference only if not overridden via URL
argd['rg'] = int(pref['websearch_group_records'])
except (KeyError, ValueError):
pass
# Check if the record belongs to a restricted primary
# collection. If yes, redirect to the authenticated URL.
user_info = collect_user_info(req)
(auth_code, auth_msg) = check_user_can_view_record(user_info, self.recid)
if argd['rg'] > CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS and acc_authorize_action(req, 'runbibedit')[0] != 0:
argd['rg'] = CFG_WEBSEARCH_MAX_RECORDS_IN_GROUPS
#check if the user has rights to set a high wildcard limit
#if not, reduce the limit set by user, with the default one
if CFG_WEBSEARCH_WILDCARD_LIMIT > 0 and (argd['wl'] > CFG_WEBSEARCH_WILDCARD_LIMIT or argd['wl'] == 0):
if acc_authorize_action(req, 'runbibedit')[0] != 0:
argd['wl'] = CFG_WEBSEARCH_WILDCARD_LIMIT
# only superadmins can use verbose parameter for obtaining debug information
if not isUserSuperAdmin(user_info):
argd['verbose'] = 0
if auth_code and user_info['email'] == 'guest':
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {'collection' : guess_primary_collection_of_a_record(self.recid)})
target = CFG_SITE_SECURE_URL + '/youraccount/login' + \
make_canonical_urlargd({'action': cookie, 'ln' : argd['ln'], 'referer' : CFG_SITE_SECURE_URL + req.unparsed_uri}, {})
return redirect_to_url(req, target, norobot=True)
elif auth_code:
return page_not_authorized(req, "../", \
text=auth_msg, \
navmenuid='search')
# mod_python does not like to return [] in case when of=id:
out = perform_request_search(req, **argd)
if isinstance(out, intbitset):
return out.fastdump()
elif out == []:
return str(out)
else:
return out
# Return the same page wether we ask for /CFG_SITE_RECORD/123/export/xm or /CFG_SITE_RECORD/123/export/xm/
index = __call__
diff --git a/invenio/legacy/websession/inveniogc.py b/invenio/legacy/websession/inveniogc.py
index 348e839f5..3273ab158 100644
--- a/invenio/legacy/websession/inveniogc.py
+++ b/invenio/legacy/websession/inveniogc.py
@@ -1,633 +1,633 @@
## -*- mode: python; coding: utf-8; -*-
##
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Invenio garbage collector.
"""
__revision__ = "$Id$"
import sys
import datetime
import time
import os
try:
from invenio.legacy.dbquery import run_sql, wash_table_column_name
from invenio.config import CFG_LOGDIR, CFG_TMPDIR, CFG_CACHEDIR, \
CFG_TMPSHAREDDIR, CFG_WEBSEARCH_RSS_TTL, CFG_PREFIX, \
CFG_WEBSESSION_NOT_CONFIRMED_EMAIL_ADDRESS_EXPIRE_IN_DAYS
- from invenio.bibtask import task_init, task_set_option, task_get_option, \
+ from invenio.legacy.bibsched.bibtask import task_init, task_set_option, task_get_option, \
write_message, write_messages
from invenio.modules.access.mailcookie import mail_cookie_gc
- from invenio.bibdocfile import BibDoc
- from invenio.bibsched import gc_tasks
+ from invenio.legacy.bibdocfile.api import BibDoc
+ from invenio.legacy.bibsched.scripts.bibsched import gc_tasks
from invenio.legacy.websubmit.config import CFG_WEBSUBMIT_TMP_VIDEO_PREFIX
from invenio.utils.date import convert_datestruct_to_datetext
except ImportError, e:
print "Error: %s" % (e,)
sys.exit(1)
# configure variables
CFG_MYSQL_ARGUMENTLIST_SIZE = 100
# After how many days to remove obsolete log/err files
CFG_MAX_ATIME_RM_LOG = 28
# After how many days to zip obsolete log/err files
CFG_MAX_ATIME_ZIP_LOG = 7
# After how many days to remove obsolete bibreformat fmt xml files
CFG_MAX_ATIME_RM_FMT = 28
# After how many days to zip obsolete bibreformat fmt xml files
CFG_MAX_ATIME_ZIP_FMT = 7
# After how many days to remove obsolete oaiharvest fmt xml files
CFG_MAX_ATIME_RM_OAI = 14
# After how many days to zip obsolete oaiharvest fmt xml files
CFG_MAX_ATIME_ZIP_OAI = 3
# After how many days to remove deleted bibdocs
CFG_DELETED_BIBDOC_MAXLIFE = 365 * 10
# After how many day to remove old cached webjournal files
CFG_WEBJOURNAL_TTL = 7
# After how many days to zip obsolete bibsword xml log files
CFG_MAX_ATIME_ZIP_BIBSWORD = 7
# After how many days to remove obsolete bibsword xml log files
CFG_MAX_ATIME_RM_BIBSWORD = 28
# After how many days to remove temporary video uploads
CFG_MAX_ATIME_WEBSUBMIT_TMP_VIDEO = 3
# After how many days to remove obsolete refextract xml output files
CFG_MAX_ATIME_RM_REFEXTRACT = 28
# After how many days to remove obsolete bibdocfiles temporary files
CFG_MAX_ATIME_RM_BIBDOC = 4
# After how many days to remove obsolete WebSubmit-created temporary
# icon files
CFG_MAX_ATIME_RM_ICON = 7
# After how many days to remove obsolete WebSubmit-created temporary
# stamp files
CFG_MAX_ATIME_RM_STAMP = 7
# After how many days to remove obsolete WebJournal-created update XML
CFG_MAX_ATIME_RM_WEBJOURNAL_XML = 7
# After how many days to remove obsolete temporary files attached with
# the CKEditor in WebSubmit context?
CFG_MAX_ATIME_RM_WEBSUBMIT_CKEDITOR_FILE = 28
# After how many days to remove obsolete temporary files related to BibEdit
# cache
CFG_MAX_ATIME_BIBEDIT_TMP = 3
def gc_exec_command(command):
""" Exec the command logging in appropriate way its output."""
write_message(' %s' % command, verbose=9)
(dummy, output, errors) = os.popen3(command)
write_messages(errors.read())
write_messages(output.read())
def clean_logs():
""" Clean the logs from obsolete files. """
write_message("""CLEANING OF LOG FILES STARTED""")
write_message("- deleting/gzipping bibsched empty/old err/log "
"BibSched files")
vstr = task_get_option('verbose') > 1 and '-v' or ''
gc_exec_command('find %s -name "bibsched_task_*"'
' -size 0c -exec rm %s -f {} \;' \
% (CFG_LOGDIR, vstr))
gc_exec_command('find %s -name "bibsched_task_*"'
' -atime +%s -exec rm %s -f {} \;' \
% (CFG_LOGDIR, CFG_MAX_ATIME_RM_LOG, vstr))
gc_exec_command('find %s -name "bibsched_task_*"'
' -atime +%s -exec gzip %s -9 {} \;' \
% (CFG_LOGDIR, CFG_MAX_ATIME_ZIP_LOG, vstr))
write_message("""CLEANING OF LOG FILES FINISHED""")
def clean_tempfiles():
""" Clean old temporary files. """
write_message("""CLEANING OF TMP FILES STARTED""")
write_message("- deleting/gzipping temporary empty/old "
"BibReformat xml files")
vstr = task_get_option('verbose') > 1 and '-v' or ''
gc_exec_command('find %s %s -name "rec_fmt_*"'
' -size 0c -exec rm %s -f {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, vstr))
gc_exec_command('find %s %s -name "rec_fmt_*"'
' -atime +%s -exec rm %s -f {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, \
CFG_MAX_ATIME_RM_FMT, vstr))
gc_exec_command('find %s %s -name "rec_fmt_*"'
' -atime +%s -exec gzip %s -9 {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, \
CFG_MAX_ATIME_ZIP_FMT, vstr))
write_message("- deleting/gzipping temporary old "
"OAIHarvest xml files")
gc_exec_command('find %s %s -name "oaiharvestadmin.*"'
' -exec rm %s -f {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, vstr))
gc_exec_command('find %s %s -name "bibconvertrun.*"'
' -exec rm %s -f {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, vstr))
# Using mtime and -r here to include directories.
gc_exec_command('find %s %s -name "oaiharvest*"'
' -mtime +%s -exec gzip %s -9 {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, \
CFG_MAX_ATIME_ZIP_OAI, vstr))
gc_exec_command('find %s %s -name "oaiharvest*"'
' -mtime +%s -exec rm %s -rf {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, \
CFG_MAX_ATIME_RM_OAI, vstr))
gc_exec_command('find %s %s -name "oai_archive*"'
' -mtime +%s -exec rm %s -rf {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, \
CFG_MAX_ATIME_RM_OAI, vstr))
write_message("- deleting/gzipping temporary old "
"BibSword files")
gc_exec_command('find %s %s -name "bibsword_*"'
' -atime +%s -exec rm %s -f {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, \
CFG_MAX_ATIME_RM_BIBSWORD, vstr))
gc_exec_command('find %s %s -name "bibsword_*"'
' -atime +%s -exec gzip %s -9 {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, \
CFG_MAX_ATIME_ZIP_BIBSWORD, vstr))
# DELETE ALL FILES CREATED DURING VIDEO SUBMISSION
write_message("- deleting old video submissions")
gc_exec_command('find %s -name %s* -atime +%s -exec rm %s -f {} \;' \
% (CFG_TMPSHAREDDIR, CFG_WEBSUBMIT_TMP_VIDEO_PREFIX,
CFG_MAX_ATIME_WEBSUBMIT_TMP_VIDEO, vstr))
write_message("- deleting temporary old "
"RefExtract files")
gc_exec_command('find %s %s -name "refextract*"'
' -atime +%s -exec rm %s -f {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR,
CFG_MAX_ATIME_RM_REFEXTRACT, vstr))
write_message("- deleting temporary old bibdocfiles")
gc_exec_command('find %s %s -name "bibdocfile_*"'
' -atime +%s -exec rm %s -f {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, \
CFG_MAX_ATIME_RM_BIBDOC, vstr))
write_message("- deleting old temporary WebSubmit icons")
gc_exec_command('find %s %s -name "websubmit_icon_creator_*"'
' -atime +%s -exec rm %s -f {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, \
CFG_MAX_ATIME_RM_ICON, vstr))
write_message("- deleting old temporary WebSubmit stamps")
gc_exec_command('find %s %s -name "websubmit_file_stamper_*"'
' -atime +%s -exec rm %s -f {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, \
CFG_MAX_ATIME_RM_STAMP, vstr))
write_message("- deleting old temporary WebJournal XML files")
gc_exec_command('find %s %s -name "webjournal_publish_*"'
' -atime +%s -exec rm %s -f {} \;' \
% (CFG_TMPDIR, CFG_TMPSHAREDDIR, \
CFG_MAX_ATIME_RM_WEBJOURNAL_XML, vstr))
write_message("- deleting old temporary files attached with CKEditor")
gc_exec_command('find %s/var/tmp/attachfile/ '
' -atime +%s -exec rm %s -f {} \;' \
% (CFG_PREFIX, CFG_MAX_ATIME_RM_WEBSUBMIT_CKEDITOR_FILE,
vstr))
write_message("- deleting old temporary files attached with BibEdit")
gc_exec_command('find %s -name "bibedit*.tmp"'
' -atime +%s -exec rm %s -f {} \;' \
% (CFG_TMPSHAREDDIR + '/bibedit-cache/', CFG_MAX_ATIME_BIBEDIT_TMP,
vstr))
write_message("""CLEANING OF TMP FILES FINISHED""")
def clean_cache():
"""Clean the cache for expired and old files."""
write_message("""CLEANING OF OLD CACHED RSS REQUEST STARTED""")
rss_cache_dir = "%s/rss/" % CFG_CACHEDIR
try:
filenames = os.listdir(rss_cache_dir)
except OSError:
filenames = []
count = 0
for filename in filenames:
filename = os.path.join(rss_cache_dir, filename)
last_update_time = datetime.datetime.fromtimestamp(os.stat(os.path.abspath(filename)).st_mtime)
if not (datetime.datetime.now() < last_update_time + datetime.timedelta(minutes=CFG_WEBSEARCH_RSS_TTL)):
try:
os.remove(filename)
count += 1
except OSError, e:
write_message("Error: %s" % e)
write_message("""%s rss cache file pruned out of %s.""" % (count, len(filenames)))
write_message("""CLEANING OF OLD CACHED RSS REQUEST FINISHED""")
write_message("""CLEANING OF OLD CACHED WEBJOURNAL FILES STARTED""")
webjournal_cache_dir = "%s/webjournal/" % CFG_CACHEDIR
filenames = []
try:
for root, dummy, files in os.walk(webjournal_cache_dir):
filenames.extend(os.path.join(root, filename) for filename in files)
except OSError:
pass
count = 0
for filename in filenames:
filename = os.path.join(webjournal_cache_dir, filename)
last_update_time = datetime.datetime.fromtimestamp(os.stat(os.path.abspath(filename)).st_mtime)
if not (datetime.datetime.now() < last_update_time + datetime.timedelta(days=CFG_WEBJOURNAL_TTL)):
try:
os.remove(filename)
count += 1
except OSError, e:
write_message("Error: %s" % e)
write_message("""%s webjournal cache file pruned out of %s.""" % (count, len(filenames)))
write_message("""CLEANING OF OLD CACHED WEBJOURNAL FILES FINISHED""")
def clean_bibxxx():
"""
Clean unreferenced bibliographic values from bibXXx tables.
This is useful to prettify browse results, as it removes
old, no longer used values.
WARNING: this function must be run only when no bibupload is
running and/or sleeping.
"""
write_message("""CLEANING OF UNREFERENCED bibXXx VALUES STARTED""")
for xx in range(0, 100):
bibxxx = 'bib%02dx' % xx
bibrec_bibxxx = 'bibrec_bib%02dx' % xx
if task_get_option('verbose') >= 9:
num_unref_values = run_sql("""SELECT COUNT(*) FROM %(bibxxx)s
LEFT JOIN %(bibrec_bibxxx)s
ON %(bibxxx)s.id=%(bibrec_bibxxx)s.id_bibxxx
WHERE %(bibrec_bibxxx)s.id_bibrec IS NULL""" % \
{'bibxxx': bibxxx,
'bibrec_bibxxx': bibrec_bibxxx, })[0][0]
run_sql("""DELETE %(bibxxx)s FROM %(bibxxx)s
LEFT JOIN %(bibrec_bibxxx)s
ON %(bibxxx)s.id=%(bibrec_bibxxx)s.id_bibxxx
WHERE %(bibrec_bibxxx)s.id_bibrec IS NULL""" % \
{'bibxxx': bibxxx,
'bibrec_bibxxx': bibrec_bibxxx, })
if task_get_option('verbose') >= 9:
write_message(""" - %d unreferenced %s values cleaned""" % \
(num_unref_values, bibxxx))
write_message("""CLEANING OF UNREFERENCED bibXXx VALUES FINISHED""")
def clean_documents():
"""Delete all the bibdocs that have been set as deleted and have not been
modified since CFG_DELETED_BIBDOC_MAXLIFE days. Returns the number of
bibdocs involved."""
write_message("""CLEANING OF OBSOLETED DELETED DOCUMENTS STARTED""")
write_message("select id from bibdoc where status='DELETED' and NOW()>ADDTIME(modification_date, '%s 0:0:0')" % CFG_DELETED_BIBDOC_MAXLIFE, verbose=9)
records = run_sql("select id from bibdoc where status='DELETED' and NOW()>ADDTIME(modification_date, '%s 0:0:0')", (CFG_DELETED_BIBDOC_MAXLIFE,))
for record in records:
bibdoc = BibDoc.create_instance(record[0])
bibdoc.expunge()
write_message("DELETE FROM bibdoc WHERE id=%i" % int(record[0]), verbose=9)
run_sql("DELETE FROM bibdoc WHERE id=%s", (record[0],))
write_message("""%s obsoleted deleted documents cleaned""" % len(records))
write_message("""CLEANING OF OBSOLETED DELETED DOCUMENTS FINISHED""")
return len(records)
def check_tables():
"""
Check all DB tables. Useful to run from time to time when the
site is idle, say once a month during a weekend night.
FIXME: should produce useful output about outcome.
"""
res = run_sql("SHOW TABLES")
for row in res:
table_name = row[0]
write_message("checking table %s" % table_name)
run_sql("CHECK TABLE %s" % wash_table_column_name(table_name)) # kwalitee: disable=sql
def optimise_tables():
"""
Optimise all DB tables to defragment them in order to increase DB
performance. Useful to run from time to time when the site is
idle, say once a month during a weekend night.
FIXME: should produce useful output about outcome.
"""
res = run_sql("SHOW TABLES")
for row in res:
table_name = row[0]
write_message("optimising table %s" % table_name)
run_sql("OPTIMIZE TABLE %s" % wash_table_column_name(table_name)) # kwalitee: disable=sql
def guest_user_garbage_collector():
"""Session Garbage Collector
program flow/tasks:
1: delete expired sessions
1b:delete guest users without session
2: delete queries not attached to any user
3: delete baskets not attached to any user
4: delete alerts not attached to any user
5: delete expired mailcookies
5b: delete expired not confirmed email address
6: delete expired roles memberships
verbose - level of program output.
0 - nothing
1 - default
9 - max, debug"""
# dictionary used to keep track of number of deleted entries
delcount = {'session': 0,
'user': 0,
'user_query': 0,
'query': 0,
'bskBASKET': 0,
'user_bskBASKET': 0,
'bskREC': 0,
'bskRECORDCOMMENT': 0,
'bskEXTREC': 0,
'bskEXTFMT': 0,
'user_query_basket': 0,
'mail_cookie': 0,
'email_addresses': 0,
'role_membership' : 0}
write_message("CLEANING OF GUEST SESSIONS STARTED")
# 1 - DELETE EXPIRED SESSIONS
write_message("- deleting expired sessions")
timelimit = convert_datestruct_to_datetext(time.gmtime())
write_message(" DELETE FROM session WHERE"
" session_expiry < %s \n" % (timelimit,), verbose=9)
delcount['session'] += run_sql("DELETE FROM session WHERE"
" session_expiry < %s """, (timelimit,))
# 1b - DELETE GUEST USERS WITHOUT SESSION
write_message("- deleting guest users without session")
# get uids
write_message(""" SELECT u.id\n FROM user AS u LEFT JOIN session AS s\n ON u.id = s.uid\n WHERE s.uid IS NULL AND u.email = ''""", verbose=9)
result = run_sql("""SELECT u.id
FROM user AS u LEFT JOIN session AS s
ON u.id = s.uid
WHERE s.uid IS NULL AND u.email = ''""")
write_message(result, verbose=9)
if result:
# work on slices of result list in case of big result
for i in range(0, len(result), CFG_MYSQL_ARGUMENTLIST_SIZE):
# create string of uids
uidstr = ''
for (id_user,) in result[i:i + CFG_MYSQL_ARGUMENTLIST_SIZE]:
if uidstr: uidstr += ','
uidstr += "%s" % (id_user,)
# delete users
write_message(" DELETE FROM user WHERE"
" id IN (TRAVERSE LAST RESULT) AND email = '' \n", verbose=9)
delcount['user'] += run_sql("DELETE FROM user WHERE"
" id IN (%s) AND email = ''" % (uidstr,))
# 2 - DELETE QUERIES NOT ATTACHED TO ANY USER
# first step, delete from user_query
write_message("- deleting user_queries referencing"
" non-existent users")
# find user_queries referencing non-existent users
write_message(" SELECT DISTINCT uq.id_user\n"
" FROM user_query AS uq LEFT JOIN user AS u\n"
" ON uq.id_user = u.id\n WHERE u.id IS NULL", verbose=9)
result = run_sql("""SELECT DISTINCT uq.id_user
FROM user_query AS uq LEFT JOIN user AS u
ON uq.id_user = u.id
WHERE u.id IS NULL""")
write_message(result, verbose=9)
# delete in user_query one by one
write_message(" DELETE FROM user_query WHERE"
" id_user = 'TRAVERSE LAST RESULT' \n", verbose=9)
for (id_user,) in result:
delcount['user_query'] += run_sql("""DELETE FROM user_query
WHERE id_user = %s""" % (id_user,))
# delete the actual queries
write_message("- deleting queries not attached to any user")
# select queries that must be deleted
write_message(""" SELECT DISTINCT q.id\n FROM query AS q LEFT JOIN user_query AS uq\n ON uq.id_query = q.id\n WHERE uq.id_query IS NULL AND\n q.type <> 'p' """, verbose=9)
result = run_sql("""SELECT DISTINCT q.id
FROM query AS q LEFT JOIN user_query AS uq
ON uq.id_query = q.id
WHERE uq.id_query IS NULL AND
q.type <> 'p'""")
write_message(result, verbose=9)
# delete queries one by one
write_message(""" DELETE FROM query WHERE id = 'TRAVERSE LAST RESULT \n""", verbose=9)
for (id_user,) in result:
delcount['query'] += run_sql("""DELETE FROM query WHERE id = %s""", (id_user,))
# 3 - DELETE BASKETS NOT OWNED BY ANY USER
write_message("- deleting baskets not owned by any user")
# select basket ids
write_message(""" SELECT ub.id_bskBASKET\n FROM user_bskBASKET AS ub LEFT JOIN user AS u\n ON u.id = ub.id_user\n WHERE u.id IS NULL""", verbose=9)
try:
result = run_sql("""SELECT ub.id_bskBASKET
FROM user_bskBASKET AS ub LEFT JOIN user AS u
ON u.id = ub.id_user
WHERE u.id IS NULL""")
except:
result = []
write_message(result, verbose=9)
# delete from user_basket and basket one by one
write_message(""" DELETE FROM user_bskBASKET WHERE id_bskBASKET = 'TRAVERSE LAST RESULT' """, verbose=9)
write_message(""" DELETE FROM bskBASKET WHERE id = 'TRAVERSE LAST RESULT' """, verbose=9)
write_message(""" DELETE FROM bskREC WHERE id_bskBASKET = 'TRAVERSE LAST RESULT'""", verbose=9)
write_message(""" DELETE FROM bskRECORDCOMMENT WHERE id_bskBASKET = 'TRAVERSE LAST RESULT' \n""", verbose=9)
for (id_basket,) in result:
delcount['user_bskBASKET'] += run_sql("""DELETE FROM user_bskBASKET WHERE id_bskBASKET = %s""", (id_basket,))
delcount['bskBASKET'] += run_sql("""DELETE FROM bskBASKET WHERE id = %s""", (id_basket,))
delcount['bskREC'] += run_sql("""DELETE FROM bskREC WHERE id_bskBASKET = %s""", (id_basket,))
delcount['bskRECORDCOMMENT'] += run_sql("""DELETE FROM bskRECORDCOMMENT WHERE id_bskBASKET = %s""", (id_basket,))
write_message(""" SELECT DISTINCT ext.id, rec.id_bibrec_or_bskEXTREC FROM bskEXTREC AS ext \nLEFT JOIN bskREC AS rec ON ext.id=-rec.id_bibrec_or_bskEXTREC WHERE id_bibrec_or_bskEXTREC is NULL""", verbose=9)
try:
result = run_sql("""SELECT DISTINCT ext.id FROM bskEXTREC AS ext
LEFT JOIN bskREC AS rec ON ext.id=-rec.id_bibrec_or_bskEXTREC
WHERE id_bibrec_or_bskEXTREC is NULL""")
except:
result = []
write_message(result, verbose=9)
write_message(""" DELETE FROM bskEXTREC WHERE id = 'TRAVERSE LAST RESULT' """, verbose=9)
write_message(""" DELETE FROM bskEXTFMT WHERE id_bskEXTREC = 'TRAVERSE LAST RESULT' \n""", verbose=9)
for (id_basket,) in result:
delcount['bskEXTREC'] += run_sql("""DELETE FROM bskEXTREC WHERE id=%s""", (id_basket,))
delcount['bskEXTFMT'] += run_sql("""DELETE FROM bskEXTFMT WHERE id_bskEXTREC=%s""", (id_basket,))
# 4 - DELETE ALERTS NOT OWNED BY ANY USER
write_message('- deleting alerts not owned by any user')
# select user ids in uqb that reference non-existent users
write_message("""SELECT DISTINCT uqb.id_user FROM user_query_basket AS uqb LEFT JOIN user AS u ON uqb.id_user = u.id WHERE u.id IS NULL""", verbose=9)
result = run_sql("""SELECT DISTINCT uqb.id_user FROM user_query_basket AS uqb LEFT JOIN user AS u ON uqb.id_user = u.id WHERE u.id IS NULL""")
write_message(result, verbose=9)
# delete all these entries
for (id_user,) in result:
write_message("""DELETE FROM user_query_basket WHERE id_user = 'TRAVERSE LAST RESULT """, verbose=9)
delcount['user_query_basket'] += run_sql("""DELETE FROM user_query_basket WHERE id_user = %s """, (id_user,))
# 5 - delete expired mailcookies
write_message("""mail_cookie_gc()""", verbose=9)
delcount['mail_cookie'] = mail_cookie_gc()
## 5b - delete expired not confirmed email address
write_message("""DELETE FROM user WHERE note='2' AND NOW()>ADDTIME(last_login, '%s 0:0:0')""" % CFG_WEBSESSION_NOT_CONFIRMED_EMAIL_ADDRESS_EXPIRE_IN_DAYS, verbose=9)
delcount['email_addresses'] = run_sql("""DELETE FROM user WHERE note='2' AND NOW()>ADDTIME(last_login, '%s 0:0:0')""", (CFG_WEBSESSION_NOT_CONFIRMED_EMAIL_ADDRESS_EXPIRE_IN_DAYS,))
# 6 - delete expired roles memberships
write_message("""DELETE FROM user_accROLE WHERE expiration<NOW()""", verbose=9)
delcount['role_membership'] = run_sql("""DELETE FROM user_accROLE WHERE expiration<NOW()""")
# print STATISTICS
write_message("""- statistics about deleted data: """)
write_message(""" %7s sessions.""" % (delcount['session'],))
write_message(""" %7s users.""" % (delcount['user'],))
write_message(""" %7s user_queries.""" % (delcount['user_query'],))
write_message(""" %7s queries.""" % (delcount['query'],))
write_message(""" %7s baskets.""" % (delcount['bskBASKET'],))
write_message(""" %7s user_baskets.""" % (delcount['user_bskBASKET'],))
write_message(""" %7s basket_records.""" % (delcount['bskREC'],))
write_message(""" %7s basket_external_records.""" % (delcount['bskEXTREC'],))
write_message(""" %7s basket_external_formats.""" % (delcount['bskEXTFMT'],))
write_message(""" %7s basket_comments.""" % (delcount['bskRECORDCOMMENT'],))
write_message(""" %7s user_query_baskets.""" % (delcount['user_query_basket'],))
write_message(""" %7s mail_cookies.""" % (delcount['mail_cookie'],))
write_message(""" %7s non confirmed email addresses.""" % delcount['email_addresses'])
write_message(""" %7s role_memberships.""" % (delcount['role_membership'],))
write_message("""CLEANING OF GUEST SESSIONS FINISHED""")
def main():
"""Main that construct all the bibtask."""
task_init(authorization_action='runinveniogc',
authorization_msg="InvenioGC Task Submission",
help_specific_usage=" -l, --logs\t\tClean old logs.\n" \
" -p, --tempfiles\t\tClean old temporary files.\n" \
" -g, --guests\t\tClean expired guest user related information. [default action]\n" \
" -b, --bibxxx\t\tClean unreferenced bibliographic values in bibXXx tables.\n" \
" -c, --cache\t\tClean cache by removing old files.\n" \
" -d, --documents\tClean deleted documents and revisions older than %s days.\n" \
" -T, --tasks\t\tClean the BibSched queue removing/archiving old DONE tasks.\n" \
" -a, --all\t\tClean all of the above (but do not run check/optimise table options below).\n" \
" -k, --check-tables\tCheck DB tables to discover potential problems.\n" \
" -o, --optimise-tables\tOptimise DB tables to increase performance.\n" % CFG_DELETED_BIBDOC_MAXLIFE,
version=__revision__,
specific_params=("lpgbdacTko", ["logs", "tempfiles", "guests", "bibxxx", "documents", "all", "cache", "tasks", "check-tables", "optimise-tables"]),
task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
task_submit_check_options_fnc=task_submit_check_options,
task_run_fnc=task_run_core)
def task_submit_check_options():
if not task_get_option('logs') and \
not task_get_option('tempfiles') and \
not task_get_option('guests') and \
not task_get_option('bibxxx') and \
not task_get_option('documents') and \
not task_get_option('cache') and \
not task_get_option('tasks') and \
not task_get_option('check-tables') and \
not task_get_option('optimise-tables'):
task_set_option('sessions', True)
return True
def task_submit_elaborate_specific_parameter(key, value, opts, args):
""" Given the string key it checks it's meaning, eventually using the
value. Usually it fills some key in the options dict.
It must return True if it has elaborated the key, False, if it doesn't
know that key.
eg:
if key in ['-n', '--number']:
self.options['number'] = value
return True
return False
"""
if key in ('-l', '--logs'):
task_set_option('logs', True)
return True
elif key in ('-p', '--tempfiles'):
task_set_option('tempfiles', True)
return True
elif key in ('-g', '--guests'):
task_set_option('guests', True)
return True
elif key in ('-b', '--bibxxx'):
task_set_option('bibxxx', True)
return True
elif key in ('-d', '--documents'):
task_set_option('documents', True)
return True
elif key in ('-c', '--cache'):
task_set_option('cache', True)
return True
elif key in ('-t', '--tasks'):
task_set_option('tasks', True)
return True
elif key in ('-k', '--check-tables'):
task_set_option('check-tables', True)
return True
elif key in ('-o', '--optimise-tables'):
task_set_option('optimise-tables', True)
return True
elif key in ('-a', '--all'):
task_set_option('logs', True)
task_set_option('tempfiles', True)
task_set_option('guests', True)
task_set_option('bibxxx', True)
task_set_option('documents', True)
task_set_option('cache', True)
task_set_option('tasks', True)
return True
return False
def task_run_core():
""" Reimplement to add the body of the task."""
if task_get_option('guests'):
guest_user_garbage_collector()
if task_get_option('logs'):
clean_logs()
if task_get_option('tempfiles'):
clean_tempfiles()
if task_get_option('bibxxx'):
clean_bibxxx()
if task_get_option('documents'):
clean_documents()
if task_get_option('cache'):
clean_cache()
if task_get_option('tasks'):
gc_tasks()
if task_get_option('check-tables'):
check_tables()
if task_get_option('optimise-tables'):
optimise_tables()
return True
if __name__ == '__main__':
main()
diff --git a/invenio/legacy/websession/webinterface.py b/invenio/legacy/websession/webinterface.py
index 0b27e30ab..836a35e25 100644
--- a/invenio/legacy/websession/webinterface.py
+++ b/invenio/legacy/websession/webinterface.py
@@ -1,1796 +1,1796 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
-from invenio.webstat import register_customevent
+from invenio.legacy.webstat.api import register_customevent
"""Invenio ACCOUNT HANDLING"""
__revision__ = "$Id$"
__lastupdated__ = """$Date$"""
import cgi
from datetime import timedelta
import os
import re
from invenio.config import \
CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS, \
CFG_ACCESS_CONTROL_LEVEL_SITE, \
CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_NEW_ACCOUNT, \
CFG_SITE_NAME, \
CFG_SITE_NAME_INTL, \
CFG_SITE_SUPPORT_EMAIL, \
CFG_SITE_SECURE_URL, \
CFG_SITE_URL, \
CFG_CERN_SITE, \
CFG_WEBSESSION_RESET_PASSWORD_EXPIRE_IN_DAYS, \
CFG_OPENAIRE_SITE
from invenio.legacy import webuser
from invenio.legacy.webpage import page
from invenio import webaccount
from invenio import webbasket
from invenio import webalert
from invenio.legacy.dbquery import run_sql
from invenio.legacy.webmessage.api import account_new_mail
from invenio.modules.access.engine import acc_authorize_action
from invenio.ext.legacy.handler import wash_urlargd, WebInterfaceDirectory
from invenio.utils.apache import SERVER_RETURN, HTTP_NOT_FOUND
from invenio.utils.url import redirect_to_url, make_canonical_urlargd
from invenio import webgroup
from invenio import webgroup_dblayer
from invenio.base.i18n import gettext_set_language, wash_language
from invenio.ext.email import send_email
from invenio.ext.logging import register_exception
from invenio.modules.access.mailcookie import mail_cookie_retrieve_kind, \
mail_cookie_check_pw_reset, mail_cookie_delete_cookie, \
mail_cookie_create_pw_reset, mail_cookie_check_role, \
mail_cookie_check_mail_activation, InvenioWebAccessMailCookieError, \
InvenioWebAccessMailCookieDeletedError, mail_cookie_check_authorize_action
from invenio.modules.access.local_config import CFG_WEBACCESS_WARNING_MSGS, \
CFG_EXTERNAL_AUTH_USING_SSO, CFG_EXTERNAL_AUTH_LOGOUT_SSO, \
CFG_EXTERNAL_AUTHENTICATION, CFG_EXTERNAL_AUTH_SSO_REFRESH, \
CFG_OPENID_CONFIGURATIONS, CFG_OAUTH2_CONFIGURATIONS, \
CFG_OAUTH1_CONFIGURATIONS, CFG_OAUTH2_PROVIDERS, CFG_OAUTH1_PROVIDERS, \
CFG_OPENID_PROVIDERS, CFG_OPENID_AUTHENTICATION, \
CFG_OAUTH1_AUTHENTICATION, CFG_OAUTH2_AUTHENTICATION
from invenio.session import get_session
from invenio.modules import apikeys as web_api_key
import invenio.legacy.template
websession_templates = invenio.legacy.template.load('websession')
bibcatalog_templates = invenio.legacy.template.load('bibcatalog')
class WebInterfaceYourAccountPages(WebInterfaceDirectory):
_exports = ['', 'edit', 'change', 'lost', 'display',
'send_email', 'youradminactivities', 'access',
'delete', 'logout', 'login', 'register', 'resetpassword',
'robotlogin', 'robotlogout', 'apikey', 'openid',
'oauth1', 'oauth2']
_force_https = True
def index(self, req, form):
redirect_to_url(req, '%s/youraccount/display' % CFG_SITE_SECURE_URL)
def access(self, req, form):
args = wash_urlargd(form, {'mailcookie' : (str, '')})
_ = gettext_set_language(args['ln'])
title = _("Mail Cookie Service")
try:
kind = mail_cookie_retrieve_kind(args['mailcookie'])
if kind == 'pw_reset':
redirect_to_url(req, '%s/youraccount/resetpassword?k=%s&ln=%s' % (CFG_SITE_SECURE_URL, args['mailcookie'], args['ln']))
elif kind == 'role':
uid = webuser.getUid(req)
try:
(role_name, expiration) = mail_cookie_check_role(args['mailcookie'], uid)
except InvenioWebAccessMailCookieDeletedError:
return page(title=_("Role authorization request"), req=req, body=_("This request for an authorization has already been authorized."), uid=webuser.getUid(req), navmenuid='youraccount', language=args['ln'], secure_page_p=1)
return page(title=title,
body=webaccount.perform_back(
_("You have successfully obtained an authorization as %(x_role)s! "
"This authorization will last until %(x_expiration)s and until "
"you close your browser if you are a guest user.") %
{'x_role' : '<strong>%s</strong>' % role_name,
'x_expiration' : '<em>%s</em>' % expiration.strftime("%Y-%m-%d %H:%M:%S")},
'/youraccount/display?ln=%s' % args['ln'], _('login'), args['ln']),
req=req,
uid=webuser.getUid(req),
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount',
secure_page_p=1)
elif kind == 'mail_activation':
try:
email = mail_cookie_check_mail_activation(args['mailcookie'])
if not email:
raise StandardError
webuser.confirm_email(email)
body = "<p>" + _("You have confirmed the validity of your email"
" address!") + "</p>"
if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS == 1:
body += "<p>" + _("Please, wait for the administrator to "
"enable your account.") + "</p>"
else:
uid = webuser.update_Uid(req, email)
body += "<p>" + _("You can now go to %(x_url_open)syour account page%(x_url_close)s.") % {'x_url_open' : '<a href="/youraccount/display?ln=%s">' % args['ln'], 'x_url_close' : '</a>'} + "</p>"
return page(title=_("Email address successfully activated"),
body=body, req=req, language=args['ln'], uid=webuser.getUid(req), lastupdated=__lastupdated__, navmenuid='youraccount', secure_page_p=1)
except InvenioWebAccessMailCookieDeletedError, e:
body = "<p>" + _("You have already confirmed the validity of your email address!") + "</p>"
if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS == 1:
body += "<p>" + _("Please, wait for the administrator to "
"enable your account.") + "</p>"
else:
body += "<p>" + _("You can now go to %(x_url_open)syour account page%(x_url_close)s.") % {'x_url_open' : '<a href="/youraccount/display?ln=%s">' % args['ln'], 'x_url_close' : '</a>'} + "</p>"
return page(title=_("Email address successfully activated"),
body=body, req=req, language=args['ln'], uid=webuser.getUid(req), lastupdated=__lastupdated__, navmenuid='youraccount', secure_page_p=1)
return webuser.page_not_authorized(req, "../youraccount/access",
text=_("This request for confirmation of an email "
"address is not valid or"
" is expired."), navmenuid='youraccount')
except InvenioWebAccessMailCookieError:
return webuser.page_not_authorized(req, "../youraccount/access",
text=_("This request for an authorization is not valid or"
" is expired."), navmenuid='youraccount')
def resetpassword(self, req, form):
args = wash_urlargd(form, {
'k' : (str, ''),
'reset' : (int, 0),
'password' : (str, ''),
'password2' : (str, '')
})
_ = gettext_set_language(args['ln'])
title = _('Reset password')
reset_key = args['k']
try:
email = mail_cookie_check_pw_reset(reset_key)
except InvenioWebAccessMailCookieDeletedError:
return page(title=title, req=req, body=_("This request for resetting a password has already been used."), uid=webuser.getUid(req), navmenuid='youraccount', language=args['ln'], secure_page_p=1)
except InvenioWebAccessMailCookieError:
return webuser.page_not_authorized(req, "../youraccount/access",
text=_("This request for resetting a password is not valid or"
" is expired."), navmenuid='youraccount')
if email is None or CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 3:
return webuser.page_not_authorized(req, "../youraccount/resetpassword",
text=_("This request for resetting the password is not valid or"
" is expired."), navmenuid='youraccount')
if not args['reset']:
return page(title=title,
body=webaccount.perform_reset_password(args['ln'], email, reset_key),
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
elif args['password'] != args['password2']:
msg = _('The two provided passwords aren\'t equal.')
return page(title=title,
body=webaccount.perform_reset_password(args['ln'], email, reset_key, msg),
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
run_sql('UPDATE user SET password=AES_ENCRYPT(email,%s) WHERE email=%s', (args['password'], email))
mail_cookie_delete_cookie(reset_key)
return page(title=title,
body=webaccount.perform_back(
_("The password was successfully set! "
"You can now proceed with the login."),
CFG_SITE_SECURE_URL + '/youraccount/login?ln=%s' % args['ln'], _('login'), args['ln']),
req=req,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount', secure_page_p=1)
def display(self, req, form):
args = wash_urlargd(form, {})
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(args['ln'])
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../youraccount/display",
navmenuid='youraccount')
if webuser.isGuestUser(uid):
return page(title=_("Your Account"),
body=webaccount.perform_info(req, args['ln']),
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
username = webuser.get_nickname_or_email(uid)
user_info = webuser.collect_user_info(req)
bask = user_info['precached_usebaskets'] and webbasket.account_list_baskets(uid, ln=args['ln']) or ''
aler = user_info['precached_usealerts'] and webalert.account_list_alerts(uid, ln=args['ln']) or ''
sear = webalert.account_list_searches(uid, ln=args['ln'])
msgs = user_info['precached_usemessages'] and account_new_mail(uid, ln=args['ln']) or ''
grps = user_info['precached_usegroups'] and webgroup.account_group(uid, ln=args['ln']) or ''
appr = user_info['precached_useapprove']
sbms = user_info['precached_viewsubmissions']
comments = user_info['precached_sendcomments']
loan = ''
admn = webaccount.perform_youradminactivities(user_info, args['ln'])
return page(title=_("Your Account"),
body=webaccount.perform_display_account(req, username, bask, aler, sear, msgs, loan, grps, sbms, appr, admn, args['ln'], comments),
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
def apikey(self, req, form):
args = wash_urlargd(form, {
'key_description' : (str, None),
'key_id' : (str, None),
'referer': (str, '')
})
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(args['ln'])
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../youraccount/edit",
navmenuid='youraccount')
if webuser.isGuestUser(uid):
return webuser.page_not_authorized(req, "../youraccount/edit",
text=_("This functionality is forbidden to guest users."),
navmenuid='youraccount')
if args['key_id']:
web_api_key.mark_web_api_key_as_removed(args['key_id'])
else:
uid = webuser.getUid(req)
web_api_key.create_new_web_api_key(uid, args['key_description'])
if args['referer']:
redirect_to_url(req, args['referer'])
else:
redirect_to_url(req, '%s/youraccount/edit?ln=%s' % (CFG_SITE_SECURE_URL, args['ln']))
def edit(self, req, form):
args = wash_urlargd(form, {"verbose" : (int, 0)})
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(args['ln'])
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../youraccount/edit",
navmenuid='youraccount')
if webuser.isGuestUser(uid):
return webuser.page_not_authorized(req, "../youraccount/edit",
text=_("This functionality is forbidden to guest users."),
navmenuid='youraccount')
body = ''
user_info = webuser.collect_user_info(req)
if args['verbose'] == 9:
keys = user_info.keys()
keys.sort()
for key in keys:
body += "<b>%s</b>:%s<br />" % (key, user_info[key])
#check if the user should see bibcatalog user name / passwd in the settings
can_config_bibcatalog = (acc_authorize_action(user_info, 'runbibedit')[0] == 0)
return page(title= _("Your Settings"),
body=body+webaccount.perform_set(webuser.get_email(uid),
args['ln'], can_config_bibcatalog,
verbose=args['verbose']),
navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description=_("%s Personalize, Your Settings") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
def change(self, req, form):
args = wash_urlargd(form, {
'nickname': (str, None),
'email': (str, None),
'old_password': (str, None),
'password': (str, None),
'password2': (str, None),
'login_method': (str, ""),
'group_records' : (int, None),
'latestbox' : (int, None),
'helpbox' : (int, None),
'lang' : (str, None),
'bibcatalog_username' : (str, None),
'bibcatalog_password' : (str, None),
})
## Wash arguments:
args['login_method'] = wash_login_method(args['login_method'])
if args['email']:
args['email'] = args['email'].lower()
## Load the right message language:
_ = gettext_set_language(args['ln'])
## Identify user and load old preferences:
uid = webuser.getUid(req)
prefs = webuser.get_user_preferences(uid)
## Check rights:
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../youraccount/change",
navmenuid='youraccount')
# FIXME: the branching below is far from optimal. Should be
# based on the submitted form name ids, to know precisely on
# which form the user clicked. Not on the passed values, as
# is the case now. The function body is too big and in bad
# need of refactoring anyway.
## Will hold the output messages:
mess = ''
## Would hold link to previous page and title for the link:
act = None
linkname = None
title = None
## Change login method if needed:
if args['login_method'] and CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS < 4 \
and args['login_method'] in CFG_EXTERNAL_AUTHENTICATION:
title = _("Settings edited")
act = "/youraccount/display?ln=%s" % args['ln']
linkname = _("Show account")
if prefs['login_method'] != args['login_method']:
if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 4:
mess += '<p>' + _("Unable to change login method.")
elif not CFG_EXTERNAL_AUTHENTICATION[args['login_method']]:
# Switching to internal authentication: we drop any external datas
p_email = webuser.get_email(uid)
webuser.drop_external_settings(uid)
webgroup_dblayer.drop_external_groups(uid)
prefs['login_method'] = args['login_method']
webuser.set_user_preferences(uid, prefs)
mess += "<p>" + _("Switched to internal login method.") + " "
mess += _("Please note that if this is the first time that you are using this account "
"with the internal login method then the system has set for you "
"a randomly generated password. Please click the "
"following button to obtain a password reset request "
"link sent to you via email:") + '</p>'
mess += """<p><form method="post" action="../youraccount/send_email">
<input type="hidden" name="p_email" value="%s">
<input class="formbutton" type="submit" value="%s">
</form></p>""" % (p_email, _("Send Password"))
else:
res = run_sql("SELECT email FROM user WHERE id=%s", (uid,))
if res:
email = res[0][0]
else:
email = None
if not email:
mess += '<p>' + _("Unable to switch to external login method %s, because your email address is unknown.") % cgi.escape(args['login_method'])
else:
try:
if not CFG_EXTERNAL_AUTHENTICATION[args['login_method']].user_exists(email):
mess += '<p>' + _("Unable to switch to external login method %s, because your email address is unknown to the external login system.") % cgi.escape(args['login_method'])
else:
prefs['login_method'] = args['login_method']
webuser.set_user_preferences(uid, prefs)
mess += '<p>' + _("Login method successfully selected.")
except AttributeError:
mess += '<p>' + _("The external login method %s does not support email address based logins. Please contact the site administrators.") % cgi.escape(args['login_method'])
## Change email or nickname:
if args['email'] or args['nickname']:
uid2 = webuser.emailUnique(args['email'])
uid_with_the_same_nickname = webuser.nicknameUnique(args['nickname'])
current_nickname = webuser.get_nickname(uid)
if current_nickname and args['nickname'] and \
current_nickname != args['nickname']:
# User tried to set nickname while one is already
# defined (policy is that nickname is not to be
# changed)
mess += '<p>' + _("Your nickname has not been updated")
elif (CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 2 or (CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS <= 1 and \
webuser.email_valid_p(args['email']))) \
and (args['nickname'] is None or webuser.nickname_valid_p(args['nickname'])) \
and uid2 != -1 and (uid2 == uid or uid2 == 0) \
and uid_with_the_same_nickname != -1 and (uid_with_the_same_nickname == uid or uid_with_the_same_nickname == 0):
if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS < 3:
change = webuser.updateDataUser(uid,
args['email'],
args['nickname'])
else:
return webuser.page_not_authorized(req, "../youraccount/change",
navmenuid='youraccount')
if change:
mess += '<p>' + _("Settings successfully edited.")
mess += '<p>' + _("Note that if you have changed your email address, "
"you will have to %(x_url_open)sreset your password%(x_url_close)s anew.") % \
{'x_url_open': '<a href="%s">' % (CFG_SITE_SECURE_URL + '/youraccount/lost?ln=%s' % args['ln']),
'x_url_close': '</a>'}
act = "/youraccount/display?ln=%s" % args['ln']
linkname = _("Show account")
title = _("Settings edited")
elif args['nickname'] is not None and not webuser.nickname_valid_p(args['nickname']):
mess += '<p>' + _("Desired nickname %s is invalid.") % cgi.escape(args['nickname'])
mess += " " + _("Please try again.")
act = "/youraccount/edit?ln=%s" % args['ln']
linkname = _("Edit settings")
title = _("Editing settings failed")
elif not webuser.email_valid_p(args['email']):
mess += '<p>' + _("Supplied email address %s is invalid.") % cgi.escape(args['email'])
mess += " " + _("Please try again.")
act = "/youraccount/edit?ln=%s" % args['ln']
linkname = _("Edit settings")
title = _("Editing settings failed")
elif uid2 == -1 or uid2 != uid and not uid2 == 0:
mess += '<p>' + _("Supplied email address %s already exists in the database.") % cgi.escape(args['email'])
mess += " " + websession_templates.tmpl_lost_your_password_teaser(args['ln'])
mess += " " + _("Or please try again.")
act = "/youraccount/edit?ln=%s" % args['ln']
linkname = _("Edit settings")
title = _("Editing settings failed")
elif uid_with_the_same_nickname == -1 or uid_with_the_same_nickname != uid and not uid_with_the_same_nickname == 0:
mess += '<p>' + _("Desired nickname %s is already in use.") % cgi.escape(args['nickname'])
mess += " " + _("Please try again.")
act = "/youraccount/edit?ln=%s" % args['ln']
linkname = _("Edit settings")
title = _("Editing settings failed")
## Change passwords:
if args['old_password'] or args['password'] or args['password2']:
if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 3:
mess += '<p>' + _("Users cannot edit passwords on this site.")
else:
res = run_sql("SELECT id FROM user "
"WHERE AES_ENCRYPT(email,%s)=password AND id=%s",
(args['old_password'], uid))
if res:
if args['password'] == args['password2']:
webuser.updatePasswordUser(uid, args['password'])
mess += '<p>' + _("Password successfully edited.")
act = "/youraccount/display?ln=%s" % args['ln']
linkname = _("Show account")
title = _("Password edited")
else:
mess += '<p>' + _("Both passwords must match.")
mess += " " + _("Please try again.")
act = "/youraccount/edit?ln=%s" % args['ln']
linkname = _("Edit settings")
title = _("Editing password failed")
else:
mess += '<p>' + _("Wrong old password inserted.")
mess += " " + _("Please try again.")
act = "/youraccount/edit?ln=%s" % args['ln']
linkname = _("Edit settings")
title = _("Editing password failed")
## Change search-related settings:
if args['group_records']:
prefs = webuser.get_user_preferences(uid)
prefs['websearch_group_records'] = args['group_records']
prefs['websearch_latestbox'] = args['latestbox']
prefs['websearch_helpbox'] = args['helpbox']
webuser.set_user_preferences(uid, prefs)
title = _("Settings edited")
act = "/youraccount/display?ln=%s" % args['ln']
linkname = _("Show account")
mess += '<p>' + _("User settings saved correctly.")
## Change language-related settings:
if args['lang']:
lang = wash_language(args['lang'])
prefs = webuser.get_user_preferences(uid)
prefs['language'] = lang
args['ln'] = lang
_ = gettext_set_language(lang)
webuser.set_user_preferences(uid, prefs)
title = _("Settings edited")
act = "/youraccount/display?ln=%s" % args['ln']
linkname = _("Show account")
mess += '<p>' + _("User settings saved correctly.")
## Edit cataloging-related settings:
if args['bibcatalog_username'] or args['bibcatalog_password']:
act = "/youraccount/display?ln=%s" % args['ln']
linkname = _("Show account")
if ((len(args['bibcatalog_username']) == 0) or (len(args['bibcatalog_password']) == 0)):
title = _("Editing bibcatalog authorization failed")
mess += '<p>' + _("Empty username or password")
else:
title = _("Settings edited")
prefs['bibcatalog_username'] = args['bibcatalog_username']
prefs['bibcatalog_password'] = args['bibcatalog_password']
webuser.set_user_preferences(uid, prefs)
mess += '<p>' + _("User settings saved correctly.")
if not mess:
mess = _("Unable to update settings.")
if not act:
act = "/youraccount/edit?ln=%s" % args['ln']
if not linkname:
linkname = _("Edit settings")
if not title:
title = _("Editing settings failed")
## Finally, output the results:
return page(title=title,
body=webaccount.perform_back(mess, act, linkname, args['ln']),
navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
def lost(self, req, form):
args = wash_urlargd(form, {})
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(args['ln'])
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../youraccount/lost",
navmenuid='youraccount')
return page(title=_("Lost your password?"),
body=webaccount.perform_lost(args['ln']),
navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
def send_email(self, req, form):
# set all the declared query fields as local variables
args = wash_urlargd(form, {'p_email': (str, None)})
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(args['ln'])
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../youraccount/send_email",
navmenuid='youraccount')
user_prefs = webuser.get_user_preferences(webuser.emailUnique(args['p_email']))
if user_prefs:
if user_prefs['login_method'] in CFG_EXTERNAL_AUTHENTICATION and \
CFG_EXTERNAL_AUTHENTICATION[user_prefs['login_method']] is not None:
eMsg = _("Cannot send password reset request since you are using external authentication system.")
return page(title=_("Your Account"),
body=webaccount.perform_emailMessage(eMsg, args['ln']),
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME)),
uid=uid, req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
try:
reset_key = mail_cookie_create_pw_reset(args['p_email'], cookie_timeout=timedelta(days=CFG_WEBSESSION_RESET_PASSWORD_EXPIRE_IN_DAYS))
except InvenioWebAccessMailCookieError:
reset_key = None
if reset_key is None:
eMsg = _("The entered email address does not exist in the database.")
return page(title=_("Your Account"),
body=webaccount.perform_emailMessage(eMsg, args['ln']),
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid, req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
ip_address = req.remote_host or req.remote_ip
if not send_email(CFG_SITE_SUPPORT_EMAIL, args['p_email'], "%s %s"
% (_("Password reset request for"),
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME)),
websession_templates.tmpl_account_reset_password_email_body(
args['p_email'],reset_key, ip_address, args['ln'])):
eMsg = _("The entered email address is incorrect, please check that it is written correctly (e.g. johndoe@example.com).")
return page(title=_("Incorrect email address"),
body=webaccount.perform_emailMessage(eMsg, args['ln']),
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
return page(title=_("Reset password link sent"),
body=webaccount.perform_emailSent(args['p_email'], args['ln']),
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid, req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
def youradminactivities(self, req, form):
args = wash_urlargd(form, {})
uid = webuser.getUid(req)
user_info = webuser.collect_user_info(req)
# load the right message language
_ = gettext_set_language(args['ln'])
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../youraccount/youradminactivities",
navmenuid='admin')
return page(title=_("Your Administrative Activities"),
body=webaccount.perform_youradminactivities(user_info, args['ln']),
navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='admin')
def delete(self, req, form):
args = wash_urlargd(form, {})
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(args['ln'])
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../youraccount/delete",
navmenuid='youraccount')
return page(title=_("Delete Account"),
body=webaccount.perform_delete(args['ln']),
navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
def logout(self, req, form):
args = wash_urlargd(form, {})
uid = webuser.logoutUser(req)
# load the right message language
_ = gettext_set_language(args['ln'])
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../youraccount/logout",
navmenuid='youraccount')
if CFG_EXTERNAL_AUTH_USING_SSO:
return redirect_to_url(req, CFG_EXTERNAL_AUTH_LOGOUT_SSO)
return page(title=_("Logout"),
body=webaccount.perform_logout(req, args['ln']),
navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords=_("%s, personalize") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
def robotlogout(self, req, form):
"""
Implement logout method for external service providers.
"""
webuser.logoutUser(req)
if CFG_OPENAIRE_SITE:
from invenio.config import CFG_OPENAIRE_PORTAL_URL
redirect_to_url(req, CFG_OPENAIRE_PORTAL_URL)
else:
redirect_to_url(req, "%s/img/pix.png" % CFG_SITE_SECURE_URL)
def robotlogin(self, req, form):
"""
Implement authentication method for external service providers.
"""
from invenio.modules.access.external_authentication import InvenioWebAccessExternalAuthError
args = wash_urlargd(form, {
'login_method': (str, None),
'remember_me' : (str, ''),
'referer': (str, ''),
'p_un': (str, ''),
'p_pw': (str, '')
})
# sanity checks:
args['login_method'] = wash_login_method(args['login_method'])
args['remember_me'] = args['remember_me'] != ''
locals().update(args)
if CFG_ACCESS_CONTROL_LEVEL_SITE > 0:
return webuser.page_not_authorized(req, CFG_SITE_SECURE_URL + "/youraccount/login?ln=%s" % args['ln'],
navmenuid='youraccount')
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(args['ln'])
try:
(iden, args['p_un'], args['p_pw'], msgcode) = webuser.loginUser(req, args['p_un'], args['p_pw'], args['login_method'])
except InvenioWebAccessExternalAuthError, err:
return page("Error", body=str(err), req=req)
if iden:
uid = webuser.update_Uid(req, args['p_un'], args['remember_me'])
uid2 = webuser.getUid(req)
if uid2 == -1:
webuser.logoutUser(req)
return webuser.page_not_authorized(req, CFG_SITE_SECURE_URL + "/youraccount/login?ln=%s" % args['ln'], uid=uid,
navmenuid='youraccount')
# login successful!
if args['referer']:
if CFG_OPENAIRE_SITE and args['referer'].startswith('https://openaire.cern.ch/deposit'):
## HACK for OpenAIRE
args['referer'] = args['referer'].replace('https://openaire.cern.ch/deposit', 'http://openaire.cern.ch/deposit')
redirect_to_url(req, args['referer'])
else:
return self.display(req, form)
else:
mess = CFG_WEBACCESS_WARNING_MSGS[msgcode] % cgi.escape(args['login_method'])
if msgcode == 14:
if webuser.username_exists_p(args['p_un']):
mess = CFG_WEBACCESS_WARNING_MSGS[15] % cgi.escape(args['login_method'])
act = CFG_SITE_SECURE_URL + '/youraccount/login%s' % make_canonical_urlargd({'ln' : args['ln'], 'referer' : args['referer']}, {})
return page(title=_("Login"),
body=webaccount.perform_back(mess, act, _("login"), args['ln']),
navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords="%s , personalize" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
def login(self, req, form):
args = wash_urlargd(form, {
'p_un': (str, None),
'p_pw': (str, None),
'login_method': (str, None),
'action': (str, ''),
'remember_me' : (str, ''),
'referer': (str, '')})
if CFG_OPENAIRE_SITE:
from invenio.config import CFG_OPENAIRE_PORTAL_URL
if CFG_OPENAIRE_PORTAL_URL:
from invenio.utils.url import create_url
from base64 import encodestring
invenio_loginurl = args['referer'] or '%s/youraccount/display?ln=%s' % (CFG_SITE_SECURE_URL, args['ln'])
loginurl = create_url(CFG_OPENAIRE_PORTAL_URL, {"option": "com_openaire", "view": "login", "return": encodestring(invenio_loginurl)})
redirect_to_url(req, loginurl)
# sanity checks:
args['login_method'] = wash_login_method(args['login_method'])
if args['p_un']:
args['p_un'] = args['p_un'].strip()
args['remember_me'] = args['remember_me'] != ''
locals().update(args)
if CFG_ACCESS_CONTROL_LEVEL_SITE > 0:
return webuser.page_not_authorized(req, CFG_SITE_SECURE_URL + "/youraccount/login?ln=%s" % args['ln'],
navmenuid='youraccount')
uid = webuser.getUid(req)
# If user is already logged in, redirect it to referer or your account
# page
if uid > 0:
redirect_to_url(req, args['referer'] or '%s/youraccount/display?ln=%s' % (CFG_SITE_SECURE_URL, args['ln']))
# load the right message language
_ = gettext_set_language(args['ln'])
if args['action']:
cookie = args['action']
try:
action, arguments = mail_cookie_check_authorize_action(cookie)
except InvenioWebAccessMailCookieError:
pass
if not CFG_EXTERNAL_AUTH_USING_SSO:
if (args['p_un'] is None or not args['login_method']) and (not args['login_method'] in ['openid', 'oauth1', 'oauth2']):
return page(title=_("Login"),
body=webaccount.create_login_page_box(args['referer'], args['ln']),
navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords="%s , personalize" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p=1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
(iden, args['p_un'], args['p_pw'], msgcode) = webuser.loginUser(req, args['p_un'], args['p_pw'], args['login_method'])
else:
# Fake parameters for p_un & p_pw because SSO takes them from the environment
(iden, args['p_un'], args['p_pw'], msgcode) = webuser.loginUser(req, '', '', CFG_EXTERNAL_AUTH_USING_SSO)
args['remember_me'] = False
if iden:
uid = webuser.update_Uid(req, args['p_un'], args['remember_me'])
uid2 = webuser.getUid(req)
if uid2 == -1:
webuser.logoutUser(req)
return webuser.page_not_authorized(req, CFG_SITE_SECURE_URL + "/youraccount/login?ln=%s" % args['ln'], uid=uid,
navmenuid='youraccount')
# login successful!
try:
register_customevent("login", [req.remote_host or req.remote_ip, uid, args['p_un']])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
if args['referer']:
redirect_to_url(req, args['referer'].replace(CFG_SITE_URL, CFG_SITE_SECURE_URL))
else:
return self.display(req, form)
else:
mess = None
if isinstance(msgcode, (str, unicode)):
# if msgcode is string, show it.
mess = msgcode
elif msgcode in [21, 22, 23]:
mess = CFG_WEBACCESS_WARNING_MSGS[msgcode]
elif msgcode == 14:
if webuser.username_exists_p(args['p_un']):
mess = CFG_WEBACCESS_WARNING_MSGS[15] % cgi.escape(args['login_method'])
if not mess:
mess = CFG_WEBACCESS_WARNING_MSGS[msgcode] % cgi.escape(args['login_method'])
act = CFG_SITE_SECURE_URL + '/youraccount/login%s' % make_canonical_urlargd({'ln' : args['ln'], 'referer' : args['referer']}, {})
return page(title=_("Login"),
body=webaccount.perform_back(mess, act, _("login"), args['ln']),
navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description="%s Personalize, Main page" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords="%s , personalize" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
def register(self, req, form):
args = wash_urlargd(form, {
'p_nickname': (str, None),
'p_email': (str, None),
'p_pw': (str, None),
'p_pw2': (str, None),
'action': (str, "login"),
'referer': (str, "")})
if CFG_ACCESS_CONTROL_LEVEL_SITE > 0:
return webuser.page_not_authorized(req, "../youraccount/register?ln=%s" % args['ln'],
navmenuid='youraccount')
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(args['ln'])
if args['p_nickname'] is None or args['p_email'] is None:
return page(title=_("Register"),
body=webaccount.create_register_page_box(args['referer'], args['ln']),
navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description=_("%s Personalize, Main page") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords="%s , personalize" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
mess = ""
act = ""
if args['p_pw'] == args['p_pw2']:
ruid = webuser.registerUser(req, args['p_email'], args['p_pw'],
args['p_nickname'], ln=args['ln'])
else:
ruid = -2
if ruid == 0:
mess = _("Your account has been successfully created.")
title = _("Account created")
if CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_NEW_ACCOUNT == 1:
mess += " " + _("In order to confirm its validity, an email message containing an account activation key has been sent to the given email address.")
mess += " " + _("Please follow instructions presented there in order to complete the account registration process.")
if CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS >= 1:
mess += " " + _("A second email will be sent when the account has been activated and can be used.")
elif CFG_ACCESS_CONTROL_NOTIFY_USER_ABOUT_NEW_ACCOUNT != 1:
uid = webuser.update_Uid(req, args['p_email'])
mess += " " + _("You can now access your %(x_url_open)saccount%(x_url_close)s.") %\
{'x_url_open': '<a href="' + CFG_SITE_SECURE_URL + '/youraccount/display?ln=' + args['ln'] + '">',
'x_url_close': '</a>'}
elif ruid == -2:
mess = _("Both passwords must match.")
mess += " " + _("Please try again.")
act = "/youraccount/register?ln=%s" % args['ln']
title = _("Registration failure")
elif ruid == 1:
mess = _("Supplied email address %s is invalid.") % cgi.escape(args['p_email'])
mess += " " + _("Please try again.")
act = "/youraccount/register?ln=%s" % args['ln']
title = _("Registration failure")
elif ruid == 2:
mess = _("Desired nickname %s is invalid.") % cgi.escape(args['p_nickname'])
mess += " " + _("Please try again.")
act = "/youraccount/register?ln=%s" % args['ln']
title = _("Registration failure")
elif ruid == 3:
mess = _("Supplied email address %s already exists in the database.") % cgi.escape(args['p_email'])
mess += " " + websession_templates.tmpl_lost_your_password_teaser(args['ln'])
mess += " " + _("Or please try again.")
act = "/youraccount/register?ln=%s" % args['ln']
title = _("Registration failure")
elif ruid == 4:
mess = _("Desired nickname %s already exists in the database.") % cgi.escape(args['p_nickname'])
mess += " " + _("Please try again.")
act = "/youraccount/register?ln=%s" % args['ln']
title = _("Registration failure")
elif ruid == 5:
mess = _("Users cannot register themselves, only admin can register them.")
act = "/youraccount/register?ln=%s" % args['ln']
title = _("Registration failure")
elif ruid == 6:
mess = _("The site is having troubles in sending you an email for confirming your email address.") + _("The error has been logged and will be taken in consideration as soon as possible.")
act = "/youraccount/register?ln=%s" % args['ln']
title = _("Registration failure")
else:
# this should never happen
mess = _("Internal Error")
act = "/youraccount/register?ln=%s" % args['ln']
title = _("Registration failure")
return page(title=title,
body=webaccount.perform_back(mess,act, _("register"), args['ln']),
navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description=_("%s Personalize, Main page") % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords="%s , personalize" % CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid=uid,
req=req,
secure_page_p = 1,
language=args['ln'],
lastupdated=__lastupdated__,
navmenuid='youraccount')
def openid(self, req, form):
"""
Constructs the URL of the login page of the OpenID provider and
redirects or constructs it.
"""
def get_consumer(req):
"""
Returns a consumer without a memory.
"""
return consumer.Consumer({"id": get_session(req)}, None)
def request_registration_data(request, provider):
"""
Adds simple registration (sreg) and attribute exchage (ax) extension
to given OpenID request.
@param request: OpenID request
@type request: openid.consumer.consumer.AuthRequest
@param provider: OpenID provider
@type provider: str
"""
# We ask the user nickname if the provider accepts sreg request.
sreg_request = sreg.SRegRequest(required = ['nickname'])
request.addExtension(sreg_request)
# If the provider is trusted, we may ask the email of the user, too.
ax_request = ax.FetchRequest()
if CFG_OPENID_CONFIGURATIONS[provider].get('trust_email', False):
ax_request.add(ax.AttrInfo(
'http://axschema.org/contact/email',
required = True))
ax_request.add(ax.AttrInfo(
'http://axschema.org/namePerson/friendly',
required = True))
request.addExtension(ax_request)
# All arguements must be extracted
content = {
'provider': (str, ''),
'identifier': (str, ''),
'referer': (str, '')
}
for key in CFG_OPENID_CONFIGURATIONS.keys():
content[key] = (str, '')
args = wash_urlargd(form, content)
# Load the right message language
_ = gettext_set_language(args['ln'])
try:
from openid.consumer import consumer
from openid.extensions import ax
from openid.extensions import sreg
except:
# Return login page with 'Need to install python-openid' error
return page(title = _("Login"),
body = webaccount.create_login_page_box(
'%s/youraccount/login?error=openid-python' % \
CFG_SITE_SECURE_URL,
args['ln']
),
navtrail = """
<a class="navtrail" href="%s/youraccount/display?ln=%s">
""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description = "%s Personalize, Main page" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords = "%s , personalize" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid = 0,
req = req,
secure_page_p = 1,
language = args['ln'],
lastupdated = __lastupdated__,
navmenuid = 'youraccount')
# If either provider isn't activated or OpenID authentication is
# disabled, redirect to login page.
if not (args['provider'] in CFG_OPENID_PROVIDERS and
CFG_OPENID_AUTHENTICATION):
redirect_to_url(req, CFG_SITE_SECURE_URL + "/youraccount/login")
# Load the right message language
_ = gettext_set_language(args['ln'])
# Construct the OpenID identifier url according to given template in the
# configuration.
openid_url = CFG_OPENID_CONFIGURATIONS[args['provider']]['identifier'].\
format(args['identifier'])
oidconsumer = get_consumer(req)
try:
request = oidconsumer.begin(openid_url)
except consumer.DiscoveryFailure:
# If the identifier is invalid, then display login form with error
# message.
return page(title = _("Login"),
body = webaccount.create_login_page_box(
'%s/youraccount/login?error=openid-invalid' % \
CFG_SITE_SECURE_URL,
args['ln']
),
navtrail = """
<a class="navtrail" href="%s/youraccount/display?ln=%s">
""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description = "%s Personalize, Main page" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords = "%s , personalize" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid = 0,
req = req,
secure_page_p = 1,
language = args['ln'],
lastupdated = __lastupdated__,
navmenuid = 'youraccount')
else:
trust_root = CFG_SITE_SECURE_URL + "/"
return_to = CFG_SITE_SECURE_URL + "/youraccount/login?"
if args['provider'] == 'openid':
# Look if the identifier is defined.
for key in CFG_OPENID_CONFIGURATIONS.keys():
if CFG_OPENID_CONFIGURATIONS[key]['identifier']!='{0}':
regexp = re.compile(CFG_OPENID_CONFIGURATIONS[key]\
['identifier'].\
format("\w+"), re.IGNORECASE)
if openid_url in CFG_OPENID_CONFIGURATIONS[key]\
['identifier'] or \
regexp.match(openid_url):
args['provider'] = key
break
return_to += "login_method=openid&provider=%s" % (
args['provider']
)
request_registration_data(request, args['provider'])
if args['referer']:
return_to += "&referer=%s" % args['referer']
if request.shouldSendRedirect():
redirect_url = request.redirectURL(
trust_root,
return_to,
immediate = False)
redirect_to_url(req, redirect_url)
else:
form_html = request.htmlMarkup(trust_root,
return_to,
form_tag_attrs = {
'id':'openid_message'
},
immediate = False)
return form_html
def oauth2(self, req, form):
args = wash_urlargd(form, {'provider': (str, '')})
# If either provider isn't activated or OAuth2 authentication is
# disabled, redirect to login page.
if not (args['provider'] in CFG_OAUTH2_PROVIDERS and
CFG_OAUTH2_AUTHENTICATION):
redirect_to_url(req, CFG_SITE_SECURE_URL + "/youraccount/login")
# Load the right message language
_ = gettext_set_language(args['ln'])
try:
from rauth.service import OAuth2Service
except:
# Return login page with 'Need to install rauth' error
return page(title = _("Login"),
body = webaccount.create_login_page_box(
'%s/youraccount/login?error=oauth-rauth' % \
CFG_SITE_SECURE_URL,
args['ln']
),
navtrail = """
<a class="navtrail" href="%s/youraccount/display?ln=%s">
""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description = "%s Personalize, Main page" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords = "%s , personalize" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid = 0,
req = req,
secure_page_p = 1,
language = args['ln'],
lastupdated = __lastupdated__,
navmenuid = 'youraccount')
provider_name = args['provider']
# Load the configurations of the OAuth2 provider
config = CFG_OAUTH2_CONFIGURATIONS[provider_name]
try:
if not (config['consumer_key'] and config['consumer_secret']):
raise Exception
provider = OAuth2Service(
name = provider_name,
consumer_key = config['consumer_key'],
consumer_secret = config['consumer_secret'],
access_token_url = config['access_token_url'],
authorize_url = config['authorize_url']
)
except:
# Return login page with 'OAuth service isn't configurated' error
return page(title = _("Login"),
body = webaccount.create_login_page_box(
'%s/youraccount/login?error=oauth-config' % \
CFG_SITE_SECURE_URL,
args['ln']
),
navtrail = """
<a class="navtrail" href="%s/youraccount/display?ln=%s">
""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description = "%s Personalize, Main page" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords = "%s , personalize" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid = 0,
req = req,
secure_page_p = 1,
language = args['ln'],
lastupdated = __lastupdated__,
navmenuid = 'youraccount')
# Construct the authorization url
params = config.get('authorize_parameters', {})
params['redirect_uri'] = '%s/youraccount/login?login_method=oauth2\
&provider=%s' % (CFG_SITE_SECURE_URL, args['provider'])
url = provider.get_authorize_url(**params)
redirect_to_url(req, url)
def oauth1(self, req, form):
args = wash_urlargd(form, {'provider': (str, '')})
# If either provider isn't activated or OAuth1 authentication is
# disabled, redirect to login page.
if not (args['provider'] in CFG_OAUTH1_PROVIDERS and
CFG_OAUTH1_AUTHENTICATION):
redirect_to_url(req, CFG_SITE_SECURE_URL + "/youraccount/login")
# Load the right message language
_ = gettext_set_language(args['ln'])
try:
from rauth.service import OAuth1Service
except:
# Return login page with 'Need to install rauth' error
return page(title = _("Login"),
body = webaccount.create_login_page_box(
'%s/youraccount/login?error=oauth-rauth' % \
CFG_SITE_SECURE_URL,
args['ln']
),
navtrail = """
<a class="navtrail" href="%s/youraccount/display?ln=%s">
""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description = "%s Personalize, Main page" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords = "%s , personalize" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid = 0,
req = req,
secure_page_p = 1,
language = args['ln'],
lastupdated = __lastupdated__,
navmenuid = 'youraccount')
# Load the configurations of the OAuth1 provider
config = CFG_OAUTH1_CONFIGURATIONS[args['provider']]
try:
if not (config['consumer_key'] and config['consumer_secret']):
raise Exception
provider = OAuth1Service(
name = args['provider'],
consumer_key = config['consumer_key'],
consumer_secret = config['consumer_secret'],
request_token_url = config['request_token_url'],
access_token_url = config['access_token_url'],
authorize_url = config['authorize_url'],
header_auth = True
)
except:
# Return login page with 'OAuth service isn't configurated' error
return page(title = _("Login"),
body = webaccount.create_login_page_box(
'%s/youraccount/login?error=oauth-config' % \
CFG_SITE_SECURE_URL,
args['ln']
),
navtrail = """
<a class="navtrail" href="%s/youraccount/display?ln=%s">
""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description = "%s Personalize, Main page" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords = "%s , personalize" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid = 0,
req = req,
secure_page_p = 1,
language = args['ln'],
lastupdated = __lastupdated__,
navmenuid = 'youraccount')
try:
# Obtain request token and its secret.
request_token, request_token_secret = \
provider.get_request_token(
method = 'GET',
data = {
'oauth_callback': \
"%s/youraccount/login?login_method=oauth1&provider=%s" % (
CFG_SITE_SECURE_URL,
args['provider']
)
}
)
except:
# Return login page with 'Cannot connect the provider' error
return page(title = _("Login"),
body = webaccount.create_login_page_box(
'%s/youraccount/login?error=connection-error' % \
CFG_SITE_SECURE_URL,
args['ln']
),
navtrail = """
<a class="navtrail" href="%s/youraccount/display?ln=%s">
""" % (CFG_SITE_SECURE_URL, args['ln']) + _("Your Account") + """</a>""",
description = "%s Personalize, Main page" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
keywords = "%s , personalize" % \
CFG_SITE_NAME_INTL.get(args['ln'], CFG_SITE_NAME),
uid = 0,
req = req,
secure_page_p = 1,
language = args['ln'],
lastupdated = __lastupdated__,
navmenuid = 'youraccount')
# Construct the authorization url.
authorize_parameters = config.get('authorize_parameters', {})
authorize_url = provider.get_authorize_url(request_token,
**authorize_parameters)
# Save request token into database since it will be used in
# authentication
query = """INSERT INTO oauth1_storage VALUES(%s, %s, NOW())"""
params = (request_token, request_token_secret)
run_sql(query, params)
redirect_to_url(req, authorize_url)
class WebInterfaceYourTicketsPages(WebInterfaceDirectory):
#support for /yourtickets url
_exports = ['', 'display']
def __call__(self, req, form):
#if there is no trailing slash
self.index(req, form)
def index(self, req, form):
#take all the parameters..
unparsed_uri = req.unparsed_uri
qstr = ""
if unparsed_uri.count('?') > 0:
dummy, qstr = unparsed_uri.split('?')
qstr = '?'+qstr
redirect_to_url(req, '/yourtickets/display'+qstr)
def display(self, req, form):
#show tickets for this user
argd = wash_urlargd(form, {'ln': (str, ''), 'start': (int, 1) })
uid = webuser.getUid(req)
ln = argd['ln']
start = argd['start']
_ = gettext_set_language(ln)
body = bibcatalog_templates.tmpl_your_tickets(uid, ln, start)
return page(title=_("Your tickets"),
body=body,
navtrail="""<a class="navtrail" href="%s/youraccount/display?ln=%s">""" % (CFG_SITE_SECURE_URL, argd['ln']) + _("Your Account") + """</a>""",
uid=uid,
req=req,
language=argd['ln'],
lastupdated=__lastupdated__,
secure_page_p=1)
class WebInterfaceYourGroupsPages(WebInterfaceDirectory):
_exports = ['', 'display', 'create', 'join', 'leave', 'edit', 'members']
def index(self, req, form):
redirect_to_url(req, '/yourgroups/display')
def display(self, req, form):
"""
Displays groups the user is admin of
and the groups the user is member of(but not admin)
@param ln: language
@return: the page for all the groups
"""
argd = wash_urlargd(form, {})
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(argd['ln'])
if uid == -1 or webuser.isGuestUser(uid) or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../yourgroups/display",
navmenuid='yourgroups')
user_info = webuser.collect_user_info(req)
if not user_info['precached_usegroups']:
return webuser.page_not_authorized(req, "../", \
text = _("You are not authorized to use groups."))
body = webgroup.perform_request_groups_display(uid=uid,
ln=argd['ln'])
return page(title = _("Your Groups"),
body = body,
navtrail = webgroup.get_navtrail(argd['ln']),
uid = uid,
req = req,
language = argd['ln'],
lastupdated = __lastupdated__,
navmenuid = 'yourgroups',
secure_page_p = 1)
def create(self, req, form):
"""create(): interface for creating a new group
@param group_name: : name of the new webgroup.Must be filled
@param group_description: : description of the new webgroup.(optionnal)
@param join_policy: : join policy of the new webgroup.Must be chosen
@param *button: which button was pressed
@param ln: language
@return: the compose page Create group
"""
argd = wash_urlargd(form, {'group_name': (str, ""),
'group_description': (str, ""),
'join_policy': (str, ""),
'create_button':(str, ""),
'cancel':(str, "")
})
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(argd['ln'])
if uid == -1 or webuser.isGuestUser(uid) or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../yourgroups/create",
navmenuid='yourgroups')
user_info = webuser.collect_user_info(req)
if not user_info['precached_usegroups']:
return webuser.page_not_authorized(req, "../", \
text = _("You are not authorized to use groups."))
if argd['cancel']:
url = CFG_SITE_SECURE_URL + '/yourgroups/display?ln=%s'
url %= argd['ln']
redirect_to_url(req, url)
if argd['create_button'] :
body= webgroup.perform_request_create_group(uid=uid,
group_name=argd['group_name'],
group_description=argd['group_description'],
join_policy=argd['join_policy'],
ln = argd['ln'])
else:
body = webgroup.perform_request_input_create_group(group_name=argd['group_name'],
group_description=argd['group_description'],
join_policy=argd['join_policy'],
ln=argd['ln'])
title = _("Create new group")
return page(title = title,
body = body,
navtrail = webgroup.get_navtrail(argd['ln'], title),
uid = uid,
req = req,
language = argd['ln'],
lastupdated = __lastupdated__,
navmenuid = 'yourgroups',
secure_page_p = 1)
def join(self, req, form):
"""join(): interface for joining a new group
@param grpID: : list of the group the user wants to become a member.
The user must select only one group.
@param group_name: : will search for groups matching group_name
@param *button: which button was pressed
@param ln: language
@return: the compose page Join group
"""
argd = wash_urlargd(form, {'grpID':(list, []),
'group_name':(str, ""),
'find_button':(str, ""),
'join_button':(str, ""),
'cancel':(str, "")
})
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(argd['ln'])
if uid == -1 or webuser.isGuestUser(uid) or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../yourgroups/join",
navmenuid='yourgroups')
user_info = webuser.collect_user_info(req)
if not user_info['precached_usegroups']:
return webuser.page_not_authorized(req, "../", \
text = _("You are not authorized to use groups."))
if argd['cancel']:
url = CFG_SITE_SECURE_URL + '/yourgroups/display?ln=%s'
url %= argd['ln']
redirect_to_url(req, url)
if argd['join_button']:
search = 0
if argd['group_name']:
search = 1
body = webgroup.perform_request_join_group(uid,
argd['grpID'],
argd['group_name'],
search,
argd['ln'])
else:
search = 0
if argd['find_button']:
search = 1
body = webgroup.perform_request_input_join_group(uid,
argd['group_name'],
search,
ln=argd['ln'])
title = _("Join New Group")
return page(title = title,
body = body,
navtrail = webgroup.get_navtrail(argd['ln'], title),
uid = uid,
req = req,
language = argd['ln'],
lastupdated = __lastupdated__,
navmenuid = 'yourgroups',
secure_page_p = 1)
def leave(self, req, form):
"""leave(): interface for leaving a group
@param grpID: : group the user wants to leave.
@param group_name: : name of the group the user wants to leave
@param *button: which button was pressed
@param confirmed: : the user is first asked to confirm
@param ln: language
@return: the compose page Leave group
"""
argd = wash_urlargd(form, {'grpID':(int, 0),
'group_name':(str, ""),
'leave_button':(str, ""),
'cancel':(str, ""),
'confirmed': (int, 0)
})
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(argd['ln'])
if uid == -1 or webuser.isGuestUser(uid) or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../yourgroups/leave",
navmenuid='yourgroups')
user_info = webuser.collect_user_info(req)
if not user_info['precached_usegroups']:
return webuser.page_not_authorized(req, "../", \
text = _("You are not authorized to use groups."))
if argd['cancel']:
url = CFG_SITE_SECURE_URL + '/yourgroups/display?ln=%s'
url %= argd['ln']
redirect_to_url(req, url)
if argd['leave_button']:
body = webgroup.perform_request_leave_group(uid,
argd['grpID'],
argd['confirmed'],
argd['ln'])
else:
body = webgroup.perform_request_input_leave_group(uid=uid,
ln=argd['ln'])
title = _("Leave Group")
return page(title = title,
body = body,
navtrail = webgroup.get_navtrail(argd['ln'], title),
uid = uid,
req = req,
language = argd['ln'],
lastupdated = __lastupdated__,
navmenuid = 'yourgroups',
secure_page_p = 1)
def edit(self, req, form):
"""edit(): interface for editing group
@param grpID: : group ID
@param group_name: : name of the new webgroup.Must be filled
@param group_description: : description of the new webgroup.(optionnal)
@param join_policy: : join policy of the new webgroup.Must be chosen
@param update: button update group pressed
@param delete: button delete group pressed
@param cancel: button cancel pressed
@param confirmed: : the user is first asked to confirm before deleting
@param ln: language
@return: the main page displaying all the groups
"""
argd = wash_urlargd(form, {'grpID': (int, 0),
'update': (str, ""),
'cancel': (str, ""),
'delete': (str, ""),
'group_name': (str, ""),
'group_description': (str, ""),
'join_policy': (str, ""),
'confirmed': (int, 0)
})
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(argd['ln'])
if uid == -1 or webuser.isGuestUser(uid) or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../yourgroups/display",
navmenuid='yourgroups')
user_info = webuser.collect_user_info(req)
if not user_info['precached_usegroups']:
return webuser.page_not_authorized(req, "../", \
text = _("You are not authorized to use groups."))
if argd['cancel']:
url = CFG_SITE_SECURE_URL + '/yourgroups/display?ln=%s'
url %= argd['ln']
redirect_to_url(req, url)
elif argd['delete']:
body = webgroup.perform_request_delete_group(uid=uid,
grpID=argd['grpID'],
confirmed=argd['confirmed'])
elif argd['update']:
body = webgroup.perform_request_update_group(uid= uid,
grpID=argd['grpID'],
group_name=argd['group_name'],
group_description=argd['group_description'],
join_policy=argd['join_policy'],
ln=argd['ln'])
else :
body= webgroup.perform_request_edit_group(uid=uid,
grpID=argd['grpID'],
ln=argd['ln'])
title = _("Edit Group")
return page(title = title,
body = body,
navtrail = webgroup.get_navtrail(argd['ln'], title),
uid = uid,
req = req,
language = argd['ln'],
lastupdated = __lastupdated__,
navmenuid = 'yourgroups',
secure_page_p = 1)
def members(self, req, form):
"""member(): interface for managing members of a group
@param grpID: : group ID
@param add_member: button add_member pressed
@param remove_member: button remove_member pressed
@param reject_member: button reject__member pressed
@param delete: button delete group pressed
@param member_id: : ID of the existing member selected
@param pending_member_id: : ID of the pending member selected
@param cancel: button cancel pressed
@param info: : info about last user action
@param ln: language
@return: the same page with data updated
"""
argd = wash_urlargd(form, {'grpID': (int, 0),
'cancel': (str, ""),
'add_member': (str, ""),
'remove_member': (str, ""),
'reject_member': (str, ""),
'member_id': (int, 0),
'pending_member_id': (int, 0)
})
uid = webuser.getUid(req)
# load the right message language
_ = gettext_set_language(argd['ln'])
if uid == -1 or webuser.isGuestUser(uid) or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return webuser.page_not_authorized(req, "../yourgroups/display",
navmenuid='yourgroups')
user_info = webuser.collect_user_info(req)
if not user_info['precached_usegroups']:
return webuser.page_not_authorized(req, "../", \
text = _("You are not authorized to use groups."))
if argd['cancel']:
url = CFG_SITE_SECURE_URL + '/yourgroups/display?ln=%s'
url %= argd['ln']
redirect_to_url(req, url)
if argd['remove_member']:
body = webgroup.perform_request_remove_member(uid=uid,
grpID=argd['grpID'],
member_id=argd['member_id'],
ln=argd['ln'])
elif argd['reject_member']:
body = webgroup.perform_request_reject_member(uid=uid,
grpID=argd['grpID'],
user_id=argd['pending_member_id'],
ln=argd['ln'])
elif argd['add_member']:
body = webgroup.perform_request_add_member(uid=uid,
grpID=argd['grpID'],
user_id=argd['pending_member_id'],
ln=argd['ln'])
else:
body= webgroup.perform_request_manage_member(uid=uid,
grpID=argd['grpID'],
ln=argd['ln'])
title = _("Edit group members")
return page(title = title,
body = body,
navtrail = webgroup.get_navtrail(argd['ln'], title),
uid = uid,
req = req,
language = argd['ln'],
lastupdated = __lastupdated__,
navmenuid = 'yourgroups',
secure_page_p = 1)
def wash_login_method(login_method):
"""
Wash the login_method parameter that came from the web input form.
@param login_method: Wanted login_method value as it came from the
web input form.
@type login_method: string
@return: Washed version of login_method. If the login_method
value is valid, then return it. If it is not valid, then
return `Local' (the default login method).
@rtype: string
@warning: Beware, 'Local' is hardcoded here!
"""
if login_method in CFG_EXTERNAL_AUTHENTICATION:
return login_method
else:
return 'Local'
diff --git a/invenio/legacy/webstat/admin.py b/invenio/legacy/webstat/admin.py
index 29a57054e..c6e4411a7 100644
--- a/invenio/legacy/webstat/admin.py
+++ b/invenio/legacy/webstat/admin.py
@@ -1,249 +1,249 @@
## $id: webstatadmin.py,v 1.28 2007/04/01 23:46:46 tibor exp $
##
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
__lastupdated__ = "$Date$"
import sys
from invenio import webstat
from invenio.legacy.dbquery import run_sql
-from invenio.bibtask import task_init, task_get_option, task_set_option, \
+from invenio.legacy.bibsched.bibtask import task_init, task_get_option, task_set_option, \
task_has_option, task_update_progress, write_message
-from invenio.webstat_config import CFG_WEBSTAT_CONFIG_PATH
+from invenio.legacy.webstat.config import CFG_WEBSTAT_CONFIG_PATH
from invenio.config import CFG_SITE_RECORD
def main():
"""Main dealing with all the BibTask magic."""
task_init(authorization_action="runwebstatadmin",
authorization_msg="Webstat Administrator",
description="Description: %s Creates/deletes custom events. Can be set\n"
" to cache key events and previously defined custom events.\n" % sys.argv[0],
help_specific_usage=" -n, --new-event=ID create a new custom event with the human-readable ID\n"
" -r, --remove-event=ID remote the custom event with id ID and all its data\n"
" -S, --show-events show all currently available custom events\n"
" -c, --cache-events=CLASS|[ID] caches the events defined by the class or IDs, e.g.:\n"
" -c ALL\n"
" -c KEYEVENTS\n"
" -c CUSTOMEVENTS\n"
" -c 'event id1',id2,'testevent'\n"
" -d,--dump-config dump default config file\n"
" -e,--load-config create the custom events described in config_file\n"
"\nWhen creating events (-n) the following parameters are also applicable:\n"
" -l, --event-label=NAME set a descriptive label to the custom event\n"
" -a, --args=[NAME] set column headers for additional custom event arguments\n"
" (e.g. -a country,person,car)\n",
version=__revision__,
specific_params=("n:r:Sl:a:c:de", ["new-event=", "remove-event=", "show-events",
"event-label=", "args=", "cache-events=", "dump-config",
"load-config"]),
task_submit_elaborate_specific_parameter_fnc=task_submit_elaborate_specific_parameter,
task_submit_check_options_fnc=task_submit_check_options,
task_run_fnc=task_run_core)
def task_submit_elaborate_specific_parameter(key, value, opts, args):
"""
Given the string key it checks it's meaning, eventually using the value.
Usually it fills some key in the options dict. It must return True if
it has elaborated the key, False, if it doesn't know that key. eg:
"""
if key in ("-n", "--new-event"):
task_set_option("create_event_with_id", value)
elif key in ("-r", "--remove-event"):
task_set_option("destroy_event_with_id", value)
elif key in ("-S", "--show-events"):
task_set_option("list_events", True)
elif key in ("-l", "--event-label"):
task_set_option("event_name", value)
elif key in ("-a", "--args"):
task_set_option("column_headers", value.split(','))
elif key in ("-c", "--cache-events"):
task_set_option("cache_events", value.split(','))
elif key in ("-d", "--dump-config"):
task_set_option("dump_config", True)
elif key in ("-e", "--load-config"):
task_set_option("load_config", True)
else:
return False
return True
def task_submit_check_options():
"""
NOTE: Depending on the parameters, either "BibSched mode" or plain
straigh-forward execution mode is entered.
"""
if task_has_option("create_event_with_id"):
print webstat.create_customevent(task_get_option("create_event_with_id"),
task_get_option("event_name", None),
task_get_option("column_headers", []))
sys.exit(0)
elif task_has_option("destroy_event_with_id"):
print webstat.destroy_customevent(task_get_option("destroy_event_with_id"))
sys.exit(0)
elif task_has_option("list_events"):
events = webstat._get_customevents()
if len(events) == 0:
print "There are no custom events available."
else:
print "Available custom events are:\n"
print '\n'.join([x[0] + ": " + ((x[1] == None) and "No descriptive name" or str(x[1])) for x in events])
sys.exit(0)
elif task_has_option("cache_events"):
events = task_get_option("cache_events")
write_message(str(events), verbose=9)
if events[0] == 'ALL':
keyevents_to_cache = webstat.KEYEVENT_REPOSITORY.keys()
customevents_to_cache = [x[0] for x in webstat._get_customevents()]
elif events[0] == 'KEYEVENTS':
keyevents_to_cache = webstat.KEYEVENT_REPOSITORY.keys()
customevents_to_cache = []
elif events[0] == 'CUSTOMEVENTS':
keyevents_to_cache = []
customevents_to_cache = [x[0] for x in webstat._get_customevents()]
elif events[0] != '':
keyevents_to_cache = [x for x in webstat.KEYEVENT_REPOSITORY.keys() if x in events]
customevents_to_cache = [x[0] for x in webstat._get_customevents() if x in events]
# Control so that we have valid event names
if len(keyevents_to_cache + customevents_to_cache) == 0:
# Oops, no events. Abort and display help.
return False
else:
task_set_option("keyevents", keyevents_to_cache)
task_set_option("customevents", customevents_to_cache)
return True
elif task_has_option("dump_config"):
print """\
[general]
visitors_box = True
search_box = True
record_box = True
bibsched_box = True
basket_box = True
apache_box = True
uptime_box = True
[webstat_custom_event_1]
name = baskets
param1 = action
param2 = basket
param3 = user
[apache_log_analyzer]
profile = nil
nb-histogram-items-to-print = 20
exclude-ip-list = ("137.138.249.162")
home-collection = "Atlantis Institute of Fictive Science"
search-interface-url = "/?"
detailed-record-url = "/%s/"
search-engine-url = "/search?"
search-engine-url-old-style = "/search.py?"
basket-url = "/yourbaskets/"
add-to-basket-url = "/yourbaskets/add"
display-basket-url = "/yourbaskets/display"
display-public-basket-url = "/yourbaskets/display_public"
alert-url = "/youralerts/"
display-your-alerts-url = "/youralerts/list"
display-your-searches-url = "/youralerts/display"
""" % CFG_SITE_RECORD
sys.exit(0)
elif task_has_option("load_config"):
from ConfigParser import ConfigParser
conf = ConfigParser()
conf.read(CFG_WEBSTAT_CONFIG_PATH)
for section in conf.sections():
if section[:21] == "webstat_custom_event_":
cols = []
name = ""
for option, value in conf.items(section):
if option == "name":
name = value
if option[:5] == "param":
# add the column name in it's position
index = int(option[-1]) - 1
while len(cols) <= index:
cols.append("")
cols[index] = value
if name:
res = run_sql("SELECT COUNT(id) FROM staEVENT WHERE id = %s", (name, ))
if res[0][0] == 0:
# name does not exist, create customevent
webstat.create_customevent(name, name, cols)
else:
# name already exists, update customevent
webstat.modify_customevent(name, cols=cols)
sys.exit(0)
else:
# False means that the --help should be displayed
return False
def task_run_core():
"""
When this function is called, the tool has entered BibSched mode, which means
that we're going to cache events according to the parameters.
"""
write_message("Initiating rawdata caching")
task_update_progress("Initating rawdata caching")
# Cache key events
keyevents = task_get_option("keyevents")
if keyevents and len(keyevents) > 0:
for i in range(len(keyevents)):
write_message("Caching key event 1: %s" % keyevents[i])
webstat.cache_keyevent_trend(keyevents)
task_update_progress("Part 1/2: done %d/%d" % (i + 1, len(keyevents)))
# Cache custom events
customevents = task_get_option("customevents")
if len(customevents) > 0:
for i in range(len(customevents)):
write_message("Caching custom event 1: %s" % customevents[i])
webstat.cache_customevent_trend(customevents)
task_update_progress("Part 2/2: done %d/%d" % (i + 1, len(customevents)))
write_message("Finished rawdata caching succesfully")
task_update_progress("Finished rawdata caching succesfully")
return True
diff --git a/invenio/legacy/webstat/api.py b/invenio/legacy/webstat/api.py
index e28334e7a..704e443aa 100644
--- a/invenio/legacy/webstat/api.py
+++ b/invenio/legacy/webstat/api.py
@@ -1,1990 +1,1990 @@
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
__lastupdated__ = "$Date$"
import os
import time
import re
import datetime
import cPickle
import calendar
from datetime import timedelta
from urllib import quote
from invenio.legacy import template
from invenio.config import \
CFG_WEBDIR, \
CFG_TMPDIR, \
CFG_SITE_URL, \
CFG_SITE_LANG, \
CFG_WEBSTAT_BIBCIRCULATION_START_YEAR
-from invenio.webstat_config import CFG_WEBSTAT_CONFIG_PATH
+from invenio.legacy.webstat.config import CFG_WEBSTAT_CONFIG_PATH
from invenio.bibindex_tokenizers.BibIndexJournalTokenizer import CFG_JOURNAL_TAG
from invenio.legacy.search_engine import get_coll_i18nname, \
wash_index_term
from invenio.legacy.dbquery import run_sql, wash_table_column_name, ProgrammingError
-from invenio.bibsched import is_task_scheduled, \
+from invenio.legacy.bibsched.scripts.bibsched import is_task_scheduled, \
get_task_ids_by_descending_date, \
get_task_options
# Imports handling key events and error log
-from invenio.webstat_engine import get_keyevent_trend_collection_population, \
+from invenio.legacy.webstat.engine import get_keyevent_trend_collection_population, \
get_keyevent_trend_new_records, \
get_keyevent_trend_search_frequency, \
get_keyevent_trend_search_type_distribution, \
get_keyevent_trend_download_frequency, \
get_keyevent_trend_comments_frequency, \
get_keyevent_trend_number_of_loans, \
get_keyevent_trend_web_submissions, \
get_keyevent_snapshot_apache_processes, \
get_keyevent_snapshot_bibsched_status, \
get_keyevent_snapshot_uptime_cmd, \
get_keyevent_snapshot_sessions, \
get_keyevent_bibcirculation_report, \
get_keyevent_loan_statistics, \
get_keyevent_loan_lists, \
get_keyevent_renewals_lists, \
get_keyevent_returns_table, \
get_keyevent_trend_returns_percentage, \
get_keyevent_ill_requests_statistics, \
get_keyevent_ill_requests_lists, \
get_keyevent_trend_satisfied_ill_requests_percentage, \
get_keyevent_items_statistics, \
get_keyevent_items_lists, \
get_keyevent_loan_request_statistics, \
get_keyevent_loan_request_lists, \
get_keyevent_user_statistics, \
get_keyevent_user_lists, \
_get_doctypes, \
_get_item_statuses, \
_get_item_doctype, \
_get_request_statuses, \
_get_libraries, \
_get_loan_periods, \
get_invenio_error_log_ranking, \
get_invenio_last_n_errors, \
update_error_log_analyzer, \
get_apache_error_log_ranking, \
get_last_updates, \
get_list_link, \
get_general_status, \
get_ingestion_matching_records, \
get_record_ingestion_status, \
get_specific_ingestion_status, \
get_title_ingestion, \
get_record_last_modification
# Imports handling custom events
-from invenio.webstat_engine import get_customevent_table, \
+from invenio.legacy.webstat.engine import get_customevent_table, \
get_customevent_trend, \
get_customevent_dump
# Imports handling custom report
-from invenio.webstat_engine import get_custom_summary_data, \
+from invenio.legacy.webstat.engine import get_custom_summary_data, \
_get_tag_name, \
create_custom_summary_graph
# Imports for handling outputting
-from invenio.webstat_engine import create_graph_trend, \
+from invenio.legacy.webstat.engine import create_graph_trend, \
create_graph_dump, \
create_graph_table, \
get_numeric_stats
# Imports for handling exports
-from invenio.webstat_engine import export_to_python, \
+from invenio.legacy.webstat.engine import export_to_python, \
export_to_csv, \
export_to_file
TEMPLATES = template.load('webstat')
# Constants
WEBSTAT_CACHE_INTERVAL = 600 # Seconds, cache_* functions not affected by this.
# Also not taking into account if BibSched has
# webstatadmin process.
WEBSTAT_RAWDATA_DIRECTORY = CFG_TMPDIR + "/"
WEBSTAT_GRAPH_DIRECTORY = CFG_WEBDIR + "/img/"
TYPE_REPOSITORY = [('gnuplot', 'Image - Gnuplot'),
('asciiart', 'Image - ASCII art'),
('flot', 'Image - Flot'),
('asciidump', 'Image - ASCII dump'),
('python', 'Data - Python code', export_to_python),
('csv', 'Data - CSV', export_to_csv)]
def get_collection_list_plus_all():
""" Return all the collection names plus the name All"""
coll = [('All', 'All')]
res = run_sql("SELECT name FROM collection WHERE (dbquery IS NULL OR dbquery \
NOT LIKE 'hostedcollection:%') ORDER BY name ASC")
for c_name in res:
# make a nice printable name (e.g. truncate c_printable for
# long collection names in given language):
c_printable_fullname = get_coll_i18nname(c_name[0], CFG_SITE_LANG, False)
c_printable = wash_index_term(c_printable_fullname, 30, False)
if c_printable != c_printable_fullname:
c_printable = c_printable + "..."
coll.append([c_name[0], c_printable])
return coll
# Key event repository, add an entry here to support new key measures.
KEYEVENT_REPOSITORY = {'collection population':
{'fullname': 'Collection population',
'specificname':
'Population in collection "%(collection)s"',
'description':
('The collection population is the number of \
documents existing in the selected collection.', ),
'gatherer':
get_keyevent_trend_collection_population,
'extraparams': {'collection': ('combobox', 'Collection',
get_collection_list_plus_all)},
'cachefilename':
'webstat_%(event_id)s_%(collection)s_%(timespan)s',
'ylabel': 'Number of records',
'multiple': None,
'output': 'Graph'},
'new records':
{'fullname': 'New records',
'specificname':
'New records in collection "%(collection)s"',
'description':
('The graph shows the new documents created in \
the selected collection and time span.', ),
'gatherer':
get_keyevent_trend_new_records,
'extraparams': {'collection': ('combobox', 'Collection',
get_collection_list_plus_all)},
'cachefilename':
'webstat_%(event_id)s_%(collection)s_%(timespan)s',
'ylabel': 'Number of records',
'multiple': None,
'output': 'Graph'},
'search frequency':
{'fullname': 'Search frequency',
'specificname': 'Search frequency',
'description':
('The search frequency is the number of searches \
performed in a specific time span.', ),
'gatherer': get_keyevent_trend_search_frequency,
'extraparams': {},
'cachefilename':
'webstat_%(event_id)s_%(timespan)s',
'ylabel': 'Number of searches',
'multiple': None,
'output': 'Graph'},
'search type distribution':
{'fullname': 'Search type distribution',
'specificname': 'Search type distribution',
'description':
('The search type distribution shows both the \
number of simple searches and the number of advanced searches in the same graph.', ),
'gatherer':
get_keyevent_trend_search_type_distribution,
'extraparams': {},
'cachefilename':
'webstat_%(event_id)s_%(timespan)s',
'ylabel': 'Number of searches',
'multiple': ['Simple searches',
'Advanced searches'],
'output': 'Graph'},
'download frequency':
{'fullname': 'Download frequency',
'specificname': 'Download frequency in collection "%(collection)s"',
'description':
('The download frequency is the number of fulltext \
downloads of the documents.', ),
'gatherer': get_keyevent_trend_download_frequency,
'extraparams': {'collection': ('combobox', 'Collection',
get_collection_list_plus_all)},
'cachefilename': 'webstat_%(event_id)s_%(collection)s_%(timespan)s',
'ylabel': 'Number of downloads',
'multiple': None,
'output': 'Graph'},
'comments frequency':
{'fullname': 'Comments frequency',
'specificname': 'Comments frequency in collection "%(collection)s"',
'description':
('The comments frequency is the amount of comments written \
for all the documents.', ),
'gatherer': get_keyevent_trend_comments_frequency,
'extraparams': {'collection': ('combobox', 'Collection',
get_collection_list_plus_all)},
'cachefilename': 'webstat_%(event_id)s_%(collection)s_%(timespan)s',
'ylabel': 'Number of comments',
'multiple': None,
'output': 'Graph'},
'number of loans':
{'fullname': 'Number of circulation loans',
'specificname': 'Number of circulation loans',
'description':
('The number of loans shows the total number of records loaned \
over a time span', ),
'gatherer': get_keyevent_trend_number_of_loans,
'extraparams': {},
'cachefilename':
'webstat_%(event_id)s_%(timespan)s',
'ylabel': 'Number of loans',
'multiple': None,
'output': 'Graph',
'type': 'bibcirculation'},
'web submissions':
{'fullname': 'Number of web submissions',
'specificname':
'Number of web submissions of "%(doctype)s"',
'description':
("The web submissions are the number of submitted \
documents using the web form.", ),
'gatherer': get_keyevent_trend_web_submissions,
'extraparams': {
'doctype': ('combobox', 'Type of document', _get_doctypes)},
'cachefilename':
'webstat_%(event_id)s_%(doctype)s_%(timespan)s',
'ylabel': 'Web submissions',
'multiple': None,
'output': 'Graph'},
'loans statistics':
{'fullname': 'Circulation loans statistics',
'specificname': 'Circulation loans statistics',
'description':
('The loan statistics consist on different numbers \
related to the records loaned. It is important to see the difference between document \
and item. The item is the physical representation of a document (like every copy of a \
book). There may be more items than documents, but never the opposite.', ),
'gatherer':
get_keyevent_loan_statistics,
'extraparams': {
'udc': ('textbox', 'UDC'),
'item_status': ('combobox', 'Item status', _get_item_statuses),
'publication_date': ('textbox', 'Publication date'),
'creation_date': ('textbox', 'Creation date')},
'cachefilename':
'webstat_%(event_id)s_%(udc)s_%(item_status)s_%(publication_date)s' + \
'_%(creation_date)s_%(timespan)s',
'rows': ['Number of documents loaned',
'Number of items loaned on the total number of items (%)',
'Number of items never loaned on the \
total number of items (%)',
'Average time between the date of \
the record creation and the date of the first loan (in days)'],
'output': 'Table',
'type': 'bibcirculation'},
'loans lists':
{'fullname': 'Circulation loans lists',
'specificname': 'Circulation loans lists',
'description':
('The loan lists show the most loaned and the never loaned \
records in a time span. The most loaned record are calculated as the number of loans by copy.', ),
'gatherer':
get_keyevent_loan_lists,
'extraparams': {
'udc': ('textbox', 'UDC'),
'loan_period': ('combobox', 'Loan period', _get_loan_periods),
'max_loans': ('textbox', 'Maximum number of loans'),
'min_loans': ('textbox', 'Minimum number of loans'),
'publication_date': ('textbox', 'Publication date'),
'creation_date': ('textbox', 'Creation date')},
'cachefilename':
'webstat_%(event_id)s_%(udc)s_%(loan_period)s' + \
'_%(min_loans)s_%(max_loans)s_%(publication_date)s_' + \
'%(creation_date)s_%(timespan)s',
'rows': [],
'output': 'List',
'type': 'bibcirculation'},
'renewals':
{'fullname': 'Circulation renewals',
'specificname': 'Circulation renewals',
'description':
('Here the list of most renewed items stored is shown \
by decreasing order', ),
'gatherer':
get_keyevent_renewals_lists,
'extraparams': {
'udc': ('textbox', 'UDC')},
'cachefilename':
'webstat_%(event_id)s_%(udc)s_%(timespan)s',
'rows': [],
'output': 'List',
'type': 'bibcirculation'},
'number returns':
{'fullname': 'Number of circulation overdue returns',
'specificname': 'Number of circulation overdue returns',
'description':
('The number of overdue returns is the number of loans \
that has not been returned by the due date (they may have been returned after or never).', ),
'gatherer':
get_keyevent_returns_table,
'extraparams': {},
'cachefilename':
'webstat_%(event_id)s_%(timespan)s',
'rows': ['Number of overdue returns'],
'output': 'Table',
'type': 'bibcirculation'},
'percentage returns':
{'fullname': 'Percentage of circulation overdue returns',
'specificname': 'Percentage of overdue returns',
'description':
('This graphs shows both the overdue returns and the total \
of returns.', ),
'gatherer':
get_keyevent_trend_returns_percentage,
'extraparams': {},
'cachefilename':
'webstat_%(event_id)s_%(timespan)s',
'ylabel': 'Percentage of overdue returns',
'multiple': ['Overdue returns',
'Total returns'],
'output': 'Graph',
'type': 'bibcirculation'},
'ill requests statistics':
{'fullname': 'Circulation ILL Requests statistics',
'specificname': 'Circulation ILL Requests statistics',
'description':
('The ILL requests statistics are different numbers \
related to the requests to other libraries.', ),
'gatherer':
get_keyevent_ill_requests_statistics,
'extraparams': {
'doctype': ('combobox', 'Type of document', _get_item_doctype),
'status': ('combobox', 'Status of request', _get_request_statuses),
'supplier': ('combobox', 'Supplier', _get_libraries)},
'cachefilename':
'webstat_%(event_id)s_%(doctype)s_%(status)s_%(supplier)s_%(timespan)s',
'rows': ['Number of ILL requests',
'Number of satisfied ILL requests 2 weeks \
after the date of request creation',
'Average time between the day \
of the ILL request date and day \
of the delivery item to the user (in days)',
'Average time between the day \
the ILL request was sent to the supplier and \
the day of the delivery item (in days)'],
'output': 'Table',
'type': 'bibcirculation'},
'ill requests list':
{'fullname': 'Circulation ILL Requests list',
'specificname': 'Circulation ILL Requests list',
'description':
('The ILL requests list shows 50 requests to other \
libraries on the selected time span.', ),
'gatherer':
get_keyevent_ill_requests_lists,
'extraparams': {
'doctype': ('combobox', 'Type of document', _get_item_doctype),
'supplier': ('combobox', 'Supplier', _get_libraries)},
'cachefilename':
'webstat_%(event_id)s_%(doctype)s_%(supplier)s_%(timespan)s',
'rows': [],
'output': 'List',
'type': 'bibcirculation'},
'percentage satisfied ill requests':
{'fullname': 'Percentage of circulation satisfied ILL requests',
'specificname': 'Percentage of circulation satisfied ILL requests',
'description':
('This graph shows both the satisfied ILL requests and \
the total number of requests in the selected time span.', ),
'gatherer':
get_keyevent_trend_satisfied_ill_requests_percentage,
'extraparams': {
'doctype': ('combobox', 'Type of document', _get_item_doctype),
'status': ('combobox', 'Status of request', _get_request_statuses),
'supplier': ('combobox', 'Supplier', _get_libraries)},
'cachefilename':
'webstat_%(event_id)s_%(doctype)s_%(status)s_%(supplier)s_%(timespan)s',
'ylabel': 'Percentage of satisfied ILL requests',
'multiple': ['Satisfied ILL requests',
'Total requests'],
'output': 'Graph',
'type': 'bibcirculation'},
'items stats':
{'fullname': 'Circulation items statistics',
'specificname': 'Circulation items statistics',
'description':
('The items statistics show the total number of items at \
the moment and the number of new items in the selected time span.', ),
'gatherer':
get_keyevent_items_statistics,
'extraparams': {
'udc': ('textbox', 'UDC'),
},
'cachefilename':
'webstat_%(event_id)s_%(udc)s_%(timespan)s',
'rows': ['The total number of items', 'Total number of new items'],
'output': 'Table',
'type': 'bibcirculation'},
'items list':
{'fullname': 'Circulation items list',
'specificname': 'Circulation items list',
'description':
('The item list shows data about the existing items.', ),
'gatherer':
get_keyevent_items_lists,
'extraparams': {
'library': ('combobox', 'Library', _get_libraries),
'status': ('combobox', 'Status', _get_item_statuses)},
'cachefilename':
'webstat_%(event_id)s_%(library)s_%(status)s',
'rows': [],
'output': 'List',
'type': 'bibcirculation'},
'loan request statistics':
{'fullname': 'Circulation hold requests statistics',
'specificname': 'Circulation hold requests statistics',
'description':
('The hold requests statistics show numbers about the \
requests for documents. For the numbers to be correct, there must be data in the loanrequest \
custom event.', ),
'gatherer':
get_keyevent_loan_request_statistics,
'extraparams': {
'item_status': ('combobox', 'Item status', _get_item_statuses)},
'cachefilename':
'webstat_%(event_id)s_%(item_status)s_%(timespan)s',
'rows': ['Number of hold requests, one week after the date of \
request creation',
'Number of successful hold requests transactions',
'Average time between the hold request date and \
the date of delivery document in a year'],
'output': 'Table',
'type': 'bibcirculation'},
'loan request lists':
{'fullname': 'Circulation hold requests lists',
'specificname': 'Circulation hold requests lists',
'description':
('The hold requests list shows the most requested items.', ),
'gatherer':
get_keyevent_loan_request_lists,
'extraparams': {
'udc': ('textbox', 'UDC')},
'cachefilename':
'webstat_%(event_id)s_%(udc)s_%(timespan)s',
'rows': [],
'output': 'List',
'type': 'bibcirculation'},
'user statistics':
{'fullname': 'Circulation users statistics',
'specificname': 'Circulation users statistics',
'description':
('The user statistics show the number of active users \
(at least one transaction) in the selected timespan.', ),
'gatherer':
get_keyevent_user_statistics,
'extraparams': {},
'cachefilename':
'webstat_%(event_id)s_%(timespan)s',
'rows': ['Number of active users'],
'output': 'Table',
'type': 'bibcirculation'},
'user lists':
{'fullname': 'Circulation users lists',
'specificname': 'Circulation users lists',
'description':
('The user list shows the most intensive users \
(ILL requests + Loans)', ),
'gatherer':
get_keyevent_user_lists,
'extraparams': {},
'cachefilename':
'webstat_%(event_id)s_%(timespan)s',
'rows': [],
'output': 'List',
'type': 'bibcirculation'}
}
# CLI
def create_customevent(event_id=None, name=None, cols=[]):
"""
Creates a new custom event by setting up the necessary MySQL tables.
@param event_id: Proposed human-readable id of the new event.
@type event_id: str
@param name: Optionally, a descriptive name.
@type name: str
@param cols: Optionally, the name of the additional columns.
@type cols: [str]
@return: A status message
@type: str
"""
if event_id is None:
return "Please specify a human-readable ID for the event."
# Only accept id and name with standard characters
if not re.search("[^\w]", str(event_id) + str(name)) is None:
return "Please note that both event id and event name needs to be " + \
"written without any non-standard characters."
# Make sure the chosen id is not already taken
if len(run_sql("SELECT NULL FROM staEVENT WHERE id = %s",
(event_id, ))) != 0:
return "Event id [%s] already exists! Aborted." % event_id
# Check if the cols are valid titles
for argument in cols:
if (argument == "creation_time") or (argument == "id"):
return "Invalid column title: %s! Aborted." % argument
# Insert a new row into the events table describing the new event
sql_param = [event_id]
if name is not None:
sql_name = "%s"
sql_param.append(name)
else:
sql_name = "NULL"
if len(cols) != 0:
sql_cols = "%s"
sql_param.append(cPickle.dumps(cols))
else:
sql_cols = "NULL"
run_sql("INSERT INTO staEVENT (id, name, cols) VALUES (%s, " + \
sql_name + ", " + sql_cols + ")", tuple(sql_param))
tbl_name = get_customevent_table(event_id)
# Create a table for the new event
sql_query = ["CREATE TABLE %s (" % wash_table_column_name(tbl_name)]
sql_query.append("id MEDIUMINT unsigned NOT NULL auto_increment,")
sql_query.append("creation_time TIMESTAMP DEFAULT NOW(),")
for argument in cols:
arg = wash_table_column_name(argument)
sql_query.append("`%s` MEDIUMTEXT NULL," % arg)
sql_query.append("INDEX `%s` (`%s` (50))," % (arg, arg))
sql_query.append("PRIMARY KEY (id))")
sql_str = ' '.join(sql_query)
run_sql(sql_str)
# We're done! Print notice containing the name of the event.
return ("Event table [%s] successfully created.\n" +
"Please use event id [%s] when registering an event.") \
% (tbl_name, event_id)
def modify_customevent(event_id=None, name=None, cols=[]):
"""
Modify a custom event. It can modify the columns definition
or/and the descriptive name
@param event_id: Human-readable id of the event.
@type event_id: str
@param name: Optionally, a descriptive name.
@type name: str
@param cols: Optionally, the name of the additional columns.
@type cols: [str]
@return: A status message
@type: str
"""
if event_id is None:
return "Please specify a human-readable ID for the event."
# Only accept name with standard characters
if not re.search("[^\w]", str(name)) is None:
return "Please note that event name needs to be written " + \
"without any non-standard characters."
# Check if the cols are valid titles
for argument in cols:
if (argument == "creation_time") or (argument == "id"):
return "Invalid column title: %s! Aborted." % argument
res = run_sql("SELECT CONCAT('staEVENT', number), cols " + \
"FROM staEVENT WHERE id = %s", (event_id, ))
if not res:
return "Invalid event id: %s! Aborted" % event_id
if not run_sql("SHOW TABLES LIKE %s", res[0][0]):
run_sql("DELETE FROM staEVENT WHERE id=%s", (event_id, ))
create_customevent(event_id, event_id, cols)
return
cols_orig = cPickle.loads(res[0][1])
# add new cols
cols_add = []
for col in cols:
if not col in cols_orig:
cols_add.append(col)
# del old cols
cols_del = []
for col in cols_orig:
if not col in cols:
cols_del.append(col)
#modify event table
if cols_del or cols_add:
sql_query = ["ALTER TABLE %s " % wash_table_column_name(res[0][0])]
# check if a column was renamed
for col_del in cols_del:
result = -1
while result < 1 or result > len(cols_add) + 1:
print """What do you want to do with the column %s in event %s?:
1.- Delete it""" % (col_del, event_id)
for i in range(len(cols_add)):
print "%d.- Rename it to %s" % (i + 2, cols_add[i])
result = int(raw_input("\n"))
if result == 1:
sql_query.append("DROP COLUMN `%s`" % col_del)
sql_query.append(", ")
else:
col_add = cols_add[result-2]
sql_query.append("CHANGE `%s` `%s` MEDIUMTEXT NULL"%(col_del, col_add))
sql_query.append(", ")
cols_add.remove(col_add)
# add the rest of the columns
for col_add in cols_add:
sql_query.append("ADD COLUMN `%s` MEDIUMTEXT NULL, " % col_add)
sql_query.append("ADD INDEX `%s` (`%s`(50))" % (col_add, col_add))
sql_query.append(", ")
sql_query[-1] = ";"
run_sql("".join(sql_query))
#modify event definition
sql_query = ["UPDATE staEVENT SET"]
sql_param = []
if cols_del or cols_add:
sql_query.append("cols = %s")
sql_query.append(",")
sql_param.append(cPickle.dumps(cols))
if name:
sql_query.append("name = %s")
sql_query.append(",")
sql_param.append(name)
if sql_param:
sql_query[-1] = "WHERE id = %s"
sql_param.append(event_id)
sql_str = ' '.join(sql_query)
run_sql(sql_str, sql_param)
# We're done! Print notice containing the name of the event.
return ("Event table [%s] successfully modified." % (event_id, ))
def destroy_customevent(event_id=None):
"""
Removes an existing custom event by destroying the MySQL tables and
the event data that might be around. Use with caution!
@param event_id: Human-readable id of the event to be removed.
@type event_id: str
@return: A status message
@type: str
"""
if event_id is None:
return "Please specify an existing event id."
# Check if the specified id exists
if len(run_sql("SELECT NULL FROM staEVENT WHERE id = %s",
(event_id, ))) == 0:
return "Custom event ID '%s' doesn't exist! Aborted." % event_id
else:
tbl_name = get_customevent_table(event_id)
run_sql("DROP TABLE %s" % wash_table_column_name(tbl_name)) # kwalitee: disable=sql
run_sql("DELETE FROM staEVENT WHERE id = %s", (event_id, ))
return ("Custom event ID '%s' table '%s' was successfully destroyed.\n") \
% (event_id, tbl_name)
def destroy_customevents():
"""
Removes all existing custom events by destroying the MySQL tables and
the events data that might be around. Use with caution!
@return: A status message
@type: str
"""
msg = ''
try:
res = run_sql("SELECT id FROM staEVENT")
except ProgrammingError:
return msg
for event in res:
msg += destroy_customevent(event[0])
return msg
def register_customevent(event_id, *arguments):
"""
Registers a custom event. Will add to the database's event tables
as created by create_customevent().
This function constitutes the "function hook" that should be
called throughout Invenio where one wants to register a
custom event! Refer to the help section on the admin web page.
@param event_id: Human-readable id of the event to be registered
@type event_id: str
@param *arguments: The rest of the parameters of the function call
@type *arguments: [params]
"""
res = run_sql("SELECT CONCAT('staEVENT', number),cols " + \
"FROM staEVENT WHERE id = %s", (event_id, ))
if not res:
return # the id does not exist
tbl_name = res[0][0]
if res[0][1]:
col_titles = cPickle.loads(res[0][1])
else:
col_titles = []
if len(col_titles) != len(arguments[0]):
return # there is different number of arguments than cols
# Make sql query
if len(arguments[0]) != 0:
sql_param = []
sql_query = ["INSERT INTO %s (" % wash_table_column_name(tbl_name)]
for title in col_titles:
sql_query.append("`%s`" % title)
sql_query.append(",")
sql_query.pop() # del the last ','
sql_query.append(") VALUES (")
for argument in arguments[0]:
sql_query.append("%s")
sql_query.append(",")
sql_param.append(argument)
sql_query.pop() # del the last ','
sql_query.append(")")
sql_str = ''.join(sql_query)
run_sql(sql_str, tuple(sql_param))
else:
run_sql("INSERT INTO %s () VALUES ()" % wash_table_column_name(tbl_name)) # kwalitee: disable=sql
def cache_keyevent_trend(ids=[]):
"""
Runs the rawdata gatherer for the specific key events.
Intended to be run mainly but the BibSched daemon interface.
For a specific id, all possible timespans' rawdata is gathered.
@param ids: The key event ids that are subject to caching.
@type ids: []
"""
args = {}
for event_id in ids:
args['event_id'] = event_id
if 'type' in KEYEVENT_REPOSITORY[event_id] and \
KEYEVENT_REPOSITORY[event_id]['type'] == 'bibcirculation':
timespans = _get_timespans(bibcirculation_stat=True)[:-1]
else:
timespans = _get_timespans()[:-1]
extraparams = KEYEVENT_REPOSITORY[event_id]['extraparams']
# Construct all combinations of extraparams and store as
# [{param name: arg value}] so as we can loop over them and just
# pattern-replace the each dictionary against
# the KEYEVENT_REPOSITORY['event_id']['cachefilename'].
combos = [[]]
for extra in [[(param, extra[0]) for extra in extraparams[param][1]()]
for param in extraparams]:
combos = [i + [y] for y in extra for i in combos]
combos = [dict(extra) for extra in combos]
for i in range(len(timespans)):
# Get timespans parameters
args['timespan'] = timespans[i][0]
args.update({'t_start': timespans[i][2], 't_end': timespans[i][3],
'granularity': timespans[i][4],
't_format': timespans[i][5],
'xtic_format': timespans[i][6]})
for combo in combos:
args.update(combo)
# Create unique filename for this combination of parameters
filename = KEYEVENT_REPOSITORY[event_id]['cachefilename'] \
% dict([(param, re.subn("[^\w]", "_",
args[param])[0]) for param in args])
# Create closure of gatherer function in case cache
# needs to be refreshed
gatherer = lambda: KEYEVENT_REPOSITORY[event_id] \
['gatherer'](args)
# Get data file from cache, ALWAYS REFRESH DATA!
_get_file_using_cache(filename, gatherer, True).read()
return True
def cache_customevent_trend(ids=[]):
"""
Runs the rawdata gatherer for the specific custom events.
Intended to be run mainly but the BibSched daemon interface.
For a specific id, all possible timespans' rawdata is gathered.
@param ids: The custom event ids that are subject to caching.
@type ids: []
"""
args = {}
timespans = _get_timespans()
for event_id in ids:
args['event_id'] = event_id
args['cols'] = []
for i in range(len(timespans)):
# Get timespans parameters
args['timespan'] = timespans[i][0]
args.update({'t_start': timespans[i][2], 't_end': timespans[i][3],
'granularity': timespans[i][4],
't_format': timespans[i][5],
'xtic_format': timespans[i][6]})
# Create unique filename for this combination of parameters
filename = "webstat_customevent_%(event_id)s_%(timespan)s" \
% {'event_id': re.subn("[^\w]", "_", event_id)[0],
'timespan': re.subn("[^\w]", "_", args['timespan'])[0]}
# Create closure of gatherer function in case cache
# needs to be refreshed
gatherer = lambda: get_customevent_trend(args)
# Get data file from cache, ALWAYS REFRESH DATA!
_get_file_using_cache(filename, gatherer, True).read()
return True
def basket_display():
"""
Display basket statistics.
"""
tbl_name = get_customevent_table("baskets")
if not tbl_name:
# custom event baskets not defined, so return empty output:
return []
try:
res = run_sql("SELECT creation_time FROM %s ORDER BY creation_time" % wash_table_column_name(tbl_name)) # kwalitee: disable=sql
days = (res[-1][0] - res[0][0]).days + 1
public = run_sql("SELECT COUNT(*) FROM %s " % wash_table_column_name(tbl_name) + " WHERE action = 'display_public'")[0][0] # kwalitee: disable=sql
users = run_sql("SELECT COUNT(DISTINCT user) FROM %s" % wash_table_column_name(tbl_name))[0][0] # kwalitee: disable=sql
adds = run_sql("SELECT COUNT(*) FROM %s WHERE action = 'add'" % wash_table_column_name(tbl_name))[0][0] # kwalitee: disable=sql
displays = run_sql("SELECT COUNT(*) FROM %s " % wash_table_column_name(tbl_name) + " WHERE action = 'display' OR action = 'display_public'")[0][0] # kwalitee: disable=sql
hits = adds + displays
average = hits / days
res = [("Basket page hits", hits)]
res.append((" Average per day", average))
res.append((" Unique users", users))
res.append((" Additions", adds))
res.append((" Public", public))
except IndexError:
res = []
return res
def alert_display():
"""
Display alert statistics.
"""
tbl_name = get_customevent_table("alerts")
if not tbl_name:
# custom event alerts not defined, so return empty output:
return []
try:
res = run_sql("SELECT creation_time FROM %s ORDER BY creation_time"
% wash_table_column_name(tbl_name))
days = (res[-1][0] - res[0][0]).days + 1
res = run_sql("SELECT COUNT(DISTINCT user),COUNT(*) FROM %s" % wash_table_column_name(tbl_name)) # kwalitee: disable=sql
users = res[0][0]
hits = res[0][1]
displays = run_sql("SELECT COUNT(*) FROM %s WHERE action = 'list'"
% wash_table_column_name(tbl_name))[0][0]
search = run_sql("SELECT COUNT(*) FROM %s WHERE action = 'display'"
% wash_table_column_name(tbl_name))[0][0]
average = hits / days
res = [("Alerts page hits", hits)]
res.append((" Average per day", average))
res.append((" Unique users", users))
res.append((" Displays", displays))
res.append((" Searches history display", search))
except IndexError:
res = []
return res
def loan_display():
"""
Display loan statistics.
"""
try:
loans, renewals, returns, illrequests, holdrequests = \
get_keyevent_bibcirculation_report()
res = [("Yearly report", '')]
res.append((" Loans", loans))
res.append((" Renewals", renewals))
res.append((" Returns", returns))
res.append((" ILL requests", illrequests))
res.append((" Hold requests", holdrequests))
return res
except IndexError:
return []
def get_url_customevent(url_dest, event_id, *arguments):
"""
Get an url for registers a custom event. Every time is load the
url will register a customevent as register_customevent().
@param url_dest: url to redirect after register the event
@type url_dest: str
@param event_id: Human-readable id of the event to be registered
@type event_id: str
@param *arguments: The rest of the parameters of the function call
the param "WEBSTAT_IP" will tell webstat that here
should be the IP who request the url
@type *arguments: [params]
@return: url for register event
@type: str
"""
return "%s/stats/customevent_register?event_id=%s&arg=%s&url=%s" % \
(CFG_SITE_URL, event_id, ','.join(arguments[0]), quote(url_dest))
# WEB
def perform_request_index(ln=CFG_SITE_LANG):
"""
Displays some informative text, the health box, and a the list of
key/custom events.
"""
out = TEMPLATES.tmpl_welcome(ln=ln)
# Display the health box
out += TEMPLATES.tmpl_system_health_list(get_general_status(), ln=ln)
# Produce a list of the key statistics
out += TEMPLATES.tmpl_keyevent_list(ln=ln)
# Display the custom statistics
out += TEMPLATES.tmpl_customevent_list(_get_customevents(), ln=ln)
# Display error log analyzer
out += TEMPLATES.tmpl_error_log_statistics_list(ln=ln)
# Display annual report
out += TEMPLATES.tmpl_custom_summary(ln=ln)
out += TEMPLATES.tmpl_yearly_report_list(ln=ln)
# Display test for collections
out += TEMPLATES.tmpl_collection_stats_main_list(ln=ln)
return out
def perform_display_current_system_health(ln=CFG_SITE_LANG):
"""
Display the current general system health:
- Uptime/load average
- Apache status
- Session information
- Searches recount
- New records
- Bibsched queue
- New/modified records
- Indexing, ranking, sorting and collecting methods
- Baskets
- Alerts
"""
from ConfigParser import ConfigParser
conf = ConfigParser()
conf.read(CFG_WEBSTAT_CONFIG_PATH)
# Prepare the health base data
health_indicators = []
now = datetime.datetime.now()
yesterday = (now - datetime.timedelta(days=1)).strftime("%Y-%m-%d")
today = now.strftime("%Y-%m-%d")
tomorrow = (now + datetime.timedelta(days=1)).strftime("%Y-%m-%d")
# Append uptime and load average to the health box
if conf.get("general", "uptime_box") == "True":
health_indicators.append(("Uptime cmd",
get_keyevent_snapshot_uptime_cmd()))
# Append number of Apache processes to the health box
if conf.get("general", "apache_box") == "True":
health_indicators.append(("Apache processes",
get_keyevent_snapshot_apache_processes()))
health_indicators.append(None)
# Append session information to the health box
if conf.get("general", "visitors_box") == "True":
sess = get_keyevent_snapshot_sessions()
health_indicators.append(("Total active visitors", sum(sess)))
health_indicators.append((" Logged in", sess[1]))
health_indicators.append(None)
# Append searches information to the health box
if conf.get("general", "search_box") == "True":
args = {'t_start': today, 't_end': tomorrow,
'granularity': "day", 't_format': "%Y-%m-%d"}
searches = get_keyevent_trend_search_type_distribution(args)
health_indicators.append(("Searches since midnight",
sum(searches[0][1])))
health_indicators.append((" Simple", searches[0][1][0]))
health_indicators.append((" Advanced", searches[0][1][1]))
health_indicators.append(None)
# Append new records information to the health box
if conf.get("general", "record_box") == "True":
args = {'collection': "All", 't_start': today,
't_end': tomorrow, 'granularity': "day",
't_format': "%Y-%m-%d"}
try:
tot_records = get_keyevent_trend_collection_population(args)[0][1]
except IndexError:
tot_records = 0
args = {'collection': "All", 't_start': yesterday,
't_end': today, 'granularity': "day", 't_format': "%Y-%m-%d"}
try:
new_records = tot_records - \
get_keyevent_trend_collection_population(args)[0][1]
except IndexError:
new_records = 0
health_indicators.append(("Total records", tot_records))
health_indicators.append((" New records since midnight",
new_records))
health_indicators.append(None)
# Append status of BibSched queue to the health box
if conf.get("general", "bibsched_box") == "True":
bibsched = get_keyevent_snapshot_bibsched_status()
health_indicators.append(("BibSched queue",
sum([x[1] for x in bibsched])))
for item in bibsched:
health_indicators.append((" " + item[0], str(item[1])))
health_indicators.append(None)
# Append records pending
if conf.get("general", "waiting_box") == "True":
last_index, last_rank, last_sort, last_coll=get_last_updates()
index_categories = ('global', 'collection', 'abstract',
'author', 'keyword', 'reference',
'reportnumber', 'title', 'fulltext',
'year', 'journal', 'collaboration',
'affiliation', 'exactauthor', 'caption',
'firstauthor', 'exactfirstauthor',
'authorcount')
rank_categories = ('wrd', 'demo_jif', 'citation',
'citerank_citation_t',
'citerank_pagerank_c',
'citerank_pagerank_t')
sort_categories = ('latest first', 'title', 'author', 'report number',
'most cited')
health_indicators.append(("Records pending per indexing method since", last_index))
for ic in index_categories:
health_indicators.append((" - " + str(ic), get_list_link('index', ic)))
health_indicators.append(None)
health_indicators.append(("Records pending per ranking method since", last_rank))
for rc in rank_categories:
health_indicators.append((" - " + str(rc), get_list_link('rank', rc)))
health_indicators.append(None)
health_indicators.append(("Records pending per sorting method since", last_sort))
for sc in sort_categories:
health_indicators.append((" - " + str(sc), get_list_link('sort', sc)))
health_indicators.append(None)
health_indicators.append(("Records pending for webcolling since", last_coll))
health_indicators.append((" - webcoll", get_list_link('collect')))
health_indicators.append(None)
# Append basket stats to the health box
if conf.get("general", "basket_box") == "True":
health_indicators += basket_display()
health_indicators.append(None)
# Append alerts stats to the health box
if conf.get("general", "alert_box") == "True":
health_indicators += alert_display()
health_indicators.append(None)
# Display the health box
return TEMPLATES.tmpl_system_health(health_indicators, ln=ln)
def perform_display_ingestion_status(req_ingestion, ln=CFG_SITE_LANG):
"""
Display the updating status for the records matching a
given request.
@param req_ingestion: Search pattern request
@type req_ingestion: str
"""
# preconfigured values
index_methods = ('global', 'collection', 'abstract', 'author', 'keyword',
'reference', 'reportnumber', 'title', 'fulltext',
'year', 'journal', 'collaboration', 'affiliation',
'exactauthor', 'caption', 'firstauthor',
'exactfirstauthor', 'authorcount')
rank_methods = ('wrd', 'demo_jif', 'citation', 'citerank_citation_t',
'citerank_pagerank_c', 'citerank_pagerank_t')
sort_methods = ('latest first', 'title', 'author', 'report number',
'most cited')
from ConfigParser import ConfigParser
conf = ConfigParser()
conf.read(CFG_WEBSTAT_CONFIG_PATH)
general = get_general_status()
flag = 0 # match with pending records
stats = []
list_records = get_ingestion_matching_records(req_ingestion, \
int(conf.get("general", "max_ingestion_health")))
if list_records == []:
stats.append(("No matches for your query!", " "*60))
return TEMPLATES.tmpl_ingestion_health(general, req_ingestion, stats, \
ln=ln)
else:
for record in list_records:
if record == 0:
return TEMPLATES.tmpl_ingestion_health(general, None, \
None, ln=ln)
elif record == -1:
stats.append(("Invalid pattern! Please retry", " "*60))
return TEMPLATES.tmpl_ingestion_health(general, None, \
stats, ln=ln)
else:
stat = get_record_ingestion_status(record)
last_mod = get_record_last_modification(record)
if stat != 0:
flag = 1 # match
# Indexing
stats.append((get_title_ingestion(record, last_mod)," "*90))
stats.append(("Pending for indexing methods:", " "*80))
for im in index_methods:
last = get_specific_ingestion_status(record,"index", im)
if last != None:
stats.append((" - %s"%im, "last: " + last))
# Ranking
stats.append(("Pending for ranking methods:", " "*80))
for rm in rank_methods:
last = get_specific_ingestion_status(record, "rank", rm)
if last != None:
stats.append((" - %s"%rm, "last: " + last))
# Sorting
stats.append(("Pending for sorting methods:", " "*80))
for sm in sort_methods:
last = get_specific_ingestion_status(record, "sort", sm)
if last != None:
stats.append((" - %s"%sm, "last: " + last))
# Collecting
stats.append(("Pending for webcolling:", " "*80))
last = get_specific_ingestion_status(record, "collect", )
if last != None:
stats.append((" - webcoll", "last: " + last))
# if there was no match
if flag == 0:
stats.append(("All matching records up to date!", " "*60))
return TEMPLATES.tmpl_ingestion_health(general, req_ingestion, stats, ln=ln)
def perform_display_yearly_report(ln=CFG_SITE_LANG):
"""
Display the year recount
"""
# Append loans stats to the box
year_report = []
year_report += loan_display()
year_report.append(None)
return TEMPLATES.tmpl_yearly_report(year_report, ln=ln)
def perform_display_keyevent(event_id=None, args={},
req=None, ln=CFG_SITE_LANG):
"""
Display key events using a certain output type over the given time span.
@param event_id: The ids for the custom events that are to be displayed.
@type event_id: [str]
@param args: { param name: argument value }
@type args: { str: str }
@param req: The Apache request object, necessary for export redirect.
@type req:
"""
# Get all the option lists:
# { parameter name: [(argument internal name, argument full name)]}
options = dict()
order = []
for param in KEYEVENT_REPOSITORY[event_id]['extraparams']:
# Order of options
order.append(param)
if KEYEVENT_REPOSITORY[event_id]['extraparams'][param][0] == 'combobox':
options[param] = ('combobox',
KEYEVENT_REPOSITORY[event_id]['extraparams'][param][1],
KEYEVENT_REPOSITORY[event_id]['extraparams'][param][2]())
else:
options[param] = (KEYEVENT_REPOSITORY[event_id]['extraparams'][param][0],
(KEYEVENT_REPOSITORY[event_id]['extraparams'][param][1]))
# Build a dictionary for the selected parameters:
# { parameter name: argument internal name }
choosed = dict([(param, args[param]) for param in KEYEVENT_REPOSITORY
[event_id]['extraparams']])
if KEYEVENT_REPOSITORY[event_id]['output'] == 'Graph':
options['format'] = ('combobox', 'Output format', _get_formats())
choosed['format'] = args['format']
order += ['format']
if event_id != 'items list':
if 'type' in KEYEVENT_REPOSITORY[event_id] and \
KEYEVENT_REPOSITORY[event_id]['type'] == 'bibcirculation':
options['timespan'] = ('combobox', 'Time span', _get_timespans(bibcirculation_stat=True))
else:
options['timespan'] = ('combobox', 'Time span', _get_timespans())
choosed['timespan'] = args['timespan']
order += ['timespan']
choosed['s_date'] = args['s_date']
choosed['f_date'] = args['f_date']
# Send to template to prepare event customization FORM box
list = KEYEVENT_REPOSITORY[event_id]['output'] == 'List'
out = "\n".join(["<p>%s</p>" % parr for parr in KEYEVENT_REPOSITORY[event_id]['description']]) \
+ TEMPLATES.tmpl_keyevent_box(options, order, choosed, ln=ln, list=list)
# Arguments OK?
# Check for existance. If nothing, only show FORM box from above.
if len(choosed) == 0:
return out
# Make sure extraparams are valid, if any
if KEYEVENT_REPOSITORY[event_id]['output'] == 'Graph' and \
event_id != 'percentage satisfied ill requests':
for param in choosed:
if param in options and options[param] == 'combobox' and \
not choosed[param] in [x[0] for x in options[param][2]]:
return out + TEMPLATES.tmpl_error(
'Please specify a valid value for parameter "%s".'
% options[param][0], ln=ln)
# Arguments OK beyond this point!
# Get unique name for caching purposes (make sure that the params used
# in the filename are safe!)
filename = KEYEVENT_REPOSITORY[event_id]['cachefilename'] \
% dict([(param, re.subn("[^\w]", "_", choosed[param])[0])
for param in choosed] +
[('event_id', re.subn("[^\w]", "_", event_id)[0])])
# Get time parameters from repository
if 'timespan' in choosed:
if choosed['timespan'] == "select date":
t_args = _get_time_parameters_select_date(args["s_date"], args["f_date"])
else:
t_args = _get_time_parameters(options, choosed['timespan'])
else:
t_args = args
for param in KEYEVENT_REPOSITORY[event_id]['extraparams']:
t_args[param] = choosed[param]
if 'format' in args and args['format'] == 'Full list':
gatherer = lambda: KEYEVENT_REPOSITORY[event_id]['gatherer'](t_args, limit=-1)
export_to_file(gatherer(), req)
return out
# Create closure of frequency function in case cache needs to be refreshed
gatherer = lambda return_sql: KEYEVENT_REPOSITORY[event_id]['gatherer'](t_args, return_sql=return_sql)
# Determine if this particular file is due for scheduling cacheing,
# in that case we must not allow refreshing of the rawdata.
allow_refresh = not _is_scheduled_for_cacheing(event_id)
# Get data file from cache (refresh if necessary)
force = 'timespan' in choosed and choosed['timespan'] == "select date"
data = eval(_get_file_using_cache(filename, gatherer, force,
allow_refresh=allow_refresh).read())
if KEYEVENT_REPOSITORY[event_id]['output'] == 'Graph':
# If type indicates an export, run the export function and we're done
if _is_type_export(choosed['format']):
_get_export_closure(choosed['format'])(data, req)
return out
# Prepare the graph settings that are being passed on to grapher
settings = {"title": KEYEVENT_REPOSITORY[event_id]['specificname']\
% choosed,
"xlabel": t_args['t_fullname'] + ' (' + \
t_args['granularity'] + ')',
"ylabel": KEYEVENT_REPOSITORY[event_id]['ylabel'],
"xtic_format": t_args['xtic_format'],
"format": choosed['format'],
"multiple": KEYEVENT_REPOSITORY[event_id]['multiple']}
else:
settings = {"title": KEYEVENT_REPOSITORY[event_id]['specificname']\
% choosed, "format": 'Table',
"rows": KEYEVENT_REPOSITORY[event_id]['rows']}
if args['sql']:
sql = gatherer(True)
else:
sql = ''
return out + _perform_display_event(data,
os.path.basename(filename), settings, ln=ln) + sql
def perform_display_customevent(ids=[], args={}, req=None, ln=CFG_SITE_LANG):
"""
Display custom events using a certain output type over the given time span.
@param ids: The ids for the custom events that are to be displayed.
@type ids: [str]
@param args: { param name: argument value }
@type args: { str: str }
@param req: The Apache request object, necessary for export redirect.
@type req:
"""
# Get all the option lists:
# { parameter name: [(argument internal name, argument full name)]}
cols_dict = _get_customevent_cols()
cols_dict['__header'] = 'Argument'
cols_dict['__none'] = []
options = {'ids': ('Custom event', _get_customevents()),
'timespan': ('Time span', _get_timespans()),
'format': ('Output format', _get_formats(True)),
'cols': cols_dict}
# Build a dictionary for the selected parameters:
# { parameter name: argument internal name }
choosed = {'ids': args['ids'], 'timespan': args['timespan'],
'format': args['format'], 's_date': args['s_date'],
'f_date': args['f_date']}
# Calculate cols
index = []
for key in args.keys():
if key[:4] == 'cols':
index.append(key[4:])
index.sort()
choosed['cols'] = [zip([""] + args['bool' + i], args['cols' + i],
args['col_value' + i]) for i in index]
# Send to template to prepare event customization FORM box
out = TEMPLATES.tmpl_customevent_box(options, choosed, ln=ln)
# Arguments OK?
# Make sure extraparams are valid, if any
for param in ['ids', 'timespan', 'format']:
legalvalues = [x[0] for x in options[param][1]]
if type(args[param]) is list:
# If the argument is a list, like the content of 'ids'
# every value has to be checked
if len(args[param]) == 0:
return out + TEMPLATES.tmpl_error(
'Please specify a valid value for parameter "%s".'
% options[param][0], ln=ln)
for arg in args[param]:
if not arg in legalvalues:
return out + TEMPLATES.tmpl_error(
'Please specify a valid value for parameter "%s".'
% options[param][0], ln=ln)
else:
if not args[param] in legalvalues:
return out + TEMPLATES.tmpl_error(
'Please specify a valid value for parameter "%s".'
% options[param][0], ln=ln)
# Fetch time parameters from repository
if choosed['timespan'] == "select date":
args_req = _get_time_parameters_select_date(args["s_date"],
args["f_date"])
else:
args_req = _get_time_parameters(options, choosed['timespan'])
# ASCII dump data is different from the standard formats
if choosed['format'] == 'asciidump':
data = perform_display_customevent_data_ascii_dump(ids, args,
args_req, choosed)
else:
data = perform_display_customevent_data(ids, args_req, choosed)
# If type indicates an export, run the export function and we're done
if _is_type_export(args['format']):
_get_export_closure(args['format'])(data, req)
return out
# Get full names, for those that have them
names = []
events = _get_customevents()
for event_id in ids:
temp = events[[x[0] for x in events].index(event_id)]
if temp[1] != None:
names.append(temp[1])
else:
names.append(temp[0])
# Generate a filename for the graph
filename = "tmp_webstat_customevent_" + ''.join([re.subn("[^\w]", "",
event_id)[0] for event_id in ids]) + "_"
if choosed['timespan'] == "select date":
filename += args_req['t_start'] + "_" + args_req['t_end']
else:
filename += choosed['timespan']
settings = {"title": 'Custom event',
"xlabel": args_req['t_fullname'] + ' (' + \
args_req['granularity'] + ')',
"ylabel": "Action quantity",
"xtic_format": args_req['xtic_format'],
"format": choosed['format'],
"multiple": (type(ids) is list) and names or []}
return out + _perform_display_event(data, os.path.basename(filename),
settings, ln=ln)
def perform_display_customevent_data(ids, args_req, choosed):
"""Returns the trend data"""
data_unmerged = []
for event_id, i in [(ids[i], str(i)) for i in range(len(ids))]:
# Calculate cols
args_req['cols'] = choosed['cols'][int(i)]
# Get unique name for the rawdata file (wash arguments!)
filename = "webstat_customevent_" + re.subn("[^\w]", "", event_id + \
"_" + choosed['timespan'] + "_" + '-'.join([':'.join(col)
for col in args_req['cols']]))[0]
# Add the current id to the gatherer's arguments
args_req['event_id'] = event_id
# Prepare raw data gatherer, if cache needs refreshing.
gatherer = lambda x: get_customevent_trend(args_req)
# Determine if this particular file is due for scheduling cacheing,
# in that case we must not allow refreshing of the rawdata.
allow_refresh = not _is_scheduled_for_cacheing(event_id)
# Get file from cache, and evaluate it to trend data
force = choosed['timespan'] == "select date"
data_unmerged.append(eval(_get_file_using_cache(filename, gatherer,
force, allow_refresh=allow_refresh).read()))
# Merge data from the unmerged trends into the final destination
return [(x[0][0], tuple([y[1] for y in x])) for x in zip(*data_unmerged)]
def perform_display_customevent_data_ascii_dump(ids, args, args_req, choosed):
"""Returns the trend data"""
for i in [str(j) for j in range(len(ids))]:
args['bool' + i].insert(0, "")
args_req['cols' + i] = zip(args['bool' + i], args['cols' + i],
args['col_value' + i])
filename = "webstat_customevent_" + re.subn("[^\w]", "", ''.join(ids) +
"_" + choosed['timespan'] + "_" + '-'.join([':'.join(col) for
col in [args['cols' + str(i)] for i in range(len(ids))]]) +
"_asciidump")[0]
args_req['ids'] = ids
gatherer = lambda: get_customevent_dump(args_req)
force = choosed['timespan'] == "select date"
return eval(_get_file_using_cache(filename, gatherer, force).read())
def perform_display_coll_list(req=None, ln=CFG_SITE_LANG):
"""
Display list of collections
@param req: The Apache request object, necessary for export redirect.
@type req:
"""
return TEMPLATES.tmpl_collection_stats_complete_list(get_collection_list_plus_all())
def perform_display_stats_per_coll(args={}, req=None, ln=CFG_SITE_LANG):
"""
Display general statistics for a given collection
@param args: { param name: argument value }
@type args: { str: str }
@param req: The Apache request object, necessary for export redirect.
@type req:
"""
events_id = ('collection population', 'download frequency', 'comments frequency')
# Get all the option lists:
# Make sure extraparams are valid, if any
if not args['collection'] in [x[0] for x in get_collection_list_plus_all()]:
return TEMPLATES.tmpl_error('Please specify a valid value for parameter "Collection".')
# { parameter name: [(argument internal name, argument full name)]}
options = {'collection': ('combobox', 'Collection', get_collection_list_plus_all()),
'timespan': ('combobox', 'Time span', _get_timespans()),
'format': ('combobox', 'Output format', _get_formats())}
order = options.keys()
# Arguments OK beyond this point!
# Get unique name for caching purposes (make sure that the params
# used in the filename are safe!)
out = TEMPLATES.tmpl_keyevent_box(options, order, args, ln=ln)
out += "<table>"
pair = False
for event_id in events_id:
# Get unique name for caching purposes (make sure that the params used
# in the filename are safe!)
filename = KEYEVENT_REPOSITORY[event_id]['cachefilename'] \
% dict([(param, re.subn("[^\w]", "_", args[param])[0])
for param in args] +
[('event_id', re.subn("[^\w]", "_", event_id)[0])])
# Get time parameters from repository
if args['timespan'] == "select date":
t_args = _get_time_parameters_select_date(args["s_date"], args["f_date"])
else:
t_args = _get_time_parameters(options, args['timespan'])
for param in KEYEVENT_REPOSITORY[event_id]['extraparams']:
t_args[param] = args[param]
# Create closure of frequency function in case cache needs to be refreshed
gatherer = lambda return_sql: KEYEVENT_REPOSITORY[event_id]['gatherer'](t_args, return_sql=return_sql)
# Determine if this particular file is due for scheduling cacheing,
# in that case we must not allow refreshing of the rawdata.
allow_refresh = not _is_scheduled_for_cacheing(event_id)
# Get data file from cache (refresh if necessary)
data = eval(_get_file_using_cache(filename, gatherer, allow_refresh=allow_refresh).read())
# Prepare the graph settings that are being passed on to grapher
settings = {"title": KEYEVENT_REPOSITORY[event_id]['specificname'] % t_args,
"xlabel": t_args['t_fullname'] + ' (' + \
t_args['granularity'] + ')',
"ylabel": KEYEVENT_REPOSITORY[event_id]['ylabel'],
"xtic_format": t_args['xtic_format'],
"format": args['format'],
"multiple": KEYEVENT_REPOSITORY[event_id]['multiple'],
"size": '360,270'}
if not pair:
out += '<tr>'
out += '<td>%s</td>' % _perform_display_event(data,
os.path.basename(filename), settings, ln=ln)
if pair:
out += '</tr>'
pair = not pair
return out + "</table>"
def perform_display_customevent_help(ln=CFG_SITE_LANG):
"""Display the custom event help"""
return TEMPLATES.tmpl_customevent_help(ln=ln)
def perform_display_error_log_analyzer(ln=CFG_SITE_LANG):
"""Display the error log analyzer"""
update_error_log_analyzer()
return TEMPLATES.tmpl_error_log_analyzer(get_invenio_error_log_ranking(),
get_invenio_last_n_errors(5),
get_apache_error_log_ranking())
def perform_display_custom_summary(args, ln=CFG_SITE_LANG):
"""Display the custom summary (annual report)
@param args: { param name: argument value } (chart title, search query and output tag)
@type args: { str: str }
"""
if args['tag'] == '':
args['tag'] = CFG_JOURNAL_TAG.replace("%", "p")
data = get_custom_summary_data(args['query'], args['tag'])
tag_name = _get_tag_name(args['tag'])
if tag_name == '':
tag_name = args['tag']
path = WEBSTAT_GRAPH_DIRECTORY + os.path.basename("tmp_webstat_custom_summary_"
+ args['query'] + args['tag'])
create_custom_summary_graph(data[:-1], path, args['title'])
return TEMPLATES.tmpl_display_custom_summary(tag_name, data, args['title'],
args['query'], args['tag'], path, ln=ln)
# INTERNALS
def _perform_display_event(data, name, settings, ln=CFG_SITE_LANG):
"""
Retrieves a graph or a table.
@param data: The trend/dump data
@type data: [(str, str|int|(str|int,...))] | [(str|int,...)]
@param name: The name of the trend (to be used as basename of graph file)
@type name: str
@param settings: Dictionary of graph parameters
@type settings: dict
@return: The URL of the graph (ASCII or image)
@type: str
"""
path = WEBSTAT_GRAPH_DIRECTORY + "tmp_" + name
# Generate, and insert using the appropriate template
if settings["format"] == "asciidump":
path += "_asciidump"
create_graph_dump(data, path)
out = TEMPLATES.tmpl_display_event_trend_ascii(settings["title"],
path, ln=ln)
if settings["format"] == "Table":
create_graph_table(data, path, settings)
return TEMPLATES.tmpl_display_event_trend_text(settings["title"], path, ln=ln)
create_graph_trend(data, path, settings)
if settings["format"] == "asciiart":
out = TEMPLATES.tmpl_display_event_trend_ascii(
settings["title"], path, ln=ln)
else:
if settings["format"] == "gnuplot":
try:
import Gnuplot
except ImportError:
out = 'Gnuplot is not installed. Returning ASCII art.' + \
TEMPLATES.tmpl_display_event_trend_ascii(
settings["title"], path, ln=ln)
out = TEMPLATES.tmpl_display_event_trend_image(
settings["title"], path, ln=ln)
elif settings["format"] == "flot":
out = TEMPLATES.tmpl_display_event_trend_text(
settings["title"], path, ln=ln)
else:
out = TEMPLATES.tmpl_display_event_trend_ascii(
settings["title"], path, ln=ln)
avgs, maxs, mins = get_numeric_stats(data, settings["multiple"] is not None)
return out + TEMPLATES.tmpl_display_numeric_stats(settings["multiple"],
avgs, maxs, mins)
def _get_customevents():
"""
Retrieves registered custom events from the database.
@return: [(internal name, readable name)]
@type: [(str, str)]
"""
return [(x[0], x[1]) for x in run_sql("SELECT id, name FROM staEVENT")]
def _get_timespans(dttime=None, bibcirculation_stat=False):
"""
Helper function that generates possible time spans to be put in the
drop-down in the generation box. Computes possible years, and also some
pre-defined simpler values. Some items in the list returned also tweaks the
output graph, if any, since such values are closely related to the nature
of the time span.
@param dttime: A datetime object indicating the current date and time
@type dttime: datetime.datetime
@return: [(Internal name, Readable name, t_start, t_end, granularity, format, xtic_format)]
@type [(str, str, str, str, str, str, str)]
"""
if dttime is None:
dttime = datetime.datetime.now()
dtformat = "%Y-%m-%d"
# Helper function to return a timediff object reflecting a diff of x days
d_diff = lambda x: datetime.timedelta(days=x)
# Helper function to return the number of days in the month x months ago
d_in_m = lambda x: calendar.monthrange(
((dttime.month - x < 1) and dttime.year - 1 or dttime.year),
(((dttime.month - 1) - x) % 12 + 1))[1]
to_str = lambda x: x.strftime(dtformat)
dt_str = to_str(dttime)
spans = [("today", "Today",
dt_str,
to_str(dttime + d_diff(1)),
"hour", dtformat, "%H"),
("this week", "This week",
to_str(dttime - d_diff(dttime.weekday())),
to_str(dttime + d_diff(1)),
"day", dtformat, "%a"),
("last week", "Last week",
to_str(dttime - d_diff(dttime.weekday() + 7)),
to_str(dttime - d_diff(dttime.weekday())),
"day", dtformat, "%a"),
("this month", "This month",
to_str(dttime - d_diff(dttime.day) + d_diff(1)),
to_str(dttime + d_diff(1)),
"day", dtformat, "%d"),
("last month", "Last month",
to_str(dttime - d_diff(d_in_m(1)) - d_diff(dttime.day) + d_diff(1)),
to_str(dttime - d_diff(dttime.day) + d_diff(1)),
"day", dtformat, "%d"),
("last three months", "Last three months",
to_str(dttime - d_diff(d_in_m(1)) - d_diff(d_in_m(2)) -
d_diff(dttime.day) + d_diff(1)),
dt_str,
"month", dtformat, "%b"),
("last year", "Last year",
to_str((dttime - datetime.timedelta(days=365)).replace(day=1)),
to_str((dttime + datetime.timedelta(days=31)).replace(day=1)),
"month", dtformat, "%b")]
# Get first year as indicated by the content's in bibrec or
# CFG_WEBSTAT_BIBCIRCULATION_START_YEAR
try:
if bibcirculation_stat and CFG_WEBSTAT_BIBCIRCULATION_START_YEAR:
year1 = int(CFG_WEBSTAT_BIBCIRCULATION_START_YEAR)
else:
year1 = run_sql("SELECT creation_date FROM bibrec ORDER BY \
creation_date LIMIT 1")[0][0].year
except:
year1 = dttime.year
year2 = time.localtime()[0]
diff_year = year2 - year1
if diff_year >= 2:
spans.append(("last 2 years", "Last 2 years",
to_str((dttime - datetime.timedelta(days=365 * 2)).replace(day=1)),
to_str((dttime + datetime.timedelta(days=31)).replace(day=1)),
"month", dtformat, "%b"))
if diff_year >= 5:
spans.append(("last 5 years", "Last 5 years",
to_str((dttime - datetime.timedelta(days=365 * 5)).replace(day=1)),
to_str((dttime + datetime.timedelta(days=31)).replace(day=1)),
"year", dtformat, "%Y"))
if diff_year >= 10:
spans.append(("last 10 years", "Last 10 years",
to_str((dttime - datetime.timedelta(days=365 * 10)).replace(day=1)),
to_str((dttime + datetime.timedelta(days=31)).replace(day=1)),
"year", dtformat, "%Y"))
spans.append(("full history", "Full history", str(year1), str(year2 + 1),
"year", "%Y", "%Y"))
spans.extend([(str(x), str(x), str(x), str(x + 1), "month", "%Y", "%b")
for x in range(year2, year1 - 1, -1)])
spans.append(("select date", "Select date...", "", "",
"hour", dtformat, "%H"))
return spans
def _get_time_parameters(options, timespan):
"""
Returns the time parameters from the repository when it is a default timespan
@param options: A dictionary with the option lists
@type options: { parameter name: [(argument internal name, argument full name)]}
@param timespan: name of the chosen timespan
@type timespan: str
@return: [(Full name, t_start, t_end, granularity, format, xtic_format)]
@type [(str, str, str, str, str, str, str)]
"""
if len(options['timespan']) == 2:
i = 1
else:
i = 2
_, t_fullname, t_start, t_end, granularity, t_format, xtic_format = \
options['timespan'][i][[x[0]
for x in options['timespan'][i]].index(timespan)]
return {'t_fullname': t_fullname, 't_start': t_start, 't_end': t_end,
'granularity': granularity, 't_format': t_format,
'xtic_format': xtic_format}
def _get_time_parameters_select_date(s_date, f_date):
"""
Returns the time parameters from the repository when it is a custom timespan
@param s_date: start date for the graph
@type s_date: str %m/%d/%Y %H:%M
@param f_date: finish date for the graph
@type f_date: str %m/%d/%Y %H:%M
@return: [(Full name, t_start, t_end, granularity, format, xtic_format)]
@type [(str, str, str, str, str, str, str)]
"""
t_fullname = "%s-%s" % (s_date, f_date)
dt_start = datetime.datetime(*(time.strptime(s_date, "%m/%d/%Y %H:%M")[0:6]))
dt_end = datetime.datetime(*(time.strptime(f_date, "%m/%d/%Y %H:%M")[0:6]))
if dt_end - dt_start <= timedelta(hours=1):
xtic_format = "%m:%s"
granularity = 'second'
elif dt_end - dt_start <= timedelta(days=1):
xtic_format = "%H:%m"
granularity = 'minute'
elif dt_end - dt_start <= timedelta(days=7):
xtic_format = "%H"
granularity = 'hour'
elif dt_end - dt_start <= timedelta(days=60):
xtic_format = "%a"
granularity = 'day'
elif dt_end - dt_start <= timedelta(days=730):
xtic_format = "%d"
granularity = 'month'
else:
xtic_format = "%H"
granularity = 'hour'
t_format = "%Y-%m-%d %H:%M:%S"
t_start = dt_start.strftime("%Y-%m-%d %H:%M:%S")
t_end = dt_end.strftime("%Y-%m-%d %H:%M:%S")
return {'t_fullname': t_fullname, 't_start': t_start, 't_end': t_end,
'granularity': granularity, 't_format': t_format,
'xtic_format': xtic_format}
def _get_formats(with_dump=False):
"""
Helper function to retrieve a Invenio friendly list of all possible
output types (displaying and exporting) from the central repository as
stored in the variable self.types at the top of this module.
@param with_dump: Optionally displays the custom-event only type 'asciidump'
@type with_dump: bool
@return: [(Internal name, Readable name)]
@type [(str, str)]
"""
# The third tuple value is internal
if with_dump:
return [(x[0], x[1]) for x in TYPE_REPOSITORY]
else:
return [(x[0], x[1]) for x in TYPE_REPOSITORY if x[0] != 'asciidump']
def _get_customevent_cols(event_id=""):
"""
List of all the diferent name of columns in customevents.
@return: {id: [(internal name, readable name)]}
@type: {str: [(str, str)]}
"""
sql_str = "SELECT id,cols FROM staEVENT"
sql_param = []
if event_id:
sql_str += "WHERE id = %s"
sql_param.append(event_id)
cols = {}
for event in run_sql(sql_str, sql_param):
if event[0]:
if event[1]:
cols[event[0]] = [(name, name) for name
in cPickle.loads(event[1])]
else:
cols[event[0]] = []
return cols
def _is_type_export(typename):
"""
Helper function that consults the central repository of types to determine
whether the input parameter represents an export type.
@param typename: Internal type name
@type typename: str
@return: Information whether a certain type exports data
@type: bool
"""
return len(TYPE_REPOSITORY[[x[0] for x in
TYPE_REPOSITORY].index(typename)]) == 3
def _get_export_closure(typename):
"""
Helper function that for a certain type, gives back the corresponding export
closure.
@param typename: Internal type name
@type typename: str
@return: Closure that exports data to the type's format
@type: function
"""
return TYPE_REPOSITORY[[x[0] for x in TYPE_REPOSITORY].index(typename)][2]
def _get_file_using_cache(filename, closure, force=False, allow_refresh=True):
"""
Uses the Invenio cache, i.e. the tempdir, to see if there's a recent
cached version of the sought-after file in there. If not, use the closure to
compute a new, and return that instead. Relies on Invenio configuration
parameter WEBSTAT_CACHE_INTERVAL.
@param filename: The name of the file that might be cached
@type filename: str
@param closure: A function, that executed will return data to be cached. The
function should return either a string, or something that
makes sense after being interpreted with str().
@type closure: function
@param force: Override cache default value.
@type force: bool
"""
# Absolute path to cached files, might not exist.
filename = os.path.normpath(WEBSTAT_RAWDATA_DIRECTORY + filename)
# Get the modification time of the cached file (if any).
try:
mtime = os.path.getmtime(filename)
except OSError:
# No cached version of this particular file exists, thus the
# modification time is set to 0 for easy logic below.
mtime = 0
# Consider refreshing cache if FORCE or NO CACHE AT ALL,
# or CACHE EXIST AND REFRESH IS ALLOWED.
if force or mtime == 0 or (mtime > 0 and allow_refresh):
# Is the file modification time recent enough?
if force or (time.time() - mtime > WEBSTAT_CACHE_INTERVAL):
# No! Use closure to compute new content
content = closure(False)
# Cache the data
open(filename, 'w').write(str(content))
# Return the (perhaps just) cached file
return open(filename, 'r')
def _is_scheduled_for_cacheing(event_id):
"""
@param event_id: The event id
@type event_id: str
@return: Indication of if the event id is scheduling for BibSched execution.
@type: bool
"""
if not is_task_scheduled('webstatadmin'):
return False
# Get the task id
try:
task_id = get_task_ids_by_descending_date('webstatadmin',
['RUNNING', 'WAITING'])[0]
except IndexError:
return False
else:
args = get_task_options(task_id)
return event_id in (args['keyevents'] + args['customevents'])
diff --git a/invenio/legacy/webstat/engine.py b/invenio/legacy/webstat/engine.py
index e16fc0310..e85fb6cf8 100644
--- a/invenio/legacy/webstat/engine.py
+++ b/invenio/legacy/webstat/engine.py
@@ -1,2865 +1,2865 @@
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
__lastupdated__ = "$Date$"
import calendar, commands, datetime, time, os, cPickle, random, cgi
from operator import itemgetter
from invenio.config import CFG_TMPDIR, \
CFG_SITE_URL, \
CFG_SITE_NAME, \
CFG_BINDIR, \
CFG_CERN_SITE, \
CFG_BIBCIRCULATION_ITEM_STATUS_CANCELLED, \
CFG_BIBCIRCULATION_ITEM_STATUS_CLAIMED, \
CFG_BIBCIRCULATION_ITEM_STATUS_IN_PROCESS, \
CFG_BIBCIRCULATION_ITEM_STATUS_NOT_ARRIVED, \
CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, \
CFG_BIBCIRCULATION_ITEM_STATUS_ON_ORDER, \
CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF, \
CFG_BIBCIRCULATION_ITEM_STATUS_OPTIONAL, \
CFG_BIBCIRCULATION_REQUEST_STATUS_DONE, \
CFG_BIBCIRCULATION_ILL_STATUS_CANCELLED
from invenio.bibindex_tokenizers.BibIndexJournalTokenizer import CFG_JOURNAL_TAG
from invenio.utils.url import redirect_to_url
from invenio.legacy.search_engine import perform_request_search, \
get_collection_reclist, \
get_most_popular_field_values, \
search_pattern
from invenio.legacy.bibrecord import get_fieldvalues
from invenio.legacy.dbquery import run_sql, \
wash_table_column_name
-from invenio.websubmitadmin_dblayer import get_docid_docname_alldoctypes
-from invenio.bibcirculation_utils import book_title_from_MARC, \
+from invenio.legacy.websubmit.admin_dblayer import get_docid_docname_alldoctypes
+from invenio.legacy.bibcirculation.utils import book_title_from_MARC, \
book_information_from_MARC
-from invenio.bibcirculation_dblayer import get_id_bibrec, \
+from invenio.legacy.bibcirculation.db_layer import get_id_bibrec, \
get_borrower_data
-from invenio.websearch_webcoll import CFG_CACHE_LAST_UPDATED_TIMESTAMP_FILE
+from invenio.legacy.websearch.webcoll import CFG_CACHE_LAST_UPDATED_TIMESTAMP_FILE
from invenio.utils.date import convert_datetext_to_datestruct, convert_datestruct_to_dategui
WEBSTAT_SESSION_LENGTH = 48 * 60 * 60 # seconds
WEBSTAT_GRAPH_TOKENS = '-=#+@$%&XOSKEHBC'
# KEY EVENT TREND SECTION
def get_keyevent_trend_collection_population(args, return_sql=False):
"""
Returns the quantity of documents in Invenio for
the given timestamp range.
@param args['collection']: A collection name
@type args['collection']: str
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['granularity']: Granularity of date and time
@type args['granularity']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
# collect action dates
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
if args.get('collection', 'All') == 'All':
sql_query_g = _get_sql_query("creation_date", args['granularity'],
"bibrec")
sql_query_i = "SELECT COUNT(id) FROM bibrec WHERE creation_date < %s"
initial_quantity = run_sql(sql_query_i, (lower, ))[0][0]
return _get_keyevent_trend(args, sql_query_g, initial_quantity=initial_quantity,
return_sql=return_sql, sql_text=
"Previous count: %s<br />Current count: %%s" % (sql_query_i),
acumulative=True)
else:
ids = get_collection_reclist(args['collection'])
if len(ids) == 0:
return []
g = get_keyevent_trend_new_records(args, return_sql, True)
sql_query_i = "SELECT id FROM bibrec WHERE creation_date < %s"
if return_sql:
return "Previous count: %s<br />Current count: %s" % (sql_query_i % lower, g)
initial_quantity = len(filter(lambda x: x[0] in ids, run_sql(sql_query_i, (lower, ))))
return _get_trend_from_actions(g, initial_quantity, args['t_start'],
args['t_end'], args['granularity'], args['t_format'], acumulative=True)
def get_keyevent_trend_new_records(args, return_sql=False, only_action=False):
"""
Returns the number of new records uploaded during the given timestamp range.
@param args['collection']: A collection name
@type args['collection']: str
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['granularity']: Granularity of date and time
@type args['granularity']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
if args.get('collection', 'All') == 'All':
return _get_keyevent_trend(args, _get_sql_query("creation_date", args['granularity'],
"bibrec"),
return_sql=return_sql)
else:
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
ids = get_collection_reclist(args['collection'])
if len(ids) == 0:
return []
sql = _get_sql_query("creation_date", args["granularity"], "bibrec",
extra_select=", id", group_by=False, count=False)
if return_sql:
return sql % (lower, upper)
recs = run_sql(sql, (lower, upper))
if recs:
def add_count(i_list, element):
""" Reduce function to create a dictionary with the count of ids
for each date """
if i_list and element == i_list[-1][0]:
i_list[-1][1] += 1
else:
i_list.append([element, 1])
return i_list
action_dates = reduce(add_count,
map(lambda x: x[0], filter(lambda x: x[1] in ids, recs)),
[])
else:
action_dates = []
if only_action:
return action_dates
return _get_trend_from_actions(action_dates, 0, args['t_start'],
args['t_end'], args['granularity'], args['t_format'])
def get_keyevent_trend_search_frequency(args, return_sql=False):
"""
Returns the number of searches (of any kind) carried out
during the given timestamp range.
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['granularity']: Granularity of date and time
@type args['granularity']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
return _get_keyevent_trend(args, _get_sql_query("date", args["granularity"],
"query INNER JOIN user_query ON id=id_query"),
return_sql=return_sql)
def get_keyevent_trend_comments_frequency(args, return_sql=False):
"""
Returns the number of comments (of any kind) carried out
during the given timestamp range.
@param args['collection']: A collection name
@type args['collection']: str
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['granularity']: Granularity of date and time
@type args['granularity']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
if args.get('collection', 'All') == 'All':
sql = _get_sql_query("date_creation", args["granularity"],
"cmtRECORDCOMMENT")
else:
sql = _get_sql_query("date_creation", args["granularity"],
"cmtRECORDCOMMENT", conditions=
_get_collection_recids_for_sql_query(args['collection']))
return _get_keyevent_trend(args, sql, return_sql=return_sql)
def get_keyevent_trend_search_type_distribution(args, return_sql=False):
"""
Returns the number of searches carried out during the given
timestamp range, but also partion them by type Simple and
Advanced.
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['granularity']: Granularity of date and time
@type args['granularity']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
# SQL to determine all simple searches:
simple = _get_sql_query("date", args["granularity"],
"query INNER JOIN user_query ON id=id_query",
conditions="urlargs LIKE '%%p=%%'")
# SQL to determine all advanced searches:
advanced = _get_sql_query("date", args["granularity"],
"query INNER JOIN user_query ON id=id_query",
conditions="urlargs LIKE '%%as=1%%'")
# Compute the trend for both types
s_trend = _get_keyevent_trend(args, simple,
return_sql=return_sql, sql_text="Simple: %s")
a_trend = _get_keyevent_trend(args, advanced,
return_sql=return_sql, sql_text="Advanced: %s")
# Assemble, according to return type
if return_sql:
return "%s <br /> %s" % (s_trend, a_trend)
return [(s_trend[i][0], (s_trend[i][1], a_trend[i][1]))
for i in range(len(s_trend))]
def get_keyevent_trend_download_frequency(args, return_sql=False):
"""
Returns the number of full text downloads carried out
during the given timestamp range.
@param args['collection']: A collection name
@type args['collection']: str
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['granularity']: Granularity of date and time
@type args['granularity']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
# Collect list of timestamps of insertion in the specific collection
if args.get('collection', 'All') == 'All':
return _get_keyevent_trend(args, _get_sql_query("download_time",
args["granularity"], "rnkDOWNLOADS"), return_sql=return_sql)
else:
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
ids = get_collection_reclist(args['collection'])
if len(ids) == 0:
return []
sql = _get_sql_query("download_time", args["granularity"], "rnkDOWNLOADS",
extra_select=", GROUP_CONCAT(id_bibrec)")
if return_sql:
return sql % (lower, upper)
action_dates = []
for result in run_sql(sql, (lower, upper)):
count = result[1]
for id in result[2].split(","):
if id == '' or not int(id) in ids:
count -= 1
action_dates.append((result[0], count))
return _get_trend_from_actions(action_dates, 0, args['t_start'],
args['t_end'], args['granularity'], args['t_format'])
def get_keyevent_trend_number_of_loans(args, return_sql=False):
"""
Returns the number of loans carried out
during the given timestamp range.
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['granularity']: Granularity of date and time
@type args['granularity']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
return _get_keyevent_trend(args, _get_sql_query("loaned_on",
args["granularity"], "crcLOAN"), return_sql=return_sql)
def get_keyevent_trend_web_submissions(args, return_sql=False):
"""
Returns the quantity of websubmissions in Invenio for
the given timestamp range.
@param args['doctype']: A doctype name
@type args['doctype']: str
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['granularity']: Granularity of date and time
@type args['granularity']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
if args['doctype'] == 'all':
sql = _get_sql_query("cd", args["granularity"], "sbmSUBMISSIONS",
conditions="action='SBI' AND status='finished'")
res = _get_keyevent_trend(args, sql, return_sql=return_sql)
else:
sql = _get_sql_query("cd", args["granularity"], "sbmSUBMISSIONS",
conditions="doctype=%s AND action='SBI' AND status='finished'")
res = _get_keyevent_trend(args, sql, extra_param=[args['doctype']],
return_sql=return_sql)
return res
def get_keyevent_loan_statistics(args, return_sql=False):
"""
Data:
- Number of documents (=records) loaned
- Number of items loaned on the total number of items
- Number of items never loaned on the total number of items
- Average time between the date of the record creation and the date of the first loan
Filter by
- in a specified time span
- by UDC (see MARC field 080__a - list to be submitted)
- by item status (available, missing)
- by date of publication (MARC field 260__c)
- by date of the record creation in the database
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['udc']: MARC field 080__a
@type args['udc']: str
@param args['item_status']: available, missing...
@type args['item_status']: str
@param args['publication_date']: MARC field 260__c
@type args['publication_date']: str
@param args['creation_date']: date of the record creation in the database
@type args['creation_date']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
# collect action dates
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
sql_from = "FROM crcLOAN l "
sql_where = "WHERE loaned_on > %s AND loaned_on < %s "
param = [lower, upper]
if 'udc' in args and args['udc'] != '':
sql_where += "AND l." + _check_udc_value_where()
param.append(_get_udc_truncated(args['udc']))
if 'item_status' in args and args['item_status'] != '':
sql_from += ", crcITEM i "
sql_where += "AND l.barcode = i.barcode AND i.status = %s "
param.append(args['item_status'])
if 'publication_date' in args and args['publication_date'] != '':
sql_where += "AND l.id_bibrec IN ( SELECT brb.id_bibrec \
FROM bibrec_bib26x brb, bib26x b WHERE brb.id_bibxxx = b.id AND tag='260__c' \
AND value LIKE %s)"
param.append('%%%s%%' % args['publication_date'])
if 'creation_date' in args and args['creation_date'] != '':
sql_from += ", bibrec br "
sql_where += "AND br.id=l.id_bibrec AND br.creation_date LIKE %s "
param.append('%%%s%%' % args['creation_date'])
param = tuple(param)
# Number of loans:
loans_sql = "SELECT COUNT(DISTINCT l.id_bibrec) " + sql_from + sql_where
items_loaned_sql = "SELECT COUNT(DISTINCT l.barcode) " + sql_from + sql_where
# Only the CERN site wants the items of the collection "Books & Proceedings"
if CFG_CERN_SITE:
items_in_book_coll = _get_collection_recids_for_sql_query("Books & Proceedings")
if items_in_book_coll == "":
total_items_sql = 0
else:
total_items_sql = "SELECT COUNT(*) FROM crcITEM WHERE %s" % \
items_in_book_coll
else: # The rest take all the items
total_items_sql = "SELECT COUNT(*) FROM crcITEM"
# Average time between the date of the record creation and the date of the first loan
avg_sql = "SELECT AVG(DATEDIFF(loaned_on, br.creation_date)) " + sql_from
if not ('creation_date' in args and args['creation_date'] != ''):
avg_sql += ", bibrec br "
avg_sql += sql_where
if not ('creation_date' in args and args['creation_date'] != ''):
avg_sql += "AND br.id=l.id_bibrec "
if return_sql:
return "<ol><li>%s</li><li>Items loaned * 100 / Number of items <ul><li>\
Items loaned: %s </li><li>Number of items: %s</li></ul></li><li>100 - Items \
loaned on total number of items</li><li>%s</li></ol>" % \
(loans_sql % param, items_loaned_sql % param, total_items_sql, avg_sql % param)
loans = run_sql(loans_sql, param)[0][0]
items_loaned = run_sql(items_loaned_sql, param)[0][0]
if total_items_sql:
total_items = run_sql(total_items_sql)[0][0]
else:
total_items = 0
if total_items == 0:
loaned_on_total = 0
never_loaned_on_total = 0
else:
# Number of items loaned on the total number of items:
loaned_on_total = float(items_loaned) * 100 / float(total_items)
# Number of items never loaned on the total number of items:
never_loaned_on_total = 100L - loaned_on_total
avg = run_sql(avg_sql, param)[0][0]
if avg:
avg = float(avg)
else:
avg = 0L
return ((loans, ), (loaned_on_total, ), (never_loaned_on_total, ), (avg, ))
def get_keyevent_loan_lists(args, return_sql=False, limit=50):
"""
Lists:
- List of documents (= records) never loaned
- List of most loaned documents (columns: number of loans,
number of copies and the creation date of the record, in
order to calculate the number of loans by copy), sorted
by decreasing order (50 items)
Filter by
- in a specified time span
- by UDC (see MARC field 080__a - list to be submitted)
- by loan period (4 week loan, one week loan...)
- by a certain number of loans
- by date of publication (MARC field 260__c)
- by date of the record creation in the database
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['udc']: MARC field 080__a
@type args['udc']: str
@param args['loan_period']: 4 week loan, one week loan...
@type args['loan_period']: str
@param args['min_loan']: minimum number of loans
@type args['min_loan']: int
@param args['max_loan']: maximum number of loans
@type args['max_loan']: int
@param args['publication_date']: MARC field 260__c
@type args['publication_date']: str
@param args['creation_date']: date of the record creation in the database
@type args['creation_date']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
sql_where = []
param = []
sql_from = ""
if 'udc' in args and args['udc'] != '':
sql_where.append("i." + _check_udc_value_where())
param.append(_get_udc_truncated(args['udc']))
if 'loan_period' in args and args['loan_period'] != '':
sql_where.append("loan_period = %s")
param.append(args['loan_period'])
if 'publication_date' in args and args['publication_date'] != '':
sql_where.append("i.id_bibrec IN ( SELECT brb.id_bibrec \
FROM bibrec_bib26x brb, bib26x b WHERE brb.id_bibxxx = b.id AND tag='260__c' \
AND value LIKE %s)")
param.append('%%%s%%' % args['publication_date'])
if 'creation_date' in args and args['creation_date'] != '':
sql_from += ", bibrec br"
sql_where.append("br.id=i.id_bibrec AND br.creation_date LIKE %s")
param.append('%%%s%%' % args['creation_date'])
if sql_where:
sql_where = "WHERE %s AND" % " AND ".join(sql_where)
else:
sql_where = "WHERE"
param = tuple(param + [lower, upper])
# SQL for both queries
check_num_loans = "HAVING "
if 'min_loans' in args and args['min_loans'] != '':
check_num_loans += "COUNT(*) >= %s" % args['min_loans']
if 'max_loans' in args and args['max_loans'] != '' and args['max_loans'] != 0:
if check_num_loans != "HAVING ":
check_num_loans += " AND "
check_num_loans += "COUNT(*) <= %s" % args['max_loans']
# Optimized to get all the data in only one query (not call get_fieldvalues several times)
mldocs_sql = "SELECT i.id_bibrec, COUNT(*) \
FROM crcLOAN l, crcITEM i%s %s l.barcode=i.barcode AND type = 'normal' AND \
loaned_on > %%s AND loaned_on < %%s GROUP BY i.id_bibrec %s" % \
(sql_from, sql_where, check_num_loans)
limit_n = ""
if limit > 0:
limit_n = "LIMIT %d" % limit
nldocs_sql = "SELECT id_bibrec, COUNT(*) FROM crcITEM i%s %s \
barcode NOT IN (SELECT id_bibrec FROM crcLOAN WHERE loaned_on > %%s AND \
loaned_on < %%s AND type = 'normal') GROUP BY id_bibrec ORDER BY COUNT(*) DESC %s" % \
(sql_from, sql_where, limit_n)
items_sql = "SELECT id_bibrec, COUNT(*) items FROM crcITEM GROUP BY id_bibrec"
creation_date_sql = "SELECT creation_date FROM bibrec WHERE id=%s"
authors_sql = "SELECT bx.value FROM bib10x bx, bibrec_bib10x bibx \
WHERE bx.id = bibx.id_bibxxx AND bx.tag LIKE '100__a' AND bibx.id_bibrec=%s"
title_sql = "SELECT GROUP_CONCAT(bx.value SEPARATOR ' ') value FROM bib24x bx, bibrec_bib24x bibx \
WHERE bx.id = bibx.id_bibxxx AND bx.tag LIKE %s AND bibx.id_bibrec=%s GROUP BY bibx.id_bibrec"
edition_sql = "SELECT bx.value FROM bib25x bx, bibrec_bib25x AS bibx \
WHERE bx.id = bibx.id_bibxxx AND bx.tag LIKE '250__a' AND bibx.id_bibrec=%s"
if return_sql:
return "Most loaned: %s<br \>Never loaned: %s" % \
(mldocs_sql % param, nldocs_sql % param)
mldocs = run_sql(mldocs_sql, param)
items = dict(run_sql(items_sql))
order_m = []
for mldoc in mldocs:
order_m.append([mldoc[0], mldoc[1], items[mldoc[0]], \
float(mldoc[1]) / float(items[mldoc[0]])])
order_m = sorted(order_m, key=itemgetter(3))
order_m.reverse()
# Check limit values
if limit > 0:
order_m = order_m[:limit]
res = [("", "Title", "Author", "Edition", "Number of loans",
"Number of copies", "Date of creation of the record")]
for mldoc in order_m:
res.append(("Most loaned documents",
_check_empty_value(run_sql(title_sql, ('245__%%', mldoc[0], ))),
_check_empty_value(run_sql(authors_sql, (mldoc[0], ))),
_check_empty_value(run_sql(edition_sql, (mldoc[0], ))),
mldoc[1], mldoc[2],
_check_empty_value(run_sql(creation_date_sql, (mldoc[0], )))))
nldocs = run_sql(nldocs_sql, param)
for nldoc in nldocs:
res.append(("Not loaned documents",
_check_empty_value(run_sql(title_sql, ('245__%%', nldoc[0], ))),
_check_empty_value(run_sql(authors_sql, (nldoc[0], ))),
_check_empty_value(run_sql(edition_sql, (nldoc[0], ))),
0, items[nldoc[0]],
_check_empty_value(run_sql(creation_date_sql, (nldoc[0], )))))
# nldocs = run_sql(nldocs_sql, param_n)
return (res)
def get_keyevent_renewals_lists(args, return_sql=False, limit=50):
"""
Lists:
- List of most renewed items stored by decreasing order (50 items)
Filter by
- in a specified time span
- by UDC (see MARC field 080__a - list to be submitted)
- by collection
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['udc']: MARC field 080__a
@type args['udc']: str
@param args['collection']: collection of the record
@type args['collection']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
sql_from = "FROM crcLOAN l, crcITEM i "
sql_where = "WHERE loaned_on > %s AND loaned_on < %s AND i.barcode = l.barcode "
param = [lower, upper]
if 'udc' in args and args['udc'] != '':
sql_where += "AND l." + _check_udc_value_where()
param.append(_get_udc_truncated(args['udc']))
filter_coll = False
if 'collection' in args and args['collection'] != '':
filter_coll = True
recid_list = get_collection_reclist(args['collection'])
param = tuple(param)
if limit > 0:
limit = "LIMIT %d" % limit
else:
limit = ""
sql = "SELECT i.id_bibrec, SUM(number_of_renewals) %s %s \
GROUP BY i.id_bibrec ORDER BY SUM(number_of_renewals) DESC %s" \
% (sql_from, sql_where, limit)
if return_sql:
return sql % param
# Results:
res = [("Title", "Author", "Edition", "Number of renewals")]
for rec, renewals in run_sql(sql, param):
if filter_coll and rec not in recid_list:
continue
author = get_fieldvalues(rec, "100__a")
if len(author) > 0:
author = author[0]
else:
author = ""
edition = get_fieldvalues(rec, "250__a")
if len(edition) > 0:
edition = edition[0]
else:
edition = ""
res.append((book_title_from_MARC(rec), author, edition, int(renewals)))
return (res)
def get_keyevent_returns_table(args, return_sql=False):
"""
Data:
- Number of overdue returns in a timespan
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
# Overdue returns:
sql = "SELECT COUNT(*) FROM crcLOAN l WHERE loaned_on > %s AND loaned_on < %s AND \
due_date < NOW() AND (returned_on IS NULL OR returned_on > due_date)"
if return_sql:
return sql % (lower, upper)
return ((run_sql(sql, (lower, upper))[0][0], ), )
def get_keyevent_trend_returns_percentage(args, return_sql=False):
"""
Returns the number of overdue returns and the total number of returns
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['granularity']: Granularity of date and time
@type args['granularity']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
# SQL to determine overdue returns:
overdue = _get_sql_query("due_date", args["granularity"], "crcLOAN",
conditions="due_date < NOW() AND due_date IS NOT NULL \
AND (returned_on IS NULL OR returned_on > due_date)",
dates_range_param="loaned_on")
# SQL to determine all returns:
total = _get_sql_query("due_date", args["granularity"], "crcLOAN",
conditions="due_date < NOW() AND due_date IS NOT NULL",
dates_range_param="loaned_on")
# Compute the trend for both types
o_trend = _get_keyevent_trend(args, overdue,
return_sql=return_sql, sql_text="Overdue: %s")
t_trend = _get_keyevent_trend(args, total,
return_sql=return_sql, sql_text="Total: %s")
# Assemble, according to return type
if return_sql:
return "%s <br /> %s" % (o_trend, t_trend)
return [(o_trend[i][0], (o_trend[i][1], t_trend[i][1]))
for i in range(len(o_trend))]
def get_keyevent_ill_requests_statistics(args, return_sql=False):
"""
Data:
- Number of ILL requests
- Number of satisfied ILL requests 2 weeks after the date of request
creation on a timespan
- Average time between the date and the hour of the ill request
date and the date and the hour of the delivery item to the user
on a timespan
- Average time between the date and the hour the ILL request
was sent to the supplier and the date and hour of the
delivery item on a timespan
Filter by
- in a specified time span
- by type of document (book or article)
- by status of the request (= new, sent, etc.)
- by supplier
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['doctype']: type of document (book or article)
@type args['doctype']: str
@param args['status']: status of the request (= new, sent, etc.)
@type args['status']: str
@param args['supplier']: supplier
@type args['supplier']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
sql_from = "FROM crcILLREQUEST ill "
sql_where = "WHERE period_of_interest_from > %s AND period_of_interest_from < %s "
param = [lower, upper]
if 'doctype' in args and args['doctype'] != '':
sql_where += "AND ill.request_type=%s"
param.append(args['doctype'])
if 'status' in args and args['status'] != '':
sql_where += "AND ill.status = %s "
param.append(args['status'])
else:
sql_where += "AND ill.status != %s "
param.append(CFG_BIBCIRCULATION_ILL_STATUS_CANCELLED)
if 'supplier' in args and args['supplier'] != '':
sql_from += ", crcLIBRARY lib "
sql_where += "AND lib.id=ill.id_crcLIBRARY AND lib.name=%s "
param.append(args['supplier'])
param = tuple(param)
requests_sql = "SELECT COUNT(*) %s %s" % (sql_from, sql_where)
satrequests_sql = "SELECT COUNT(*) %s %s \
AND arrival_date IS NOT NULL AND \
DATEDIFF(arrival_date, period_of_interest_from) < 14 " % (sql_from, sql_where)
avgdel_sql = "SELECT AVG(TIMESTAMPDIFF(DAY, period_of_interest_from, arrival_date)) %s %s \
AND arrival_date IS NOT NULL" % (sql_from, sql_where)
avgsup_sql = "SELECT AVG(TIMESTAMPDIFF(DAY, request_date, arrival_date)) %s %s \
AND arrival_date IS NOT NULL \
AND request_date IS NOT NULL" % (sql_from, sql_where)
if return_sql:
return "<ol><li>%s</li><li>%s</li><li>%s</li><li>%s</li></ol>" % \
(requests_sql % param, satrequests_sql % param,
avgdel_sql % param, avgsup_sql % param)
# Number of requests:
requests = run_sql(requests_sql, param)[0][0]
# Number of satisfied ILL requests 2 weeks after the date of request creation:
satrequests = run_sql(satrequests_sql, param)[0][0]
# Average time between the date and the hour of the ill request date and
# the date and the hour of the delivery item to the user
avgdel = run_sql(avgdel_sql, param)[0][0]
if avgdel:
avgdel = float(avgdel)
else:
avgdel = 0
# Average time between the date and the hour the ILL request was sent to
# the supplier and the date and hour of the delivery item
avgsup = run_sql(avgsup_sql, param)[0][0]
if avgsup:
avgsup = float(avgsup)
else:
avgsup = 0
return ((requests, ), (satrequests, ), (avgdel, ), (avgsup, ))
def get_keyevent_ill_requests_lists(args, return_sql=False, limit=50):
"""
Lists:
- List of ILL requests
Filter by
- in a specified time span
- by type of request (article or book)
- by supplier
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['doctype']: type of request (article or book)
@type args['doctype']: str
@param args['supplier']: supplier
@type args['supplier']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
sql_from = "FROM crcILLREQUEST ill "
sql_where = "WHERE status != '%s' AND request_date > %%s AND request_date < %%s " \
% CFG_BIBCIRCULATION_ITEM_STATUS_CANCELLED
param = [lower, upper]
if 'doctype' in args and args['doctype'] != '':
sql_where += "AND ill.request_type=%s "
param.append(args['doctype'])
if 'supplier' in args and args['supplier'] != '':
sql_from += ", crcLIBRARY lib "
sql_where += "AND lib.id=ill.id_crcLIBRARY AND lib.name=%s "
param.append(args['supplier'])
param = tuple(param)
if limit > 0:
limit = "LIMIT %d" % limit
else:
limit = ""
sql = "SELECT ill.id, item_info %s %s %s" % (sql_from, sql_where, limit)
if return_sql:
return sql % param
# Results:
res = [("Id", "Title", "Author", "Edition")]
for req_id, item_info in run_sql(sql, param):
item_info = eval(item_info)
try:
res.append((req_id, item_info['title'], item_info['authors'], item_info['edition']))
except KeyError:
pass
return (res)
def get_keyevent_trend_satisfied_ill_requests_percentage(args, return_sql=False):
"""
Returns the number of satisfied ILL requests 2 weeks after the date of request
creation and the total number of ILL requests
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['doctype']: type of document (book or article)
@type args['doctype']: str
@param args['status']: status of the request (= new, sent, etc.)
@type args['status']: str
@param args['supplier']: supplier
@type args['supplier']: str
@param args['granularity']: Granularity of date and time
@type args['granularity']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
sql_from = "crcILLREQUEST ill "
sql_where = ""
param = []
if 'doctype' in args and args['doctype'] != '':
sql_where += "AND ill.request_type=%s"
param.append(args['doctype'])
if 'status' in args and args['status'] != '':
sql_where += "AND ill.status = %s "
param.append(args['status'])
else:
sql_where += "AND ill.status != %s "
param.append(CFG_BIBCIRCULATION_ILL_STATUS_CANCELLED)
if 'supplier' in args and args['supplier'] != '':
sql_from += ", crcLIBRARY lib "
sql_where += "AND lib.id=ill.id_crcLIBRARY AND lib.name=%s "
param.append(args['supplier'])
# SQL to determine satisfied ILL requests:
satisfied = _get_sql_query("request_date", args["granularity"], sql_from,
conditions="ADDDATE(request_date, 14) < NOW() AND \
(arrival_date IS NULL OR arrival_date < ADDDATE(request_date, 14)) " + sql_where)
# SQL to determine all ILL requests:
total = _get_sql_query("request_date", args["granularity"], sql_from,
conditions="ADDDATE(request_date, 14) < NOW() "+ sql_where)
# Compute the trend for both types
s_trend = _get_keyevent_trend(args, satisfied, extra_param=param,
return_sql=return_sql, sql_text="Satisfied: %s")
t_trend = _get_keyevent_trend(args, total, extra_param=param,
return_sql=return_sql, sql_text="Total: %s")
# Assemble, according to return type
if return_sql:
return "%s <br /> %s" % (s_trend, t_trend)
return [(s_trend[i][0], (s_trend[i][1], t_trend[i][1]))
for i in range(len(s_trend))]
def get_keyevent_items_statistics(args, return_sql=False):
"""
Data:
- The total number of items
- Total number of new items added in last year
Filter by
- in a specified time span
- by collection
- by UDC (see MARC field 080__a - list to be submitted)
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['udc']: MARC field 080__a
@type args['udc']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
sql_from = "FROM crcITEM i "
sql_where = "WHERE "
param = []
if 'udc' in args and args['udc'] != '':
sql_where += "i." + _check_udc_value_where()
param.append(_get_udc_truncated(args['udc']))
# Number of items:
if sql_where == "WHERE ":
sql_where = ""
items_sql = "SELECT COUNT(i.id_bibrec) %s %s" % (sql_from, sql_where)
# Number of new items:
if sql_where == "":
sql_where = "WHERE creation_date > %s AND creation_date < %s "
else:
sql_where += " AND creation_date > %s AND creation_date < %s "
new_items_sql = "SELECT COUNT(i.id_bibrec) %s %s" % (sql_from, sql_where)
if return_sql:
return "Total: %s <br />New: %s" % (items_sql % tuple(param), new_items_sql % tuple(param + [lower, upper]))
return ((run_sql(items_sql, tuple(param))[0][0], ), (run_sql(new_items_sql, tuple(param + [lower, upper]))[0][0], ))
def get_keyevent_items_lists(args, return_sql=False, limit=50):
"""
Lists:
- The list of items
Filter by
- by library (=physical location of the item)
- by status (=on loan, available, requested, missing...)
@param args['library']: physical location of the item
@type args[library'']: str
@param args['status']: on loan, available, requested, missing...
@type args['status']: str
"""
sql_from = "FROM crcITEM i "
sql_where = "WHERE "
param = []
if 'library' in args and args['library'] != '':
sql_from += ", crcLIBRARY li "
sql_where += "li.id=i.id_crcLIBRARY AND li.name=%s "
param.append(args['library'])
if 'status' in args and args['status'] != '':
if sql_where != "WHERE ":
sql_where += "AND "
sql_where += "i.status = %s "
param.append(args['status'])
param = tuple(param)
# Results:
res = [("Title", "Author", "Edition", "Barcode", "Publication date")]
if sql_where == "WHERE ":
sql_where = ""
if limit > 0:
limit = "LIMIT %d" % limit
else:
limit = ""
sql = "SELECT i.barcode, i.id_bibrec %s %s %s" % (sql_from, sql_where, limit)
if len(param) == 0:
sqlres = run_sql(sql)
else:
sqlres = run_sql(sql, tuple(param))
sql = sql % param
if return_sql:
return sql
for barcode, rec in sqlres:
author = get_fieldvalues(rec, "100__a")
if len(author) > 0:
author = author[0]
else:
author = ""
edition = get_fieldvalues(rec, "250__a")
if len(edition) > 0:
edition = edition[0]
else:
edition = ""
res.append((book_title_from_MARC(rec),
author, edition, barcode,
book_information_from_MARC(int(rec))[1]))
return (res)
def get_keyevent_loan_request_statistics(args, return_sql=False):
"""
Data:
- Number of hold requests, one week after the date of request creation
- Number of successful hold requests transactions
- Average time between the hold request date and the date of delivery document in a year
Filter by
- in a specified time span
- by item status (available, missing)
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['item_status']: available, missing...
@type args['item_status']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
sql_from = "FROM crcLOANREQUEST lr "
sql_where = "WHERE request_date > %s AND request_date < %s "
param = [lower, upper]
if 'item_status' in args and args['item_status'] != '':
sql_from += ", crcITEM i "
sql_where += "AND lr.barcode = i.barcode AND i.status = %s "
param.append(args['item_status'])
param = tuple(param)
custom_table = get_customevent_table("loanrequest")
# Number of hold requests, one week after the date of request creation:
holds = "SELECT COUNT(*) %s, %s ws %s AND ws.request_id=lr.id AND \
DATEDIFF(ws.creation_time, lr.request_date) >= 7" % (sql_from, custom_table, sql_where)
# Number of successful hold requests transactions
succesful_holds = "SELECT COUNT(*) %s %s AND lr.status='%s'" % (sql_from, sql_where,
CFG_BIBCIRCULATION_REQUEST_STATUS_DONE)
# Average time between the hold request date and the date of delivery document in a year
avg_sql = "SELECT AVG(DATEDIFF(ws.creation_time, lr.request_date)) \
%s, %s ws %s AND ws.request_id=lr.id" % (sql_from, custom_table, sql_where)
if return_sql:
return "<ol><li>%s</li><li>%s</li><li>%s</li></ol>" % \
(holds % param, succesful_holds % param, avg_sql % param)
avg = run_sql(avg_sql, param)[0][0]
if avg is int:
avg = int(avg)
else:
avg = 0
return ((run_sql(holds, param)[0][0], ),
(run_sql(succesful_holds, param)[0][0], ), (avg, ))
def get_keyevent_loan_request_lists(args, return_sql=False, limit=50):
"""
Lists:
- List of the most requested items
Filter by
- in a specified time span
- by UDC (see MARC field 080__a - list to be submitted)
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['udc']: MARC field 080__a
@type args['udc']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
sql_from = "FROM crcLOANREQUEST lr "
sql_where = "WHERE request_date > %s AND request_date < %s "
param = [lower, upper]
if 'udc' in args and args['udc'] != '':
sql_where += "AND lr." + _check_udc_value_where()
param.append(_get_udc_truncated(args['udc']))
if limit > 0:
limit = "LIMIT %d" % limit
else:
limit = ""
sql = "SELECT lr.barcode %s %s GROUP BY barcode \
ORDER BY COUNT(*) DESC %s" % (sql_from, sql_where, limit)
if return_sql:
return sql
res = [("Title", "Author", "Edition", "Barcode")]
# Most requested items:
for barcode in run_sql(sql, param):
rec = get_id_bibrec(barcode[0])
author = get_fieldvalues(rec, "100__a")
if len(author) > 0:
author = author[0]
else:
author = ""
edition = get_fieldvalues(rec, "250__a")
if len(edition) > 0:
edition = edition[0]
else:
edition = ""
res.append((book_title_from_MARC(rec), author, edition, barcode[0]))
return (res)
def get_keyevent_user_statistics(args, return_sql=False):
"""
Data:
- Total number of active users (to be defined = at least one transaction in the past year)
Filter by
- in a specified time span
- by registration date
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
sql_from_ill = "FROM crcILLREQUEST ill "
sql_from_loan = "FROM crcLOAN l "
sql_where_ill = "WHERE request_date > %s AND request_date < %s "
sql_where_loan = "WHERE loaned_on > %s AND loaned_on < %s "
param = (lower, upper, lower, upper)
# Total number of active users:
users = "SELECT COUNT(DISTINCT user) FROM ((SELECT id_crcBORROWER user %s %s) \
UNION (SELECT id_crcBORROWER user %s %s)) res" % \
(sql_from_ill, sql_where_ill, sql_from_loan, sql_where_loan)
if return_sql:
return users % param
return ((run_sql(users, param)[0][0], ), )
def get_keyevent_user_lists(args, return_sql=False, limit=50):
"""
Lists:
- List of most intensive users (ILL requests + Loan)
Filter by
- in a specified time span
- by registration date
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
param = (lower, upper, lower, upper)
if limit > 0:
limit = "LIMIT %d" % limit
else:
limit = ""
sql = "SELECT user, SUM(trans) FROM \
((SELECT id_crcBORROWER user, COUNT(*) trans FROM crcILLREQUEST ill \
WHERE request_date > %%s AND request_date < %%s GROUP BY id_crcBORROWER) UNION \
(SELECT id_crcBORROWER user, COUNT(*) trans FROM crcLOAN l WHERE loaned_on > %%s AND \
loaned_on < %%s GROUP BY id_crcBORROWER)) res GROUP BY user ORDER BY SUM(trans) DESC \
%s" % (limit)
if return_sql:
return sql % param
res = [("Name", "Address", "Mailbox", "E-mail", "Number of transactions")]
# List of most intensive users (ILL requests + Loan):
for borrower_id, trans in run_sql(sql, param):
name, address, mailbox, email = get_borrower_data(borrower_id)
res.append((name, address, mailbox, email, int(trans)))
return (res)
# KEY EVENT SNAPSHOT SECTION
def get_keyevent_snapshot_uptime_cmd():
"""
A specific implementation of get_current_event().
@return: The std-out from the UNIX command 'uptime'.
@type: str
"""
return _run_cmd('uptime').strip().replace(' ', ' ')
def get_keyevent_snapshot_apache_processes():
"""
A specific implementation of get_current_event().
@return: The std-out from the UNIX command 'uptime'.
@type: str
"""
# The number of Apache processes (root+children)
return _run_cmd('ps -e | grep apache2 | grep -v grep | wc -l')
def get_keyevent_snapshot_bibsched_status():
"""
A specific implementation of get_current_event().
@return: Information about the number of tasks in the different status modes.
@type: [(str, int)]
"""
sql = "SELECT status, COUNT(status) FROM schTASK GROUP BY status"
return [(x[0], int(x[1])) for x in run_sql(sql)]
def get_keyevent_snapshot_sessions():
"""
A specific implementation of get_current_event().
@return: The current number of website visitors (guests, logged in)
@type: (int, int)
"""
# SQL to retrieve sessions in the Guests
sql = "SELECT COUNT(session_expiry) " + \
"FROM session INNER JOIN user ON uid=id " + \
"WHERE email = '' AND " + \
"session_expiry-%d < unix_timestamp() AND " \
% WEBSTAT_SESSION_LENGTH + \
"unix_timestamp() < session_expiry"
guests = run_sql(sql)[0][0]
# SQL to retrieve sessions in the Logged in users
sql = "SELECT COUNT(session_expiry) " + \
"FROM session INNER JOIN user ON uid=id " + \
"WHERE email <> '' AND " + \
"session_expiry-%d < unix_timestamp() AND " \
% WEBSTAT_SESSION_LENGTH + \
"unix_timestamp() < session_expiry"
logged_ins = run_sql(sql)[0][0]
# Assemble, according to return type
return (guests, logged_ins)
def get_keyevent_bibcirculation_report(freq='yearly'):
"""
Monthly and yearly report with the total number of circulation
transactions (loans, renewals, returns, ILL requests, hold request).
@param freq: yearly or monthly
@type freq: str
@return: loans, renewals, returns, ILL requests, hold request
@type: (int, int, int, int, int)
"""
if freq == 'monthly':
datefrom = datetime.date.today().strftime("%Y-%m-01 00:00:00")
else: #yearly
datefrom = datetime.date.today().strftime("%Y-01-01 00:00:00")
loans, renewals = run_sql("SELECT COUNT(*), \
SUM(number_of_renewals) \
FROM crcLOAN WHERE loaned_on > %s", (datefrom, ))[0]
returns = run_sql("SELECT COUNT(*) FROM crcLOAN \
WHERE returned_on!='0000-00-00 00:00:00' and loaned_on > %s", (datefrom, ))[0][0]
illrequests = run_sql("SELECT COUNT(*) FROM crcILLREQUEST WHERE request_date > %s",
(datefrom, ))[0][0]
holdrequest = run_sql("SELECT COUNT(*) FROM crcLOANREQUEST WHERE request_date > %s",
(datefrom, ))[0][0]
return (loans, renewals, returns, illrequests, holdrequest)
def get_last_updates():
"""
List date/time when the last updates where done (easy reading format).
@return: last indexing, last ranking, last sorting, last webcolling
@type: (datetime, datetime, datetime, datetime)
"""
try:
last_index = convert_datestruct_to_dategui(convert_datetext_to_datestruct \
(str(run_sql('SELECT last_updated FROM idxINDEX WHERE \
name="global"')[0][0])))
last_rank = convert_datestruct_to_dategui(convert_datetext_to_datestruct \
(str(run_sql('SELECT last_updated FROM rnkMETHOD ORDER BY \
last_updated DESC LIMIT 1')[0][0])))
last_sort = convert_datestruct_to_dategui(convert_datetext_to_datestruct \
(str(run_sql('SELECT last_updated FROM bsrMETHODDATA ORDER BY \
last_updated DESC LIMIT 1')[0][0])))
file_coll_last_update = open(CFG_CACHE_LAST_UPDATED_TIMESTAMP_FILE, 'r')
last_coll = convert_datestruct_to_dategui(convert_datetext_to_datestruct \
(str(file_coll_last_update.read())))
file_coll_last_update.close()
# database not filled
except IndexError:
return ("", "", "", "")
return (last_index, last_rank, last_sort, last_coll)
def get_list_link(process, category=None):
"""
Builds the link for the list of records not indexed, ranked, sorted or
collected.
@param process: kind of process the records are waiting for (index, rank,
sort, collect)
@type process: str
@param category: specific sub-category of the process.
Index: global, collection, abstract, author, keyword,
reference, reportnumber, title, fulltext, year,
journal, collaboration, affiliation, exactauthor,
caption, firstauthor, exactfirstauthor, authorcount)
Rank: wrd, demo_jif, citation, citerank_citation_t,
citerank_pagerank_c, citerank_pagerank_t
Sort: latest first, title, author, report number,
most cited
Collect: Empty / None
@type category: str
@return: link text
@type: string
"""
if process == "index":
list_registers = run_sql('SELECT id FROM bibrec WHERE \
modification_date > (SELECT last_updated FROM \
idxINDEX WHERE name=%s)', (category,))
elif process == "rank":
list_registers = run_sql('SELECT id FROM bibrec WHERE \
modification_date > (SELECT last_updated FROM \
rnkMETHOD WHERE name=%s)', (category,))
elif process == "sort":
list_registers = run_sql('SELECT id FROM bibrec WHERE \
modification_date > (SELECT last_updated FROM \
bsrMETHODDATA WHERE id_bsrMETHOD=(SELECT id \
FROM bsrMETHOD WHERE name=%s))', (category,))
elif process == "collect":
file_coll_last_update = open(CFG_CACHE_LAST_UPDATED_TIMESTAMP_FILE, 'r')
coll_last_update = file_coll_last_update.read()
file_coll_last_update.close()
list_registers = run_sql('SELECT id FROM bibrec WHERE \
modification_date > %s', (coll_last_update,))
# build the link
if list_registers == ():
return "Up to date"
link = '<a href="' + CFG_SITE_URL + '/search?p='
for register in list_registers:
link += 'recid%3A' + str(register[0]) + '+or+'
# delete the last '+or+'
link = link[:len(link)-4]
link += '">' + str(len(list_registers)) + '</a>'
return link
def get_search_link(record_id):
"""
Auxiliar, builds the direct link for a given record.
@param record_id: record's id number
@type record_id: int
@return: link text
@type: string
"""
link = '<a href="' + CFG_SITE_URL + '/record/' + \
str(record_id) + '">Record [' + str(record_id) + ']</a>'
return link
def get_ingestion_matching_records(request=None, limit=25):
"""
Fetches all the records matching a given pattern, arranges them by last
modificaton date and returns a list.
@param request: requested pattern to match
@type request: str
@return: list of records matching a pattern,
(0,) if no request,
(-1,) if the request was invalid
@type: list
"""
if request==None or request=="":
return (0,)
try:
records = list(search_pattern(p=request))
except:
return (-1,)
if records == []:
return records
# order by most recent modification date
query = 'SELECT id FROM bibrec WHERE '
for r in records:
query += 'id="' + str(r) + '" OR '
query = query[:len(query)-4]
query += ' ORDER BY modification_date DESC LIMIT %s'
list_records = run_sql(query, (limit,))
final_list = []
for lr in list_records:
final_list.append(lr[0])
return final_list
def get_record_ingestion_status(record_id):
"""
Returns the amount of ingestion methods not updated yet to a given record.
If 0, the record is up to date.
@param record_id: record id number
@type record_id: int
@return: number of methods not updated for the record
@type: int
"""
counter = 0
counter += run_sql('SELECT COUNT(*) FROM bibrec WHERE \
id=%s AND modification_date > (SELECT last_updated FROM \
idxINDEX WHERE name="global")', (record_id, ))[0][0]
counter += run_sql('SELECT COUNT(*) FROM bibrec WHERE \
id=%s AND modification_date > (SELECT last_updated FROM \
rnkMETHOD ORDER BY last_updated DESC LIMIT 1)', \
(record_id, ))[0][0]
counter = run_sql('SELECT COUNT(*) FROM bibrec WHERE \
id=%s AND modification_date > (SELECT last_updated FROM \
bsrMETHODDATA ORDER BY last_updated DESC LIMIT 1)', \
(record_id, ))[0][0]
file_coll_last_update = open(CFG_CACHE_LAST_UPDATED_TIMESTAMP_FILE, 'r')
last_coll = file_coll_last_update.read()
file_coll_last_update.close()
counter += run_sql('SELECT COUNT(*) FROM bibrec WHERE \
id=%s AND \
modification_date >\
%s', (record_id, last_coll,))[0][0]
return counter
def get_specific_ingestion_status(record_id, process, method=None):
"""
Returns whether a record is or not up to date for a given
process and method.
@param record_id: identification number of the record
@type record_id: int
@param process: kind of process the records may be waiting for (index,
rank, sort, collect)
@type process: str
@param method: specific sub-method of the process.
Index: global, collection, abstract, author, keyword,
reference, reportnumber, title, fulltext, year,
journal, collaboration, affiliation, exactauthor,
caption, firstauthor, exactfirstauthor, authorcount
Rank: wrd, demo_jif, citation, citerank_citation_t,
citerank_pagerank_c, citerank_pagerank_t
Sort: latest first, title, author, report number,
most cited
Collect: Empty / None
@type category: str
@return: text: None if the record is up to date
Last time the method was updated if it is waiting
@type: date/time string
"""
exist = run_sql('SELECT COUNT(*) FROM bibrec WHERE id=%s', (record_id, ))
if exist[0][0] == 0:
return "REG not in DB"
if process == "index":
list_registers = run_sql('SELECT COUNT(*) FROM bibrec WHERE \
id=%s AND modification_date > (SELECT \
last_updated FROM idxINDEX WHERE name=%s)',
(record_id, method,))
last_time = run_sql ('SELECT last_updated FROM idxINDEX WHERE \
name=%s', (method,))[0][0]
elif process == "rank":
list_registers = run_sql('SELECT COUNT(*) FROM bibrec WHERE \
id=%s AND modification_date > (SELECT \
last_updated FROM rnkMETHOD WHERE name=%s)',
(record_id, method,))
last_time = run_sql ('SELECT last_updated FROM rnkMETHOD WHERE \
name=%s', (method,))[0][0]
elif process == "sort":
list_registers = run_sql('SELECT COUNT(*) FROM bibrec WHERE \
id=%s AND modification_date > (SELECT \
last_updated FROM bsrMETHODDATA WHERE \
id_bsrMETHOD=(SELECT id FROM bsrMETHOD \
WHERE name=%s))', (record_id, method,))
last_time = run_sql ('SELECT last_updated FROM bsrMETHODDATA WHERE \
id_bsrMETHOD=(SELECT id FROM bsrMETHOD \
WHERE name=%s)', (method,))[0][0]
elif process == "collect":
file_coll_last_update = open(CFG_CACHE_LAST_UPDATED_TIMESTAMP_FILE, 'r')
last_time = file_coll_last_update.read()
file_coll_last_update.close()
list_registers = run_sql('SELECT COUNT(*) FROM bibrec WHERE id=%s \
AND modification_date > %s',
(record_id, last_time,))
# no results means the register is up to date
if list_registers[0][0] == 0:
return None
else:
return convert_datestruct_to_dategui(convert_datetext_to_datestruct \
(str(last_time)))
def get_title_ingestion(record_id, last_modification):
"""
Auxiliar, builds a direct link for a given record, with its last
modification date.
@param record_id: id number of the record
@type record_id: string
@param last_modification: date/time of the last modification
@type last_modification: string
@return: link text
@type: string
"""
return '<h3><a href="%s/record/%s">Record [%s] last modification: %s</a></h3>' \
% (CFG_SITE_URL, record_id, record_id, last_modification)
def get_record_last_modification (record_id):
"""
Returns the date/time of the last modification made to a given record.
@param record_id: id number of the record
@type record_id: int
@return: date/time of the last modification
@type: string
"""
return convert_datestruct_to_dategui(convert_datetext_to_datestruct \
(str(run_sql('SELECT modification_date FROM bibrec \
WHERE id=%s', (record_id,))[0][0])))
def get_general_status():
"""
Returns an aproximate amount of ingestions processes not aplied to new or
updated records, using the "global" category.
@return: number of processes not updated
@type: int
"""
return run_sql('SELECT COUNT(*) FROM bibrec WHERE \
modification_date > (SELECT last_updated FROM \
idxINDEX WHERE name="global")')[0][0]
# ERROR LOG STATS
def update_error_log_analyzer():
"""Creates splitted files for today's errors"""
_run_cmd('bash %s/webstat -e -is' % CFG_BINDIR)
def get_invenio_error_log_ranking():
""" Returns the ranking of the errors in the invenio log"""
return _run_cmd('bash %s/webstat -e -ir' % CFG_BINDIR)
def get_invenio_last_n_errors(nerr):
"""Returns the last nerr errors in the invenio log (without details)"""
return _run_cmd('bash %s/webstat -e -il %d' % (CFG_BINDIR, nerr))
def get_invenio_error_details(error):
"""Returns the complete text of the invenio error."""
out = _run_cmd('bash %s/webstat -e -id %s' % (CFG_BINDIR, error))
return out
def get_apache_error_log_ranking():
""" Returns the ranking of the errors in the apache log"""
return _run_cmd('bash %s/webstat -e -ar' % CFG_BINDIR)
# CUSTOM EVENT SECTION
def get_customevent_trend(args):
"""
Returns trend data for a custom event over a given
timestamp range.
@param args['event_id']: The event id
@type args['event_id']: str
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['granularity']: Granularity of date and time
@type args['granularity']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
@param args['cols']: Columns and it's content that will be include
if don't exist or it's empty it will include all cols
@type args['cols']: [ [ str, str ], ]
"""
# Get a MySQL friendly date
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
tbl_name = get_customevent_table(args['event_id'])
col_names = get_customevent_args(args['event_id'])
where = []
sql_param = [lower, upper]
for col_bool, col_title, col_content in args['cols']:
if not col_title in col_names:
continue
if col_content:
if col_bool == "" or not where:
where.append(wash_table_column_name(col_title))
elif col_bool == "and":
where.append("AND %s"
% wash_table_column_name(col_title))
elif col_bool == "or":
where.append("OR %s"
% wash_table_column_name(col_title))
elif col_bool == "and_not":
where.append("AND NOT %s"
% wash_table_column_name(col_title))
else:
continue
where.append(" LIKE %s")
sql_param.append("%" + col_content + "%")
sql = _get_sql_query("creation_time", args['granularity'], tbl_name, " ".join(where))
return _get_trend_from_actions(run_sql(sql, tuple(sql_param)), 0,
args['t_start'], args['t_end'],
args['granularity'], args['t_format'])
def get_customevent_dump(args):
"""
Similar to a get_event_trend implemention, but NO refining aka frequency
handling is carried out what so ever. This is just a dump. A dump!
@param args['event_id']: The event id
@type args['event_id']: str
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['granularity']: Granularity of date and time
@type args['granularity']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
@param args['cols']: Columns and it's content that will be include
if don't exist or it's empty it will include all cols
@type args['cols']: [ [ str, str ], ]
"""
# Get a MySQL friendly date
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
# Get customevents
# events_list = [(creation_time, event, [arg1, arg2, ...]), ...]
event_list = []
event_cols = {}
for event_id, i in [(args['ids'][i], str(i))
for i in range(len(args['ids']))]:
# Get all the event arguments and creation times
tbl_name = get_customevent_table(event_id)
col_names = get_customevent_args(event_id)
sql_query = ["SELECT * FROM %s WHERE creation_time > '%%s'" % wash_table_column_name(tbl_name), (lower,)] # kwalitee: disable=sql
sql_query.append("AND creation_time < '%s'" % upper)
sql_param = []
for col_bool, col_title, col_content in args['cols' + i]:
if not col_title in col_names:
continue
if col_content:
if col_bool == "and" or col_bool == "":
sql_query.append("AND %s" % \
wash_table_column_name(col_title))
elif col_bool == "or":
sql_query.append("OR %s" % \
wash_table_column_name(col_title))
elif col_bool == "and_not":
sql_query.append("AND NOT %s" % \
wash_table_column_name(col_title))
else:
continue
sql_query.append(" LIKE %s")
sql_param.append("%" + col_content + "%")
sql_query.append("ORDER BY creation_time DESC")
sql = ' '.join(sql_query)
res = run_sql(sql, tuple(sql_param))
for row in res:
event_list.append((row[1], event_id, row[2:]))
# Get the event col names
try:
event_cols[event_id] = cPickle.loads(run_sql(
"SELECT cols FROM staEVENT WHERE id = %s",
(event_id, ))[0][0])
except TypeError:
event_cols[event_id] = ["Unnamed"]
event_list.sort()
output = []
for row in event_list:
temp = [row[1], row[0].strftime('%Y-%m-%d %H:%M:%S')]
arguments = ["%s: %s" % (event_cols[row[1]][i],
row[2][i]) for i in range(len(row[2]))]
temp.extend(arguments)
output.append(tuple(temp))
return output
def get_customevent_table(event_id):
"""
Helper function that for a certain event id retrives the corresponding
event table name.
"""
res = run_sql(
"SELECT CONCAT('staEVENT', number) FROM staEVENT WHERE id = %s", (event_id, ))
try:
return res[0][0]
except IndexError:
# No such event table
return None
def get_customevent_args(event_id):
"""
Helper function that for a certain event id retrives the corresponding
event argument (column) names.
"""
res = run_sql("SELECT cols FROM staEVENT WHERE id = %s", (event_id, ))
try:
if res[0][0]:
return cPickle.loads(res[0][0])
else:
return []
except IndexError:
# No such event table
return None
# CUSTOM SUMMARY SECTION
def get_custom_summary_data(query, tag):
"""Returns the annual report data for the specified year
@param query: Search query to make customized report
@type query: str
@param tag: MARC tag for the output
@type tag: str
"""
# Check arguments
if tag == '':
tag = CFG_JOURNAL_TAG.replace("%", "p")
# First get records of the year
recids = perform_request_search(p=query, of="id", wl=0)
# Then return list by tag
pub = get_most_popular_field_values(recids, tag)
if len(pub) == 0:
return []
if CFG_CERN_SITE:
total = sum([x[1] for x in pub])
else:
others = 0
total = 0
first_other = -1
for elem in pub:
total += elem[1]
if elem[1] < 2:
if first_other == -1:
first_other = pub.index(elem)
others += elem[1]
del pub[first_other:]
if others != 0:
pub.append(('Others', others))
pub.append(('TOTAL', total))
return pub
def create_custom_summary_graph(data, path, title):
"""
Creates a pie chart with the information from the custom summary and
saves it in the file specified by the path argument
"""
# If no input, we don't bother about anything
if len(data) == 0:
return
os.environ['HOME'] = CFG_TMPDIR
try:
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
except ImportError:
return
# make a square figure and axes
matplotlib.rcParams['font.size'] = 8
labels = [x[0] for x in data]
numb_elem = len(labels)
width = 6 + float(numb_elem) / 7
gfile = plt.figure(1, figsize=(width, 6))
plt.axes([0.1, 0.1, 4.2 / width, 0.7])
numb = [x[1] for x in data]
total = sum(numb)
fracs = [x * 100 / total for x in numb]
colors = []
random.seed()
for i in range(numb_elem):
col = 0.5 + float(i) / (float(numb_elem) * 2.0)
rand = random.random() / 2.0
if i % 3 == 0:
red = col
green = col + rand
blue = col - rand
if green > 1.0:
green = 1
elif i % 3 == 1:
red = col - rand
green = col
blue = col + rand
if blue > 1.0:
blue = 1
elif i % 3 == 2:
red = col + rand
green = col - rand
blue = col
if red > 1.0:
red = 1
colors.append((red, green, blue))
patches = plt.pie(fracs, colors=tuple(colors), labels=labels,
autopct='%1i%%', pctdistance=0.8, shadow=True)[0]
ttext = plt.title(title)
plt.setp(ttext, size='xx-large', color='b', family='monospace', weight='extra bold')
legend_keywords = {"prop": {"size": "small"}}
plt.figlegend(patches, labels, 'lower right', **legend_keywords)
plt.savefig(path)
plt.close(gfile)
# GRAPHER
def create_graph_trend(trend, path, settings):
"""
Creates a graph representation out of data produced from get_event_trend.
@param trend: The trend data
@type trend: [(str, str|int|(str|int,...))]
@param path: Where to store the graph
@type path: str
@param settings: Dictionary of graph parameters
@type settings: dict
"""
# If no input, we don't bother about anything
if not trend or len(trend) == 0:
return
# If no filename is given, we'll assume STD-out format and ASCII.
if path == '':
settings["format"] = 'asciiart'
if settings["format"] == 'asciiart':
create_graph_trend_ascii_art(trend, path, settings)
elif settings["format"] == 'gnuplot':
create_graph_trend_gnu_plot(trend, path, settings)
elif settings["format"] == "flot":
create_graph_trend_flot(trend, path, settings)
def create_graph_trend_ascii_art(trend, path, settings):
"""Creates the graph trend using ASCII art"""
out = ""
if settings["multiple"] is not None:
# Tokens that will represent the different data sets (maximum 16 sets)
# Set index (=100) to the biggest of the histogram sums
index = max([sum(x[1]) for x in trend])
# Print legend box
out += "Legend: %s\n\n" % ", ".join(["%s (%s)" % x
for x in zip(settings["multiple"], WEBSTAT_GRAPH_TOKENS)])
else:
index = max([x[1] for x in trend])
width = 82
# Figure out the max length of the xtics, in order to left align
xtic_max_len = max([len(_to_datetime(x[0]).strftime(
settings["xtic_format"])) for x in trend])
for row in trend:
# Print the xtic
xtic = _to_datetime(row[0]).strftime(settings["xtic_format"])
out_row = xtic + ': ' + ' ' * (xtic_max_len - len(xtic)) + '|'
try:
col_width = (1.0 * width / index)
except ZeroDivisionError:
col_width = 0
if settings["multiple"] is not None:
# The second value of the row-tuple, represents the n values from
# the n data sets. Each set, will be represented by a different
# ASCII character, chosen from the randomized string
# 'WEBSTAT_GRAPH_TOKENS'.
# NOTE: Only up to 16 (len(WEBSTAT_GRAPH_TOKENS)) data
# sets are supported.
total = sum(row[1])
for i in range(len(row[1])):
col = row[1][i]
try:
out_row += WEBSTAT_GRAPH_TOKENS[i] * int(1.0 * col * col_width)
except ZeroDivisionError:
break
if len([i for i in row[1] if type(i) is int and i > 0]) - 1 > 0:
out_row += out_row[-1]
else:
total = row[1]
try:
out_row += '-' * int(1.0 * total * col_width)
except ZeroDivisionError:
break
# Print sentinel, and the total
out += out_row + '>' + ' ' * (xtic_max_len + 4 +
width - len(out_row)) + str(total) + '\n'
# Write to destination file
if path == '':
print out
else:
open(path, 'w').write(out)
def create_graph_trend_gnu_plot(trend, path, settings):
"""Creates the graph trend using the GNU plot library"""
try:
import Gnuplot
except ImportError:
return
gnup = Gnuplot.Gnuplot()
gnup('set style data steps')
if 'size' in settings:
gnup('set terminal png tiny size %s' % settings['size'])
else:
gnup('set terminal png tiny')
gnup('set output "%s"' % path)
if settings["title"] != '':
gnup.title(settings["title"].replace("\"", ""))
if settings["xlabel"] != '':
gnup.xlabel(settings["xlabel"])
if settings["ylabel"] != '':
gnup.ylabel(settings["ylabel"])
if settings["xtic_format"] != '':
xtics = 'set xtics ('
xtics += ', '.join(['"%s" %d' %
(_to_datetime(trend[i][0], '%Y-%m-%d \
%H:%M:%S').strftime(settings["xtic_format"]), i)
for i in range(len(trend))]) + ')'
gnup(xtics)
gnup('set format y "%.0f"')
# If we have multiple data sets, we need to do
# some magic to make Gnuplot eat it,
# This is basically a matrix transposition,
# and the addition of index numbers.
if settings["multiple"] is not None:
cols = len(trend[0][1])
rows = len(trend)
plot_items = []
y_max = 0
y_min = 0
for col in range(cols):
data = []
for row in range(rows):
data.append([row, trend[row][1][col]])
data.append([rows, trend[-1][1][col]])
plot_items.append(Gnuplot.PlotItems
.Data(data, title=settings["multiple"][col]))
tmp_max = max([x[col] for x in data])
tmp_min = min([x[col] for x in data])
if tmp_max > y_max:
y_max = tmp_max
if tmp_min < y_min:
y_min = tmp_min
if y_max - y_min < 5 and y_min != 0:
gnup('set ytic %d, 1, %d' % (y_min - 1, y_max + 2))
elif y_max < 5:
gnup('set ytic 1')
gnup.plot(*plot_items)
else:
data = [x[1] for x in trend]
data.append(trend[-1][1])
y_max = max(data)
y_min = min(data)
if y_max - y_min < 5 and y_min != 0:
gnup('set ytic %d, 1, %d' % (y_min - 1, y_max + 2))
elif y_max < 5:
gnup('set ytic 1')
gnup.plot(data)
def create_graph_trend_flot(trend, path, settings):
"""Creates the graph trend using the flot library"""
size = settings.get("size", "500,400").split(",")
title = cgi.escape(settings["title"].replace(" ", "")[:10])
out = """<!--[if IE]><script language="javascript" type="text/javascript"
src="%(site)s/js/excanvas.min.js"></script><![endif]-->
<script language="javascript" type="text/javascript" src="%(site)s/js/jquery.flot.min.js"></script>
<script language="javascript" type="text/javascript" src="%(site)s/js/jquery.flot.selection.min.js"></script>
<script id="source" language="javascript" type="text/javascript">
document.write('<div style="float:left"><div id="placeholder%(title)s" style="width:%(width)spx;height:%(height)spx"></div></div>'+
'<div id="miniature%(title)s" style="float:left;margin-left:20px;margin-top:50px">' +
'<div id="overview%(title)s" style="width:%(hwidth)dpx;height:%(hheigth)dpx"></div>' +
'<p id="overviewLegend%(title)s" style="margin-left:10px"></p>' +
'</div>');
$(function () {
function parseDate%(title)s(sdate){
var div1 = sdate.split(' ');
var day = div1[0].split('-');
var hour = div1[1].split(':');
return new Date(day[0], day[1]-1, day[2], hour[0], hour[1], hour[2]).getTime() - (new Date().getTimezoneOffset() * 60 * 1000) ;
}
function getData%(title)s() {""" % \
{'site': CFG_SITE_URL, 'width': size[0], 'height': size[1], 'hwidth': int(size[0]) / 2,
'hheigth': int(size[1]) / 2, 'title': title}
if(len(trend) > 1):
granularity_td = (_to_datetime(trend[1][0], '%Y-%m-%d %H:%M:%S') -
_to_datetime(trend[0][0], '%Y-%m-%d %H:%M:%S'))
else:
granularity_td = datetime.timedelta()
# Create variables with the format dn = [[x1,y1], [x2,y2]]
minx = trend[0][0]
maxx = trend[0][0]
if settings["multiple"] is not None:
cols = len(trend[0][1])
rows = len(trend)
first = 0
for col in range(cols):
out += """var d%d = [""" % (col)
for row in range(rows):
if(first == 0):
first = 1
else:
out += ", "
if trend[row][0] < minx:
minx = trend[row][0]
if trend[row][0] > maxx:
maxx = trend[row][0]
out += '[parseDate%s("%s"),%d]' % \
(title, _to_datetime(trend[row][0], '%Y-%m-%d \
%H:%M:%S'), trend[row][1][col])
out += ", [parseDate%s('%s'), %d]];\n" % (title,
_to_datetime(maxx, '%Y-%m-%d %H:%M:%S')+ granularity_td,
trend[-1][1][col])
out += "return [\n"
first = 0
for col in range(cols):
if first == 0:
first = 1
else:
out += ", "
out += '{data : d%d, label : "%s"}' % \
(col, settings["multiple"][col])
out += "];\n}\n"
else:
out += """var d1 = ["""
rows = len(trend)
first = 0
for row in range(rows):
if trend[row][0] < minx:
minx = trend[row][0]
if trend[row][0] > maxx:
maxx = trend[row][0]
if first == 0:
first = 1
else:
out += ', '
out += '[parseDate%s("%s"),%d]' % \
(title, _to_datetime(trend[row][0], '%Y-%m-%d %H:%M:%S'),
trend[row][1])
out += """, [parseDate%s("%s"), %d]];
return [d1];
}
""" % (title, _to_datetime(maxx, '%Y-%m-%d %H:%M:%S') +
granularity_td, trend[-1][1])
# Set options
tics = """yaxis: {
tickDecimals : 0
},"""
if settings["xtic_format"] != '':
current = _to_datetime(maxx, '%Y-%m-%d %H:%M:%S')
next = current + granularity_td
if (granularity_td.seconds + granularity_td.days * 24 * 3600) > 2592000:
next = current.replace(day=31)
tics += 'xaxis: { mode:"time",min:parseDate%s("%s"),max:parseDate%s("%s")},'\
% (title, _to_datetime(minx, '%Y-%m-%d %H:%M:%S'), title, next)
out += """var options%s ={
series: {
lines: { steps: true, fill: true},
points: { show: false }
},
legend: {show: false},
%s
grid: { hoverable: true, clickable: true },
selection: { mode: "xy" }
};
""" % (title, tics, )
# Write the plot method in javascript
out += """var startData%(title)s = getData%(title)s();
var plot%(title)s = $.plot($("#placeholder%(title)s"), startData%(title)s, options%(title)s);
// setup overview
var overview%(title)s = $.plot($("#overview%(title)s"), startData%(title)s, {
legend: { show: true, container: $("#overviewLegend%(title)s") },
series: {
lines: { steps: true, fill: true, lineWidth: 1},
shadowSize: 0
},
%(tics)s
grid: { color: "#999" },
selection: { mode: "xy" }
});
""" % {"title": title, "tics": tics}
# Tooltip and zoom
out += """
function showTooltip%(title)s(x, y, contents) {
$('<div id="tooltip%(title)s">' + contents + '</div>').css( {
position: 'absolute',
display: 'none',
top: y - 5,
left: x + 10,
border: '1px solid #fdd',
padding: '2px',
'background-color': '#fee',
opacity: 0.80
}).appendTo("body").fadeIn(200);
}
var previousPoint%(title)s = null;
$("#placeholder%(title)s").bind("plothover", function (event, pos, item) {
if (item) {
if (previousPoint%(title)s != item.datapoint) {
previousPoint%(title)s = item.datapoint;
$("#tooltip%(title)s").remove();
var y = item.datapoint[1];
showTooltip%(title)s(item.pageX, item.pageY, y);
}
}
else {
$("#tooltip%(title)s").remove();
previousPoint%(title)s = null;
}
});
$("#placeholder%(title)s").bind("plotclick", function (event, pos, item) {
if (item) {
plot%(title)s.highlight(item.series, item.datapoint);
}
});
// now connect the two
$("#placeholder%(title)s").bind("plotselected", function (event, ranges) {
// clamp the zooming to prevent eternal zoom
if (ranges.xaxis.to - ranges.xaxis.from < 0.00001){
ranges.xaxis.to = ranges.xaxis.from + 0.00001;}
if (ranges.yaxis.to - ranges.yaxis.from < 0.00001){
ranges.yaxis.to = ranges.yaxis.from + 0.00001;}
// do the zooming
plot%(title)s = $.plot($("#placeholder%(title)s"), getData%(title)s(ranges.xaxis.from, ranges.xaxis.to),
$.extend(true, {}, options%(title)s, {
xaxis: { min: ranges.xaxis.from, max: ranges.xaxis.to },
yaxis: { min: ranges.yaxis.from, max: ranges.yaxis.to }
}));
// don't fire event on the overview to prevent eternal loop
overview%(title)s.setSelection(ranges, true);
});
$("#overview%(title)s").bind("plotselected", function (event, ranges) {
plot%(title)s.setSelection(ranges);
});
});
</script>
<noscript>Your browser does not support JavaScript!
Please, select another output format</noscript>""" % {'title' : title}
open(path, 'w').write(out)
def get_numeric_stats(data, multiple):
""" Returns average, max and min values for data """
data = [x[1] for x in data]
if data == []:
return (0, 0, 0)
if multiple:
lists = []
for i in range(len(data[0])):
lists.append([x[i] for x in data])
return ([float(sum(x)) / len(x) for x in lists], [max(x) for x in lists],
[min(x) for x in lists])
else:
return (float(sum(data)) / len(data), max(data), min(data))
def create_graph_table(data, path, settings):
"""
Creates a html table representation out of data.
@param data: The data
@type data: (str,...)
@param path: Where to store the graph
@type path: str
@param settings: Dictionary of table parameters
@type settings: dict
"""
out = """<table border="1">
"""
if settings['rows'] == []:
for row in data:
out += """<tr>
"""
for value in row:
out += """<td>%s</td>
""" % value
out += "</tr>"
else:
for dta, value in zip(settings['rows'], data):
out += """<tr>
<td>%s</td>
<td>
""" % dta
for vrow in value:
out += """%s<br />
""" % vrow
out = out[:-6] + "</td></tr>"
out += "</table>"
open(path, 'w').write(out)
def create_graph_dump(dump, path):
"""
Creates a graph representation out of data produced from get_event_trend.
@param dump: The dump data
@type dump: [(str|int,...)]
@param path: Where to store the graph
@type path: str
"""
out = ""
if len(dump) == 0:
out += "No actions for this custom event " + \
"are registered in the given time range."
else:
# Make every row in dump equally long, insert None if appropriate.
max_len = max([len(x) for x in dump])
events = [tuple(list(x) + [None] * (max_len - len(x))) for x in dump]
cols = ["Event", "Date and time"] + ["Argument %d" % i
for i in range(max_len - 2)]
column_widths = [max([len(str(x[i])) \
for x in events + [cols]]) + 3 for i in range(len(events[0]))]
for i in range(len(cols)):
out += cols[i] + ' ' * (column_widths[i] - len(cols[i]))
out += "\n"
for i in range(len(cols)):
out += '=' * (len(cols[i])) + ' ' * (column_widths[i] - len(cols[i]))
out += "\n\n"
for action in dump:
for i in range(len(action)):
if action[i] is None:
temp = ''
else:
temp = action[i]
out += str(temp) + ' ' * (column_widths[i] - len(str(temp)))
out += "\n"
# Write to destination file
if path == '':
print out
else:
open(path, 'w').write(out)
# EXPORT DATA TO SLS
def get_search_frequency(day=datetime.datetime.now().date()):
"""Returns the number of searches performed in the chosen day"""
searches = get_keyevent_trend_search_type_distribution(get_args(day))
return sum(searches[0][1])
def get_total_records(day=datetime.datetime.now().date()):
"""Returns the total number of records which existed in the chosen day"""
tomorrow = (datetime.datetime.now() +
datetime.timedelta(days=1)).strftime("%Y-%m-%d")
args = {'collection': CFG_SITE_NAME, 't_start': day.strftime("%Y-%m-%d"),
't_end': tomorrow, 'granularity': "day", 't_format': "%Y-%m-%d"}
try:
return get_keyevent_trend_collection_population(args)[0][1]
except IndexError:
return 0
def get_new_records(day=datetime.datetime.now().date()):
"""Returns the number of new records submitted in the chosen day"""
args = {'collection': CFG_SITE_NAME,
't_start': (day - datetime.timedelta(days=1)).strftime("%Y-%m-%d"),
't_end': day.strftime("%Y-%m-%d"), 'granularity': "day",
't_format': "%Y-%m-%d"}
try:
return (get_total_records(day) -
get_keyevent_trend_collection_population(args)[0][1])
except IndexError:
return 0
def get_download_frequency(day=datetime.datetime.now().date()):
"""Returns the number of downloads during the chosen day"""
return get_keyevent_trend_download_frequency(get_args(day))[0][1]
def get_comments_frequency(day=datetime.datetime.now().date()):
"""Returns the number of comments during the chosen day"""
return get_keyevent_trend_comments_frequency(get_args(day))[0][1]
def get_loans_frequency(day=datetime.datetime.now().date()):
"""Returns the number of comments during the chosen day"""
return get_keyevent_trend_number_of_loans(get_args(day))[0][1]
def get_web_submissions(day=datetime.datetime.now().date()):
"""Returns the number of web submissions during the chosen day"""
args = get_args(day)
args['doctype'] = 'all'
return get_keyevent_trend_web_submissions(args)[0][1]
def get_alerts(day=datetime.datetime.now().date()):
"""Returns the number of alerts during the chosen day"""
args = get_args(day)
args['cols'] = [('', '', '')]
args['event_id'] = 'alerts'
return get_customevent_trend(args)[0][1]
def get_journal_views(day=datetime.datetime.now().date()):
"""Returns the number of journal displays during the chosen day"""
args = get_args(day)
args['cols'] = [('', '', '')]
args['event_id'] = 'journals'
return get_customevent_trend(args)[0][1]
def get_basket_views(day=datetime.datetime.now().date()):
"""Returns the number of basket displays during the chosen day"""
args = get_args(day)
args['cols'] = [('', '', '')]
args['event_id'] = 'baskets'
return get_customevent_trend(args)[0][1]
def get_args(day):
"""Returns the most common arguments for the exporting to SLS methods"""
return {'t_start': day.strftime("%Y-%m-%d"),
't_end': (day + datetime.timedelta(days=1)).strftime("%Y-%m-%d"),
'granularity': "day", 't_format': "%Y-%m-%d"}
# EXPORTER
def export_to_python(data, req):
"""
Exports the data to Python code.
@param data: The Python data that should be exported
@type data: []
@param req: The Apache request object
@type req:
"""
_export("text/x-python", str(data), req)
def export_to_csv(data, req):
"""
Exports the data to CSV.
@param data: The Python data that should be exported
@type data: []
@param req: The Apache request object
@type req:
"""
csv_list = [""""%s",%s""" % (x[0], ",".join([str(y) for y in \
((type(x[1]) is tuple) and x[1] or (x[1], ))])) for x in data]
_export('text/csv', '\n'.join(csv_list), req)
def export_to_file(data, req):
"""
Exports the data to a file.
@param data: The Python data that should be exported
@type data: []
@param req: The Apache request object
@type req:
"""
try:
import xlwt
book = xlwt.Workbook(encoding="utf-8")
sheet1 = book.add_sheet('Sheet 1')
for row in range(0, len(data)):
for col in range(0, len(data[row])):
sheet1.write(row, col, "%s" % data[row][col])
filename = CFG_TMPDIR + "/webstat_export_" + \
str(time.time()).replace('.', '') + '.xls'
book.save(filename)
redirect_to_url(req, '%s/stats/export?filename=%s&mime=%s' \
% (CFG_SITE_URL, os.path.basename(filename), 'application/vnd.ms-excel'))
except ImportError:
csv_list = []
for row in data:
row = ['"%s"' % str(col) for col in row]
csv_list.append(",".join(row))
_export('text/csv', '\n'.join(csv_list), req)
# INTERNAL
def _export(mime, content, req):
"""
Helper function to pass on the export call. Create a
temporary file in which the content is stored, then let
redirect to the export web interface.
"""
filename = CFG_TMPDIR + "/webstat_export_" + \
str(time.time()).replace('.', '')
open(filename, 'w').write(content)
redirect_to_url(req, '%s/stats/export?filename=%s&mime=%s' \
% (CFG_SITE_URL, os.path.basename(filename), mime))
def _get_trend_from_actions(action_dates, initial_value,
t_start, t_end, granularity, dt_format, acumulative=False):
"""
Given a list of dates reflecting some sort of action/event, and some additional parameters,
an internal data format is returned. 'initial_value' set to zero, means that the frequency
will not be accumulative, but rather non-causal.
@param action_dates: A list of dates, indicating some sort of action/event.
@type action_dates: [datetime.datetime]
@param initial_value: The numerical offset the first action's value should make use of.
@type initial_value: int
@param t_start: Start time for the time domain in dt_format
@type t_start: str
@param t_end: End time for the time domain in dt_format
@type t_end: str
@param granularity: The granularity of the time domain, span between values.
Possible values are [year,month,day,hour,minute,second].
@type granularity: str
@param dt_format: Format of the 't_start' and 't_stop' parameters
@type dt_format: str
@return: A list of tuples zipping a time-domain and a value-domain
@type: [(str, int)]
"""
# Append the maximum date as a sentinel indicating we're done
action_dates = list(action_dates)
# Construct the datetime tuple for the stop time
stop_at = _to_datetime(t_end, dt_format) - datetime.timedelta(seconds=1)
vector = [(None, initial_value)]
try:
upcoming_action = action_dates.pop()
#Do not count null values (when year, month or day is 0)
if granularity in ("year", "month", "day") and upcoming_action[0] == 0:
upcoming_action = action_dates.pop()
except IndexError:
upcoming_action = (datetime.datetime.max, 0)
# Create an iterator running from the first day of activity
for current in _get_datetime_iter(t_start, granularity, dt_format):
# Counter of action_dates in the current span, set the initial value to
# zero to avoid accumlation.
if acumulative:
actions_here = vector[-1][1]
else:
actions_here = 0
# Check to see if there's an action date in the current span
if upcoming_action[0] == {"year": current.year,
"month": current.month,
"day": current.day,
"hour": current.hour,
"minute": current.minute,
"second": current.second
}[granularity]:
actions_here += upcoming_action[1]
try:
upcoming_action = action_dates.pop()
except IndexError:
upcoming_action = (datetime.datetime.max, 0)
vector.append((current.strftime('%Y-%m-%d %H:%M:%S'), actions_here))
# Make sure to stop the iteration at the end time
if {"year": current.year >= stop_at.year,
"month": current.month >= stop_at.month and current.year == stop_at.year,
"day": current.day >= stop_at.day and current.month == stop_at.month,
"hour": current.hour >= stop_at.hour and current.day == stop_at.day,
"minute": current.minute >= stop_at.minute and current.hour == stop_at.hour,
"second": current.second >= stop_at.second and current.minute == stop_at.minute
}[granularity]:
break
# Remove the first bogus tuple, and return
return vector[1:]
def _get_keyevent_trend(args, sql, initial_quantity=0, extra_param=[],
return_sql=False, sql_text='%s', acumulative=False):
"""
Returns the trend for the sql passed in the given timestamp range.
@param args['t_start']: Date and time of start point
@type args['t_start']: str
@param args['t_end']: Date and time of end point
@type args['t_end']: str
@param args['granularity']: Granularity of date and time
@type args['granularity']: str
@param args['t_format']: Date and time formatting string
@type args['t_format']: str
"""
# collect action dates
lower = _to_datetime(args['t_start'], args['t_format']).isoformat()
upper = _to_datetime(args['t_end'], args['t_format']).isoformat()
param = tuple([lower, upper] + extra_param)
if return_sql:
sql = sql % param
return sql_text % sql
return _get_trend_from_actions(run_sql(sql, param), initial_quantity, args['t_start'],
args['t_end'], args['granularity'], args['t_format'], acumulative)
def _get_datetime_iter(t_start, granularity='day',
dt_format='%Y-%m-%d %H:%M:%S'):
"""
Returns an iterator over datetime elements starting at an arbitrary time,
with granularity of a [year,month,day,hour,minute,second].
@param t_start: An arbitrary starting time in format %Y-%m-%d %H:%M:%S
@type t_start: str
@param granularity: The span between iterable elements, default is 'days'.
Possible values are [year,month,day,hour,minute,second].
@type granularity: str
@param dt_format: Format of the 't_start' parameter
@type dt_format: str
@return: An iterator of points in time
@type: iterator over datetime elements
"""
tim = _to_datetime(t_start, dt_format)
# Make a time increment depending on the granularity and the current time
# (the length of years and months vary over time)
span = ""
while True:
yield tim
if granularity == "year":
span = (calendar.isleap(tim.year) and ["days=366"] or ["days=365"])[0]
elif granularity == "month":
span = "days=" + str(calendar.monthrange(tim.year, tim.month)[1])
elif granularity == "day":
span = "days=1"
elif granularity == "hour":
span = "hours=1"
elif granularity == "minute":
span = "minutes=1"
elif granularity == "second":
span = "seconds=1"
else:
# Default just in case
span = "days=1"
tim += eval("datetime.timedelta(" + span + ")")
def _to_datetime(dttime, dt_format='%Y-%m-%d %H:%M:%S'):
"""
Transforms a string into a datetime
"""
return datetime.datetime(*time.strptime(dttime, dt_format)[:6])
def _run_cmd(command):
"""
Runs a certain command and returns the string output. If the command is
not found a string saying so will be returned. Use with caution!
@param command: The UNIX command to execute.
@type command: str
@return: The std-out from the command.
@type: str
"""
return commands.getoutput(command)
def _get_doctypes():
"""Returns all the possible doctypes of a new submission"""
doctypes = [("all", "All")]
for doctype in get_docid_docname_alldoctypes():
doctypes.append(doctype)
return doctypes
def _get_item_statuses():
"""Returns all the possible status of an item"""
return [(CFG_BIBCIRCULATION_ITEM_STATUS_CANCELLED, "Cancelled"),
(CFG_BIBCIRCULATION_ITEM_STATUS_CLAIMED, "Claimed"),
(CFG_BIBCIRCULATION_ITEM_STATUS_IN_PROCESS, "In process"),
(CFG_BIBCIRCULATION_ITEM_STATUS_NOT_ARRIVED, "Not arrived"),
(CFG_BIBCIRCULATION_ITEM_STATUS_ON_LOAN, "On loan"),
(CFG_BIBCIRCULATION_ITEM_STATUS_ON_ORDER, "On order"),
(CFG_BIBCIRCULATION_ITEM_STATUS_ON_SHELF, "On shelf")] + \
[(status, status) for status in CFG_BIBCIRCULATION_ITEM_STATUS_OPTIONAL]
def _get_item_doctype():
"""Returns all the possible types of document for an item"""
dts = []
for dat in run_sql("""SELECT DISTINCT(request_type)
FROM crcILLREQUEST ORDER BY request_type ASC"""):
dts.append((dat[0], dat[0]))
return dts
def _get_request_statuses():
"""Returns all the possible statuses for an ILL request"""
dts = []
for dat in run_sql("SELECT DISTINCT(status) FROM crcILLREQUEST ORDER BY status ASC"):
dts.append((dat[0], dat[0]))
return dts
def _get_libraries():
"""Returns all the possible libraries"""
dts = []
for dat in run_sql("SELECT name FROM crcLIBRARY ORDER BY name ASC"):
if not CFG_CERN_SITE or not "CERN" in dat[0]: # do not add internal libraries for CERN site
dts.append((dat[0], dat[0]))
return dts
def _get_loan_periods():
"""Returns all the possible loan periods for an item"""
dts = []
for dat in run_sql("SELECT DISTINCT(loan_period) FROM crcITEM ORDER BY loan_period ASC"):
dts.append((dat[0], dat[0]))
return dts
def _get_tag_name(tag):
"""
For a specific MARC tag, it returns the human-readable name
"""
res = run_sql("SELECT name FROM tag WHERE value LIKE %s", ('%' + tag + '%',))
if res:
return res[0][0]
res = run_sql("SELECT name FROM tag WHERE value LIKE %s", ('%' + tag[:-1] + '%',))
if res:
return res[0][0]
return ''
def _get_collection_recids_for_sql_query(coll):
ids = get_collection_reclist(coll).tolist()
if len(ids) == 0:
return ""
return "id_bibrec IN %s" % str(ids).replace('[', '(').replace(']', ')')
def _check_udc_value_where():
return "id_bibrec IN (SELECT brb.id_bibrec \
FROM bibrec_bib08x brb, bib08x b WHERE brb.id_bibxxx = b.id AND tag='080__a' \
AND value LIKE %s) "
def _get_udc_truncated(udc):
if udc[-1] == '*':
return "%s%%" % udc[:-1]
if udc[0] == '*':
return "%%%s" % udc[1:]
return "%s" % udc
def _check_empty_value(value):
if len(value) == 0:
return ""
else:
return value[0][0]
def _get_granularity_sql_functions(granularity):
try:
return {
"year": ("YEAR",),
"month": ("YEAR", "MONTH",),
"day": ("MONTH", "DAY",),
"hour": ("DAY", "HOUR",),
"minute": ("HOUR", "MINUTE",),
"second": ("MINUTE", "SECOND")
}[granularity]
except KeyError:
return ("MONTH", "DAY",)
def _get_sql_query(creation_time_name, granularity, tables_from, conditions="",
extra_select="", dates_range_param="", group_by=True, count=True):
if len(dates_range_param) == 0:
dates_range_param = creation_time_name
conditions = "%s > %%s AND %s < %%s %s" % (dates_range_param, dates_range_param,
len(conditions) > 0 and "AND %s" % conditions or "")
values = {'creation_time_name': creation_time_name,
'granularity_sql_function': _get_granularity_sql_functions(granularity)[-1],
'count': count and ", COUNT(*)" or "",
'tables_from': tables_from,
'conditions': conditions,
'extra_select': extra_select,
'group_by': ""}
if group_by:
values['group_by'] = "GROUP BY "
for fun in _get_granularity_sql_functions(granularity):
values['group_by'] += "%s(%s), " % (fun, creation_time_name)
values['group_by'] = values['group_by'][:-2]
return "SELECT %(granularity_sql_function)s(%(creation_time_name)s) %(count)s %(extra_select)s \
FROM %(tables_from)s WHERE %(conditions)s \
%(group_by)s \
ORDER BY %(creation_time_name)s DESC" % values
diff --git a/invenio/legacy/webstat/templates.py b/invenio/legacy/webstat/templates.py
index 0c6d04249..ac8e89545 100644
--- a/invenio/legacy/webstat/templates.py
+++ b/invenio/legacy/webstat/templates.py
@@ -1,1005 +1,1005 @@
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
__lastupdated__ = "$Date$"
import datetime, cgi, urllib, os
import re
from invenio.config import \
CFG_WEBDIR, \
CFG_SITE_URL, \
CFG_SITE_LANG, \
CFG_SITE_NAME
from invenio.legacy.search_engine import get_coll_sons
-from invenio.webstat_engine import get_invenio_error_details
+from invenio.legacy.webstat.engine import get_invenio_error_details
class Template:
def tmpl_welcome(self, ln=CFG_SITE_LANG):
"""
Generates a welcome page for the Webstat module.
"""
return """<p>On these pages, you can review measurements of Invenio usage
and performance. Output is available in several formats, and its
raw data can be exported for offline processing. Further on, a general
overview is presented below under the label Current System Health.</p>"""
def tmpl_system_health_list(self, recount, ln=CFG_SITE_LANG):
"""
Generates a box with current information from the system providing the administrator
an easy way of overlooking the 'health', i.e. the current performance/efficency, of
the system.
"""
temp_out = """<h3>Current system health</h3>
<p>Please, choose the information you want to review.</p>
<ul>
"""
if recount == 0:
temp_out += """
<li class="warninggreen">
<a href="%s/stats/system_health%s">
Current system health</a>
(all records up to date!)</li>
""" % \
(CFG_SITE_URL, (CFG_SITE_LANG != ln and '?ln=' + ln) or '')
else:
temp_out += """
<li class="warningred">
<a href="%s/stats/system_health%s">
Current system health</a>
(around %s records pending)</li>
""" % \
(CFG_SITE_URL, (CFG_SITE_LANG != ln and '?ln=' + ln) \
or '', str(recount))
temp_out += """
<li><a href="%s/stats/ingestion_health%s">
Check ingestion health for specific records</a></li>
</ul>
""" % \
(CFG_SITE_URL, (CFG_SITE_LANG != ln and '?ln=' + ln) or '')
return temp_out
def tmpl_system_health(self, health_statistics, ln=CFG_SITE_LANG):
"""
Generates the system health list of parameters.
"""
temp_out = ""
for statistic in health_statistics:
if statistic is None:
temp_out += '\n'
elif statistic[1] is None:
temp_out += statistic[0] + '\n'
else:
# regular expression, just to extract the text
# inside the link and view it's length
aux = re.search('.*>(.*)<.*', str(statistic[1]))
if aux is None:
temp_out += statistic[0] + " " + '.' * \
(85 - len(str(statistic[0])) - \
len(str(statistic[1]))) + " " + \
str(statistic[1]) + '\n'
else:
temp_out += statistic[0] + " " + '.' * \
(85 - len(str(statistic[0])) - \
len(str(aux.group(1)))) + " " + \
str(statistic[1]) + '\n'
temp_out = "<pre>" + temp_out + "</pre>"
temp_out += '<a href="%s/stats/ingestion_health%s">Check ingestion \
health for specific records pending</a>' % (CFG_SITE_URL,
(CFG_SITE_LANG != ln and '?ln=' + ln) or '')
return temp_out
def tmpl_keyevent_list(self, ln=CFG_SITE_LANG):
"""
Generates a list of available key statistics.
"""
return """<h3>Key statistics</h3>
<p>Please choose a statistic from below to review it in detail.</p>
<ul>
<li><a href="%(CFG_SITE_URL)s/stats/collection_population%(ln_link)s">Collection population</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/new_records%(ln_link)s">New records</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/search_frequency%(ln_link)s">Search frequency</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/search_type_distribution%(ln_link)s">Search type distribution</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/download_frequency%(ln_link)s">Download frequency</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/comments_frequency%(ln_link)s">Comments frequency</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/number_of_loans%(ln_link)s">Number of circulation loans</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/web_submissions%(ln_link)s">Web submissions</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/loans_stats%(ln_link)s">Circulation loan statistics</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/loans_lists%(ln_link)s">Circulation loan lists</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/renewals_lists%(ln_link)s">Circulation renewals lists</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/returns_table%(ln_link)s">Number of circulation overdue returns</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/returns_graph%(ln_link)s">Percentage of circulation overdue returns</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/ill_requests_stats%(ln_link)s">Circulation ILL Requests statistics</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/ill_requests_lists%(ln_link)s">Circulation ILL Requests list</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/ill_requests_graph%(ln_link)s">Percentage of satisfied circulation ILL requests</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/items_stats%(ln_link)s">Circulation items statistics</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/items_list%(ln_link)s">Circulation items list</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/loans_requests%(ln_link)s">Circulation hold requests statistics</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/loans_request_lists%(ln_link)s">Circulation hold requests lists</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/user_stats%(ln_link)s">Circulation user statistics</a></li>
<li><a href="%(CFG_SITE_URL)s/stats/user_lists%(ln_link)s">Circulation user lists</a></li>
</ul>""" % {'CFG_SITE_URL': CFG_SITE_URL,
'ln_link': (CFG_SITE_LANG != ln and '?ln=' + ln) or ''}
def tmpl_customevent_list(self, customevents, ln=CFG_SITE_LANG):
"""
Generates a list of available custom statistics.
"""
out = """<h3>Custom events</h3>
<p>The Webstat module supplies a mean for the administrators of Invenio
to define their own custom events, more abstract than the Key Statistics above.
A technical walk-through how to create these, is available <a href="%s/stats/customevent_help">here</a>.
When a custom event has been made available, it is displayed below.</p>
""" % CFG_SITE_URL
temp_out = ""
for event in customevents:
temp_out += """<li><a href="%s/stats/customevent?ids=%s">%s</a></li>""" \
% (CFG_SITE_URL, event[0], (event[1] is None) and event[0] or event[1])
if len(customevents) == 0:
out += self.tmpl_error("There are currently no custom events available.", ln=ln)
else:
out += "<ul>" + temp_out + "</ul>"
return out
def tmpl_loans_statistics(self, ln=CFG_SITE_LANG):
"""
Generates the tables with the bibcirculation statistics
"""
out = """<h3>Bibcirculation stats</h3>"""
return out
def tmpl_error_log_statistics_list(self, ln=CFG_SITE_LANG):
"""
Link to error log analyzer
"""
return """<h3>Error log statistics</h3>
<p>Displays statistics about the last errors in the Invenio and Apache logs</p>
<ul><li><a href="%s/stats/error_log%s">Error log analyzer</a></li>
</ul>""" % (CFG_SITE_URL, (CFG_SITE_LANG != ln and '?ln=' + ln) or '')
def tmpl_error_log_analyzer(self, invenio_ranking, invenio_last_errors, apache_ranking):
"""
Generates the statistics of the last errors
"""
out = """<h4>Invenio error log</h4>
<h5>Ranking</h5>
<pre>%s</pre>
<h5>Last errors</h5>
""" % (cgi.escape(invenio_ranking))
lines = invenio_last_errors.splitlines()
error_number = len(lines)
for line in lines:
out += """<div>
%(line)s<button id="bt_toggle%(error_number)s">Toggle</button>
<pre id="txt_error%(error_number)s">%(error_details)s</pre>
</div>
<script>
$("#txt_error%(error_number)s").slideToggle("fast");
$("#bt_toggle%(error_number)s").click(function () {
$("#txt_error%(error_number)s").slideToggle("fast");
});
</script>
""" % {'line': cgi.escape(line),
'error_number': error_number,
'error_details': cgi.escape(get_invenio_error_details(error_number))}
error_number -= 1
out += """<h4>Apache error log</h4>
<pre>%s</pre>""" % apache_ranking
return out
def tmpl_custom_summary(self, ln=CFG_SITE_LANG):
"""
Link to custom annual report.
"""
return """<h3>Library report</h3>
<ul><li><a href="%s/stats/custom_summary">Custom query summary</a></li></ul>
""" % CFG_SITE_URL
def tmpl_yearly_report_list(self, ln=CFG_SITE_LANG):
"""
Link to yearly report
"""
return """<h3>Yearly report</h3>
<p>Yearly report with the total number of circulation transactions
(loans, renewals, returns, ILL requests, hold request).</p>
<ul><li><a href="%s/stats/yearly_report">Yearly report</a></li></ul>
""" % CFG_SITE_URL
def tmpl_yearly_report(self, year_report, ln=CFG_SITE_LANG):
"""
Display yearly report with the total number of circulation transactions.
"""
temp_out = ""
for info in year_report:
if info is None:
temp_out += '\n'
elif info[1] is None:
temp_out += info[0] + '\n'
else:
temp_out += info[0] + " " + \
'.' * (85 - len(str(info[0])) - \
len(str(info[1]))) + " " + str(info[1]) + '\n'
return "<pre>" + temp_out + "</pre>"
def tmpl_ingestion_health(self, general, req_ingestion=None, stats=None, ln=CFG_SITE_LANG):
"""
Display the record status search box and the results of the last
request, if done.
"""
# Introduction and search box
temp_out = """
<p>Check the ingestion health for records pending or go to
<a href="%s/stats/system_health%s">current system health</a>.</p>
""" % \
(CFG_SITE_URL, (CFG_SITE_LANG != ln and '?ln=' + ln) or '')
if general == 0:
temp_out += """
<p class="warninggreen">(all records up to date!)</p>
"""
else:
temp_out += """
<p class="warningred">(around %s records pending)</p>
""" % str(general)
temp_out += """
<form method="get"> <input type="hidden" name="ln" value="%s" />
""" % ln
if req_ingestion == None:
temp_out += self._tmpl_text_box("pattern", "", ln=ln)
else:
temp_out += self._tmpl_text_box("pattern", req_ingestion, ln=ln)
temp_out += """
<input class="formbutton" type="submit" />
</form>
"""
if stats==None:
return temp_out
else:
# results of the last request
info_stats = self.tmpl_ingestion_list(stats, ln=ln)
return temp_out + info_stats
def tmpl_ingestion_list(self, ingestion_statistics, ln=CFG_SITE_LANG):
"""
Generates the system ingestion health.
"""
temp_out = ""
for statistic in ingestion_statistics:
if statistic is None:
temp_out += '\n'
elif statistic[1] is None:
temp_out += statistic[0] + '\n'
else:
# regular expression, just to extract the text
# inside the link and view it's length
aux = re.search('.*>(.*)<.*', str(statistic[0]))
if aux is None:
temp_out += statistic[0] + " " + '.' * \
(85 - len(str(statistic[0])) - \
len(str(statistic[1]))) + " " + \
str(statistic[1]) + '\n'
else:
temp_out += statistic[0] + " " + '.' * \
(85 - len(str(aux.group(1))) - \
len(str(statistic[1]))) + " " + \
str(statistic[1]) + '\n'
temp_out = "<pre>" + temp_out + "</pre>"
return temp_out
def tmpl_collection_stats_main_list(self, ln=CFG_SITE_LANG):
"""
Generates a list of available collections statistics.
"""
out = """<h3>Collections stats</h3>
<ul>"""
for coll in get_coll_sons(CFG_SITE_NAME):
out += """<li><a href="%s/stats/collections?%s">%s</a></li>""" \
% (CFG_SITE_URL, urllib.urlencode({'collection': coll}) +
((CFG_SITE_LANG != ln and '&ln=' + ln) or ''), coll)
out += """<li><a href="%s/stats/collection_stats%s">Other collections</a></li>""" \
% (CFG_SITE_URL, (CFG_SITE_LANG != ln and '?ln=' + ln) or '')
return out + "</ul>"
def tmpl_collection_stats_complete_list(self, collections, ln=CFG_SITE_LANG):
if len(collections) == 0:
return self.tmpl_error("There are currently no collections available.", ln=ln)
temp_out = """<h4>Collections stats</h4>
<ul>"""
for coll in collections:
temp_out += """<li><a href="%s/stats/collections?%s">%s</a></li>""" \
% (CFG_SITE_URL, urllib.urlencode({'collection': coll[0]}), coll[1])
return temp_out + "</ul>"
def tmpl_customevent_help(self, ln=CFG_SITE_LANG):
"""
Display help for custom events.
"""
return """<h3>General overview</h3>
<p>A custom event is a measure indicating the frequency of some kind of
"action", such as e.g. the number of advanced searches carried out using
the Swedish language interface. The custom event functionality is intended
to give administrators a mean to log abstract activity, as opposed to
trivial measurements like "collection population" and "search frequency".
Thus, a custom event is fully customizable and defined by an administrator
but it is important to understand that the Webstat module merely supplies
the mean to register an action and associate it with a predefined custom event,
while the actual use case leading up to the very registration of the action
is left to the user.</p>
<p>After a custom event has been created and the process of collecting data
has started, the event is accessible for review through the Webstat webpage.</p>
<h3>How to create a new custom event</h3>
<ol>
<li>Edit <strong>/opt/invenio/etc/webstat/webstat.cfg</strong> adding
the definition of the customevent:
<pre>
[webstat_custom_event_1]
name = baskets
param1 = action
param2 = basket
param3 = user</pre>
</li>
<li>The title must be <em>webstat_custom_event_(num)</em> where <em>(num)</em>
is a number. The number can not be repeated in two different customevents.
</li>
<li>The option <em>name</em> is the name of the customevent.</li>
<li>Each param in the customevent must be given as <em>param(num)</em> where
<em>(num)</em> is an unique number.</li>
</ol>"""
def tmpl_error(self, msg, ln=CFG_SITE_LANG):
"""
Provides a common way of outputting error messages.
"""
return """<div class="important">%s</div>""" % msg
def tmpl_keyevent_box(self, options, order, choosed, ln=CFG_SITE_LANG, list=False):
"""
Generates a FORM box with dropdowns for keyevents.
@param options: { parameter name: [(argument internal, argument full)]}
@type options: { str: [(str, str)]}
@param order: A permutation of the keys in options, for design purpose.
@type order: [str]
@param options: The selected parameters, and its values.
@type options: { str: str }
"""
# Create the FORM's header
formheader = """<form method="get">
<input type="hidden" name="ln" value="%s" />""" % ln
# Create the headers using the options permutation
headers = [[options[param][1] for param in order]]
headers[0].append("")
# Create all SELECT boxes
sels = [[]]
for param in order:
if choosed[param] == 'select date':
sels[0].append(self._tmpl_select_box(options[param][2], # SELECT box data
" - select " + options[param][1], # first item info
param, # name
[choosed['s_date'], choosed['f_date']], # selected value (perhaps several)
True, # multiple box?
ln=ln))
elif options[param][0] == 'combobox':
sels[0].append(self._tmpl_select_box(options[param][2], # SELECT box data
" - select " + options[param][1], # first item info
param, # name
choosed[param], # selected value (perhaps several)
type(choosed[param]) is list, # multiple box?
ln=ln))
elif options[param][0] == 'textbox':
sels[0].append(self._tmpl_text_box(param, # name
choosed[param], # selected value
ln=ln))
# Create button
sels[0].append("""<input class="formbutton" type="submit"
name="action_gen" value="Generate"/>""")
# Export option
if list:
sels[0].append("""<input class="formbutton" type="submit"
name="format" value="Full list"/>""")
# Create form footer
formfooter = """</form>"""
return self._tmpl_box(formheader, formfooter, ["keyevent_table"],
headers, sels, [""], ln=ln)
def tmpl_customevent_box(self, options, choosed, ln=CFG_SITE_LANG):
"""
Generates a FORM box with dropdowns for customevents.
@param options: { parameter name: (header, [(argument internal, argument full)]) or
{param father: [(argument internal, argument full)]}}
The dictionary is for options that are dependient of other.
It's use for 'cols'
With "param father"="__header" the headers
With "param father"="__none" indicate the arguments by default
@type options: { str: (str, [(str, str)])|{str: [(str, str)]}}
@param choosed: The selected parameters, and its values.
@type choosed: { str: str }
"""
if choosed['ids'] == []:
choosed['ids'] = [""]
choosed['cols'] = [[("", "", "")]]
num_ids = len(choosed['ids'])
operators = [('and', 'AND'), ('or', 'OR'), ('and_not', 'AND NOT')]
# Crate the ids of the tables
table_id = ["time_format"]
table_id.extend(['cols' + str(i) for i in range(num_ids)])
# Create the headers using the options permutation
headers = [(options['timespan'][0], options['format'][0])]
headers.extend([(options['ids'][0], "", options['cols']['__header'], "value")
for event_id in choosed['ids']])
# Create all SELECT boxes
sels = [[]]
for param in ['timespan', 'format']:
if choosed[param] == 'select date':
sels[0].append(self._tmpl_select_box(options[param][1], # SELECT box data
" - select " + options[param][0], # first item info
param, # name
[choosed['s_date'], choosed['f_date']], # selected value (perhaps several)
True, # multiple box?
ln=ln))
else:
sels[0].append(self._tmpl_select_box(options[param][1], # SELECT box data
" - select " + options[param][0], # first item info
param, # name
choosed[param], # selected value (perhaps several)
type(choosed[param]) is list, # multiple box?
ln=ln))
for event_id, i in zip(choosed['ids'], range(num_ids)):
select_table = []
select_row = [self._tmpl_select_box(options['ids'][1],
" - select " + options['ids'][0],
'ids',
event_id,
attribute='onChange="javascript: \
changed_customevent(customevent[\'ids\'],%d);"' % i,
ln=ln)]
is_first_loop = True
row = 0
if len(choosed['cols']) <= i:
choosed['cols'].append([("", "", "")])
if choosed['cols'][i] == []:
choosed['cols'][i] = [("", "", "")]
for _, col, value in choosed['cols'][i]:
select_row.append("")
if not is_first_loop:
select_row.append(self._tmpl_select_box(operators, "", "bool%d" % i, bool))
if event_id:
select_row.append(self._tmpl_select_box(options['cols'][event_id],
" - select " + options['cols']['__header'],
'cols' + str(i),
col,
ln=ln))
else:
select_row.append(self._tmpl_select_box(options['cols']['__none'],
"Choose CustomEvent",
'cols' + str(i),
"",
ln=ln))
if is_first_loop:
select_row.append("<input name=\"col_value%d\" value=\"%s\">" % (i, value))
else:
select_row.append("""<input name="col_value%d" value="%s">
<a href="javascript:;" onclick="delrow(%d,%d);">Remove row</a>""" \
% (i, value, i, row))
select_table.append(select_row)
select_row = []
if is_first_loop:
is_first_loop = False
row += 1
sels.append(select_table)
# javascript for add col selectors
sels_col = []
sels_col.append(self._tmpl_select_box(options['ids'][1], " - select "
+ options['ids'][0], 'ids', "",
False,
attribute='onChange="javascript: \
changed_customevent(customevent[\\\'ids\\\'],\' + col + \');"',
ln=ln))
sels_col.append("")
sels_col.append(self._tmpl_select_box(options['cols']['__none'], "Choose CustomEvent",
'cols\' + col + \'', "", False, ln=ln))
sels_col.append("""<input name="col_value' + col + '">""")
col_table = self._tmpl_box("", "", ["cols' + col + '"], headers[1:], [sels_col],
["""<a id="add' + col + '" href="javascript:;"
onclick="addcol(\\'cols' + col + '\\', ' + col + ');">Add more arguments</a>
<a id="del' + col + '" href="javascript:;" onclick="delblock(' + col + ');">
Remove block</a>"""], ln=ln)
col_table = col_table.replace('\n', '')
formheader = """<script type="text/javascript">
var col = %d;
var col_select = new Array(%s,0);
var block_pos_max = %d;
var block_pos = new Array(%s,0);
var rows_pos_max = [%s];
var rows_pos = [%s];
function addcol(id, num){
col_select[num]++;
var table = document.getElementById(id);
var body = table.getElementsByTagName('tbody')[0];
var row = document.createElement('tr');
var cel0 = document.createElement('td');
row.appendChild(cel0);
var cel1 = document.createElement('td');
cel1.innerHTML = '<select name="bool' + num + '"> <option value="and">AND</option> <option value="or">OR</option> <option value="and_not">AND NOT</option> </select>';
row.appendChild(cel1);
var cel2 = document.createElement('td');
cel2.innerHTML = '%s';
row.appendChild(cel2);
var cel3 = document.createElement('td');
cel3.innerHTML = '%s';
row.appendChild(cel3);
body.appendChild(row);
// Change arguments
arguments = document['customevent']['cols' + num]
if (col_select[1] == 0) {
value = document['customevent']['ids'].value;
} else {
value = document['customevent']['ids'][block_pos[num]].value;
}
_change_select_options(arguments[arguments.length -1], get_argument_list(value), '');
rows_pos[num][col_select[num]-1] = rows_pos_max[num];
rows_pos_max[num]++;
} """ % (num_ids,
','.join([str(len(choosed['cols'][i])) for i in range(num_ids)]),
num_ids,
','.join([str(i) for i in range(num_ids)]),
','.join([str(len(block)) for block in choosed['cols']]),
','.join([str(range(len(block))) for block in choosed['cols']]),
sels_col[2].replace("' + col + '", "' + num + '"),
sels_col[3].replace("' + col + '", "' + num + '") + \
""" <a href="javascript:;" onclick="delrow(' + num + ',' + (col_select[num]-1) + ');">Remove row</a>""")
formheader += """
function addblock() {
col_select[col] = 1;
var ni = document.getElementById('block');
var newdiv = document.createElement('div'+col);
newdiv.innerHTML = '%s';
ni.appendChild(newdiv);
block_pos[col] = block_pos_max;
block_pos_max++;
rows_pos[col] = [0];
rows_pos_max[col] = 1;
col++;
}""" % col_table
formheader += """
function delblock(id) {
var block = document.getElementById("cols" + id);
var add = document.getElementById("add" + id);
var del = document.getElementById("del" + id);
block.parentNode.removeChild(block);
add.parentNode.removeChild(add);
del.parentNode.removeChild(del);
for (var i = id+1; i < col_select.length; i++) {
block_pos[i]--;
}
block_pos_max--;
}
function delrow(table_id,row_num) {
var table = document.getElementById('cols' + table_id);
table.tBodies[0].deleteRow(rows_pos[table_id][row_num]);
col_select[table_id]--;
for (var i = row_num+1; i < rows_pos[table_id].length; i++) {
rows_pos[table_id][i]--;
}
rows_pos_max[table_id]--;
} """
formheader += """
function change_select_options(selectList, isList, optionArray, chooseDefault) {
if (isList) {
for (var select = 0; select < selectList.length; select++) {
_change_select_options(selectList[select], optionArray, chooseDefault);
}
} else {
_change_select_options(selectList, optionArray, chooseDefault);
}
}
function _change_select_options(select, optionArray, chooseDefault) {
select.options.length = 0;
for (var option = 0; option*2 < optionArray.length - 1; option++) {
if (chooseDefault == optionArray[option*2+1]) {
select.options[option] = new Option(optionArray[option*2], optionArray[option*2+1], true, true);
} else {
select.options[option] = new Option(optionArray[option*2], optionArray[option*2+1]);
}
}
}
function changed_customevent(select, num){
if (select.length) {
value = select[block_pos[num]].value;
} else {
value = select.value;
}
list = get_argument_list(value);
select_list = (col_select[num] > 1);
change_select_options(document['customevent']['cols' + num], select_list, list, '');
}
function get_argument_list(value) {
if (value == "") {
return ['Choose CustomEvent',''];"""
for event_id, cols in options['cols'].items():
if event_id not in ['__header', '__none']:
str_cols = "[' - select %s', ''," % options['cols']['__header']
for internal, full in cols:
str_cols += "'%s','%s'," % (full, internal)
str_cols = str_cols[:-1] + ']'
formheader += """
} else if (value == "%s") {
return %s;""" % (event_id, str_cols)
formheader += """
}
}
</script>"""
# Create the FORM's header
formheader += """<form method="get" name="customevent">
<input type="hidden" name="ln"value="%s" />""" % ln
# Create all footers
footers = []
footers.append("")
footers.append("""<a href="javascript:;" onclick="addcol('cols0', 0);">
Add more arguments</a>""")
for i in range(1, num_ids):
footers.append("""
<a id="add%(i)d" href="javascript:;" onclick="addcol('cols%(i)d', %(i)d);">Add more arguments</a>
<a id="del%(i)d" href="javascript:;" onclick="delblock(%(i)d);">Remove block</a>
""" % {'i': i})
footers[-1] += """<div id="block"> </div>"""
# Create formfooter
formfooter = """<p><a href="javascript:;" onclick="addblock();">Add more events</a>
<input class="formbutton" type="submit" name="action_gen" value="Generate"></p>
</form>"""
return self._tmpl_box(formheader, formfooter, table_id, headers, sels, footers, ln=ln)
def tmpl_display_event_trend_ascii(self, title, filename, ln=CFG_SITE_LANG):
"""Displays a ASCII graph representing a trend"""
try:
return self.tmpl_display_trend(title, "<div><pre>%s</pre></div>" %
open(filename, 'r').read(), ln=ln)
except IOError:
return "No data found"
def tmpl_display_event_trend_image(self, title, filename, ln=CFG_SITE_LANG):
"""Displays an image graph representing a trend"""
if os.path.isfile(filename):
return self.tmpl_display_trend(title, """<div><img src="%s" /></div>""" %
filename.replace(CFG_WEBDIR, CFG_SITE_URL), ln=ln)
else:
return "No data found"
def tmpl_display_event_trend_text(self, title, filename, ln=CFG_SITE_LANG):
"""Displays a text representing a trend"""
try:
return self.tmpl_display_trend(title, "<div>%s</div>" %
open(filename, 'r').read(), ln=ln)
except IOError:
return "No data found"
def tmpl_display_numeric_stats(self, titles, avgs, maxs, mins):
"""Display average, max and min values"""
if titles:
out = ""
for i in range(len(titles)):
out += """<em>%s</em><br />
<b>Average:</b> %d<br />
<b>Max:</b> %d<br />
<b>Min:</b> %d<br />""" % (cgi.escape(titles[i]),
avgs[i], maxs[i], mins[i])
return out
else:
return """<b>Average:</b> %d<br />
<b>Max:</b> %d<br />
<b>Min:</b> %d<br />""" % (avgs, maxs, mins)
def tmpl_display_custom_summary(self, tag_name, data, title, query, tag,
path, ln=CFG_SITE_LANG):
"""Display the custom summary (annual report)"""
# Create the FORM's header
formheader = """<form method="get">
<input type="hidden" name="ln"value="%s" />""" % ln
# Create the headers
headers = [("Chart title", "Query", "Output tag", "")]
# Create the body (text boxes and button)
fields = (("""<input type="text" name="title" value="%s" size="20"/>""" % cgi.escape(title),
"""<input type="text" name="query" value="%s" size="35"/>""" % cgi.escape(query),
"""<input type="text" name="tag" value="%s" size="10"/>""" % cgi.escape(tag),
"""<input class="formbutton" type="submit" name="action_gen" value="Generate"/>"""), )
# Create form footer
formfooter = """</form>"""
out = self._tmpl_box(formheader, formfooter, [("custom_summary_table", )],
headers, fields, [""], ln=ln)
out += """<div>
<table border>
<tr>
<td colspan=2>
<b><center>
Distribution across %s
</center>
</td>
</tr>
<tr>
<td align="right"><b>Nb.</b></td>
<td><b>%s</b></td>
</tr>
""" % (cgi.escape(tag_name), cgi.escape(tag_name[0].capitalize() + tag_name[1:]))
if len(query) > 0:
query += " and "
for title, number in data:
if title in ('Others', 'TOTAL'):
out += """<tr>
<td align="right">%d</td>
<td>%s</td>
</tr>
""" % (number, cgi.escape(title))
else:
out += """<tr>
<td align="right"><a href="%s/search?p=%s&ln=%s">%d</a></td>
<td>%s</td>
</tr>
""" % (CFG_SITE_URL, cgi.escape(urllib.quote(query + " " + tag + ':"' + title + '"')), ln, number, cgi.escape(title))
out += """</table></div>
<div><img src="%s" /></div>""" % cgi.escape(path.replace(CFG_WEBDIR, CFG_SITE_URL))
return out
# INTERNALS
def tmpl_display_trend(self, title, html, ln=CFG_SITE_LANG):
"""
Generates a generic display box for showing graphs (ASCII and IMGs)
alongside to some metainformational boxes.
"""
return """<table class="narrowsearchbox">
<thead><tr><th colspan="2" class="narrowsearchboxheader" align="left">%s</th></tr></thead>
<tbody><tr><td class="narrowsearchboxbody" valign="top">%s</td></tr></tbody>
</table> """ % (title, html)
def _tmpl_box(self, formheader, formfooter, table_id, headers, selectboxes,
footers, ln=CFG_SITE_LANG):
"""
Aggregates together the parameters in order to generate the
corresponding box for customevent.
@param formheader: Start tag for the FORM element.
@type formheader: str
@param formfooter: End tag for the FORM element.
@type formfooter: str
@param table_id: id for each table
@type table_id: list<str>
@param headers: Headers for the SELECT boxes
@type headers: list<list<str>>
@param selectboxes: The actual HTML drop-down boxes, with appropriate content.
@type selectboxes: list<list<str>>|list<list<list<str>>>
@param footers: footer for each table
@type footers: list<str>
@return: HTML describing a particular FORM box.
@type: str
"""
out = formheader
for table in range(len(table_id)):
out += """<table id="%s" class="searchbox">
<thead>
<tr>""" % table_id[table]
#Append the headers
for header in headers[table]:
out += """<th class="searchboxheader">%s</th>""" % header
out += """</tr>
</thead>
<tbody>"""
# Append the SELECT boxes
is_first_loop = True
out += """<tr valign="bottom">"""
for selectbox in selectboxes[table]:
if type(selectbox) is list:
if is_first_loop:
is_first_loop = False
else:
out += """</tr>
<tr valign="bottom">"""
for select in selectbox:
out += """<td class="searchboxbody" valign="top">%s</td>""" % select
else:
out += """<td class="searchboxbody" valign="top">%s</td>""" % selectbox
out += """
</tr>"""
out += """
</tbody>
</table>"""
# Append footer
out += footers[table]
out += formfooter
return out
def _tmpl_select_box(self, iterable, explaination, name, preselected,
multiple=False, attribute="", ln=CFG_SITE_LANG):
"""
Generates a HTML SELECT drop-down menu.
@param iterable: A list of values and tag content to be used in the SELECT list
@type iterable: [(str, str)]
@param explaination: An explainatory string put as the tag content for the first OPTION.
@type explaination: str
@param name: The name of the SELECT tag. Important for FORM-parsing.
@type name: str
@param preselected: The value, or list of values, of the OPTION that should be
preselected. Blank or empty list for none.
@type preselected: str | []
@param attribute: Optionally add attributes to the select tag
@type attribute: str
@param multiple: Optionally sets the SELECT box to accept multiple entries.
@type multiple: bool
"""
if attribute:
sel = """<select name="%s" %s>""" % (name, attribute)
else:
if name == "timespan":
sel = """<script type="text/javascript">
function changeTimeSpanDates(val){
if(val == "select date"){
document.getElementById("selectDateTxt").style.display='block';}
else{
document.getElementById("selectDateTxt").style.display='none';}
}
</script>
<select name="timespan" id="timespan"
onchange="javascript: changeTimeSpanDates(this.value);">"""
else:
sel = """<select name="%s">""" % name
if multiple is True and name != "timespan":
sel = sel.replace("<select ", """<select multiple="multiple" size="5" """)
elif explaination:
sel += """<option value="">%s</option>""" % explaination
for realname, printname in [(x[0], x[1]) for x in iterable]:
if printname is None:
printname = realname
option = """<option value="%s">%s</option>""" % (realname, printname)
if realname == preselected or (type(preselected) is list
and realname in preselected) or (name == "timespan" and
realname == 'select date' and multiple):
option = option.replace('">', '" selected="selected">')
sel += option
sel += "</select>"
if name == "timespan":
if multiple:
s_date = preselected[0]
f_date = preselected[1]
else:
s_date = datetime.datetime.today().date().strftime("%m/%d/%Y %H:%M")
f_date = datetime.datetime.now().strftime("%m/%d/%Y %H:%M")
sel += """<link rel="stylesheet" href="%(CFG_SITE_URL)s/img/jquery-ui.css"
type="text/css" />
<script language="javascript" type="text/javascript" src="%(CFG_SITE_URL)s/js/jquery-ui.min.js"></script>
<script type="text/javascript" src="%(CFG_SITE_URL)s/js/jquery-ui-timepicker-addon.js"></script>
<div id="selectDateTxt" style="position:relative;display:none">
<table align="center">
<tr align="center">
<td align="right" class="searchboxheader">From: </td>
<td align="left"><input type="text" name="s_date" id="s_date" value="%(s_date)s" size="14" /></td>
</tr>
<tr align="center">
<td align="right" class="searchboxheader">To: </td>
<td align="left"><input type="text" name="f_date" id="f_date" value="%(f_date)s" size="14" /></td>
</tr>
</table>
</div>
<script type="text/javascript">
$('#s_date').datetimepicker();
$('#f_date').datetimepicker({
hour: 23,
minute: 59
});
if(document.getElementById("timespan").value == "select date"){
document.getElementById("selectDateTxt").style.display='block';
} </script>""" % {'CFG_SITE_URL': CFG_SITE_URL,
's_date': cgi.escape(s_date),
'f_date': cgi.escape(f_date)}
return sel
def _tmpl_text_box(self, name, preselected, ln=CFG_SITE_LANG):
"""
Generates a HTML text-box menu.
@param name: The name of the textbox label.
@type name: str
@param preselected: The value that should be preselected. Blank or empty
list for none.
@type preselected: str | []
"""
if name == 'min_loans' or name == 'max_loans':
return """<script type="text/javascript">
function checkNumber(input){
var num = input.value.replace(/\,/g,'');
var newtext = parseInt(num);
if(isNaN(newtext)){
alert('You may enter only numbers in this field!');
input.value = 0;
}
else {
input.value = newtext;
}
}
</script>
<input type="text" name="%s" onchange="checkNumber(this)" value="%s" />""" % (name, preselected)
else:
return """<input type="text" name="%s" value="%s" />""" % (name, preselected)
diff --git a/invenio/legacy/webstat/webinterface.py b/invenio/legacy/webstat/webinterface.py
index b4217c615..d657d0177 100644
--- a/invenio/legacy/webstat/webinterface.py
+++ b/invenio/legacy/webstat/webinterface.py
@@ -1,1071 +1,1071 @@
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
__lastupdated__ = "$Date$"
import os, sys
from urllib import unquote
from invenio.utils import apache
from invenio.config import \
CFG_TMPDIR, \
CFG_SITE_URL, \
CFG_SITE_LANG
from invenio.bibindex_tokenizers.BibIndexJournalTokenizer import CFG_JOURNAL_TAG
from invenio.ext.legacy.handler import wash_urlargd, WebInterfaceDirectory
from invenio.legacy.webpage import page
from invenio.modules.access.engine import acc_authorize_action
from invenio.modules.access.local_config import VIEWRESTRCOLL
from invenio.legacy.search_engine import collection_restricted_p
from invenio.legacy.webuser import collect_user_info, page_not_authorized
from invenio.utils.url import redirect_to_url
-from invenio.webstat import perform_request_index, \
+from invenio.legacy.webstat.api import perform_request_index, \
perform_display_keyevent, \
perform_display_customevent, \
perform_display_customevent_help, \
perform_display_error_log_analyzer, \
register_customevent, \
perform_display_custom_summary, \
perform_display_stats_per_coll, \
perform_display_current_system_health, \
perform_display_yearly_report, \
perform_display_coll_list, \
perform_display_ingestion_status
def detect_suitable_graph_format():
"""
Return suitable graph format default argument. It is always flot (when there wasn't plot, gnuplot if it is
present, otherwise asciiart).
"""
return "flot"
# try:
# import Gnuplot
# suitable_graph_format = "gnuplot"
# except ImportError:
# suitable_graph_format = "asciiart"
# return suitable_graph_format
SUITABLE_GRAPH_FORMAT = detect_suitable_graph_format()
class WebInterfaceStatsPages(WebInterfaceDirectory):
"""Defines the set of stats pages."""
_exports = ['', 'system_health', 'systemhealth', 'yearly_report', 'ingestion_health',
'collection_population', 'new_records', 'search_frequency', 'search_type_distribution',
'download_frequency', 'comments_frequency', 'number_of_loans', 'web_submissions',
'loans_stats', 'loans_lists', 'renewals_lists', 'returns_table', 'returns_graph',
'ill_requests_stats', 'ill_requests_lists', 'ill_requests_graph', 'items_stats',
'items_list', 'loans_requests', 'loans_request_lists', 'user_stats',
'user_lists', 'error_log', 'customevent', 'customevent_help',
'customevent_register', 'custom_summary', 'collections' , 'collection_stats',
'export']
navtrail = """<a class="navtrail" href="%s/stats/%%(ln_link)s">Statistics</a>""" % CFG_SITE_URL
def __call__(self, req, form):
"""Index page."""
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='index',
ln=ln)
return page(title="Statistics",
body=perform_request_index(ln=ln),
description="Invenio, Statistics",
keywords="Invenio, statistics",
req=req,
lastupdated=__lastupdated__,
navmenuid='stats',
language=ln)
# CURRENT SYSTEM HEALTH
def system_health(self, req, form):
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='current system health',
ln=ln)
return page(title="Current system health",
body=perform_display_current_system_health(ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Current system health",
keywords="Invenio, statistics, current system health",
req=req,
lastupdated=__lastupdated__,
navmenuid='current system health',
language=ln)
def systemhealth(self, req, form):
"""Redirect for the old URL. """
return redirect_to_url (req, "%s/stats/system_health" % (CFG_SITE_URL))
def yearly_report(self, req, form):
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='yearly report',
ln=ln)
return page(title="Yearly report",
body=perform_display_yearly_report(ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Yearly report",
keywords="Invenio, statistics, yearly report",
req=req,
lastupdated=__lastupdated__,
navmenuid='yearly report',
language=ln)
def ingestion_health(self, req, form):
argd = wash_urlargd(form, { 'pattern': (str, None),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
req_ingestion = argd['pattern']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='ingestion status',
ln=ln)
return page(title="Check ingestion health",
body=perform_display_ingestion_status(req_ingestion, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Ingestion health",
keywords="Invenio, statistics, Ingestion health",
req=req,
lastupdated=__lastupdated__,
navmenuid='ingestion health',
language=ln)
# KEY EVENT SECTION
def collection_population(self, req, form):
"""Collection population statistics page."""
argd = wash_urlargd(form, {'collection': (str, "All"),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='collection population',
ln=ln)
return page(title="Collection population",
body=perform_display_keyevent('collection population', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Collection population",
keywords="Invenio, statistics, collection population",
req=req,
lastupdated=__lastupdated__,
navmenuid='collection population',
language=ln)
def new_records(self, req, form):
"""Collection population statistics page."""
argd = wash_urlargd(form, {'collection': (str, "All"),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='new records',
ln=ln)
return page(title="New records",
body=perform_display_keyevent('new records', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, New records",
keywords="Invenio, statistics, new records",
req=req,
lastupdated=__lastupdated__,
navmenuid='new records',
language=ln)
def search_frequency(self, req, form):
"""Search frequency statistics page."""
argd = wash_urlargd(form, {'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='search frequency',
ln=ln)
return page(title="Search frequency",
body=perform_display_keyevent('search frequency', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Search frequency",
keywords="Invenio, statistics, search frequency",
req=req,
lastupdated=__lastupdated__,
navmenuid='search frequency',
language=ln)
def comments_frequency(self, req, form):
"""Comments frequency statistics page."""
argd = wash_urlargd(form, {'collection': (str, "All"),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='comments frequency',
ln=ln)
return page(title="Comments frequency",
body=perform_display_keyevent('comments frequency', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Comments frequency",
keywords="Invenio, statistics, Comments frequency",
req=req,
lastupdated=__lastupdated__,
navmenuid='comments frequency',
language=ln)
def search_type_distribution(self, req, form):
"""Search type distribution statistics page."""
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
argd = wash_urlargd(form, {'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='search type distribution',
ln=ln)
return page(title="Search type distribution",
body=perform_display_keyevent('search type distribution', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Search type distribution",
keywords="Invenio, statistics, search type distribution",
req=req,
lastupdated=__lastupdated__,
navmenuid='search type distribution',
language=ln)
def download_frequency(self, req, form):
"""Download frequency statistics page."""
argd = wash_urlargd(form, {'collection': (str, "All"),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='download frequency',
ln=ln)
return page(title="Download frequency",
body=perform_display_keyevent('download frequency', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Download frequency",
keywords="Invenio, statistics, download frequency",
req=req,
lastupdated=__lastupdated__,
navmenuid='download frequency',
language=ln)
def number_of_loans(self, req, form):
"""Number of loans statistics page."""
argd = wash_urlargd(form, {'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='number of circulation loans',
ln=ln)
return page(title="Number of circulation loans",
body=perform_display_keyevent('number of loans', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Number of circulation loans",
keywords="Invenio, statistics, Number of circulation loans",
req=req,
lastupdated=__lastupdated__,
navmenuid='number of circulation loans',
language=ln)
def web_submissions(self, req, form):
"""Web submissions statistics page."""
argd = wash_urlargd(form, {'doctype': (str, "all"),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='web submissions',
ln=ln)
return page(title="Web submissions",
body=perform_display_keyevent('web submissions', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Web submissions",
keywords="Invenio, statistics, websubmissions",
req=req,
lastupdated=__lastupdated__,
navmenuid='web submissions',
language=ln)
def loans_stats(self, req, form):
"""Number of loans statistics page."""
argd = wash_urlargd(form, {'udc': (str, ""),
'item_status': (str, ""),
'publication_date': (str, ""),
'creation_date': (str, ""),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='circulation loans statistics',
ln=ln)
return page(title="Circulation loans statistics",
body=perform_display_keyevent('loans statistics', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Circulation loans statistics",
keywords="Invenio, statistics, Circulation loans statistics",
req=req,
lastupdated=__lastupdated__,
navmenuid='circulation loans statistics',
language=ln)
def loans_lists(self, req, form):
"""Number of loans lists page."""
argd = wash_urlargd(form, {'udc': (str, ""),
'loan_period': (str, ""),
'min_loans': (int, 0),
'max_loans': (int, sys.maxint),
'publication_date': (str, ""),
'creation_date': (str, ""),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
argd['min_loans'] = str(argd['min_loans'])
argd['max_loans'] = str(argd['max_loans'])
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='circulation loans lists',
ln=ln)
return page(title="Circulation loans lists",
body=perform_display_keyevent('loans lists', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Circulation oans lists",
keywords="Invenio, statistics, Circulation loans lists",
req=req,
lastupdated=__lastupdated__,
navmenuid='circulation loans lists',
language=ln)
def renewals_lists(self, req, form):
"""Renewed items lists page."""
argd = wash_urlargd(form, {'udc': (str, ""),
'collection': (str, ""),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='circulation renewals lists',
ln=ln)
return page(title="Circulation renewals lists",
body=perform_display_keyevent('renewals', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Circulation renewals lists",
keywords="Invenio, statistics, Circulation renewals lists",
req=req,
lastupdated=__lastupdated__,
navmenuid='circulation renewals lists',
language=ln)
def returns_table(self, req, form):
"""Number of returns table page."""
argd = wash_urlargd(form, {'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='Circulation returns table',
ln=ln)
return page(title="Circulation returns table",
body=perform_display_keyevent('number returns', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Circulation returns table",
keywords="Invenio, statistics, Circulation returns table",
req=req,
lastupdated=__lastupdated__,
navmenuid='circulation returns table',
language=ln)
def returns_graph(self, req, form):
"""Percentage of returns graph page."""
argd = wash_urlargd(form, {'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='circulation returns graph',
ln=ln)
return page(title="Circulation returns graph",
body=perform_display_keyevent('percentage returns', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Circulation returns graph",
keywords="Invenio, statistics, Circulation returns graph",
req=req,
lastupdated=__lastupdated__,
navmenuid='circulation returns graph',
language=ln)
def ill_requests_stats(self, req, form):
"""ILL Requests statistics page."""
argd = wash_urlargd(form, {'doctype': (str, ""),
'status': (str, ""),
'supplier': (str, ""),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='circulation ill requests statistics',
ln=ln)
return page(title="Circulation ILL Requests statistics",
body=perform_display_keyevent('ill requests statistics', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Circulation ILL Requests statistics",
keywords="Invenio, statistics, Circulation ILL Requests statistics",
req=req,
lastupdated=__lastupdated__,
navmenuid='circulation ill requests statistics',
language=ln)
def ill_requests_lists(self, req, form):
"""Number of loans lists page."""
argd = wash_urlargd(form, {'doctype': (str, ""),
'supplier': (str, ""),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='circulation ill requests list',
ln=ln)
return page(title="Circulation ILL Requests list",
body=perform_display_keyevent('ill requests list', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Circulation ILL Requests list",
keywords="Invenio, statistics, Circulation ILL Requests list",
req=req,
lastupdated=__lastupdated__,
navmenuid='circulation ill requests list',
language=ln)
def ill_requests_graph(self, req, form):
"""Percentage of satisfied ILL requests graph page."""
argd = wash_urlargd(form, {'doctype': (str, ""),
'status': (str, ""),
'supplier': (str, ""),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='percentage circulation satisfied ill requests',
ln=ln)
return page(title="Percentage of circulation satisfied ILL requests",
body=perform_display_keyevent('percentage satisfied ill requests',
argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Percentage of circulation satisfied ILL requests",
keywords="Invenio, statistics, Percentage of circulation satisfied ILL requests",
req=req,
lastupdated=__lastupdated__,
navmenuid='percentage circulation satisfied ill requests',
language=ln)
def items_stats(self, req, form):
"""ILL Requests statistics page."""
argd = wash_urlargd(form, {'udc': (str, ""),
'collection': (str, ""),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='circulation items stats',
ln=ln)
return page(title="Circulation items statistics",
body=perform_display_keyevent('items stats', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Circulation items statistics",
keywords="Invenio, statistics, Circulation items statistics",
req=req,
lastupdated=__lastupdated__,
navmenuid='circulation items stats',
language=ln)
def items_list(self, req, form):
"""Number of loans lists page."""
argd = wash_urlargd(form, {'library': (str, ""),
'status': (str, ""),
'format': (str, ""),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='circulation items list',
ln=ln)
return page(title="Circulation items list",
body=perform_display_keyevent('items list', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Circulation items list",
keywords="Invenio, statistics, Circulation items list",
req=req,
lastupdated=__lastupdated__,
navmenuid='circulation items list',
language=ln)
def loans_requests(self, req, form):
"""Number of loans statistics page."""
argd = wash_urlargd(form, {'item_status': (str, ""),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='circulation loan request statistics',
ln=ln)
return page(title="Circulation hold requests statistics",
body=perform_display_keyevent('loan request statistics', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Circulation hold requests statistics",
keywords="Invenio, statistics, Circulation hold requests statistics",
req=req,
lastupdated=__lastupdated__,
navmenuid='circulation loan request statistics',
language=ln)
def loans_request_lists(self, req, form):
"""Number of loans request lists page."""
argd = wash_urlargd(form, {'udc': (str, ""),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='circulation hold request lists',
ln=ln)
return page(title="Circulation loans request lists",
body=perform_display_keyevent('loan request lists', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Circulation hold request lists",
keywords="Invenio, statistics, Circulation hold request lists",
req=req,
lastupdated=__lastupdated__,
navmenuid='circulation hold request lists',
language=ln)
def user_stats(self, req, form):
"""Number of loans statistics page."""
argd = wash_urlargd(form, {'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='circulation user statistics',
ln=ln)
return page(title="Circulation users statistics",
body=perform_display_keyevent('user statistics', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Circulation users statistics",
keywords="Invenio, statistics, Circulation users statistics",
req=req,
lastupdated=__lastupdated__,
navmenuid='circulation user statistics',
language=ln)
def user_lists(self, req, form):
"""Number of loans lists page."""
argd = wash_urlargd(form, {'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'sql': (int, 0),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='circulation users lists',
ln=ln)
return page(title="Circulation users lists",
body=perform_display_keyevent('user lists', argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Circulation users lists",
keywords="Invenio, statistics, Circulation users lists",
req=req,
lastupdated=__lastupdated__,
navmenuid='circulation users lists',
language=ln)
# CUSTOM EVENT SECTION
def customevent(self, req, form):
"""Custom event statistics page"""
arg_format = {'ids': (list, []),
'timespan': (str, "today"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, SUITABLE_GRAPH_FORMAT),
'ln': (str, CFG_SITE_LANG)}
for key in form.keys():
if key[:4] == 'cols':
i = key[4:]
arg_format['cols' + i] = (list, [])
arg_format['col_value' + i] = (list, [])
arg_format['bool' + i] = (list, [])
argd = wash_urlargd(form, arg_format)
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='custom event',
ln=ln)
body = perform_display_customevent(argd['ids'], argd, req=req, ln=ln)
return page(title="Custom event",
body=body,
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Custom event",
keywords="Invenio, statistics, custom event",
req=req,
lastupdated=__lastupdated__,
navmenuid='custom event',
language=ln)
def error_log(self, req, form):
"""Number of loans request lists page."""
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='error log analyzer',
ln=ln)
return page(title="Error log analyzer",
body=perform_display_error_log_analyzer(ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Error log analyzer",
keywords="Invenio, statistics, Error log analyzer",
req=req,
lastupdated=__lastupdated__,
navmenuid='error log analyzer',
language=ln)
def customevent_help(self, req, form):
"""Custom event help page"""
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='custom event help',
ln=ln)
return page(title="Custom event help",
body=perform_display_customevent_help(ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Custom event help",
keywords="Invenio, statistics, custom event help",
req=req,
lastupdated=__lastupdated__,
navmenuid='custom event help',
language=ln)
def customevent_register(self, req, form):
"""Register a customevent and reload to it defined url"""
argd = wash_urlargd(form, {'event_id': (str, ""),
'arg': (str, ""),
'url': (str, ""),
'ln': (str, CFG_SITE_LANG)})
params = argd['arg'].split(',')
if "WEBSTAT_IP" in params:
index = params.index("WEBSTAT_IP")
params[index] = str(req.remote_ip)
register_customevent(argd['event_id'], params)
return redirect_to_url(req, unquote(argd['url']), apache.HTTP_MOVED_PERMANENTLY)
# CUSTOM REPORT SECTION
def custom_summary(self, req, form):
"""Custom report page"""
argd = wash_urlargd(form, {'query': (str, ""),
'tag': (str, CFG_JOURNAL_TAG.replace("%", "p")),
'title': (str, "Publications"),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='custom query summary',
ln=ln)
return page(title="Custom query summary",
body=perform_display_custom_summary(argd, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Custom Query Summary",
keywords="Invenio, statistics, custom query summary",
req=req,
lastupdated=__lastupdated__,
navmenuid='custom query summary',
language=ln)
# COLLECTIONS SECTION
def collection_stats(self, req, form):
"""Collection statistics list page"""
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
navmenuid='collections list',
text=auth_msg,
ln=ln)
return page(title="Collection statistics",
body=perform_display_coll_list(req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Collection statistics",
keywords="Invenio, statistics",
req=req,
lastupdated=__lastupdated__,
navmenuid='collections list',
language=ln)
def collections(self, req, form):
"""Collections statistics page"""
argd = wash_urlargd(form, {'collection': (str, "All"),
'timespan': (str, "this month"),
's_date': (str, ""),
'f_date': (str, ""),
'format': (str, "flot"),
'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
navmenuid='collections',
text=auth_msg,
ln=ln)
if collection_restricted_p(argd['collection']):
(auth_code_coll, auth_msg_coll) = acc_authorize_action(user_info, VIEWRESTRCOLL, collection=argd['collection'])
if auth_code_coll:
return page_not_authorized(req,
navmenuid='collections',
text=auth_msg_coll,
ln=ln)
return page(title="Statistics of %s" % argd['collection'],
body=perform_display_stats_per_coll(argd, req, ln=ln),
navtrail="""<a class="navtrail" href="%s/stats/%s">Statistics</a>""" % \
(CFG_SITE_URL, (ln != CFG_SITE_LANG and '?ln=' + ln) or ''),
description="Invenio, Statistics, Collection %s" % argd['collection'],
keywords="Invenio, statistics, %s" % argd['collection'],
req=req,
lastupdated=__lastupdated__,
navmenuid='collections',
language=ln)
# EXPORT SECTION
def export(self, req, form):
"""Exports data"""
argd = wash_urlargd(form, {'ln': (str, CFG_SITE_LANG)})
ln = argd['ln']
user_info = collect_user_info(req)
(auth_code, auth_msg) = acc_authorize_action(user_info, 'runwebstatadmin')
if auth_code:
return page_not_authorized(req,
navtrail=self.navtrail % {'ln_link': (ln != CFG_SITE_LANG and '?ln=' + ln) or ''},
text=auth_msg,
navmenuid='export',
ln=ln)
argd = wash_urlargd(form, {"filename": (str, ""),
"mime": (str, "")})
# Check that the particular file exists and that it's OK to export
webstat_files = [x for x in os.listdir(CFG_TMPDIR) if x.startswith("webstat")]
if argd["filename"] not in webstat_files:
return "Bad file."
# Set correct header type
req.content_type = argd["mime"]
req.send_http_header()
# Rebuild path, send it to the user, and clean up.
filename = CFG_TMPDIR + '/' + argd["filename"]
req.sendfile(filename)
os.remove(filename)
index = __call__
diff --git a/invenio/legacy/webstyle/httptest_webinterface.py b/invenio/legacy/webstyle/httptest_webinterface.py
index 45095721d..5a1617473 100644
--- a/invenio/legacy/webstyle/httptest_webinterface.py
+++ b/invenio/legacy/webstyle/httptest_webinterface.py
@@ -1,138 +1,138 @@
## This file is part of Invenio.
## Copyright (C) 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
HTTP Test web interface. This is the place where to put helpers for
regression tests related to HTTP (or WSGI or SSO).
"""
__revision__ = \
"$Id$"
__lastupdated__ = """$Date$"""
import cgi
from invenio.config import CFG_SITE_URL, CFG_TMPDIR
from invenio.legacy.webpage import page
from invenio.ext.legacy.handler import WebInterfaceDirectory, wash_urlargd
from invenio.utils.url import redirect_to_url
class WebInterfaceHTTPTestPages(WebInterfaceDirectory):
_exports = ["", "post1", "post2", "sso", "dumpreq", "complexpost", "whatismyip", "oraclefriendly"]
def __call__(self, req, form):
redirect_to_url(req, CFG_SITE_URL + '/httptest/post1')
index = __call__
def _lookup(self, component, path):
if component == 'hello':
name = '/'.join(path)
def hello(req, form):
return "Hello %s!" % name
return hello, []
return None, []
def sso(self, req, form):
""" For testing single sign-on """
req.add_common_vars()
sso_env = {}
for var, value in req.subprocess_env.iteritems():
if var.startswith('HTTP_ADFS_'):
sso_env[var] = value
out = "<html><head><title>SSO test</title></head>"
out += "<body><table>"
for var, value in sso_env.iteritems():
out += "<tr><td><strong>%s</strong></td><td>%s</td></tr>" % (var, value)
out += "</table></body></html>"
return out
def dumpreq(self, req, form):
"""
Dump a textual representation of the request object.
"""
return "<pre>%s</pre>" % cgi.escape(str(req))
def post1(self, req, form):
"""
This is used by WSGI regression test, to test if it's possible
to upload a file and retrieve it correctly.
"""
if req.method == 'POST':
if 'file' in form:
for row in form['file']:#.file:
req.write(row)
return ''
else:
body = """
<form method="post" enctype="multipart/form-data">
<input type="file" name="file" />
<input type="submit" />
</form>"""
return page("test1", body=body, req=req)
def post2(self, req, form):
"""
This is to test L{handle_file_post} function.
"""
from invenio.legacy.wsgi.utils import handle_file_post
- from invenio.bibdocfile import stream_file
+ from invenio.legacy.bibdocfile.api import stream_file
argd = wash_urlargd(form, {"save": (str, "")})
if req.method != 'POST':
body = """<p>Please send a file via POST.</p>"""
return page("test2", body=body, req=req)
path, mimetype = handle_file_post(req)
if argd['save'] and argd['save'].startswith(CFG_TMPDIR):
open(argd['save'], "w").write(open(path).read())
return stream_file(req, path, mime=mimetype)
def oraclefriendly(self, req, form):
"""
This specifically for batchuploader with the oracle-friendly patch
"""
from invenio.legacy.wsgi.utils import handle_file_post
- from invenio.bibdocfile import stream_file
+ from invenio.legacy.bibdocfile.api import stream_file
argd = wash_urlargd(form, {"save": (str, ""), "results": (str, "")})
if req.method != 'POST':
body = """<p>Please send a FORM via POST.</p>"""
return page("test2", body=body, req=req)
if argd['save'] and argd['save'].startswith(CFG_TMPDIR):
open(argd['save'], "w").write(argd['results'])
return argd['results']
def complexpost(self, req, form):
body = """
<form action="/httptest/dumpreq" method="POST">
A file: <input name="file1" type="file" /><br />
Another file: <input name="file2" type="file" /><br />
<select name="cars" multiple="multiple">
<option value="volvo">Volvo</option>
<option value="saab">Saab</option>
<option value="fiat" selected="selected">Fiat</option>
<option value="audi">Audi</option>
</select>
<input type="submit" />
</form>"""
return page("Complex POST", body=body, req=req)
def whatismyip(self, req, form):
"""
Return the client IP as seen by the server (useful for testing e.g. Robot authentication)
"""
req.content_type = "text/plain"
return req.remote_ip
diff --git a/invenio/legacy/websubmit/admin_dblayer.py b/invenio/legacy/websubmit/admin_dblayer.py
index e63c2ad0a..d8f99a5a5 100644
--- a/invenio/legacy/websubmit/admin_dblayer.py
+++ b/invenio/legacy/websubmit/admin_dblayer.py
@@ -1,3225 +1,3225 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
from invenio.legacy.dbquery import run_sql
-from invenio.websubmitadmin_config import *
+from invenio.legacy.websubmit.admin_config import *
from random import randint
## Functions related to the organisation of catalogues:
def insert_submission_collection(collection_name):
qstr = """INSERT INTO sbmCOLLECTION (name) VALUES (%s)"""
qres = run_sql(qstr, (collection_name,))
return int(qres)
def update_score_of_collection_child_of_submission_collection_at_scorex(id_father, old_score, new_score):
qstr = """UPDATE sbmCOLLECTION_sbmCOLLECTION """ \
"""SET catalogue_order=%s WHERE id_father=%s AND catalogue_order=%s"""
qres = run_sql(qstr, (new_score, id_father, old_score))
return 0
def update_score_of_collection_child_of_submission_collection_with_colid_and_scorex(id_father,
id_son,
old_score,
new_score):
qstr = """UPDATE sbmCOLLECTION_sbmCOLLECTION """ \
"""SET catalogue_order=%s """ \
"""WHERE id_father=%s AND id_son=%s AND catalogue_order=%s"""
qres = run_sql(qstr, (new_score, id_father, id_son, old_score))
return 0
def update_score_of_doctype_child_of_submission_collection_at_scorex(id_father, old_score, new_score):
qstr = """UPDATE sbmCOLLECTION_sbmDOCTYPE """ \
"""SET catalogue_order=%s WHERE id_father=%s AND catalogue_order=%s"""
qres = run_sql(qstr, (new_score, id_father, old_score))
return 0
def update_score_of_doctype_child_of_submission_collection_with_doctypeid_and_scorex(id_father,
id_son,
old_score,
new_score):
qstr = """UPDATE sbmCOLLECTION_sbmDOCTYPE """ \
"""SET catalogue_order=%s """ \
"""WHERE id_father=%s AND id_son=%s AND catalogue_order=%s"""
qres = run_sql(qstr, (new_score, id_father, id_son, old_score))
return 0
def get_id_father_of_collection(collection_id):
qstr = """SELECT id_father FROM sbmCOLLECTION_sbmCOLLECTION """ \
"""WHERE id_son=%s """ \
"""LIMIT 1"""
qres = run_sql(qstr, (collection_id,))
try:
return int(qres[0][0])
except (TypeError, IndexError):
return None
def get_maximum_catalogue_score_of_collection_children_of_submission_collection(collection_id):
qstr = """SELECT IFNULL(MAX(catalogue_order), 0) """ \
"""FROM sbmCOLLECTION_sbmCOLLECTION """ \
"""WHERE id_father=%s"""
qres = int(run_sql(qstr, (collection_id,))[0][0])
return qres
def get_score_of_collection_child_of_submission_collection(id_father, id_son):
qstr = """SELECT catalogue_order FROM sbmCOLLECTION_sbmCOLLECTION """ \
"""WHERE id_son=%s and id_father=%s """ \
"""LIMIT 1"""
qres = run_sql(qstr, (id_son, id_father))
try:
return int(qres[0][0])
except (TypeError, IndexError):
return None
def get_score_of_previous_collection_child_above(id_father, score):
qstr = """SELECT MAX(catalogue_order) """ \
"""FROM sbmCOLLECTION_sbmCOLLECTION """ \
"""WHERE id_father=%s and catalogue_order < %s"""
qres = run_sql(qstr, (id_father, score))
try:
return int(qres[0][0])
except (TypeError, IndexError):
return None
def get_score_of_next_collection_child_below(id_father, score):
qstr = """SELECT MIN(catalogue_order) """ \
"""FROM sbmCOLLECTION_sbmCOLLECTION """ \
"""WHERE id_father=%s and catalogue_order > %s"""
qres = run_sql(qstr, (id_father, score))
try:
return int(qres[0][0])
except (TypeError, IndexError):
return None
def get_catalogue_score_of_doctype_child_of_submission_collection(id_father, id_son):
qstr = """SELECT catalogue_order FROM sbmCOLLECTION_sbmDOCTYPE """ \
"""WHERE id_son=%s and id_father=%s """ \
"""LIMIT 1"""
qres = run_sql(qstr, (id_son, id_father))
try:
return int(qres[0][0])
except (TypeError, IndexError):
return None
def get_score_of_previous_doctype_child_above(id_father, score):
qstr = """SELECT MAX(catalogue_order) """ \
"""FROM sbmCOLLECTION_sbmDOCTYPE """ \
"""WHERE id_father=%s and catalogue_order < %s"""
qres = run_sql(qstr, (id_father, score))
try:
return int(qres[0][0])
except (TypeError, IndexError):
return None
def get_score_of_next_doctype_child_below(id_father, score):
qstr = """SELECT MIN(catalogue_order) """ \
"""FROM sbmCOLLECTION_sbmDOCTYPE """ \
"""WHERE id_father=%s and catalogue_order > %s"""
qres = run_sql(qstr, (id_father, score))
try:
return int(qres[0][0])
except (TypeError, IndexError):
return None
def get_maximum_catalogue_score_of_doctype_children_of_submission_collection(collection_id):
qstr = """SELECT IFNULL(MAX(catalogue_order), 0) """ \
"""FROM sbmCOLLECTION_sbmDOCTYPE """ \
"""WHERE id_father=%s"""
qres = int(run_sql(qstr, (collection_id,))[0][0])
return qres
def insert_collection_child_for_submission_collection(id_father, id_son, score):
qstr = """INSERT INTO sbmCOLLECTION_sbmCOLLECTION (id_father, id_son, catalogue_order) """ \
"""VALUES (%s, %s, %s)"""
qres = run_sql(qstr, (id_father, id_son, score))
def insert_doctype_child_for_submission_collection(id_father, id_son, score):
qstr = """INSERT INTO sbmCOLLECTION_sbmDOCTYPE (id_father, id_son, catalogue_order) """ \
"""VALUES (%s, %s, %s)"""
qres = run_sql(qstr, (id_father, id_son, score))
def get_doctype_children_of_collection(id_father):
"""Get details of all 'doctype' children of a given collection. For each doctype, get:
* doctype ID
* doctype long-name
* doctype catalogue-order
The document type children retrieved are ordered in ascending order of 'catalogue order'.
@param id_father: (integer) - the ID of the parent collection for which doctype children are
to be retrieved.
@return: (tuple) of tuples. Each tuple is a row giving the following details of a doctype:
(doctype_id, doctype_longname, doctype_catalogue_order)
"""
## query to retrieve details of doctypes attached to a given collection:
qstr_doctype_children = """SELECT col_doctype.id_son, doctype.ldocname, col_doctype.catalogue_order """ \
"""FROM sbmCOLLECTION_sbmDOCTYPE AS col_doctype """ \
"""INNER JOIN sbmDOCTYPE AS doctype """ \
"""ON col_doctype.id_son = doctype.sdocname """ \
"""WHERE id_father=%s ORDER BY catalogue_order ASC"""
res_doctype_children = run_sql(qstr_doctype_children, (id_father,))
## return the result of this query:
return res_doctype_children
def get_collection_children_of_collection(id_father):
"""Get the collection ids of all 'collection' children of a given collection.
@param id_father: (integer) the ID of the parent collection for which collection are to
be retrieved.
@return: (tuple) of tuples. Each tuple is a row containing the collection ID of a 'collection' child
of the given parent collection.
"""
## query to retrieve IDs of collections attached to a given collection:
qstr_collection_children = """SELECT id_son FROM sbmCOLLECTION_sbmCOLLECTION WHERE id_father=%s ORDER BY catalogue_order ASC"""
res_collection_children = run_sql(qstr_collection_children, (id_father,))
## return the result of this query:
return res_collection_children
def get_id_and_score_of_collection_children_of_collection(id_father):
"""Get the collection ids and catalogue score positions of all 'collection' children of
a given collection.
@param id_father: (integer) the ID of the parent collection for which collection are to
be retrieved.
@return: (tuple) of tuples. Each tuple is a row containing the collection ID and the catalogue-score
position of a 'collection' child of the given parent collection: (id, catalogue-score)
"""
## query to retrieve IDs of collections attached to a given collection:
qstr_collection_children = """SELECT id_son, catalogue_order """ \
"""FROM sbmCOLLECTION_sbmCOLLECTION """ \
"""WHERE id_father=%s ORDER BY catalogue_order ASC"""
res_collection_children = run_sql(qstr_collection_children, (id_father,))
## return the result of this query:
return res_collection_children
def get_number_of_rows_for_submission_collection_as_submission_tree_branch(collection_id):
"""Get the number of rows found for a submission-collection as a branch of the
submission tree.
@param collection_id: (integer) - the id of the submission-collection.
@return: (integer) - number of rows found by the query.
"""
qstr = """SELECT COUNT(*) FROM sbmCOLLECTION_sbmCOLLECTION WHERE id_son=%s"""
return int(run_sql(qstr, (collection_id,))[0][0])
def get_number_of_rows_for_submission_collection(collection_id):
"""Get the number of rows found for a submission-collection.
@param collection_id: (integer) - the id of the submission-collection.
@return: (integer) - number of rows found by the query.
"""
qstr = """SELECT COUNT(*) FROM sbmCOLLECTION WHERE id=%s"""
return int(run_sql(qstr, (collection_id,))[0][0])
def delete_submission_collection_details(collection_id):
"""Delete the details of a submission-collection from the database.
@param collection_id: (integer) - the ID of the submission-collection whose details
are to be deleted from the WebSubmit database.
@return: (integer) - error code: 0 on successful delete; 1 on failure to delete.
"""
qstr = """DELETE FROM sbmCOLLECTION WHERE id=%s"""
run_sql(qstr, (collection_id,))
## check to see if submission-collection details deleted:
numrows_submission_collection = get_number_of_rows_for_submission_collection(collection_id)
if numrows_submission_collection == 0:
## everything OK - no doctype-children remain for this submission-collection
return 0
else:
## everything NOT OK - still rows remaining for this submission-collection
## make a last attempt to delete them:
run_sql(qstr, (collection_id,))
## once more, check the number of rows remaining for this submission-collection:
numrows_submission_collection = get_number_of_rows_for_submission_collection(collection_id)
if numrows_submission_collection == 0:
## Everything OK - submission-collection deleted
return 0
else:
## still could not delete the submission-collection
return 1
def delete_submission_collection_from_submission_tree(collection_id):
"""Delete a submission-collection from the submission tree.
@param collection_id: (integer) - the ID of the submission-collection whose details
are to be deleted from the WebSubmit database.
@return: (integer) - error code: 0 on successful delete; 1 on failure to delete.
"""
qstr = """DELETE FROM sbmCOLLECTION_sbmCOLLECTION WHERE id_son=%s"""
run_sql(qstr, (collection_id,))
## check to ensure that the submission-collection was deleted from the tree:
numrows_collection = \
get_number_of_rows_for_submission_collection_as_submission_tree_branch(collection_id)
if numrows_collection == 0:
## everything OK - this submission-collection does not exist as a branch on the submission tree
return 0
else:
## submission-collection still exists as a branch of the submission tree
## try once more to delete it:
run_sql(qstr, (collection_id,))
numrows_collection = \
get_number_of_rows_for_submission_collection_as_submission_tree_branch(collection_id)
if numrows_collection == 0:
## deleted successfully this time:
return 0
else:
## Still unable to delete
return 1
def get_collection_name(collection_id):
"""Get the name of a given collection.
@param collection_id: (integer) - the ID of the collection for which whose name is to be retrieved
@return: (string or None) the name of the collection if it exists, None if no rows were returned
"""
collection_name = None
## query to retrieve the name of a given collection:
qstr_collection_name = """SELECT name FROM sbmCOLLECTION WHERE id=%s"""
## get the name of this collection:
res_collection_name = run_sql(qstr_collection_name, (collection_id,))
try:
collection_name = res_collection_name[0][0]
except IndexError:
pass
## return the collection name:
return collection_name
def delete_doctype_children_from_submission_collection(collection_id):
"""Delete all doctype-children of a submission-collection.
@param collection_id: (integer) - the ID of the submission-collection from which
the doctype-children are to be deleted.
@return: (integer) - error code: 0 on successful delete; 1 on failure to delete.
"""
qstr = """DELETE FROM sbmCOLLECTION_sbmDOCTYPE WHERE id_father=%s"""
run_sql(qstr, (collection_id,))
## check to see if doctype-children still remain attached to submission-collection:
num_doctype_children = get_number_of_doctype_children_of_submission_collection(collection_id)
if num_doctype_children == 0:
## everything OK - no doctype-children remain for this submission-collection
return 0
else:
## everything NOT OK - still doctype-children remaining for this submission-collection
## make a last attempt to delete them:
run_sql(qstr, (collection_id,))
## once more, check the number of doctype-children remaining
num_doctype_children = get_number_of_doctype_children_of_submission_collection(collection_id)
if num_doctype_children == 0:
## Everything OK - all doctype-children deleted this time
return 0
else:
## still could not delete the doctype-children from this submission
return 1
def get_details_of_all_submission_collections():
"""Get the id and name of all submission-collections.
@return: (tuple) of tuples - (collection-id, collection-name)
"""
qstr_collections = """SELECT id, name from sbmCOLLECTION order by id ASC"""
res_collections = run_sql(qstr_collections)
return res_collections
def get_count_of_doctype_instances_at_score_for_collection(doctypeid, id_father, catalogue_score):
"""Get the number of rows found for a given doctype as attached to a given position on a query tree.
@param doctypeid: (string) - the identifier for the given document type.
@param id_father: (integer) - the id of the submission-collection to which the doctype is attached.
@param catalogue_posn: (integer) - the score of the document type for that catalogue connection.
@return: (integer) - number of rows found by the query.
"""
qstr = """SELECT COUNT(*) FROM sbmCOLLECTION_sbmDOCTYPE WHERE id_father=%s AND id_son=%s AND catalogue_order=%s"""
return int(run_sql(qstr, (id_father, doctypeid, catalogue_score))[0][0])
def get_number_of_doctype_children_of_submission_collection(collection_id):
"""Get the number of rows found for doctype-children as attached to a given submission-collection.
@param collection_id: (integer) - the id of the submission-collection to which the doctype-children are attached.
@return: (integer) - number of rows found by the query.
"""
qstr = """SELECT COUNT(*) FROM sbmCOLLECTION_sbmDOCTYPE WHERE id_father=%s"""
return int(run_sql(qstr, (collection_id,))[0][0])
def delete_doctype_from_position_on_submission_page(doctypeid, id_father, catalogue_score):
"""Delete a document type from a given score position of a given submission-collection.
@param doctypeid: (string) - the ID of the document type that is to be deleted from the submission-collection.
@param id_father: (integer) - the ID of the submission-collection from which the document type
is to be deleted.
@param catalogue_score: (integer) - the score of the submission-collection at which the
document type to be deleted is connected.
@return: (integer) - error code: 0 if delete was successful; 1 if delete failed;
"""
qstr = """DELETE FROM sbmCOLLECTION_sbmDOCTYPE WHERE id_father=%s AND id_son=%s AND catalogue_order=%s"""
run_sql(qstr, (id_father, doctypeid, catalogue_score))
## check to see whether this doctype was deleted:
numrows_doctype = get_count_of_doctype_instances_at_score_for_collection(doctypeid, id_father, catalogue_score)
if numrows_doctype == 0:
## delete successful
return 0
else:
## unsuccessful delete - try again
run_sql(qstr, (id_father, doctypeid, catalogue_score))
numrows_doctype = get_count_of_doctype_instances_at_score_for_collection(doctypeid, id_father, catalogue_score)
if numrows_doctype == 0:
## delete successful
return 0
else:
## unable to delete
return 1
def update_score_of_doctype_child_of_collection(id_father, id_son, old_catalogue_score, new_catalogue_score):
"""Update the score of a given doctype child of a submission-collection.
@param id_father: (integer) - the ID of the submission-collection whose child's score is to be updated
@param id_son: (string) - the ID of the document type to be updated
@param old_catalogue_score: (integer) - the score of the submission-collection that the doctype is found
at before update
@param new_catalogue_score: (integer) - the new value of the doctype's score for the submission-collection
@return: (integer) - 0
"""
qstr = """UPDATE sbmCOLLECTION_sbmDOCTYPE SET catalogue_order=%s """ \
"""WHERE id_father=%s AND id_son=%s AND catalogue_order=%s"""
run_sql(qstr, (new_catalogue_score, id_father, id_son, old_catalogue_score))
return 0
def update_score_of_collection_child_of_collection(id_father, id_son, old_catalogue_score, new_catalogue_score):
"""Update the score of a given collection child ofa submission-collection.
@param id_father: (integer) - the ID of the submission-collection whose child's score is to be updated
@param id_son: (integer) - the ID of the collection type to be updated
@param old_catalogue_score: (integer) - the score of the submission-collection that the collection is found
at before update
@param new_catalogue_score: (integer) - the new value of the collection's score for the submission-collection
@return: (integer) - 0
"""
qstr = """UPDATE sbmCOLLECTION_sbmCOLLECTION SET catalogue_order=%s """ \
"""WHERE id_father=%s AND id_son=%s AND catalogue_order=%s"""
run_sql(qstr, (new_catalogue_score, id_father, id_son, old_catalogue_score))
return 0
def normalize_scores_of_doctype_children_for_submission_collection(collection_id):
"""Normalize the scores of the doctype-children of a given submission-collection.
I.e. set them into the format (1, 2, 3, 4, 5, [...]).
@param collection_id: (integer) - the ID of the submission-collection whose
doctype-children's scores are to be normalized.
@return: None
"""
## Get all document types attached to the collection, ordered by score:
doctypes = get_doctype_children_of_collection(collection_id)
num_doctypes = len(doctypes)
normal_score = 1
## for each document type, if score does not fit with counter, update it:
for idx in xrange(0, num_doctypes):
this_doctype_id = doctypes[idx][0]
this_doctype_score = int(doctypes[idx][2])
if this_doctype_score != normal_score:
## Score of doctype is not good - correct it:
update_score_of_doctype_child_of_collection(collection_id, this_doctype_id, \
this_doctype_score, normal_score)
normal_score += 1
return
def normalize_scores_of_collection_children_of_collection(collection_id):
"""Normalize the scores of the collection-children of a given submission-collection.
I.e. set them into the format (1, 2, 3, 4, 5, [...]).
@param collection_id: (integer) - the ID of the submission-collection whose
collection-children's scores are to be normalized.
@return: None
"""
## Get all document types attached to the collection, ordered by score:
collections = get_id_and_score_of_collection_children_of_collection(collection_id)
num_collections = len(collections)
normal_score = 1
## for each collection, if score does not fit with counter, update it:
for idx in xrange(0, num_collections):
this_collection_id = collections[idx][0]
this_collection_score = int(collections[idx][1])
if this_collection_score != normal_score:
## Score of collection is not good - correct it:
update_score_of_collection_child_of_collection(collection_id, this_collection_id, \
this_collection_score, normal_score)
normal_score += 1
return
## Functions relating to WebSubmit ACTIONS, their addition, and their modification:
def update_action_details(actid, actname, working_dir, status_text):
"""Update the details of an action in the websubmit database IF there was only one action
with that actid (sactname).
@param actid: unique action id (sactname)
@param actname: action name (lactname)
@param working_dir: directory action works from (dir)
@param status_text: text string indicating action status (statustext)
@return: 0 (ZERO) if update is performed; 1 (ONE) if insert not performed due to rows existing for
given action name.
"""
# Check record with code 'actid' does not already exist:
numrows_actid = get_number_actions_with_actid(actid)
if numrows_actid == 1:
q ="""UPDATE sbmACTION SET lactname=%s, dir=%s, statustext=%s, md=CURDATE() WHERE sactname=%s"""
run_sql(q, (actname, working_dir, status_text, actid))
return 0 # Everything is OK
else:
return 1 # Everything not OK: Either no rows or more than one row for action "actid"
def get_action_details(actid):
"""Get and return a tuple of tuples for all actions with the sactname "actid".
@param actid: Action Identifier Code (sactname).
@return: tuple of tuples (one tuple per action row): (sactname,lactname,dir,statustext,cd,md).
"""
q = """SELECT act.sactname, act.lactname, act.dir, act.statustext, act.cd, act.md FROM sbmACTION AS act WHERE act.sactname=%s"""
return run_sql(q, (actid,))
def get_actid_actname_allactions():
"""Get and return a tuple of tuples containing the "action id" and "action name" for each action
in the WebSubmit database.
@return: tuple of tuples: (actid,actname)
"""
q = """SELECT sactname,lactname FROM sbmACTION ORDER BY sactname ASC"""
return run_sql(q)
def get_number_actions_with_actid(actid):
"""Return the number of actions found for a given action id.
@param actid: action id (sactname) to query for
@return: an integer count of the number of actions in the websubmit database for this actid.
"""
q = """SELECT COUNT(sactname) FROM sbmACTION WHERE sactname=%s"""
return int(run_sql(q, (actid,))[0][0])
def insert_action_details(actid, actname, working_dir, status_text):
"""Insert details of a new action into the websubmit database IF there are not already actions
with the same actid (sactname).
@param actid: unique action id (sactname)
@param actname: action name (lactname)
@param working_dir: directory action works from (dir)
@param status_text: text string indicating action status (statustext)
@return: 0 (ZERO) if insert is performed; 1 (ONE) if insert not performed due to rows existing for
given action name.
"""
# Check record with code 'actid' does not already exist:
numrows_actid = get_number_actions_with_actid(actid)
if numrows_actid == 0:
# insert new action:
q = """INSERT INTO sbmACTION (lactname,sactname,dir,cd,md,actionbutton,statustext) VALUES (%s,%s,%s,CURDATE(),CURDATE(),NULL,%s)"""
run_sql(q, (actname, actid, working_dir, status_text))
return 0 # Everything is OK
else:
return 1 # Everything not OK: rows may already exist for action with 'actid'
## Functions relating to WebSubmit Form Element JavaScript CHECKING FUNCTIONS, their addition, and their
## modification:
def get_number_jschecks_with_chname(chname):
"""Return the number of Checks found for a given check name/id.
@param chname: Check name/id (chname) to query for
@return: an integer count of the number of Checks in the WebSubmit database for this chname.
"""
q = """SELECT COUNT(chname) FROM sbmCHECKS where chname=%s"""
return int(run_sql(q, (chname,))[0][0])
def get_all_jscheck_names():
"""Return a list of the names of all WebSubmit JSChecks"""
q = """SELECT DISTINCT(chname) FROM sbmCHECKS ORDER BY chname ASC"""
res = run_sql(q)
return map(lambda x: str(x[0]), res)
def get_chname_alljschecks():
"""Get and return a tuple of tuples containing the "check name" (chname) for each JavaScript Check
in the WebSubmit database.
@return: tuple of tuples: (chname)
"""
q = """SELECT chname FROM sbmCHECKS ORDER BY chname ASC"""
return run_sql(q)
def get_jscheck_details(chname):
"""Get and return a tuple of tuples for all Checks with the check id/name "chname".
@param chname: Check name/Identifier Code (chname).
@return: tuple of tuples (one tuple per check row): (chname,chdesc,cd,md).
"""
q = """SELECT ch.chname, ch.chdesc, ch.cd, ch.md FROM sbmCHECKS AS ch WHERE ch.chname=%s"""
return run_sql(q, (chname,))
def insert_jscheck_details(chname, chdesc):
"""Insert details of a new JavaScript Check into the WebSubmit database IF there are not already Checks
with the same Check-name (chname).
@param chname: unique check id/name (chname)
@param chdesc: Check description (the JavaScript code body that is the Check) (chdesc)
@return: 0 (ZERO) if insert is performed; 1 (ONE) if insert not performed due to rows existing for
given Check name/id.
"""
# Check record with code 'chname' does not already exist:
numrows_chname = get_number_jschecks_with_chname(chname)
if numrows_chname == 0:
# insert new Check:
q = """INSERT INTO sbmCHECKS (chname,chdesc,cd,md,chefi1,chefi2) VALUES (%s,%s,CURDATE(),CURDATE(),NULL,NULL)"""
run_sql(q, (chname, chdesc))
return 0 # Everything is OK
else:
return 1 # Everything not OK: rows may already exist for Check with 'chname'
def update_jscheck_details(chname, chdesc):
"""Update the details of a Check in the WebSubmit database IF there was only one Check
with that check id/name (chname).
@param chname: unique Check id/name (chname)
@param chdesc: Check description (the JavaScript code body that is the Check) (chdesc)
@return: 0 (ZERO) if update is performed; 1 (ONE) if insert not performed due to rows existing for
given Check.
"""
# Check record with code 'chname' does not already exist:
numrows_chname = get_number_jschecks_with_chname(chname)
if numrows_chname == 1:
q = """UPDATE sbmCHECKS SET chdesc=%s, md=CURDATE() WHERE chname=%s"""
run_sql(q, (chdesc, chname))
return 0 # Everything is OK
else:
return 1 # Everything not OK: Either no rows or more than one row for check "chname"
## Functions relating to WebSubmit FUNCTIONS, their addition, and their modification:
def get_function_description(function):
"""Get and return a tuple containing the function description (description) for
the function with the name held in the "function" parameter.
@return: tuple of tuple (for one function): ((description,))
"""
q = """SELECT description FROM sbmALLFUNCDESCR where function=%s"""
return run_sql(q, (function,))
def get_function_parameter_vals_doctype(doctype, paramlist):
res = []
q = """SELECT name, value FROM sbmPARAMETERS WHERE doctype=%s AND name=%s"""
for par in paramlist:
r = run_sql(q, (doctype, par))
if len(r) > 0:
res.append(r[0])
else:
res.append((par, ""))
return res
def get_function_parameters(function):
"""Get the list of paremeters for a given function
@param function: the function name
@return: tuple of tuple ((param,))
"""
q = """SELECT param FROM sbmFUNDESC WHERE function=%s ORDER BY param ASC"""
return run_sql(q, (function,))
def get_number_parameters_with_paramname_funcname(funcname, paramname):
"""Return the number of parameters found for a given function name and parameter name. I.e. count the
number of times a given parameter appears for a given function.
@param funcname: Function name (function) to query for.
@param paramname: name of the parameter whose instances for the given function are to be counted.
@return: an integer count of the number of parameters matching the criteria.
"""
q = """SELECT COUNT(param) FROM sbmFUNDESC WHERE function=%s AND param=%s"""
return int(run_sql(q, (funcname, paramname))[0][0])
def get_distinct_paramname_all_function_parameters():
"""Get the names of all function parameters.
@return: tuple of tuples: (param,)
"""
q = """SELECT DISTINCT(param) FROM sbmFUNDESC ORDER BY param ASC"""
return run_sql(q)
def get_distinct_paramname_all_websubmit_parameters():
"""Get the names of all WEBSUBMIT parameters (i.e. parameters that are used somewhere by WebSubmit actions.
@return: tuple of tuples (param,)
"""
q = """SELECT DISTINCT(name) FROM sbmPARAMETERS ORDER BY name ASC"""
return run_sql(q)
def get_distinct_paramname_all_websubmit_function_parameters():
"""Get and return a tuple of tuples containing the names of all parameters in the WebSubmit system.
@return: tuple of tuples: ((param,),(param,))
"""
param_names = {}
all_params_list = []
all_function_params = get_distinct_paramname_all_function_parameters()
all_websubmit_params = get_distinct_paramname_all_websubmit_parameters()
for func_param in all_function_params:
param_names[func_param[0]] = None
for websubmit_param in all_websubmit_params:
param_names[websubmit_param[0]] = None
all_params_names = param_names.keys()
all_params_names.sort()
for param in all_params_names:
all_params_list.append((param,))
return all_params_list
def regulate_score_of_all_functions_in_step_to_ascending_multiples_of_10_for_submission(doctype, action, step):
"""Within a step of a submission, regulate the scores of all functions to multiples of 10. For example, for
the following:
Submission Func Step Score
SBITEST Print 2 10
SBITEST Run 2 11
SBITEST Alert 2 20
SBITEST End 2 50
...regulate the scores like this:
Submission Func Step Score
SBITEST Print 2 10
SBITEST Run 2 20
SBITEST Alert 2 30
SBITEST End 2 40
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param step: (integer) the number of the step in which functions scores are to be regulated
@return: None
@Exceptions raised:
InvenioWebSubmitAdminWarningDeleteFailed - in the case that it wasn't possible to delete functions
"""
functnres = get_name_step_score_of_all_functions_in_step_of_submission(doctype=doctype, action=action, step=step)
i = 1
score_order_broken = 0
for functn in functnres:
cur_functn_score = int(functn[2])
if cur_functn_score != i * 10:
## this score is not a correct multiple of 10 for its place in the order
score_order_broken = 1
i += 1
if score_order_broken == 1:
## the function scores were not good.
## delete the functions within this step
try:
delete_all_functions_in_step_of_submission(doctype=doctype, action=action, step=step)
except InvenioWebSubmitAdminWarningDeleteFailed, e:
## unable to delete some or all functions
## pass the exception back up to the caller
raise
## re-insert them with the correct scores
i = 10
for functn in functnres:
insert_functn_name = functn[0]
try:
insert_function_into_submission_at_step_and_score(doctype=doctype, action=action,
function=insert_functn_name,
step=step, score=i)
except InvenioWebSubmitAdminWarningReferentialIntegrityViolation, e:
## tried to insert a function that doesn't exist in WebSubmit DB
## TODO : LOG ERROR
## continue onto next loop iteration - don't increment value of I
continue
i += 10
return
def get_number_of_functions_with_functionname_in_submission_at_step_and_score(doctype, action, function, step, score):
"""Get the number or rows for a particular function at a given step and score of a doctype submission"""
q = """SELECT COUNT(doctype) FROM sbmFUNCTIONS where doctype=%s AND action=%s AND function=%s AND step=%s AND score=%s"""
return int(run_sql(q, (doctype, action, function, step, score))[0][0])
def get_number_functions_doctypesubmission_step_score(doctype, action, step, score):
"""Get the number or rows for a particular function at a given step and score of a doctype submission"""
q = """SELECT COUNT(doctype) FROM sbmFUNCTIONS where doctype=%s AND action=%s AND step=%s AND score=%s"""
return int(run_sql(q, (doctype, action, step, score))[0][0])
def update_step_score_doctypesubmission_function(doctype, action, function, oldstep, oldscore, newstep, newscore):
numrows_function = get_number_of_functions_with_functionname_in_submission_at_step_and_score(doctype=doctype, action=action,
function=function, step=oldstep, score=oldscore)
if numrows_function == 1:
q = """UPDATE sbmFUNCTIONS SET step=%s, score=%s WHERE doctype=%s AND action=%s AND function=%s AND step=%s AND score=%s"""
run_sql(q, (newstep, newscore, doctype, action, function, oldstep, oldscore))
return 0 ## Everything OK
else:
## Everything NOT OK - perhaps this function doesn't exist at this posn - cannot update
return 1
def move_position_submissionfunction_up(doctype, action, function, funccurstep, funccurscore):
functions_above = get_functionname_step_score_allfunctions_beforereference_doctypesubmission(doctype=doctype,
action=action,
step=funccurstep,
score=funccurscore)
numrows_functions_above = len(functions_above)
if numrows_functions_above < 1:
## there are no functions above this - nothing to do
return 0 ## Everything OK
## get the details of the function above this one:
name_function_above = functions_above[numrows_functions_above-1][0]
step_function_above = int(functions_above[numrows_functions_above-1][1])
score_function_above = int(functions_above[numrows_functions_above-1][2])
if step_function_above < int(funccurstep):
## the function above the function to be moved is in a lower step. Put the function to be moved in the same step
## as the one above, but set its score to be greater by 10 than the one above
error_code = update_step_score_doctypesubmission_function(doctype=doctype,
action=action,
function=function,
oldstep=funccurstep,
oldscore=funccurscore,
newstep=step_function_above,
newscore=int(score_function_above)+10)
return error_code
else:
## the function above is in the same step as the function to be moved. just switch them around (scores)
## first, delete the function above:
error_code = delete_function_doctypesubmission_step_score(doctype=doctype,
action=action,
function=name_function_above,
step=step_function_above,
score=score_function_above)
if error_code == 0:
## now update the function to be moved with the step and score of the function that was above it
error_code = update_step_score_doctypesubmission_function(doctype=doctype,
action=action,
function=function,
oldstep=funccurstep,
oldscore=funccurscore,
newstep=step_function_above,
newscore=score_function_above)
if error_code == 0:
## now insert the function that *was* above, into the position of the function that we have just moved
try:
insert_function_into_submission_at_step_and_score(doctype=doctype, action=action,
function=name_function_above,
step=funccurstep,
score=funccurscore)
return 0
except InvenioWebSubmitAdminWarningReferentialIntegrityViolation, e:
return 1
else:
## could not update the function that was to be moved! Try to re-insert that which was deleted
try:
insert_function_into_submission_at_step_and_score(doctype=doctype, action=action,
function=name_function_above,
step=step_function_above,
score=score_function_above)
except InvenioWebSubmitAdminWarningReferentialIntegrityViolation, e:
pass
return 1 ## Returning an ERROR code to signal that the move did not work
else:
## Unable to delete the function above that which we want to move. Cannot move the function then.
## Return an error code to signal that things went wrong
return 1
def add_10_to_score_of_all_functions_in_step_of_submission(doctype, action, step):
"""Add 10 to the score of all functions within a particular step of a submission.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param step: (integer) the step in which all function scores are to be incremented by 10
@return: None
"""
q = """UPDATE sbmFUNCTIONS SET score=score+10 WHERE doctype=%s AND action=%s AND step=%s"""
run_sql(q, (doctype, action, step))
return
def update_score_of_allfunctions_from_score_within_step_in_submission_reduce_by_val(doctype, action, step, fromscore, val):
q = """UPDATE sbmFUNCTIONS SET score=score-%s WHERE doctype=%s AND action=%s AND step=%s AND score >= %s"""
run_sql(q, (val, doctype, action, step, fromscore))
return
def add_10_to_score_of_all_functions_in_step_of_submission_and_with_score_equalto_or_above_val(doctype, action, step, fromscore):
"""Add 10 to the score of all functions within a particular step of a submission, but with a score equal-to,
or higher than a given value (fromscore).
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param step: (integer) the step in which all function scores are to be incremented by 10
@param fromscore: (integer) the score from which all scores are incremented by 10
@return: None
"""
q = """UPDATE sbmFUNCTIONS SET score=score+10 WHERE doctype=%s AND action=%s AND step=%s AND score >= %s"""
run_sql(q, (doctype, action, step, fromscore))
return
def get_number_of_submission_functions_in_step_between_two_scores(doctype, action, step, score1, score2):
"""Return the number of submission functions found within a particular step of a submission, and between
two scores.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param step: (integer) the number of the step
@param score1: (integer) the first score boundary
@param score2: (integer) the second score boundary
@return: (integer) the number of functions found
"""
q = """SELECT COUNT(doctype) FROM sbmFUNCTIONS WHERE doctype=%s AND action=%s AND step=%s AND (score BETWEEN %s AND %s)"""
return int(run_sql(q, (doctype, action, step,
((score1 <= score2 and score1) or (score2)),
((score1 <= score2 and score2) or (score1))))[0][0])
def move_submission_function_from_one_position_to_another_position(doctype, action, movefuncname, movefuncfromstep,
movefuncfromscore, movefunctostep, movefunctoscore):
"""Move a submission function from one score/step to another position.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param movefuncname: (string) the name of the function to be moved
@param movefuncfromstep: (integer) the step in which the function to be moved is located
@param movefuncfromscore: (integer) the score at which the function to be moved is located
@parm movefunctostep: (integer) the step to which the function is to be moved
@param movefunctoscore: (integer) the to which the function is to be moved
@return: None
@exceptions raised:
InvenioWebSubmitAdminWarningDeleteFailed - when unable to delete functions when regulating their scores
InvenioWebSubmitAdminWarningNoRowsFound - when the function to be moved is not found
InvenioWebSubmitAdminWarningInsertFailed - when regulating the scores of functions, and unable to insert
a function
InvenioWebSubmitAdminWarningReferentialIntegrityViolation - when the function to be inserted does not
exist in WebSubmit
InvenioWebSubmitAdminWarningNoUpdate - when the function was not moved because there would have been no
change in its position, or because the function could not be moved for some reason
"""
## first check that there is a function "movefuncname"->"movefuncfromstep";"movefuncfromscore"
numrows_movefunc = \
get_number_of_functions_with_functionname_in_submission_at_step_and_score(doctype=doctype,
action=action,
function=movefuncname,
step=movefuncfromstep,
score=movefuncfromscore)
if numrows_movefunc < 1:
## the function to move doesn't exist
msg = """Could not move function [%s] at step [%s], score [%s] in submission [%s] to another position. """\
"""This function does not exist at this position."""\
% (movefuncname, movefuncfromstep, movefuncfromscore, "%s%s" % (action, doctype))
raise InvenioWebSubmitAdminWarningNoRowsFound(msg)
## check that the function is not being moved to the same position:
if movefuncfromstep == movefunctostep:
num_functs_between_old_and_new_posn =\
get_number_of_submission_functions_in_step_between_two_scores(doctype=doctype,
action=action,
step=movefuncfromstep,
score1=movefuncfromscore,
score2=movefunctoscore)
if num_functs_between_old_and_new_posn < 3 and (movefuncfromscore <= movefunctoscore):
## moving the function to the same position - no point
msg = """The function [%s] of the submission [%s] was not moved from step [%s], score [%s] to """\
"""step [%s], score [%s] as there would have been no change in position."""\
% (movefuncname, "%s%s" % (action, doctype), movefuncfromstep,
movefuncfromscore, movefunctostep, movefunctoscore)
raise InvenioWebSubmitAdminWarningNoUpdate(msg)
## delete the function that is being moved:
try:
delete_the_function_at_step_and_score_from_a_submission(doctype=doctype, action=action,
function=movefuncname, step=movefuncfromstep,
score=movefuncfromscore)
except InvenioWebSubmitAdminWarningDeleteFailed, e:
## unable to delete the function - cannot perform the move.
msg = """Unable to move function [%s] at step [%s], score [%s] of submission [%s] - couldn't """\
"""delete the function from its current position."""\
% (movefuncname, movefuncfromstep, movefuncfromscore, "%s%s" % (action, doctype))
raise InvenioWebSubmitAdminWarningNoUpdate(msg)
## now insert the function into its new position and correct the order of all functions within that step:
insert_function_into_submission_at_step_and_score_then_regulate_scores_of_functions_in_step(doctype=doctype,
action=action,
function=movefuncname,
step=movefunctostep,
score=movefunctoscore)
## regulate the scores of the functions in the step from which the function was moved
try:
regulate_score_of_all_functions_in_step_to_ascending_multiples_of_10_for_submission(doctype=doctype,
action=action,
step=movefuncfromstep)
except InvenioWebSubmitAdminWarningDeleteFailed, e:
## couldn't delete some or all functions
msg = """Moved function [%s] to step [%s], score [%s] of submission [%s]. However, when trying to regulate"""\
""" scores of functions in step [%s], failed to delete some functions. Check that they have not been lost."""\
% (movefuncname, movefuncfromstep, movefuncfromscore, "%s%s" % (action, doctype), movefuncfromstep)
raise InvenioWebSubmitAdminWarningDeleteFailed(msg)
## finished
return
def move_position_submissionfunction_fromposn_toposn(doctype, action, movefuncname, movefuncfromstep,
movefuncfromscore, movefunctoname, movefunctostep,
movefunctoscore):
## first check that there is a function "movefuncname"->"movefuncfromstep";"movefuncfromscore"
numrows_movefunc = get_number_of_functions_with_functionname_in_submission_at_step_and_score(doctype=doctype,
action=action,
function=movefuncname,
step=movefuncfromstep,
score=movefuncfromscore)
if numrows_movefunc < 1:
## the function to move does not exist!
return 1
## now check that there is a function "movefunctoname"->"movefunctostep";"movefunctoscore"
numrows_movefunctoposn = get_number_of_functions_with_functionname_in_submission_at_step_and_score(doctype=doctype,
action=action,
function=movefunctoname,
step=movefunctostep,
score=movefunctoscore)
if numrows_movefunctoposn < 1:
## the function in the position to move to does not exist!
return 1
##
functions_above = get_functionname_step_score_allfunctions_beforereference_doctypesubmission(doctype=doctype,
action=action,
step=movefunctostep,
score=movefunctoscore)
numrows_functions_above = len(functions_above)
if numrows_functions_above >= 1:
function_above_name = functions_above[numrows_functions_above-1][0]
function_above_step = int(functions_above[numrows_functions_above-1][1])
function_above_score = int(functions_above[numrows_functions_above-1][2])
## Check that the place to which we are moving our function is NOT the same place that it is currently
## situated!
if (numrows_functions_above < 1) or (int(functions_above[numrows_functions_above-1][1]) < int(movefunctostep)): ### NICK SEPARATE THESE 2 OUT
## EITHER: there are no functions above the destination position; -OR- the function immediately above the
## destination position function is in a lower step.
## So, it is not important to care about any functions above for the move
if ((numrows_functions_above < 1) and (int(movefunctoscore) > 10)):
## There is a space of 10 or more between the score of the function into whose place we are moving
## a function, and the one above it. Set the new function score for the moved function as the
## score of the function whose place it is taking in the order - 10
error_code = update_step_score_doctypesubmission_function(doctype=doctype,
action=action,
function=movefuncname,
oldstep=movefuncfromstep,
oldscore=movefuncfromscore,
newstep=movefunctostep,
newscore=int(movefunctoscore)-10)
return error_code
elif (int(movefunctoscore) - 10 > function_above_score):
## There is a space of 10 or more between the score of the function into whose place we are moving
## a function, and the one above it. Set the new function score for the moved function as the
## score of the function whose place it is taking in the order - 10
error_code = update_step_score_doctypesubmission_function(doctype=doctype,
action=action,
function=movefuncname,
oldstep=movefuncfromstep,
oldscore=movefuncfromscore,
newstep=movefunctostep,
newscore=int(movefunctoscore)-10)
return error_code
else:
## There is not a space of 10 or more in the scores of the function into whose position we are moving
## a function and the function above it. It is necessary to augment the score of all functions
## within the step of the one into whose position our function will be moved, from that position onwards,
## by 10; then the function to be moved can be inserted into the newly created space
## First, delete the function to be moved so that it is not changed during any augmentation:
error_code = delete_function_doctypesubmission_step_score(doctype=doctype,
action=action,
function=movefuncname,
step=movefuncfromstep,
score=movefuncfromscore)
if error_code == 0:
## deletion successful
## now augment the relevant scores:
add_10_to_score_of_all_functions_in_step_of_submission_and_with_score_equalto_or_above_val(doctype=doctype,
action=action,
step=movefunctostep,
fromscore=movefunctoscore)
try:
insert_function_into_submission_at_step_and_score(doctype=doctype, action=action,
function=movefuncname,
step=movefunctostep,
score=movefunctoscore)
except InvenioWebSubmitAdminWarningReferentialIntegrityViolation, e:
return 1
return 0
else:
## could not delete it - cannot continue:
return 1
else:
## there are functions above the destination position function and they are in the same step as it.
if int(movefunctoscore) - 10 > function_above_score:
## the function above has a score that is more than 10 below that into whose position we are moving
## a function. It is therefore possible to set the new score as movefunctoscore - 10:
error_code = update_step_score_doctypesubmission_function(doctype=doctype,
action=action,
function=movefuncname,
oldstep=movefuncfromstep,
oldscore=movefuncfromscore,
newstep=movefunctostep,
newscore=int(movefunctoscore)-10)
return error_code
else:
## there is not a space of 10 or more in the scores of the function into whose position our function
## is to be moved and the function above it. It is necessary to augment the score of all functions
## within the step of the one into whose position our function will be moved, from that position onwards,
## by 10; then the function to be moved can be inserted into the newly created space
## First, delete the function to be moved so that it is not changed during any augmentation:
error_code = delete_function_doctypesubmission_step_score(doctype=doctype,
action=action,
function=movefuncname,
step=movefuncfromstep,
score=movefuncfromscore)
if error_code == 0:
## deletion successful
## now augment the relevant scores:
add_10_to_score_of_all_functions_in_step_of_submission_and_with_score_equalto_or_above_val(doctype=doctype,
action=action,
step=movefunctostep,
fromscore=movefunctoscore)
try:
insert_function_into_submission_at_step_and_score(doctype=doctype, action=action,
function=movefuncname,
step=movefunctostep,
score=movefunctoscore)
except InvenioWebSubmitAdminWarningReferentialIntegrityViolation, e:
return 1
return 0
else:
## could not delete it - cannot continue:
return 1
def move_position_submissionfunction_down(doctype, action, function, funccurstep, funccurscore):
functions_below = get_functionname_step_score_allfunctions_afterreference_doctypesubmission(doctype=doctype,
action=action,
step=funccurstep,
score=funccurscore)
numrows_functions_below = len(functions_below)
if numrows_functions_below < 1:
## there are no functions below this - nothing to do
return 0 ## Everything OK
## get the details of the function below this one:
name_function_below = functions_below[0][0]
step_function_below = int(functions_below[0][1])
score_function_below = int(functions_below[0][2])
if step_function_below > int(funccurstep):
## the function below is in a higher step: update all functions in that step with their score += 10,
## then place the function to be moved into that step with a score of that which the function below had
if score_function_below <= 10:
## the score of the function below is 10 or less: add 10 to the score of all functions in that step
add_10_to_score_of_all_functions_in_step_of_submission(doctype=doctype, action=action, step=step_function_below)
numrows_function_stepscore_moveto = get_number_functions_doctypesubmission_step_score(doctype=doctype,
action=action,
step=step_function_below,
score=score_function_below)
if numrows_function_stepscore_moveto == 0:
## the score of the step that the function will be moved to is empty - it's safe to move the function there:
error_code = update_step_score_doctypesubmission_function(doctype=doctype,
action=action,
function=function,
oldstep=funccurstep,
oldscore=funccurscore,
newstep=step_function_below,
newscore=score_function_below)
return error_code
else:
## could not move the functions below? Cannot move this function then
return 1
else:
## the function below is already on a score higher than 10 - just move the function into score 10 in that step
error_code = update_step_score_doctypesubmission_function(doctype=doctype,
action=action,
function=function,
oldstep=funccurstep,
oldscore=funccurscore,
newstep=step_function_below,
newscore=10)
return error_code
else:
## the function below is in the same step. Switch it with this function
## first, delete the function below:
error_code = delete_function_doctypesubmission_step_score(doctype=doctype,
action=action,
function=name_function_below,
step=step_function_below,
score=score_function_below)
if error_code == 0:
## now update the function to be moved with the step and score of the function that was below it
error_code = update_step_score_doctypesubmission_function(doctype=doctype,
action=action,
function=function,
oldstep=funccurstep,
oldscore=funccurscore,
newstep=step_function_below,
newscore=score_function_below)
if error_code == 0:
## now insert the function that *was* below, into the position of the function that has just been moved
try:
insert_function_into_submission_at_step_and_score(doctype=doctype, action=action,
function=name_function_below,
step=funccurstep, score=funccurscore)
except InvenioWebSubmitAdminWarningReferentialIntegrityViolation, e:
return 1
return 0
else:
## could not update the function that was to be moved! Try to re-insert that which was deleted
try:
insert_function_into_submission_at_step_and_score(doctype=doctype, action=action,
function=name_function_below,
step=step_function_below,
score=score_function_below)
except InvenioWebSubmitAdminWarningReferentialIntegrityViolation, e:
pass
return 1 ## Returning an ERROR code to signal that the move did not work
else:
## Unable to delete the function below that which we want to move. Cannot move the function then.
## Return an error code to signal that things went wrong
return 1
def get_names_of_all_functions():
"""Return a list of the names of all WebSubmit functions (as strings).
The function names will be sorted in ascending alphabetical order.
@return: a list of strings
"""
q = """SELECT function FROM sbmALLFUNCDESCR ORDER BY function ASC"""
res = run_sql(q)
return map(lambda x: str(x[0]), res)
def get_funcname_funcdesc_allfunctions():
"""Get and return a tuple of tuples containing the "function name" (function) and function textual
description (description) for each WebSubmit function in the WebSubmit database.
@return: tuple of tuples: ((function,description),(function,description)[,...])
"""
q = """SELECT function, description FROM sbmALLFUNCDESCR ORDER BY function ASC"""
return run_sql(q)
def get_function_usage_details(function):
"""Get the details of a function's usage in WebSubmit.
This means get the following usage details:
- doctype: the unique ID of the document type with which the usage is associated
- docname: the long-name of the document type
- action id: the unique ID of the action of the doctype, with which the usage is associated
- action name: the long name of this action
- function step: the step in which the instance of function usage occurs
- function score: the score (of the above-mentioned step) at which the function is called
@param function: (string) the name of the function whose WebSubmit usage is to be examined.
@return: tuple of tuples whereby each tuple represents one instance of the function's usage:
(doctype, docname, action id, action name, function-step, function-score)
"""
q = """SELECT fun.doctype, dt.ldocname, fun.action, actn.lactname, fun.step, fun.score """ +\
"""FROM sbmDOCTYPE AS dt LEFT JOIN sbmFUNCTIONS AS fun ON (fun.doctype=dt.sdocname) """ +\
"""LEFT JOIN sbmIMPLEMENT as imp ON (fun.action=imp.actname AND fun.doctype=imp.docname) """ +\
"""LEFT JOIN sbmACTION AS actn ON (actn.sactname=imp.actname) WHERE fun.function=%s """ +\
"""ORDER BY dt.sdocname ASC, fun.action ASC, fun.step ASC, fun.score ASC"""
return run_sql(q, (function,))
def get_number_of_functions_with_funcname(funcname):
"""Return the number of Functions found in the WebSubmit DB for a given function name.
@param funcname: (string) the name of the function
@return: an integer count of the number of Functions in the WebSubmit database for this function name.
"""
q = """SELECT COUNT(function) FROM sbmALLFUNCDESCR where function=%s"""
return int(run_sql(q, (funcname,))[0][0])
def insert_function_details(function, fundescr):
""""""
numrows_function = get_number_of_functions_with_funcname(function)
if numrows_function == 0:
## Insert new function
q = """INSERT INTO sbmALLFUNCDESCR (function, description) VALUES (%s, %s)"""
run_sql(q, (function, fundescr))
return 0 # Everything is OK
else:
return 1 # Everything not OK: rows may already exist for function with name 'function'
def update_function_description(funcname, funcdescr):
"""Update the description of function "funcname", with string contained in "funcdescr".
Function description will be updated only if one row was found for the function in the DB.
@param funcname: the unique function name of the function whose description is to be updated
@param funcdescr: the new, updated description of the function
@return: error code (0 is OK, 1 is BAD insert)
"""
numrows_function = get_number_of_functions_with_funcname(funcname)
if numrows_function == 1:
## perform update of description
q = """UPDATE sbmALLFUNCDESCR SET description=%s WHERE function=%s"""
run_sql(q, ( (funcdescr != "" and funcdescr) or (None), funcname ) )
return 0 ## Everything OK
else:
return 1 ## Everything not OK: either no rows, or more than 1 row for function "funcname"
def delete_function_parameter(function, parameter_name):
"""Delete a given parameter from a from a given function.
@param function: name of the function from which the parameter is to be deleted.
@param parameter_name: name of the parameter to be deleted from the function.
@return: error-code. 0 means successful deletion of the parameter; 1 means deletion failed because
the parameter did not exist for the given function.
"""
numrows_function_parameter = get_number_parameters_with_paramname_funcname(funcname=function, paramname=parameter_name)
if numrows_function_parameter >= 1:
## perform deletion of parameter(s)
q = """DELETE FROM sbmFUNDESC WHERE function=%s AND param=%s"""
run_sql(q, (function, parameter_name))
return 0 ## Everything OK
else:
return 1 ## Everything not OK: no rows - this parameter doesn't exist for this function
def add_function_parameter(function, parameter_name):
"""Add a parameter (parameter_name) to a given function.
@param function: name of the function from which the parameter is to be deleted.
@param parameter_name: name of the parameter to be deleted from the function.
@return: error-code. 0 means successful addition of the parameter; 1 means addition failed because
the parameter already existed for the given function.
"""
numrows_function_parameter = get_number_parameters_with_paramname_funcname(funcname=function, paramname=parameter_name)
if numrows_function_parameter == 0:
## perform addition of parameter
q = """INSERT INTO sbmFUNDESC (function, param) VALUES (%s, %s)"""
run_sql(q, (function, parameter_name))
return 0 ## Everything OK
else:
return 1 ## Everything NOT OK: parameter already exists for function
## Functions relating to WebSubmit ELEMENTS, their addition, and their modification:
def get_number_elements_with_elname(elname):
"""Return the number of Elements found for a given element name/id.
@param elname: Element name/id (name) to query for
@return: an integer count of the number of Elements in the WebSubmit database for this elname.
"""
q = """SELECT COUNT(name) FROM sbmFIELDDESC where name=%s"""
return int(run_sql(q, (elname,))[0][0])
def get_doctype_action_pagenb_for_submissions_using_element(elname):
"""Get and return a tuple of tuples containing the doctype, the action, and the
page number (pagenb) for the instances of use of the element identified by "elname".
I.e. get the information about which submission pages the element is used on.
@param elname: The unique identifier for an element ("name" in "sbmFIELDDESC",
"fidesc" in "sbmFIELD").
@return: tuple of tuples (doctype, action, pagenb)
"""
q = """SELECT subm.docname, subm.actname, sf.pagenb FROM sbmIMPLEMENT AS subm LEFT JOIN sbmFIELD AS sf ON sf.subname=CONCAT(subm.actname, subm.docname) WHERE sf.fidesc=%s ORDER BY sf.subname ASC, sf.pagenb ASC"""
return run_sql(q, (elname,))
def get_subname_pagenb_element_use(elname):
"""Get and return a tuple of tuples containing the "submission name" (subname) and the
page number (pagenb) for the instances of use of the element identified by "elname".
I.e. get the information about which submission pages the element is used on.
@param elname: The unique identifier for an element ("name" in "sbmFIELDDESC",
"fidesc" in "sbmFIELD").
@return: tuple of tuples (subname, pagenb)
"""
q = """SELECT sf.subname, sf.pagenb FROM sbmFIELD AS sf WHERE sf.fidesc=%s ORDER BY sf.subname ASC, sf.pagenb ASC"""
return run_sql(q, (elname,))
def get_elename_allelements():
"""Get and return a tuple of tuples containing the "element name" (name) for each WebSubmit
element in the WebSubmit database.
@return: tuple of tuples: (name)
"""
q = """SELECT name FROM sbmFIELDDESC ORDER BY name"""
return run_sql(q)
def get_all_element_names():
"""Return a list of the names of all "elements" in the WebSubmit DB.
@return: a list of strings, where each string is a WebSubmit element
"""
q = """SELECT DISTINCT(name) FROM sbmFIELDDESC ORDER BY name"""
res = run_sql(q)
return map(lambda x: str(x[0]), res)
def get_element_details(elname):
"""Get and return a tuple of tuples for all ELEMENTS with the element name "elname".
@param elname: ELEMENT name (elname).
@return: tuple of tuples (one tuple per element): (marccode,type,size,rows,cols,maxlength,
val,fidesc,cd,md,modifytext)
"""
q = "SELECT el.marccode, el.type, el.size, el.rows, el.cols, el.maxlength, " + \
"el.val, el.fidesc, el.cd, el.md, el.modifytext FROM sbmFIELDDESC AS el WHERE el.name=%s"
return run_sql(q, (elname,))
def update_element_details(elname, elmarccode, eltype, elsize, elrows, elcols, elmaxlength, \
elval, elfidesc, elmodifytext):
"""Update the details of an ELEMENT in the WebSubmit database IF there was only one Element
with that element id/name (name).
@param elname: unique Element id/name (name)
@param elmarccode: element's MARC code
@param eltype: type of element
@param elsize: size of element
@param elrows: number of rows in element
@param elcols: number of columns in element
@param elmaxlength: element maximum length
@param elval: element default value
@param elfidesc: element description
@param elmodifytext: element's modification text
@return: 0 (ZERO) if update is performed; 1 (ONE) if update not performed due to rows existing for
given Element.
"""
# Check record with code 'elname' does not already exist:
numrows_elname = get_number_elements_with_elname(elname)
if numrows_elname == 1:
q = """UPDATE sbmFIELDDESC SET marccode=%s, type=%s, size=%s, rows=%s, cols=%s, maxlength=%s, """ +\
"""val=%s, fidesc=%s, modifytext=%s, md=CURDATE() WHERE name=%s"""
run_sql(q, ( elmarccode,
(eltype != "" and eltype) or (None),
(elsize != "" and elsize) or (None),
(elrows != "" and elrows) or (None),
(elcols != "" and elcols) or (None),
(elmaxlength != "" and elmaxlength) or (None),
(elval != "" and elval) or (None),
(elfidesc != "" and elfidesc) or (None),
(elmodifytext != "" and elmodifytext) or (None),
elname
) )
return 0 # Everything is OK
else:
return 1 # Everything not OK: Either no rows or more than one row for element "elname"
def insert_element_details(elname, elmarccode, eltype, elsize, elrows, elcols, \
elmaxlength, elval, elfidesc, elmodifytext):
"""Insert details of a new Element into the WebSubmit database IF there are not already elements
with the same element name (name).
@param elname: unique Element id/name (name)
@param elmarccode: element's MARC code
@param eltype: type of element
@param elsize: size of element
@param elrows: number of rows in element
@param elcols: number of columns in element
@param elmaxlength: element maximum length
@param elval: element default value
@param elfidesc: element description
@param elmodifytext: element's modification text
@return: 0 (ZERO) if insert is performed; 1 (ONE) if insert not performed due to rows existing for
given Element.
"""
# Check element record with code 'elname' does not already exist:
numrows_elname = get_number_elements_with_elname(elname)
if numrows_elname == 0:
# insert new Check:
q = """INSERT INTO sbmFIELDDESC (name, alephcode, marccode, type, size, rows, cols, """ +\
"""maxlength, val, fidesc, cd, md, modifytext, fddfi2) VALUES(%s, NULL, """ +\
"""%s, %s, %s, %s, %s, %s, %s, %s, CURDATE(), CURDATE(), %s, NULL)"""
run_sql(q, ( elname,
elmarccode,
(eltype != "" and eltype) or (None),
(elsize != "" and elsize) or (None),
(elrows != "" and elrows) or (None),
(elcols != "" and elcols) or (None),
(elmaxlength != "" and elmaxlength) or (None),
(elval != "" and elval) or (None),
(elfidesc != "" and elfidesc) or (None),
(elmodifytext != "" and elmodifytext) or (None)
) )
return 0 # Everything is OK
else:
return 1 # Everything not OK: rows may already exist for Element with 'elname'
# Functions relating to WebSubmit DOCUMENT TYPES:
def get_docid_docname_alldoctypes():
"""Get and return a tuple of tuples containing the "doctype id" (sdocname) and
"doctype name" (ldocname) for each action in the WebSubmit database.
@return: tuple of tuples: (docid,docname)
"""
q = """SELECT sdocname, ldocname FROM sbmDOCTYPE ORDER BY ldocname ASC"""
return run_sql(q)
def get_docid_docname_and_docid_alldoctypes():
"""Get and return a tuple of tuples containing the "doctype id" (sdocname) and
"doctype name" (ldocname) for each action in the WebSubmit database.
@return: tuple of tuples: (docid,docname)
"""
q = """SELECT sdocname, CONCAT(ldocname, " [", sdocname, "]") FROM sbmDOCTYPE ORDER BY ldocname ASC"""
return run_sql(q)
def get_number_doctypes_docid(docid):
"""Return the number of DOCUMENT TYPES found for a given document type id (sdocname).
@param docid: unique ID of document type whose instances are to be counted.
@return: an integer count of the number of document types in the WebSubmit database for this doctype id.
"""
q = """SELECT COUNT(sdocname) FROM sbmDOCTYPE where sdocname=%s"""
return int(run_sql(q, (docid,))[0][0])
def get_number_functions_doctype(doctype):
"""Return the number of FUNCTIONS found for a given DOCUMENT TYPE.
@param doctype: unique ID of doctype for which the number of functions are to be counted
@return: an integer count of the number of functions in the WebSubmit database for this doctype.
"""
q = """SELECT COUNT(doctype) FROM sbmFUNCTIONS where doctype=%s"""
return int(run_sql(q, (doctype,))[0][0])
def get_number_functions_action_doctype(doctype, action):
"""Return the number of FUNCTIONS found for a given ACTION of a given DOCUMENT TYPE.
@param doctype: unique ID of doctype for which the number of functions are to be counted
@param action: the action (of the document type "doctype") that owns the functions to be counted
@return: an integer count of the number of functions in the WebSubmit database for this doctype/action.
"""
q = """SELECT COUNT(doctype) FROM sbmFUNCTIONS where doctype=%s AND action=%s"""
return int(run_sql(q, (doctype, action))[0][0])
def get_number_of_functions_in_step_of_submission(doctype, action, step):
"""Return the number of FUNCTIONS within a step of a submission.
@param doctype: (string) unique ID of a doctype
@param action: (string) unique ID of an action
@param step: (integer) the number of the step in which the functions to be counted are situated
@return: an integer count of the number of functions found within the step of the submission
"""
q = """SELECT COUNT(doctype) FROM sbmFUNCTIONS where doctype=%s AND action=%s AND step=%s"""
return int(run_sql(q, (doctype, action, step))[0][0])
def get_number_categories_doctype(doctype):
"""Return the number of CATEGORIES (used to distinguish between submissions) found for a given DOCUMENT TYPE.
@param doctype: unique ID of doctype for which submission categories are to be counted
@return: an integer count of the number of categories in the WebSubmit database for this doctype.
"""
q = """SELECT COUNT(doctype) FROM sbmCATEGORIES where doctype=%s"""
return int(run_sql(q, (doctype,))[0][0])
def get_number_categories_doctype_category(doctype, categ):
"""Return the number of CATEGORIES (used to distinguish between submissions) found for a given
DOCUMENT TYPE/CATEGORY NAME. Basically, test to see whether a given category already exists
for a given document type.
@param doctype: unique ID of doctype for which the submission category is to be tested
@param categ: the category ID of the category to be tested for
@return: an integer count of the number of categories in the WebSubmit database for this doctype.
"""
q = """SELECT COUNT(sname) FROM sbmCATEGORIES where doctype=%s and sname=%s"""
return int(run_sql(q, (doctype, categ))[0][0])
def get_number_parameters_doctype(doctype):
"""Return the number of PARAMETERS (used by functions) found for a given DOCUMENT TYPE.
@param doctype: unique ID of doctype whose parameters are to be counted
@return: an integer count of the number of parameters in the WebSubmit database for this doctype.
"""
q = """SELECT COUNT(name) FROM sbmPARAMETERS where doctype=%s"""
return int(run_sql(q, (doctype,))[0][0])
def get_number_submissionfields_submissionnames(submission_names):
"""Return the number of SUBMISSION FIELDS found for a given list of submissions.
A doctype can have several submissions, and each submission can have many fields making up
its interface. Using this function, the fields owned by several submissions can be counted.
If the submissions in the list are all owned by one doctype, then it is possible to count the
submission fields owned by one doctype.
@param submission_names: unique IDs of all submissions whose fields are to be counted. If this
value is a string, it will be classed as a single submission name. Otherwise, a list/tuple of
strings must be passed - where each string is a submission name.
@return: an integer count of the number of fields in the WebSubmit database for these submission(s)
"""
q = """SELECT COUNT(subname) FROM sbmFIELD WHERE subname=%s"""
if type(submission_names) in (str, unicode):
submission_names = (submission_names,)
number_submissionnames = len(submission_names)
if number_submissionnames == 0:
return 0
if number_submissionnames > 1:
for i in range(1,number_submissionnames):
## Ensure that we delete all elements used by all submissions for the doctype in question:
q += """ OR subname=%s"""
return int(run_sql(q, map(lambda x: str(x), submission_names))[0][0])
def get_doctypeid_doctypes_implementing_action(action):
q = """SELECT doc.sdocname, CONCAT("[", doc.sdocname, "] ", doc.ldocname) FROM sbmDOCTYPE AS doc """\
"""LEFT JOIN sbmIMPLEMENT AS subm ON """\
"""subm.docname = doc.sdocname """\
"""WHERE subm.actname=%s """\
"""ORDER BY doc.sdocname ASC"""
return run_sql(q, (action,))
def get_number_submissions_doctype(doctype):
"""Return the number of SUBMISSIONS found for a given document type
@param doctype: the unique ID of the document type for which submissions are to be counted
@return: an integer count of the number of submissions owned by this doctype
"""
q = """SELECT COUNT(subname) FROM sbmIMPLEMENT WHERE docname=%s"""
return int(run_sql(q, (doctype,))[0][0])
def get_number_submissions_doctype_action(doctype, action):
"""Return the number of SUBMISSIONS found for a given document type/action
@param doctype: the unique ID of the document type for which submissions are to be counted
@param actname: the unique ID of the action that the submission implements, that is to be counted
@return: an integer count of the number of submissions found for this doctype/action ID
"""
q = """SELECT COUNT(subname) FROM sbmIMPLEMENT WHERE docname=%s and actname=%s"""
return int(run_sql(q, (doctype, action))[0][0])
def get_number_collection_doctype_entries_doctype(doctype):
"""Return the number of collection_doctype entries found for a given doctype
@param doctype: the document type for which the collection-doctypes are to be counted
@return: an integer count of the number of collection-doctype entries found for the
given document type
"""
q = """SELECT COUNT(id_father) FROM sbmCOLLECTION_sbmDOCTYPE WHERE id_son=%s"""
return int(run_sql(q, (doctype,))[0][0])
def get_all_category_details_for_doctype(doctype):
"""Return all details (short-name, long-name, position number) of all CATEGORIES found for a
given document type. If the position number is NULL, it will be assigned a value of zero.
Categories will be ordered primarily by ascending position number and then by ascending
alphabetical order of short-name.
@param doctype: (string) The document type for which categories are to be retrieved.
@return: (tuple) of tuples whereby each tuple is a row containing 3 items:
(short-name, long-name, position)
"""
q = """SELECT sname, lname, score FROM sbmCATEGORIES where doctype=%s ORDER BY score ASC,""" \
""" lname ASC"""
return run_sql(q, (doctype,))
def get_all_categories_sname_lname_for_doctype_categsname(doctype, categsname):
"""Return the short and long names of all CATEGORIES found for a given DOCUMENT TYPE.
@param doctype: unique ID of doctype for which submission categories are to be counted
@return: a tuple of tuples: (sname, lname)
"""
q = """SELECT sname, lname FROM sbmCATEGORIES where doctype=%s AND sname=%s"""
return run_sql(q, (doctype, categsname) )
def get_all_submissionnames_doctype(doctype):
"""Get and return a tuple of tuples containing the "submission name" (subname) of all
submissions for the document type identified by "doctype".
In other words, get a list of the submissions that document type "doctype" has.
@param doctype: unique ID of the document type whose submissions are to be retrieved
@return: tuple of tuples (subname,)
"""
q = """SELECT subname FROM sbmIMPLEMENT WHERE docname=%s ORDER BY subname ASC"""
return run_sql(q, (doctype,))
def get_actname_all_submissions_doctype(doctype):
"""Get and return a tuple of tuples containing the "action name" (actname) of all
submissions for the document type identified by "doctype".
In other words, get a list of the action IDs of the submissions implemented by document type "doctype".
@param doctype: unique ID of the document type whose actions are to be retrieved
@return: tuple of tuples (actname,)
"""
q = """SELECT actname FROM sbmIMPLEMENT WHERE docname=%s ORDER BY actname ASC"""
return run_sql(q, (doctype,))
def get_submissiondetails_doctype_action(doctype, action):
"""Get the details of all submissions for a given document type, ordered by the action name.
@param doctype: details of the document type for which the details of all submissions are to be
retrieved.
@return: a tuple containing the details of a submission:
(subname, docname, actname, displayed, nbpg, cd, md, buttonorder, statustext, level, score,
stpage, endtext)
"""
q = """SELECT subname, docname, actname, displayed, nbpg, cd, md, buttonorder, statustext, level, """ \
"""score, stpage, endtxt FROM sbmIMPLEMENT WHERE docname=%s AND actname=%s"""
return run_sql(q, (doctype, action))
def get_all_categories_of_doctype_ordered_by_score_lname(doctype):
"""Return a tuple containing all categories of a given document type, ordered by
ascending order of score, and ascending order of category long-name.
@param doctype: (string) the document type ID.
@return: (tuple) or tuples, whereby each tuple is a row representing a category, with
the following structure: (sname, lname, score)
"""
qstr = """SELECT sname, lname, score FROM sbmCATEGORIES WHERE doctype=%s ORDER BY score ASC, lname ASC"""
res = run_sql(qstr, (doctype,))
return res
def update_score_of_doctype_category(doctype, categid, newscore):
"""Update the score of a given category of a given document type.
@param doctype: (string) the document type id
@param categid: (string) the category id
@param newscore: (integer) the score that the category is to be given
@return: (integer) - 0 on update of row; 1 on failure to update.
"""
qstr = """UPDATE sbmCATEGORIES SET score=%s WHERE doctype=%s AND sname=%s"""
res = run_sql(qstr, (newscore, doctype, categid))
if int(res) > 0:
## row(s) were updated
return 0
else:
## no rows were updated
return 1
def normalize_doctype_category_scores(doctype):
"""Get details of all categories of a given document type, ordered by score and long name;
Loop through each category and check its score vs a counter in the result-set; if the score
does not match the counter number, update the score of that category to match that of the
counter. In this way, the category scores will be normalized sequentially. E.g.:
categories numbered [1,4,6,8,9] will be allocated normalized scores [1,2,3,4,5]. I.e. the
order won't change, but the scores will be corrected.
@param doctype: (string) the document type id
@return: (None)
"""
all_categs = get_all_categories_of_doctype_ordered_by_score_lname(doctype)
num_categs = len(all_categs)
for row_idx in xrange(0, num_categs):
## Get the details of the current categories:
cur_row_score = row_idx + 1
cur_categ_id = all_categs[row_idx][0]
cur_categ_lname = all_categs[row_idx][1]
cur_categ_score = int(all_categs[row_idx][2])
## Check the score of the categ vs its position in the list:
if cur_categ_score != cur_row_score:
## update this score:
update_score_of_doctype_category(doctype=doctype,
categid=cur_categ_id, newscore=cur_row_score)
def move_category_to_new_score(doctype, sourcecateg, destinationcatg):
"""Move a category of a document type from one score, to another.
@param doctype: (string) -the ID of the document type whose categories are to be moved.
@param sourcecateg: (string) - the category ID of the category to be moved.
@param destinationcatg: (string) - the category ID of the category to whose position sourcecateg
is to be moved.
@return: (integer) 0 - successfully moved category; 1 - failed to correctly move category.
"""
qstr_increment_scores_from_scorex = """UPDATE sbmCATEGORIES SET score=score+1 WHERE doctype=%s AND score >= %s"""
move_categ_from_score = mave_categ_to_score = -1
## get the (categid, lname, score) of all categories for this document type:
res_all_categs = get_all_categories_of_doctype_ordered_by_score_lname(doctype=doctype)
num_categs = len(res_all_categs)
## if the category scores are not ordered properly (1,2,3,4,...), correct them.
## Also, get the row-count (therefore score-position) of the categ to be moved, and the destination score:
for row_idx in xrange(0, num_categs):
current_row_score = row_idx + 1
current_categid = res_all_categs[row_idx][0]
current_categ_score = int(res_all_categs[row_idx][2])
## Check the score of the categ vs its position in the list:
if current_categ_score != current_row_score:
## bad score - fix it:
update_score_of_doctype_category(doctype=doctype,
categid=current_categid,
newscore=current_row_score)
if current_categid == sourcecateg:
## this is the place from which the category is being jumped-out:
move_categ_from_score = current_row_score
elif current_categid == destinationcatg:
## this is the place into which the categ is being jumped:
move_categ_to_score = current_row_score
## If couldn't find the scores of both 'sourcecateg' and 'destinationcatg', return error:
if -1 in (move_categ_from_score, move_categ_to_score) or \
move_categ_from_score == mave_categ_to_score:
## either trying to move a categ to the same place or can't find both the source and destination categs:
return 1
## add 1 to score of all categories from the score position into which the sourcecateg is to be moved:
qres = run_sql(qstr_increment_scores_from_scorex, (doctype, move_categ_to_score))
## update the score of the category to be moved:
update_score_of_doctype_category(doctype=doctype, categid=sourcecateg, newscore=move_categ_to_score)
## now re-order all category scores correctly:
normalize_doctype_category_scores(doctype)
return 0 ## return success
def move_category_by_one_place_in_score(doctype, categsname, direction):
"""Move a category up or down in score by one place.
@param doctype: (string) - the ID of the document type to which the category belongs.
@param categsname: (string) - the ID of the category to be moved.
@param direction: (string) - the direction in which to move the category ('up' or 'down').
@return: (integer) - 0 on successful move of category; 1 on failure to properly move category.
"""
qstr_update_score = """UPDATE sbmCATEGORIES SET score=%s WHERE doctype=%s AND score=%s"""
move_categ_score = -1
## get the (categid, lname, score) of all categories for this document type:
res_all_categs = get_all_categories_of_doctype_ordered_by_score_lname(doctype=doctype)
num_categs = len(res_all_categs)
## if the category scores are not ordered properly (1,2,3,4,...), correct them
## Also, get the row-count (therefore score-position) of the categ to be moved
for row_idx in xrange(0, num_categs):
current_row_score = row_idx + 1
current_categid = res_all_categs[row_idx][0]
current_categ_score = int(res_all_categs[row_idx][2])
## Check the score of the categ vs its position in the list:
if current_categ_score != current_row_score:
## bad score - fix it:
update_score_of_doctype_category(doctype=doctype,
categid=current_categid,
newscore=current_row_score)
if current_categid == categsname:
## this is the category to be moved:
move_categ_score = current_row_score
## move the category:
if direction.lower() == "up":
## Moving the category upwards (reducing its score):
if num_categs > 1 and move_categ_score > 1:
## move the category above down by one place:
run_sql(qstr_update_score, (move_categ_score, doctype, (move_categ_score - 1)))
## move the chosen category up:
update_score_of_doctype_category(doctype=doctype,
categid=categsname, newscore=(move_categ_score - 1))
## return success
return 0
else:
## return error - not enough categs, or categ already in first posn
return 1
elif direction.lower() == "down":
## move the category downwards (increasing its score):
if num_categs > 1 and move_categ_score < num_categs:
## move category below, up by one place:
run_sql(qstr_update_score, (move_categ_score, doctype, (move_categ_score + 1)))
## move the chosen category down:
update_score_of_doctype_category(doctype=doctype,
categid=categsname, newscore=(move_categ_score + 1))
## return success
return 0
else:
## return error - not enough categs, or categ already in last posn
return 1
else:
## invalid move direction - no action
return 1
def update_submissiondetails_doctype_action(doctype, action, displayed, buttonorder,
statustext, level, score, stpage, endtxt):
"""Update the details of a submission.
@param doctype: the document type for which the submission details are to be updated
@param action: the action ID of the submission to be modified
@param displayed: displayed on main submission page? (Y/N)
@param buttonorder: button order
@param statustext: statustext
@param level: level
@param score: score
@param stpage: stpage
@param endtxt: endtxt
@return: an integer error code: 0 for successful update; 1 for update failure.
"""
numrows_submission = get_number_submissions_doctype_action(doctype, action)
if numrows_submission == 1:
## there is only one row for this submission - can update
q = """UPDATE sbmIMPLEMENT SET md=CURDATE(), displayed=%s, buttonorder=%s, statustext=%s, level=%s, """\
"""score=%s, stpage=%s, endtxt=%s WHERE docname=%s AND actname=%s"""
run_sql(q, (displayed,
((str(buttonorder).isdigit() and int(buttonorder) >= 0) and buttonorder) or (None),
statustext,
level,
((str(score).isdigit() and int(score) >= 0) and score) or (""),
((str(stpage).isdigit() and int(stpage) >= 0) and stpage) or (""),
endtxt,
doctype,
action
) )
return 0 ## Everything OK
else:
## Everything NOT OK - either multiple rows exist for submission, or submission doesn't exist
return 1
def update_doctype_details(doctype, doctypename, doctypedescr):
"""Update a document type's details. In effect the document type name (ldocname) and the description
are updated, as is the last modification date (md).
@param doctype: the ID of the document type to be updated
@param doctypename: the new/updated name of the document type
@param doctypedescr: the new/updated description of the document type
@return: Integer error code: 0 = update successful; 1 = update failed
"""
numrows_doctype = get_number_doctypes_docid(docid=doctype)
if numrows_doctype == 1:
## doctype exists - perform update
q = """UPDATE sbmDOCTYPE SET ldocname=%s, description=%s, md=CURDATE() WHERE sdocname=%s"""
run_sql(q, (doctypename, doctypedescr, doctype))
return 0 ## Everything OK
else:
## Everything NOT OK - either doctype does not exists, or key is duplicated
return 1
def get_submissiondetails_all_submissions_doctype(doctype):
"""Get the details of all submissions for a given document type, ordered by the action name.
@param doctype: details of the document type for which the details of all submissions are to be
retrieved.
@return: a tuple of tuples, each tuple containing the details of a submission:
(subname, docname, actname, displayed, nbpg, cd, md, buttonorder, statustext, level, score,
stpage, endtext)
"""
q = """SELECT subname, docname, actname, displayed, nbpg, cd, md, buttonorder, statustext, level, """ \
"""score, stpage, endtxt FROM sbmIMPLEMENT WHERE docname=%s ORDER BY actname ASC"""
return run_sql(q, (doctype,))
def delete_doctype(doctype):
"""Delete a document type's details from the document types table (sbmDOCTYPE).
Effectively, this means that the document type has been deleted, but this function
should be called after other functions that delete all of the other components of a
document type (such as "delete_all_submissions_doctype" to delete the doctype's submissions,
"delete_all_functions_doctype" to delete its functions, etc.
@param doctype: the unique ID of the document type to be deleted.
@return: 0 (ZERO) if doctype was deleted successfully; 1 (ONE) if doctype remains after the
deletion attempt.
"""
q = """DELETE FROM sbmDOCTYPE WHERE sdocname=%s"""
run_sql(q, (doctype,))
numrows_doctype = get_number_doctypes_docid(doctype)
if numrows_doctype == 0:
## everything OK - deleted this doctype
return 0
else:
## everything NOT OK - could not delete all entries for this doctype
## make a last attempt:
run_sql(q, (doctype,))
if get_number_doctypes_docid(doctype) == 0:
## everything OK this time - could delete doctype
return 0
else:
## everything still NOT OK - could not delete the doctype
return 1
def delete_collection_doctype_entry_doctype(doctype):
"""Delete a document type's entry from the collection-doctype list
@param doctype: the unique ID of the document type to be deleted from the
collection-doctypes list
@return: 0 (ZERO) if doctype was deleted successfully from collection-doctypes list;
1 (ONE) if doctype remains in the collection-doctypes list after the deletion attempt
"""
q = """DELETE FROM sbmCOLLECTION_sbmDOCTYPE WHERE id_son=%s"""
run_sql(q, (doctype,))
numrows_coll_doctype_doctype = get_number_collection_doctype_entries_doctype(doctype)
if numrows_coll_doctype_doctype == 0:
## everything OK - deleted the document type from the collection-doctype list
return 0
else:
## everything NOT OK - could not delete the doctype from the collection-doctype list
## try once more
run_sql(q, (doctype,))
if get_number_collection_doctype_entries_doctype(doctype) == 0:
## everything now OK - could delete this time
return 0
else:
## everything still NOT OK - could not delete
return 1
def delete_all_submissions_doctype(doctype):
"""Delete all SUBMISSIONS (actions) for a given document type
@param doctype: the doument type from which the submissions are to be deleted
@return: 0 (ZERO) if all submissions are deleted successfully; 1 (ONE) if submissions remain after the
delete has been performed (i.e. all submissions could not be deleted for some reason)
"""
q = """DELETE FROM sbmIMPLEMENT WHERE docname=%s"""
run_sql(q, (doctype,))
numrows_submissionsdoctype = get_number_submissions_doctype(doctype)
if numrows_submissionsdoctype == 0:
## everything OK - no submissions remain for this doctype
return 0
else:
## everything NOT OK - still submissions remaining for this doctype
## make a last attempt to delete them:
run_sql(q, (doctype,))
## last check to see whether submissions remain:
if get_number_submissions_doctype(doctype) == 0:
## Everything OK - all submissions deleted this time
return 0
else:
## Everything NOT OK - still could not delete the submissions
return 1
def delete_all_parameters_doctype(doctype):
"""Delete all PARAMETERS (as used by functions) for a given document type
@param doctype: the doctype for which all function-parameters are to be deleted
@return: 0 (ZERO) if all parameters are deleted successfully; 1 (ONE) if parameters remain after the
delete has been performed (i.e. all parameters could not be deleted for some reason)
"""
q = """DELETE FROM sbmPARAMETERS WHERE doctype=%s"""
run_sql(q, (doctype,))
numrows_paramsdoctype = get_number_parameters_doctype(doctype)
if numrows_paramsdoctype == 0:
## Everything OK - no parameters remain for this doctype
return 0
else:
## Everything NOT OK - still some parameters remaining for doctype
## make a last attempt to delete them:
run_sql(q, (doctype,))
## check once more to see if parameters remain:
if get_number_parameters_doctype(doctype) == 0:
## Everything OK - all parameters were deleted successfully this time
return 0
else:
## still unable to recover - could not delete all parameters
return 1
def get_functionname_step_score_allfunctions_afterreference_doctypesubmission(doctype, action, step, score):
q = """SELECT function, step, score FROM sbmFUNCTIONS WHERE (doctype=%s AND action=%s) AND ((step=%s AND score > %s)""" \
""" OR (step > %s)) ORDER BY step ASC, score ASC"""
return run_sql(q, (doctype, action, step, score, step))
def get_functionname_step_score_allfunctions_beforereference_doctypesubmission(doctype, action, step, score):
q = """SELECT function, step, score FROM sbmFUNCTIONS WHERE (doctype=%s AND action=%s) AND ((step=%s AND score < %s)"""
if step > 1:
q += """ OR (step < %s)"""
q += """) ORDER BY step ASC, score ASC"""
if step > 1:
return run_sql(q, (doctype, action, step, score, step))
else:
return run_sql(q, (doctype, action, step, score))
def get_functionname_step_score_allfunctions_doctypesubmission(doctype, action):
"""Return the details (function name, step, score) of all functions beloning to the submission (action) of
doctype.
@param doctype: unique ID of doctype for which the details of the functions of the given submission
are to be retrieved
@param action: the action ID of the submission whose function details ore to be retrieved
@return: a tuple of tuples: ((function, step, score),(function, step, score),[...])
"""
q = """SELECT function, step, score FROM sbmFUNCTIONS where doctype=%s AND action=%s ORDER BY step ASC, score ASC"""
return run_sql(q, (doctype, action))
def get_name_step_score_of_all_functions_in_step_of_submission(doctype, action, step):
"""Return a list of the details of all functions within a given step of a submission.
The functions will be ordered in ascending order of score.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param step: (integer) the step in which the functions are located
@return: a tuple of tuples (function-name, step, score)
"""
q = """SELECT function, step, score FROM sbmFUNCTIONS WHERE doctype=%s AND action=%s AND step=%s ORDER BY score ASC"""
res = run_sql(q, (doctype, action, step))
return res
def delete_function_doctypesubmission_step_score(doctype, action, function, step, score):
"""Delete a given function at a particular step/score for a given doctype submission"""
q = """DELETE FROM sbmFUNCTIONS WHERE doctype=%s AND action=%s AND function=%s AND step=%s AND score=%s"""
run_sql(q, (doctype, action, function, step, score))
numrows_function_doctypesubmission_step_score = \
get_number_of_functions_with_functionname_in_submission_at_step_and_score(doctype=doctype,
action=action,
function=function,
step=step,
score=score)
if numrows_function_doctypesubmission_step_score == 0:
## Everything OK - function deleted
return 0
else:
## Everything NOT OK - still some functions remaining for doctype/action
## make a last attempt to delete them:
run_sql(q, (doctype, action, function, step, score))
## check once more to see if functions remain:
if get_number_of_functions_with_functionname_in_submission_at_step_and_score(doctype=doctype, action=action,
function=function, step=step,
score=score):
## Everything OK - all functions for this doctype/action were deleted successfully this time
return 0
else:
## still unable to recover - could not delete all functions for this doctype/action
return 1
def delete_the_function_at_step_and_score_from_a_submission(doctype, action, function, step, score):
## THIS SHOULD REPLACE "delete_function_doctypesubmission_step_score(doctype, action, function, step, score)"
"""Delete a given function at a particular step/score for a given submission"""
q = """DELETE FROM sbmFUNCTIONS WHERE doctype=%s AND action=%s AND function=%s AND step=%s AND score=%s"""
run_sql(q, (doctype, action, function, step, score))
numrows_deletedfunc = \
get_number_of_functions_with_functionname_in_submission_at_step_and_score(doctype=doctype,
action=action,
function=function,
step=step,
score=score)
if numrows_deletedfunc == 0:
## Everything OK - function deleted
return
else:
## Everything NOT OK - still some functions remaining for doctype/action
## make a last attempt to delete them:
run_sql(q, (doctype, action, function, step, score))
## check once more to see if functions remain:
numrows_deletedfunc = \
get_number_of_functions_with_functionname_in_submission_at_step_and_score(doctype=doctype,
action=action,
function=function,
step=step,
score=score)
if numrows_deletedfunc == 0:
## Everything OK - all functions for this doctype/action were deleted successfully this time
return
else:
## still unable to recover - could not delete all functions for this doctype/action
msg = """Failed to delete the function [%s] at score [%s] of step [%s], from submission [%s]"""\
% (function, score, step, "%s%s" % (action, doctype))
raise InvenioWebSubmitAdminWarningDeleteFailed(msg)
def delete_function_at_step_and_score_from_submission(doctype, action, function, step, score):
"""Delete the function at a particular step/score from a submission.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param function: (string) the name of the function to be deleted
@param step: (integer) the step in which the function to be deleted is found
@param score: (integer) the score at which the function to be deleted is found
@return: None
@Exceptions raised:
InvenioWebSubmitAdminWarningDeleteFailed - when unable to delete the function
"""
q = """DELETE FROM sbmFUNCTIONS WHERE doctype=%s AND action=%s AND function=%s AND step=%s AND score=%s"""
run_sql(q, (doctype, action, function, step, score))
numrows_function_at_stepscore = \
get_number_of_functions_with_functionname_in_submission_at_step_and_score(doctype=doctype,
action=action,
function=function,
step=step,
score=score)
if numrows_function_at_stepscore == 0:
## Everything OK - function deleted
return
else:
## Everything NOT OK - still some functions remaining for doctype/action
## make a last attempt to delete them:
run_sql(q, (doctype, action, function, step, score))
## check once more to see if functions remain:
numrows_function_at_stepscore = \
get_number_of_functions_with_functionname_in_submission_at_step_and_score(doctype=doctype,
action=action,
function=function,
step=step,
score=score)
if numrows_function_at_stepscore == 0:
## Everything OK - all functions for this doctype/action were deleted successfully this time
return
else:
## still unable to recover - could not delete all functions for this doctype/action
msg = """Failed to delete function [%s] from step [%s] and score [%s] from submission [%s]""" \
% (function, step, score, "%s%s" % (action, doctype))
raise InvenioWebSubmitAdminWarningDeleteFailed(msg)
def delete_all_functions_in_step_of_submission(doctype, action, step):
"""Delete all functions from a given step of a submission.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param step: (integer) the number of the step in which the functions are to be deleted
@return: None
@Exceptions raised:
InvenioWebSubmitAdminWarningDeleteFailed - when unable to delete some or all of the functions
"""
q = """DELETE FROM sbmFUNCTIONS WHERE doctype=%s AND action=%s AND step=%s"""
run_sql(q, (doctype, action, step))
numrows_functions_in_step = get_number_of_functions_in_step_of_submission(doctype=doctype,
action=action,
step=step)
if numrows_functions_in_step == 0:
## all functions in step of submission deleted
return
else:
## couldn't delete all of the functions - try again
run_sql(q, (doctype, action, step))
numrows_functions_in_step = get_number_of_functions_in_step_of_submission(doctype=doctype,
action=action,
step=step)
if numrows_functions_in_step == 0:
## success this time
return
else:
msg = """Failed to delete all functions in step [%s] of submission [%s]""" % (step,
"%s%s" % (action, doctype))
raise InvenioWebSubmitAdminWarningDeleteFailed(msg)
def delete_all_functions_foraction_doctype(doctype, action):
"""Delete all FUNCTIONS for a given action, belonging to a given doctype.
@param doctype: the document type for which the functions are to be deleted
@param action: the action that owns the functions to be deleted
@return: 0 (ZERO) if all functions for the doctype/action are deleted successfully;
1 (ONE) if functions for the doctype/action remain after the delete has been performed (i.e.
the functions could not be deleted for some reason)
"""
q = """DELETE FROM sbmFUNCTIONS WHERE doctype=%s AND action=%s"""
run_sql(q, (doctype, action))
numrows_functions_actiondoctype = get_number_functions_action_doctype(doctype=doctype, action=action)
if numrows_functions_actiondoctype == 0:
## Everything OK - no functions remain for this doctype/action
return 0
else:
## Everything NOT OK - still some functions remaining for doctype/action
## make a last attempt to delete them:
run_sql(q, (doctype, action))
## check once more to see if functions remain:
if get_number_functions_action_doctype(doctype=doctype, action=action) == 0:
## Everything OK - all functions for this doctype/action were deleted successfully this time
return 0
else:
## still unable to recover - could not delete all functions for this doctype/action
return 1
def delete_all_functions_doctype(doctype):
"""Delete all FUNCTIONS for a given document type.
@param doctype: the document type for which all functions are to be deleted
@return: 0 (ZERO) if all functions are deleted successfully; 1 (ONE) if functions remain after the
delete has been performed (i.e. all functions could not be deleted for some reason)
"""
q = """DELETE FROM sbmFUNCTIONS WHERE doctype=%s"""
run_sql(q, (doctype,))
numrows_functionsdoctype = get_number_functions_doctype(doctype)
if numrows_functionsdoctype == 0:
## Everything OK - no functions remain for this doctype
return 0
else:
## Everything NOT OK - still some functions remaining for doctype
## make a last attempt to delete them:
run_sql(q, (doctype,))
## check once more to see if functions remain:
if get_number_functions_doctype(doctype) == 0:
## Everything OK - all functions were deleted successfully this time
return 0
else:
## still unable to recover - could not delete all functions
return 1
def clone_submissionfields_from_doctypesubmission_to_doctypesubmission(fromsub, tosub):
"""
"""
error_code = delete_all_submissionfields_submission(tosub)
if error_code == 0:
## there are no fields for the submission "tosubm" - clone from "fromsub"
q = """INSERT INTO sbmFIELD (subname, pagenb, fieldnb, fidesc, fitext, level, sdesc, checkn, cd, md, """ \
"""fiefi1, fiefi2) """\
"""(SELECT %s, pagenb, fieldnb, fidesc, fitext, level, sdesc, checkn, CURDATE(), CURDATE(), NULL, NULL """ \
"""FROM sbmFIELD WHERE subname=%s)"""
## get number of submission fields for submission fromsub:
numfields_fromsub = get_number_submissionfields_submissionnames(submission_names=fromsub)
run_sql(q, (tosub, fromsub))
## get number of submission fields for submission tosub (after cloning):
numfields_tosub = get_number_submissionfields_submissionnames(submission_names=tosub)
if numfields_fromsub == numfields_tosub:
## successful clone
return 0
else:
## didn't manage to clone all fields - return 2
return 2
else:
## cannot delete "tosub"s fields - cannot clone - return 1 to signal this
return 1
def clone_categories_fromdoctype_todoctype(fromdoctype, todoctype):
""" TODO : docstring
"""
## first, if categories exist for "todoctype", delete them
error_code = delete_all_categories_doctype(todoctype)
if error_code == 0:
## all categories were deleted - now clone those of "fromdoctype"
## first, count "fromdoctype"s categories:
numcategs_fromdoctype = get_number_categories_doctype(fromdoctype)
## now perform the cloning:
q = """INSERT INTO sbmCATEGORIES (doctype, sname, lname, score) (SELECT %s, sname, lname, score """\
"""FROM sbmCATEGORIES WHERE doctype=%s)"""
run_sql(q, (todoctype, fromdoctype))
## get number categories for "todoctype" (should be the same as "fromdoctype" if the cloning was successful):
numcategs_todoctype = get_number_categories_doctype(todoctype)
if numcategs_fromdoctype == numcategs_todoctype:
## successful clone
return 0
else:
## did not manage to clone all categories - return 2 to indicate this
return 2
else:
## cannot delete "todoctype"s categories - return error code of 1 to signal this
return 1
def insert_function_into_submission_at_step_and_score_then_regulate_scores_of_functions_in_step(doctype, action,
function, step, score):
"""Insert a function into a submission at a particular score within a particular step, then regulate the scores
of all functions within that step to spaces of 10.
@param doctype: (string)
@param action: (string)
@param function: (string)
@param step: (integer)
@param score: (integer)
@return: None
"""
## check whether function exists in WebSubmit DB:
numrows_function = get_number_of_functions_with_funcname(funcname=function)
if numrows_function < 1:
msg = """Failed to insert the function [%s] into submission [%s] at step [%s] and score [%s] - """\
"""Could not find function [%s] in WebSubmit DB""" % (function, "%s%s" % (action, doctype),
step, score, function)
raise InvenioWebSubmitAdminWarningReferentialIntegrityViolation(msg)
## add 10 to the score of all functions at or below the position of this new function and within the same step
## (this ensures there is a vacant slot where the function is to be added)
add_10_to_score_of_all_functions_in_step_of_submission_and_with_score_equalto_or_above_val(doctype=doctype,
action=action,
step=step,
fromscore=score)
## now insert the new function into its position:
try:
insert_function_into_submission_at_step_and_score(doctype=doctype, action=action,
function=function, step=step, score=score)
except InvenioWebSubmitAdminWarningReferentialIntegrityViolation, e:
## The function doesn't exist in WebSubmit and therefore cannot be used in the submission
## regulate the scores of all functions within the step, to correct the "hole" that was made
try:
regulate_score_of_all_functions_in_step_to_ascending_multiples_of_10_for_submission(doctype=doctype,
action=action,
step=step)
except InvenioWebSubmitAdminWarningDeleteFailed, f:
## can't regulate the functions' scores - couldn't delete some or all of them before re-inserting
## them in the correct position. Cannot fix this - report that some functions may have been lost.
msg = """It wasn't possible to add the function [%s] to submission [%s] at step [%s], score [%s]."""\
""" Firstly, the function doesn't exist in WebSubmit. Secondly, when trying to correct the """\
"""score of the functions within step [%s], it was not possible to delete some or all of them."""\
""" Some functions may have been lost - please check."""\
% (function, "%s%s" % (action, doctype), step, score, step)
raise InvenioWebSubmitAdminWarningInsertFailed(msg)
raise
## try to regulate the scores of the functions in the step that the new function was just inserted into:
try:
regulate_score_of_all_functions_in_step_to_ascending_multiples_of_10_for_submission(doctype=doctype,
action=action,
step=step)
except InvenioWebSubmitAdminWarningDeleteFailed, e:
## could not correctly regulate the functions - could not delete all functions in the step
msg = """Could not regulate the scores of all functions within step [%s] of submission [%s]."""\
""" It was not possible to delete some or all of them. Some functions may have been lost -"""\
""" please chack.""" % (step, "%s%s" % (action, doctype))
raise InvenioWebSubmitAdminWarningDeleteFailed(msg)
## success
return
def insert_function_into_submission_at_step_and_score(doctype, action, function, step, score):
"""Insert a function into a submission, at the position dictated by step/score.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param function: (string) the unique name of a function
@param step: (integer) the step into which the function should be inserted
@param score: (integer) the score at which the function should be inserted
@return:
"""
## check that the function exists in WebSubmit:
numrows_function = get_number_of_functions_with_funcname(function)
if numrows_function > 0:
## perform the insert
q = """INSERT INTO sbmFUNCTIONS (doctype, action, function, step, score) VALUES(%s, %s, %s, %s, %s)"""
run_sql(q, (doctype, action, function, step, score))
return
else:
## function doesnt exist - cannot insert a row for it in a submission!
msg = """Failed to insert the function [%s] into submission [%s] at step [%s] and score [%s] - """\
"""Could not find function [%s] in WebSubmit DB""" % (function, "%s%s" % (action, doctype),
step, score, function)
raise InvenioWebSubmitAdminWarningReferentialIntegrityViolation(msg)
def clone_functions_foraction_fromdoctype_todoctype(fromdoctype, todoctype, action):
## delete all functions that
error_code = delete_all_functions_foraction_doctype(doctype=todoctype, action=action)
if error_code == 0:
## all functions for todoctype/action deleted - no clone those of "fromdoctype"
## count fromdoctype's functions for the given action
numrows_functions_action_fromdoctype = get_number_functions_action_doctype(doctype=fromdoctype, action=action)
## perform the cloning:
q = """INSERT INTO sbmFUNCTIONS (doctype, action, function, score, step) (SELECT %s, action, function, """ \
"""score, step FROM sbmFUNCTIONS WHERE doctype=%s AND action=%s)"""
run_sql(q, (todoctype, fromdoctype, action))
## get number of functions for todoctype/action (these have just been cloned these from fromdoctype/action, so
## the counts should be the same)
numrows_functions_action_todoctype = get_number_functions_action_doctype(doctype=todoctype, action=action)
if numrows_functions_action_fromdoctype == numrows_functions_action_todoctype:
## successful clone:
return 0
else:
## could not clone all functions from fromdoctype/action for todoctype/action
return 2
else:
## unable to delete "todoctype"'s functions for action
return 1
def get_number_functionparameters_for_action_doctype(action, doctype):
"""Get the number of parameters associated with a given action of a given document type.
@param action: the action of the doctype, with which the parameters are associated
@param doctype: the doctype with which the parameters are associated.
@return: an integer count of the number of parameters associated with the given action
of the given document type
"""
q = """SELECT COUNT(DISTINCT(par.name)) FROM sbmFUNDESC AS fundesc """ \
"""LEFT JOIN sbmPARAMETERS AS par ON fundesc.param = par.name """ \
"""LEFT JOIN sbmFUNCTIONS AS func ON par.doctype = func.doctype AND fundesc.function = func.function """ \
"""WHERE par.doctype=%s AND func.action=%s"""
return int(run_sql(q, (doctype, action))[0][0])
def delete_functionparameters_doctype_submission(doctype, action):
def _get_list_params_to_delete(potential_delete_params, keep_params):
del_params = []
for param in potential_delete_params:
if param[0] not in keep_params and param[0] != "":
## this parameter is not used by the other actions - it can be deleted
del_params.append(param[0])
return del_params
## get the parameters belonging to the given submission of the doctype:
params_doctype_action = get_functionparameternames_doctype_action(doctype=doctype, action=action)
## get all parameters for the given doctype that belong to submissions OTHER than the submission for which we must
## delete parameters:
params_doctype_other_actions = get_functionparameternames_doctype_not_action(doctype=doctype, action=action)
## "params_doctype_not_action" is a tuple of tuples, where each tuple contains only the parameter name: ((param,),(param,))
## make a tuple of strings, instead of this tuple of tuples:
params_to_keep = map(lambda x: (type(x[0]) in (str, unicode) and x[0]) or (""), params_doctype_other_actions)
delete_params = _get_list_params_to_delete(potential_delete_params=params_doctype_action, keep_params=params_to_keep)
## now, if there are parameters to delete, do it:
if len(delete_params) > 0:
q = """DELETE FROM sbmPARAMETERS WHERE doctype=%s AND (name=%s"""
if len(delete_params) > 1:
for i in range(1, len(delete_params)):
q += """ OR name=%s"""
q += """)"""
run_sql(q, [doctype,] + delete_params)
params_remaining_doctype_action = get_functionparameternames_doctype_action(doctype=doctype, action=action)
if len(_get_list_params_to_delete(potential_delete_params=params_remaining_doctype_action, keep_params=params_to_keep)) == 0:
## Everything OK - all parameters deleted
return 0
else:
## Everything NOT OK - some parameters remain: try one final time to delete them
run_sql(q, [doctype,] + delete_params)
params_remaining_doctype_action = get_functionparameternames_doctype_action(doctype=doctype, action=action)
if len(_get_list_params_to_delete(potential_delete_params=params_remaining_doctype_action, keep_params=params_to_keep)) > 0:
## Everything OK - deleted successfully this time
return 0
else:
## Still unable to delete - give up
return 1
## no parameters to delete
return 0
def update_value_of_function_parameter_for_doctype(doctype, paramname, paramval):
"""Update the value of a parameter as used by a document type.
@param doctype: (string) the unique ID of a document type
@param paramname: (string) the name of the parameter whose value is to be updated
@param paramval: (string) the new value for the parameter
@Exceptions raised:
InvenioWebSubmitAdminTooManyRows - when multiple rows found for parameter
InvenioWebSubmitAdminNoRowsFound - when no rows found for parameter
"""
q = """UPDATE sbmPARAMETERS SET value=%s WHERE doctype=%s AND name=%s"""
## get number of rows found for the parameter:
numrows_param = get_numberparams_doctype_paramname(doctype=doctype, paramname=paramname)
if numrows_param == 1:
run_sql(q, (paramval, doctype, paramname))
return
elif numrows_param > 1:
## multiple rows found for the parameter - not safe to edit
msg = """When trying to update the [%s] parameter for the [%s] document type, [%s] rows were found for the parameter """\
"""- not safe to update""" % (paramname, doctype, numrows_param)
raise InvenioWebSubmitAdminWarningTooManyRows(msg)
else:
## no row for parameter found
insert_parameter_doctype(doctype=doctype, paramname=paramname, paramval=paramval)
numrows_param = get_numberparams_doctype_paramname(doctype=doctype, paramname=paramname)
if numrows_param != 1:
msg = """When trying to update the [%s] parameter for the [%s] document type, could not insert a new value"""\
% (paramname, doctype)
raise InvenioWebSubmitAdminWarningNoRowsFound(msg)
return
def get_parameters_name_and_value_for_function_of_doctype(doctype, function):
"""Get the names and values of all parameters of a given function, as they have been set for a particular document
type.
@param doctype: (string) the unique ID of a document type
@param function: the name of the function from which the parameters names/values are to be retrieved
@return: a tuple of 2-celled tuples, each tuple containing 2 strings: (parameter-name, parameter-value)
"""
q = """SELECT param.name, param.value FROM sbmPARAMETERS AS param """\
"""LEFT JOIN sbmFUNDESC AS func ON func.param=param.name """\
"""WHERE func.function=%s AND param.doctype=%s """\
"""ORDER BY param.name ASC"""
return run_sql(q, (function, doctype))
def get_value_of_parameter_for_doctype(doctype, parameter):
q = """SELECT value FROM sbmPARAMETERS WHERE doctype=%s AND name=%s"""
res = run_sql(q, (doctype, parameter))
if len(res) > 0:
return res[0][0]
else:
return None
def get_functionparameternames_doctype_action(doctype, action):
"""Get the unique NAMES function parameters for a given action of a given doctype.
@param doctype: the document type with which the parameters are associated
@param action: the action (of "doctype") with which the parameters are associated
@return: a tuple of tuples, where each tuple represents a parameter name:
(parameter name, parameter value, doctype)
"""
q = """SELECT DISTINCT(par.name) FROM sbmFUNDESC AS fundesc """ \
"""LEFT JOIN sbmPARAMETERS AS par ON fundesc.param = par.name """ \
"""LEFT JOIN sbmFUNCTIONS AS func ON par.doctype = func.doctype AND fundesc.function = func.function """ \
"""WHERE par.doctype=%s AND func.action=%s """\
"""GROUP BY par.name """ \
"""ORDER BY fundesc.function ASC, par.name ASC"""
return run_sql(q, (doctype, action))
def get_functionparameternames_doctype_not_action(doctype, action):
"""Get the unique NAMES function parameters for a given action of a given doctype.
@param doctype: the document type with which the parameters are associated
@param action: the action (of "doctype") with which the parameters are associated
@return: a tuple of tuples, where each tuple represents a parameter name:
(parameter name, parameter value, doctype)
"""
q = """SELECT DISTINCT(par.name) FROM sbmFUNDESC AS fundesc """ \
"""LEFT JOIN sbmPARAMETERS AS par ON fundesc.param = par.name """ \
"""LEFT JOIN sbmFUNCTIONS AS func ON par.doctype = func.doctype AND fundesc.function = func.function """ \
"""WHERE par.doctype=%s AND func.action <> %s """\
"""GROUP BY par.name """ \
"""ORDER BY fundesc.function ASC, par.name ASC"""
return run_sql(q, (doctype, action))
def get_functionparameters_for_action_doctype(action, doctype):
"""Get the details of all function parameter values for a given action of a given doctype.
@param doctype: the document type with which the parameter values are associated
@param action: the action (of "doctype") with which the parameter values are associated
@return: a tuple of tuples, where each tuple represents a parameter/value:
(parameter name, parameter value, doctype)
"""
q = """SELECT DISTINCT(par.name), par.value, par.doctype FROM sbmFUNDESC AS fundesc """ \
"""LEFT JOIN sbmPARAMETERS AS par ON fundesc.param = par.name """ \
"""LEFT JOIN sbmFUNCTIONS AS func ON par.doctype = func.doctype AND fundesc.function = func.function """ \
"""WHERE par.doctype=%s AND func.action=%s """\
"""GROUP BY par.name """ \
"""ORDER BY fundesc.function ASC, par.name ASC"""
return run_sql(q, (doctype, action))
def get_numberparams_doctype_paramname(doctype, paramname):
"""Return a count of the number of rows found for a given parameter of a given doctype.
@param doctype: the doctype with which the parameter is associated
@param paramname: the parameter to be counted
@return: an integer count of the number of times this parameter is found for the document type
"doctype"
"""
q = """SELECT COUNT(name) FROM sbmPARAMETERS WHERE doctype=%s AND name=%s"""
return int(run_sql(q, (doctype, paramname))[0][0])
def get_doctype_docname_descr_cd_md_fordoctype(doctype):
q = """SELECT sdocname, ldocname, description, cd, md FROM sbmDOCTYPE WHERE sdocname=%s"""
return run_sql(q, (doctype,))
def get_actions_sname_lname_not_linked_to_doctype(doctype):
q = """SELECT actn.sactname, CONCAT("[", actn.sactname, "] ", actn.lactname) FROM sbmACTION AS actn """ \
"""LEFT JOIN sbmIMPLEMENT AS subm ON subm.docname=%s AND actn.sactname=subm.actname """ \
"""WHERE subm.actname IS NULL"""
return run_sql(q, (doctype,))
def insert_parameter_doctype(doctype, paramname, paramval):
"""Insert a new parameter and its value into the parameters table (sbmPARAMETERS) for a given
document type.
@param doctype: the document type for which the parameter is to be inserted
@param paramname:
@param paramval:
@return:
"""
q = """INSERT INTO sbmPARAMETERS (doctype, name, value) VALUES (%s, %s, %s)"""
numrows_paramdoctype = get_numberparams_doctype_paramname(doctype=doctype, paramname=paramname)
if numrows_paramdoctype == 0:
## go ahead and insert
run_sql(q, (doctype, paramname, paramval))
return 0 ## Everything is OK
else:
return 1 ## Everything NOT OK - this param already exists, so not inserted
def clone_functionparameters_foraction_fromdoctype_todoctype(fromdoctype, todoctype, action):
## get a list of all function-parameters/values for fromdoctype/action
functionparams_action_fromdoctype = get_functionparameters_for_action_doctype(action=action, doctype=fromdoctype)
numrows_functionparams_action_fromdoctype = len(functionparams_action_fromdoctype)
## for each param, test whether "todoctype" already has a value for it, and if not, clone it:
for docparam in functionparams_action_fromdoctype:
docparam_name = docparam[0]
docparam_val = docparam[1]
insert_parameter_doctype(doctype=todoctype, paramname=docparam_name, paramval=docparam_val)
numrows_functionparams_action_todoctype = get_number_functionparameters_for_action_doctype(action=action, doctype=todoctype)
if numrows_functionparams_action_fromdoctype == numrows_functionparams_action_todoctype:
## All is OK - the action on both document types has the same number of parameters
return 0
else:
## everything NOT OK - the action on both document types has a different number of parameters
## probably some could not be cloned. return 2 to signal that cloning not 100% successful
return 2
def update_category_description_doctype_categ(doctype, categ, categdescr):
"""Update the description of the category "categ", belonging to the document type "doctype".
Set the description of this category equal to "categdescr".
@param doctype: the document type for which the given category description is to be updated
@param categ: the name/ID of the category whose description is to be updated
@param categdescr: the new description for the category
@return: integer error code (0 is OK, 1 is BAD update)
"""
numrows_category_doctype = get_number_categories_doctype_category(doctype=doctype, categ=categ)
if numrows_category_doctype == 1:
## perform update of description
q = """UPDATE sbmCATEGORIES SET lname=%s WHERE doctype=%s AND sname=%s"""
run_sql(q, (categdescr, doctype, categ))
return 0 ## Everything OK
else:
return 1 ## Everything not OK: either no rows, or more than 1 row for category
def insert_category_into_doctype(doctype, categ, categdescr):
"""Insert a category for a document type. It will be inserted into the last position.
If the category already exists for that document type, the insert will fail.
@param doctype: (string) - the document type ID.
@param categ: (string) - the ID of the new category.
@param categdescr: (string) - the new category's description.
@return: (integer) An error code: 0 on successful insert; 1 on failure to insert.
"""
qstr = """INSERT INTO sbmCATEGORIES (doctype, sname, lname, score) """\
"""(SELECT %s, %s, %s, COUNT(sname)+1 FROM sbmCATEGORIES WHERE doctype=%s)"""
## does this category already exist for this document type?
numrows_categ = get_number_categories_doctype_category(doctype=doctype, categ=categ)
if numrows_categ == 0:
## it doesn't exist for this doctype - go ahead and insert it:
run_sql(qstr, (doctype, categ, categdescr, doctype))
return 0
else:
## the category already existed for this doctype - cannot insert
return 1
def delete_category_doctype(doctype, categ):
"""Delete a given CATEGORY from a document type.
@param doctype: the document type from which the category is to be deleted
@param categ: the name/ID of the category to be deleted from doctype
@return: 0 (ZERO) if the category was successfully deleted from this doctype; 1 (ONE) not;
"""
q = """DELETE FROM sbmCATEGORIES WHERE doctype=%s and sname=%s"""
run_sql(q, (doctype, categ))
## check to see whether this category still exists for the doctype:
numrows_categorydoctype = get_number_categories_doctype_category(doctype=doctype, categ=categ)
if numrows_categorydoctype == 0:
## Everything OK - category deleted
## now re-order all category scores correctly:
normalize_doctype_category_scores(doctype)
return 0
else:
## Everything NOT OK - category still present
## make a last attempt to delete it:
run_sql(q, (doctype, categ))
## check once more to see if category remains:
if get_number_categories_doctype_category(doctype=doctype, categ=categ) == 0:
## Everything OK - category was deleted successfully this time
## now re-order all category scores correctly:
normalize_doctype_category_scores(doctype)
return 0
else:
## still unable to recover - could not delete category
return 1
def delete_all_categories_doctype(doctype):
"""Delete all CATEGORIES for a given document type.
@param doctype: the document type for which all submission-categories are to be deleted
@return: 0 (ZERO) if all categories for this doctype are deleted successfully; 1 (ONE) if categories
remain after the delete has been performed (i.e. all categories could not be deleted for some reason)
"""
q = """DELETE FROM sbmCATEGORIES WHERE doctype=%s"""
run_sql(q, (doctype,))
numrows_categoriesdoctype = get_number_categories_doctype(doctype)
if numrows_categoriesdoctype == 0:
## Everything OK - no submission categories remain for this doctype
return 0
else:
## Everything NOT OK - still some submission categories remaining for doctype
## make a last attempt to delete them:
run_sql(q, (doctype,))
## check once more to see if categories remain:
if get_number_categories_doctype(doctype) == 0:
## Everything OK - all categories were deleted successfully this time
return 0
else:
## still unable to recover - could not delete all categories
return 1
def delete_all_submissionfields_submission(subname):
"""Delete all FIELDS (i.e. field elements used on a document type's submission pages - these are the
instances of WebSubmit elements throughout the system) for a given submission. This means delete all
fields used by a given action of a given doctype.
@param subname: the unique name/ID of the submission from which all field elements are to be deleted.
@return: 0 (ZERO) if all submission fields could be deleted for the given submission; 1 (ONE) if some
fields remain after the deletion was performed (i.e. for some reason it was not possible to delete
all fields for the submission).
"""
q = """DELETE FROM sbmFIELD WHERE subname=%s"""
run_sql(q, (subname,))
numrows_submissionfields_subname = get_number_submissionfields_submissionnames(subname)
if numrows_submissionfields_subname == 0:
## all submission fields have been deleted for this submission
return 0
else:
## all fields not deleted. try once more:
run_sql(q, (subname,))
numrows_submissionfields_subname = get_number_submissionfields_submissionnames(subname)
if numrows_submissionfields_subname == 0:
## OK this time - all deleted
return 0
else:
## still unable to delete all submission fields for this submission - give up
return 1
def delete_all_submissionfields_doctype(doctype):
"""Delete all FIELDS (i.e. field elements used on a document type's submission pages - these are the instances
of "WebSubmit Elements" throughout the system).
@param doctype: the document type for which all submission fields are to be deleted
@return: 0 (ZERO) if all submission fields for this doctype are deleted successfully; 1 (ONE) if submission-
fields remain after the delete has been performed (i.e. all fields could not be deleted for some reason)
"""
all_submissions_doctype = get_all_submissionnames_doctype(doctype=doctype)
number_submissions_doctype = len(all_submissions_doctype)
if number_submissions_doctype > 0:
## for each of the submissions, delete the submission fields
q = """DELETE FROM sbmFIELD WHERE subname=%s"""
if number_submissions_doctype > 1:
for i in range(1,number_submissions_doctype):
## Ensure that we delete all elements used by all submissions for the doctype in question:
q += """ OR subname=%s"""
run_sql(q, map(lambda x: str(x[0]), all_submissions_doctype))
## get a count of the number of fields remaining for these submissions after deletion.
numrows_submissions = get_number_submissionfields_submissionnames(submission_names=map(lambda x: str(x[0]), all_submissions_doctype))
if numrows_submissions == 0:
## Everything is OK - no submission fields left for this doctype
return 0
else:
## Everything is NOT OK - some submission fields remain for this doctype - try one more time to delete them:
run_sql(q, map(lambda x: str(x[0]), all_submissions_doctype))
numrows_submissions = get_number_submissionfields_submissionnames(submission_names=map(lambda x: str(x[0]), all_submissions_doctype))
if numrows_submissions == 0:
## everything OK this time
return 0
else:
## still could not delete all fields
return 1
else:
## there were no submissions to delete - therefore there should be no submission fields
## cannot check, so just return OK
return 0
def delete_submissiondetails_doctype(doctype, action):
"""Delete a SUBMISSION (action) for a given document type
@param doctype: the doument type from which the submission is to be deleted
@param action: the action name for the submission that is to be deleted
@return: 0 (ZERO) if all submissions are deleted successfully; 1 (ONE) if submissions remain after the
delete has been performed (i.e. all submissions could not be deleted for some reason)
"""
q = """DELETE FROM sbmIMPLEMENT WHERE docname=%s AND actname=%s"""
run_sql(q, (doctype, action))
numrows_submissiondoctype = get_number_submissions_doctype_action(doctype, action)
if numrows_submissiondoctype == 0:
## everything OK - the submission has been deleted
return 0
else:
## everything NOT OK - could not delete submission. retry.
run_sql(q, (doctype, action))
if get_number_submissions_doctype_action(doctype, action) == 0:
return 0 ## success this time
else:
return 1 ## still unable to delete doctype
def insert_doctype_details(doctype, doctypename, doctypedescr):
"""Insert the details of a new document type into WebSubmit.
@param doctype: the ID code of the new document type
@param doctypename: the name of the new document type
@param doctypedescr: the description of the new document type
@return: integer (0/1). 0 when insert performed; 1 when doctype already existed, so no insert performed.
"""
numrows_doctype = get_number_doctypes_docid(doctype)
if numrows_doctype == 0:
# insert new document type:
q = """INSERT INTO sbmDOCTYPE (ldocname, sdocname, cd, md, description) VALUES (%s, %s, CURDATE(), CURDATE(), %s)"""
run_sql(q, (doctypename, doctype, (doctypedescr != "" and doctypedescr) or (None)))
return 0 # Everything is OK
else:
return 1 # Everything not OK: rows may already exist for document type doctype
def insert_submission_details_clonefrom_submission(addtodoctype, action, clonefromdoctype):
numrows_submission_addtodoctype = get_number_submissions_doctype_action(addtodoctype, action)
if numrows_submission_addtodoctype == 0:
## submission does not exist for "addtodoctype" - insert it
q = """INSERT INTO sbmIMPLEMENT (docname, actname, displayed, subname, nbpg, cd, md, buttonorder, statustext, level, """ \
"""score, stpage, endtxt) (SELECT %s, %s, displayed, %s, nbpg, CURDATE(), CURDATE(), IFNULL(buttonorder, 100), statustext, level, """ \
"""score, stpage, endtxt FROM sbmIMPLEMENT WHERE docname=%s AND actname=%s LIMIT 1)"""
run_sql(q, (addtodoctype, action, "%s%s" % (action, addtodoctype), clonefromdoctype, action))
return 0 ## cloning executed - everything OK
else:
## submission already exists for "addtodoctype" - cannot insert it again!
return 1
def insert_submission_details(doctype, action, displayed, nbpg, buttonorder, statustext, level, score, stpage, endtext):
"""Insert the details of a new submission of a given document type into WebSubmit.
@param doctype: the doctype ID (string)
@param action: the action ID (string)
@param displayed: the value of displayed (char)
@param nbpg: the value of nbpg (integer)
@param buttonorder: the value of buttonorder (integer)
@param statustext: the value of statustext (string)
@param level: the value of level (char)
@param score: the value of score (integer)
@param stpage: the value of stpage (integer)
@param endtext: the value of endtext (string)
@return: integer (0/1). 0 when insert performed; 1 when submission already existed for doctype, so no insert performed.
"""
numrows_submission = get_number_submissions_doctype_action(doctype, action)
if numrows_submission == 0:
## this submission does not exist for doctype - insert it
q = """INSERT INTO sbmIMPLEMENT (docname, actname, displayed, subname, nbpg, cd, md, buttonorder, statustext, level, """ \
"""score, stpage, endtxt) VALUES(%s, %s, %s, %s, %s, CURDATE(), CURDATE(), %s, %s, %s, %s, %s, %s)"""
run_sql(q, (doctype,
action,
displayed,
"%s%s" % (action, doctype),
((str(nbpg).isdigit() and int(nbpg) >= 0) and nbpg) or ("0"),
((str(buttonorder).isdigit() and int(buttonorder) >= 0) and buttonorder) or (None),
statustext,
level,
((str(score).isdigit() and int(score) >= 0) and score) or (""),
((str(stpage).isdigit() and int(stpage) >= 0) and stpage) or (""),
endtext
) )
return 0 ## insert performed
else:
## this submission already exists for the doctype - do not insert it
return 1
def get_cd_md_numbersubmissionpages_doctype_action(doctype, action):
"""Return the creation date (cd), the modification date (md), and the number of submission pages
for a given submission (action) of a given document type (doctype).
@param doctype: the document type for which the number of pages of a given submission is to be
determined.
@param action: the submission (action) for which the number of pages is to be determined.
@return: a tuple of tuples, where each tuple contains the creation date, the modification date, and
the number of pages for the given submission: ((cd, md, nbpg), (cd, md, nbpg)[,...])
"""
q = """SELECT cd, md, nbpg FROM sbmIMPLEMENT WHERE docname=%s AND actname=%s LIMIT 1"""
return run_sql(q, (doctype, action))
def get_numbersubmissionpages_doctype_action(doctype, action):
"""Return the number of submission pages belonging to a given submission (action) of a document type
(doctype) as an integer. In the case that the submission does not exist, 0 (ZERO) will be returned.
In the case that an error occurs, -1 will be returned.
@param doctype: (string) the unique ID of a document type.
@param action: (string) the unique name/ID of an action.
@return: an integer - the number of pages found for the submission
"""
q = """SELECT nbpg FROM sbmIMPLEMENT WHERE docname=%s AND actname=%s LIMIT 1"""
res = run_sql(q, (doctype, action))
if len(res) > 0:
try:
return int(res[0][0])
except (IndexError, ValueError):
## unexpected result
return -1
else:
return 0
def get_numberfields_submissionpage_doctype_action(doctype, action, pagenum):
"""Return the number of fields on a given page of a given submission.
@param doctype: (string) the unique ID of the document type to which the submission belongs
@param action: (string) the unique name/ID of the action
@param pagenum: (integer) the number of the page on which fields are to be counted
@return: (integer) the number of fields found on the page
"""
q = """SELECT COUNT(subname) FROM sbmFIELD WHERE pagenb=%s AND subname=%s"""
return int(run_sql(q, (pagenum, """%s%s""" % (action, doctype)))[0][0])
def get_number_of_fields_on_submissionpage_at_positionx(doctype, action, pagenum, positionx):
"""Return the number of fields at positionx on a given page of a given submission.
@param doctype: (string) the unique ID of the document type to which the submission belongs
@param action: (string) the unique name/ID of the action
@param pagenum: (integer) the number of the page on which fields are to be counted
@return: (integer) the number of fields found on the page
"""
q = """SELECT COUNT(subname) FROM sbmFIELD WHERE pagenb=%s AND subname=%s AND fieldnb=%s"""
return int(run_sql(q, (pagenum, """%s%s""" % (action, doctype), positionx))[0][0])
def swap_elements_adjacent_pages_doctype_action(doctype, action, page1, page2):
## get number pages belonging to submission:
num_pages = get_numbersubmissionpages_doctype_action(doctype=doctype, action=action)
tmp_page = num_pages + randint(3,10)
if page1 - page2 not in (1, -1):
## pages are not adjacent - cannot swap
return 1
if page1 > num_pages or page2 > num_pages or page1 < 1 or page2 < 1:
## atl least one page is out of range of legal pages:
return 2
q = """UPDATE sbmFIELD SET pagenb=%s WHERE subname=%s AND pagenb=%s"""
## move fields from p1 to tmp
run_sql(q, (tmp_page, "%s%s" % (action, doctype), page1))
num_fields_p1 = get_numberfields_submissionpage_doctype_action(doctype=doctype, action=action, pagenum=page1)
if num_fields_p1 != 0:
## problem moving some fields from page 1 - move them back from tmp
run_sql(q, (page1, "%s%s" % (action, doctype), tmp_page))
return 3
## move fields from p2 to p1
run_sql(q, (page1, "%s%s" % (action, doctype), page2))
num_fields_p2 = get_numberfields_submissionpage_doctype_action(doctype=doctype, action=action, pagenum=page2)
if num_fields_p2 != 0:
## problem moving some fields from page 2 to page 1 - try to move everything back
run_sql(q, (page2, "%s%s" % (action, doctype), page1))
run_sql(q, (page1, "%s%s" % (action, doctype), tmp_page))
return 4
## move fields from tmp_page to page2:
run_sql(q, (page2, "%s%s" % (action, doctype), tmp_page))
num_fields_tmp_page = get_numberfields_submissionpage_doctype_action(doctype=doctype, action=action, pagenum=tmp_page)
if num_fields_tmp_page != 0:
## problem moving some fields from tmp_page to page 2
## stop - this problem should be examined by admin
return 5
## success - update modification date for all fields on the swapped pages
update_modificationdate_fields_submissionpage(doctype=doctype, action=action, subpage=page1)
update_modificationdate_fields_submissionpage(doctype=doctype, action=action, subpage=page2)
return 0
def update_modificationdate_fields_submissionpage(doctype, action, subpage):
q = """UPDATE sbmFIELD SET md=CURDATE() WHERE subname=%s AND pagenb=%s"""
run_sql(q, ("%s%s" % (action, doctype), subpage))
return 0
def update_modificationdate_of_field_on_submissionpage(doctype, action, subpage, fieldnb):
q = """UPDATE sbmFIELD SET md=CURDATE() WHERE subname=%s AND pagenb=%s AND fieldnb=%s"""
run_sql(q, ("%s%s" % (action, doctype), subpage, fieldnb))
return 0
def decrement_by_one_pagenumber_submissionelements_abovepage(doctype, action, frompage):
q = """UPDATE sbmFIELD SET pagenb=pagenb-1, md=CURDATE() WHERE subname=%s AND pagenb > %s"""
run_sql(q, ("%s%s" % (action, doctype), frompage))
return 0
def get_details_and_description_of_all_fields_on_submissionpage(doctype, action, pagenum):
"""Get the details and descriptions of all fields on a given submission page, ordered by ascending field number.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param pagenum: (integer) the number of the page on which the fields to be displayed are found
@return: a tuple of tuples. Each tuple represents one field on the page.
(fieldname, field-label, check-name, field-type, size, rows, cols, field-description, field-default-value)
"""
q = """SELECT field.fidesc, field.fitext, field.checkn, el.type, el.size, el.rows, el.cols, el.fidesc, IFNULL(el.val,"") """\
"""FROM sbmFIELD AS field """\
"""LEFT JOIN sbmFIELDDESC AS el ON el.name=field.fidesc """\
"""WHERE field.subname=%s AND field.pagenb=%s """\
"""ORDER BY field.fieldnb ASC"""
res = run_sql(q, ("%s%s" % (action, doctype), pagenum))
return res
def insert_field_onto_submissionpage(doctype, action, pagenum, fieldname, fieldtext, fieldlevel, fieldshortdesc, fieldcheck):
"""Insert a field onto a given submission page, in the last position.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param pagenum: (integer) the number of the page onto which the field is to be added
@param fieldname: (string) the "element name" of the field to be added to the page
@param fieldtext: (string) the label to be displayed for the fieldon a submission page
@param fieldlevel: (char) the level of a field ('M' or 'O') - Mandatory or Optional
@param fieldshortdesc: (string) the short description for a field
@param fieldcheck: (string) the name of a check to be associated with a field
@return: None
@Exceptions raised:
InvenioWebSubmitAdminWarningInsertFailed - raised if it was not possible to insert the row for the field
"""
## get the number of fields on the page onto which the new field is to be inserted:
numfields_preinsert = get_numberfields_submissionpage_doctype_action(doctype=doctype, action=action, pagenum=pagenum)
q = """INSERT INTO sbmFIELD (subname, pagenb, fieldnb, fidesc, fitext, level, sdesc, checkn, cd, md, """ \
"""fiefi1, fiefi2) """\
"""(SELECT %s, %s, COUNT(subname)+1, %s, %s, %s, %s, %s, CURDATE(), CURDATE(), NULL, NULL FROM sbmFIELD """ \
"""WHERE subname=%s AND pagenb=%s)"""
run_sql(q, ("%s%s" % (action, doctype), pagenum, fieldname, fieldtext,
fieldlevel, fieldshortdesc, fieldcheck, "%s%s" % (action, doctype), pagenum))
numfields_postinsert = get_numberfields_submissionpage_doctype_action(doctype=doctype, action=action, pagenum=pagenum)
if not (numfields_postinsert > numfields_preinsert):
## seems as though the new field was not inserted:
msg = """Failed when trying to add a new field to page %s of submission %s""" % (pagenum, "%s%s" % (action, doctype))
raise InvenioWebSubmitAdminWarningInsertFailed(msg)
return
def delete_a_field_from_submissionpage(doctype, action, pagenum, fieldposn):
q = """DELETE FROM sbmFIELD WHERE subname=%s AND pagenb=%s AND fieldnb=%s"""
run_sql(q, ("""%s%s""" % (action, doctype), pagenum, fieldposn))
## check number of fields at deleted field's position. If 0, promote all fields below it by 1 posn;
## If field(s) still exists at deleted field's posn, report error.
numfields_deletedfieldposn = \
get_number_of_fields_on_submissionpage_at_positionx(doctype=doctype, action=action, pagenum=pagenum, positionx=fieldposn)
if numfields_deletedfieldposn == 0:
## everything OK - field was successfully deleted
return 0
else:
## everything NOT OK - couldn't delete field - retry
run_sql(q, ("""%s%s""" % (action, doctype), pagenum, fieldposn))
numfields_deletedfieldposn = \
get_number_of_fields_on_submissionpage_at_positionx(doctype=doctype, action=action, pagenum=pagenum, positionx=fieldposn)
if numfields_deletedfieldposn == 0:
## success this time
return 0
else:
## still unable to delete all fields - return fail code
return 1
def update_details_of_a_field_on_a_submissionpage(doctype, action, pagenum, fieldposn,
fieldtext, fieldlevel, fieldshortdesc, fieldcheck):
"""Update the details of one field, as found at a given location on a given submission page.
@param doctype: (string) unique ID for a document type
@param action: (string) unique ID for an action
@param pagenum: (integer) number of page on which field is found
@param fieldposn: (integer) number of field on page
@param fieldtext: (string) text label for field on page
@param fieldlevel: (char) level of field (should be 'M' or 'O' - mandatory or optional)
@param fieldshortdesc: (string) short description of field
@param fieldcheck: (string) name of JavaScript Check to be applied to field
@return: None
@Exceptions raised:
InvenioWebSubmitAdminWarningTooManyRows - when multiple rows found for field
InvenioWebSubmitAdminWarningNoRowsFound - when no rows found for field
"""
q = """UPDATE sbmFIELD SET fitext=%s, level=%s, sdesc=%s, checkn=%s, md=CURDATE() WHERE subname=%s AND pagenb=%s AND fieldnb=%s"""
queryargs = (fieldtext, fieldlevel, fieldshortdesc, fieldcheck, "%s%s" % (action, doctype), pagenum, fieldposn)
## get number of rows found for field:
numrows_field = get_number_of_fields_on_submissionpage_at_positionx(doctype=doctype, action=action,
pagenum=pagenum, positionx=fieldposn)
if numrows_field == 1:
run_sql(q, queryargs)
return
elif numrows_field > 1:
## multiple rows found for the field at this position - not safe to edit
msg = """When trying to update the field in position %s on page %s of the submission %s, %s rows were found for the field""" \
% (fieldposn, pagenum, "%s%s" % (action, doctype), numrows_field)
raise InvenioWebSubmitAdminWarningTooManyRows(msg)
else:
## no row for field found
msg = """When trying to update the field in position %s on page %s of the submission %s, no rows were found for the field""" \
% (fieldposn, pagenum, "%s%s" % (action, doctype))
raise InvenioWebSubmitAdminWarningNoRowsFound(msg)
def delete_a_field_from_submissionpage_then_reorder_fields_below_to_fill_vacant_position(doctype,
action,
pagenum,
fieldposn):
"""Delete a submission field from a given page of a given document-type submission.
E.g. Delete the field in position 3, from page 2 of the "SBI" submission of the
"TEST" document-type.
@param doctype: (string) the unique ID of the document type
@param action: (string) the unique name/ID of the submission/action
@param pagenum: (integer) the number of the page from which the field is to be
deleted
@param fieldposn: (integer) the number of the field to be deleted (e.g. field at position
number 1, or number 2, etc.)
@return: An integer number containing the number of rows deleted; -OR-
An error string in the event that something goes wrong.
"""
delete_res = delete_a_field_from_submissionpage(doctype=doctype, action=action, pagenum=pagenum, fieldposn=fieldposn)
if delete_res == 0:
## deletion was successful - demote fields below deleted field into gap:
update_res = decrement_position_of_all_fields_atposition_greaterthan_positionx_on_submissionpage(doctype=doctype,
action=action,
pagenum=pagenum,
positionx=fieldposn,
decrement=1)
## update the modification date of the page:
update_modification_date_for_submission(doctype=doctype, action=action)
return 0
else:
## could not delete field! return an appropriate error message
return delete_res
def update_modification_date_for_submission(doctype, action):
"""Update the "last-modification" date for a submission to the current date (today).
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@return: None
"""
q = """UPDATE sbmIMPLEMENT SET md=CURDATE() WHERE docname=%s AND actname=%s"""
run_sql(q, (doctype, action))
return
def move_field_on_submissionpage_from_positionx_to_positiony(doctype, action, pagenum, movefieldfrom, movefieldto):
## get number of fields on submission page:
try:
movefieldfrom = int(movefieldfrom)
movefieldto = int(movefieldto)
except ValueError:
return 1
#return 'WRN_WEBSUBMITADMIN_INVALID_FIELD_NUMBERS_SUPPLIED_WHEN_TRYING_TO_MOVE_FIELD_ON_SUBMISSION_PAGE'
numfields_page = get_numberfields_submissionpage_doctype_action(doctype=doctype, action=action, pagenum=pagenum)
if movefieldfrom > numfields_page or movefieldto > numfields_page or movefieldfrom < 1 or \
movefieldto < 1 or movefieldfrom == movefieldto:
## invalid move-field coordinates:
return 1
#return 'WRN_WEBSUBMITADMIN_INVALID_FIELD_NUMBERS_SUPPLIED_WHEN_TRYING_TO_MOVE_FIELD_ON_SUBMISSION_PAGE'
q = """UPDATE sbmFIELD SET fieldnb=%s WHERE subname=%s AND pagenb=%s AND fieldnb=%s"""
## process movement:
if movefieldfrom - movefieldto in (1, -1):
## fields are adjacent - swap them around:
tmp_fieldnb = numfields_page + randint(3,10)
## move field from position 'movefieldfrom' to tempoary position 'tmp_fieldnb':
run_sql(q, (tmp_fieldnb, "%s%s" % (action, doctype), pagenum, movefieldfrom))
num_fields_posn_movefieldfrom = \
get_number_of_fields_on_submissionpage_at_positionx(doctype=doctype, action=action, pagenum=pagenum, positionx=movefieldfrom)
if num_fields_posn_movefieldfrom != 0:
## problem moving the field from its position to the temporary position
## try to move it back, and return with an error
return 2
#return 'WRN_WEBSUBMITADMIN_UNABLE_TO_SWAP_TWO_FIELDS_ON_SUBMISSION_PAGE_COULDNT_MOVE_FIELD1_TO_TEMP_POSITION'
## move field from position 'movefieldto' to position 'movefieldfrom':
run_sql(q, (movefieldfrom, "%s%s" % (action, doctype), pagenum, movefieldto))
num_fields_posn_movefieldto = \
get_number_of_fields_on_submissionpage_at_positionx(doctype=doctype, action=action, pagenum=pagenum, positionx=movefieldto)
if num_fields_posn_movefieldto != 0:
## problem moving the field at 'movefieldto' into the position 'movefieldfrom'
## try to reverse the changes made so far, then return with an error:
## move field at temporary posn back to 'movefieldfrom' position:
run_sql(q, (movefieldfrom, "%s%s" % (action, doctype), pagenum, tmp_fieldnb))
return 3
#return 'WRN_WEBSUBMITADMIN_UNABLE_TO_SWAP_TWO_FIELDS_ON_SUBMISSION_PAGE_COULDNT_MOVE_FIELD2_TO_FIELD1_POSITION'
## move field from temporary position 'tmp_fieldnb' to position 'movefieldto':
run_sql(q, (movefieldto, "%s%s" % (action, doctype), pagenum, tmp_fieldnb))
num_fields_posn_tmp_fieldnb = \
get_number_of_fields_on_submissionpage_at_positionx(doctype=doctype, action=action, pagenum=pagenum, positionx=tmp_fieldnb)
if num_fields_posn_tmp_fieldnb != 0:
## problem moving the field from the temporary position to position 'movefieldto'
## stop - admin should examine and fix this problem
return 4
#return 'WRN_WEBSUBMITADMIN_UNABLE_TO_SWAP_TWO_FIELDS_ON_SUBMISSION_PAGE_COULDNT_MOVE_FIELD1_TO_POSITION_FIELD2_FROM_TEMPORARY_POSITION'
## successfully swapped fields - update modification date of the swapped fields and of the submission
update_modificationdate_of_field_on_submissionpage(doctype=doctype, action=action, subpage=pagenum, fieldnb=movefieldfrom)
update_modificationdate_of_field_on_submissionpage(doctype=doctype, action=action, subpage=pagenum, fieldnb=movefieldto)
update_modification_date_for_submission(doctype=doctype, action=action)
return 0
else:
## fields not adjacent - perform a move:
tmp_fieldnb = 0 - randint(3,10)
## move field from position 'movefieldfrom' to tempoary position 'tmp_fieldnb':
run_sql(q, (tmp_fieldnb, "%s%s" % (action, doctype), pagenum, movefieldfrom))
num_fields_posn_movefieldfrom = \
get_number_of_fields_on_submissionpage_at_positionx(doctype=doctype, action=action, pagenum=pagenum, positionx=movefieldfrom)
if num_fields_posn_movefieldfrom != 0:
## problem moving the field from its position to the temporary position
## try to move it back, and return with an error
return 2
#return 'WRN_WEBSUBMITADMIN_UNABLE_TO_SWAP_TWO_FIELDS_ON_SUBMISSION_PAGE_COULDNT_MOVE_FIELD1_TO_TEMP_POSITION'
## fill the gap created by the moved field by decrementing by one the position of all fields below it:
qres = decrement_position_of_all_fields_atposition_greaterthan_positionx_on_submissionpage(doctype=doctype, action=action,
pagenum=pagenum, positionx=movefieldfrom,
decrement=1)
if movefieldfrom < numfields_page:
## check that there is now a field in the position of "movefieldfrom":
num_fields_posn_movefieldfrom = \
get_number_of_fields_on_submissionpage_at_positionx(doctype=doctype, action=action, pagenum=pagenum, positionx=movefieldfrom)
if num_fields_posn_movefieldfrom == 0:
## no field there - it was not possible to decrement the field position of all fields below the field moved 'tmp_fieldnb'
## try to move the field back from 'tmp_fieldnb'
run_sql(q, (movefieldfrom, "%s%s" % (action, doctype), pagenum, tmp_fieldnb))
## return an ERROR message
return 5
#return 'WRN_WEBSUBMITADMIN_UNABLE_TO_MOVE_FIELD_TO_NEW_POSITION_ON_SUBMISSION_PAGE_COULDNT_DECREMENT_POSITION_OF_FIELDS_BELOW_FIELD1'
## now increment (by one) the position of the fields at and below the field at position 'movefieldto':
qres = increment_position_of_all_fields_atposition_greaterthan_positionx_on_submissionpage(doctype=doctype, action=action,
pagenum=pagenum, positionx=movefieldto-1,
increment=1)
## there should now be an empty space at position 'movefieldto':
num_fields_posn_movefieldto = \
get_number_of_fields_on_submissionpage_at_positionx(doctype=doctype, action=action, pagenum=pagenum, positionx=movefieldto)
if num_fields_posn_movefieldto != 0:
## there isn't! the increment of position has failed - return warning:
return 6
#return 'WRN_WEBSUBMITADMIN_UNABLE_TO_MOVE_FIELD_TO_NEW_POSITION_ON_SUBMISSION_PAGE_COULDNT_INCREMENT_POSITION_OF_FIELDS_AT_AND_BELOW_FIELD2'
## Move field from temporary position to position 'movefieldto':
run_sql(q, (movefieldto, "%s%s" % (action, doctype), pagenum, tmp_fieldnb))
num_fields_posn_movefieldto = \
get_number_of_fields_on_submissionpage_at_positionx(doctype=doctype, action=action, pagenum=pagenum, positionx=movefieldto)
if num_fields_posn_movefieldto == 0:
## failed to move field1 from temp posn to final posn
return 4
#return 'WRN_WEBSUBMITADMIN_UNABLE_TO_SWAP_TWO_FIELDS_ON_SUBMISSION_PAGE_COULDNT_MOVE_FIELD1_TO_POSITION_FIELD2_FROM_TEMPORARY_POSITION'
## successfully moved field - update modification date of the moved field and of the submission
update_modificationdate_of_field_on_submissionpage(doctype=doctype, action=action, subpage=pagenum, fieldnb=movefieldfrom)
update_modification_date_for_submission(doctype=doctype, action=action)
return 0
def increment_position_of_all_fields_atposition_greaterthan_positionx_on_submissionpage(doctype, action, pagenum, positionx, increment=1):
"""Increment (by the number provided via the "increment" parameter) the position of all fields (on a given submission page)
found at a position greater than that of positionx
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique name/ID of the action
@param pagenum: (integer) the number of the submission page on which the fields are situated
@param positionx: (integer) the position after which fields' positions are to be promoted
@param increment: (integer) the number by which to increment the field positions (defaults to 1)
@return:
"""
if type(increment) is not int:
increment = 1
q = """UPDATE sbmFIELD SET fieldnb=fieldnb+%s WHERE subname=%s AND pagenb=%s AND fieldnb > %s"""
res = run_sql(q, (increment, "%s%s" % (action, doctype), pagenum, positionx))
try:
return int(res)
except ValueError:
return None
def decrement_position_of_all_fields_atposition_greaterthan_positionx_on_submissionpage(doctype, action, pagenum, positionx, decrement=1):
"""Decrement (by the number provided via the "decrement" parameter) the position of all fields (on a given submission page)
found at a position greater than that of positionx
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique name/ID of the action
@param pagenum: (integer) the number of the submission page on which the fields are situated
@param positionx: (integer) the position after which fields' positions are to be promoted
@param decrement: (integer) the number by which to increment the field positions (defaults to 1)
@return:
"""
if type(decrement) is not int:
decrement = 1
q = """UPDATE sbmFIELD SET fieldnb=fieldnb-%s WHERE subname=%s AND pagenb=%s AND fieldnb > %s"""
res = run_sql(q, (decrement, "%s%s" % (action, doctype), pagenum, positionx))
try:
return int(res)
except ValueError:
return None
def delete_allfields_submissionpage_doctype_action(doctype, action, pagenum):
q = """DELETE FROM sbmFIELD WHERE pagenb=%s AND subname=%s"""
run_sql(q, (pagenum, """%s%s""" % (action, doctype)))
numrows_fields = get_numberfields_submissionpage_doctype_action(doctype=doctype,
action=action, pagenum=pagenum)
if numrows_fields == 0:
## everything OK - all fields deleted
return 0
else:
## everything NOT OK - couldn't delete all fields for page
## retry
run_sql(q, (pagenum, doctype, action))
numrows_fields = get_numberfields_submissionpage_doctype_action(doctype=doctype,
action=action, pagenum=pagenum)
if numrows_fields == 0:
## success this time
return 0
else:
## still unable to delete all fields - return fail code
return 1
def get_details_allsubmissionfields_on_submission_page(doctype, action, pagenum):
"""Get the details of all submission elements belonging to a particular page of the submission.
Results are returned ordered by field number.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique name/ID of an action
@param pagenum: (string/integer): the integer number of the page for which element details are
to be retrieved
@return: a tuple of tuples: (subname, fieldnb, fidesc, fitext, level, sdesc, checkn, cd, md). Each
tuple contains the details of one element.
"""
q = """SELECT subname, fieldnb, fidesc, fitext, level, sdesc, checkn, cd, md FROM sbmFIELD """\
"""WHERE subname=%s AND pagenb=%s ORDER BY fieldnb ASC"""
return run_sql(q, ("%s%s" % (action, doctype), pagenum))
def get_details_of_field_at_positionx_on_submissionpage(doctype, action, pagenum, fieldposition):
"""Get the details of a particular field in a submission page.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique name/ID of an action
@param pagenum: (integer) the number of the submission page on which the field is found
@param fieldposition: (integer) the position on the submission page of the field for which details
are to be retrieved.
@return: a tuple of the field's details: (subname, fieldnb, fidesc, fitext, level, sdesc, checkn, cd, md). Each
tuple contains the details of one element.
"""
fielddets = []
q = """SELECT subname, fieldnb, fidesc, fitext, level, sdesc, checkn, cd, md FROM sbmFIELD """\
"""WHERE subname=%s AND pagenb=%s AND fieldnb=%s LIMIT 1"""
res = run_sql(q, ("%s%s" % (action, doctype), pagenum, fieldposition))
if len(res) > 0:
fielddets = res[0]
return fielddets
def decrement_by_one_number_submissionpages_doctype_action(doctype, action):
numrows_submission = get_number_submissions_doctype_action(doctype, action)
if numrows_submission == 1:
## there is only one row for this submission - can update
q = """UPDATE sbmIMPLEMENT SET nbpg=IFNULL(nbpg, 1)-1, md=CURDATE() WHERE docname=%s AND actname=%s and IFNULL(nbpg, 1) > 0"""
run_sql(q, (doctype, action))
return 0 ## Everything OK
else:
## Everything NOT OK - either multiple rows exist for submission, or submission doesn't exist
return 1
def add_submission_page_doctype_action(doctype, action):
"""Increment the number of pages associated with a given submission by 1
@param doctype: the unique ID of the document type that owns the submission.
@param action: the action name/ID of the given submission of the document type, for which the number
of pages is to be incremented.
@return: an integer error code. 0 (ZERO) means that the update was performed without error; 1 (ONE) means
that there was a problem and the update could not be performed. Problems could be: multiple rows found for
the submission; no rows found for the submission.
"""
numrows_submission = get_number_submissions_doctype_action(doctype, action)
if numrows_submission == 1:
## there is only one row for this submission - can update
q = """UPDATE sbmIMPLEMENT SET nbpg=IFNULL(nbpg, 0)+1, md=CURDATE() WHERE docname=%s AND actname=%s"""
run_sql(q, (doctype, action))
return 0 ## Everything OK
else:
## Everything NOT OK - either multiple rows exist for submission, or submission doesn't exist
return 1
diff --git a/invenio/legacy/websubmit/admin_engine.py b/invenio/legacy/websubmit/admin_engine.py
index 3d95019b3..65018bb86 100644
--- a/invenio/legacy/websubmit/admin_engine.py
+++ b/invenio/legacy/websubmit/admin_engine.py
@@ -1,4246 +1,4246 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
import re
from os.path import split, basename, isfile
from os import access, F_OK, R_OK, getpid, rename, unlink
from time import strftime, localtime
-from invenio.websubmitadmin_dblayer import *
-from invenio.websubmitadmin_config import *
+from invenio.legacy.websubmit.admin_dblayer import *
+from invenio.legacy.websubmit.admin_config import *
from invenio.legacy.websubmit.config import CFG_RESERVED_SUBMISSION_FILENAMES
from invenio.modules.access.control import acc_get_all_roles, acc_get_role_users, acc_delete_user_role
from invenio.config import CFG_SITE_LANG, CFG_WEBSUBMIT_BIBCONVERTCONFIGDIR
from invenio.modules.access.engine import acc_authorize_action
from invenio.ext.logging import register_exception
-from invenio.websubmitadmin_config import InvenioWebSubmitWarning
+from invenio.legacy.websubmit.admin_config import InvenioWebSubmitWarning
from invenio.base.i18n import gettext_set_language
import invenio.legacy.template
try:
websubmitadmin_templates = invenio.legacy.template.load('websubmitadmin')
except:
pass
## utility functions:
def is_adminuser(req, role):
"""check if user is a registered administrator. """
return acc_authorize_action(req, role)
def check_user(req, role, adminarea=2, authorized=0):
(auth_code, auth_message) = is_adminuser(req, role)
if not authorized and auth_code != 0:
return ("false", auth_message)
return ("", auth_message)
def get_navtrail(ln=CFG_SITE_LANG):
"""gets the navtrail for title...
@param title: title of the page
@param ln: language
@return: HTML output
"""
navtrail = websubmitadmin_templates.tmpl_navtrail(ln)
return navtrail
def stringify_listvars(mylist):
"""Accept a list (or a list of lists) (or tuples).
Convert each item in the list, into a string (replace None with the empty
string "").
@param mylist: A list/tuple of values, or a list/tuple of value list/tuples.
@return: a tuple of string values or a tuple of string value tuples
"""
string_list = []
try:
if type(mylist[0]) in (tuple,list):
for row in mylist:
string_list.append(map(lambda x: x is not None and str(x) or "", row))
else:
string_list = map(lambda x: x is not None and str(x) or "", mylist)
except IndexError:
pass
return string_list
def save_update_to_file(filepath, filecontent, notruncate=0, appendmode=0):
"""Save a string value to a file.
Save will create a new file if the file does not exist. Mode can be set to truncate an older file
or to refuse to create the file if it already exists. There is also a mode to "append" the string value
to a file.
@param filepath: (string) the full path to the file
@param filecontent: (string) the content to be written to the file
@param notruncate: (integer) should be 1 or 0, defaults to 0 (ZERO). If 0, existing file will be truncated;
if 1, file will not be written if it already exists
@param appendmode: (integer) should be 1 or 0, defaults to 0 (ZERO). If 1, data will be appended to the file
if it exists; if 0, file will be truncated (or not, depending on the notruncate mode) by new data.
@return: None
@exceptions raised:
- InvenioWebSubmitAdminWarningIOError: when operations involving writing to file failed.
"""
## sanity checking:
if notruncate not in (0, 1):
notruncate = 0
if appendmode not in (0, 1):
appendmode = 0
(fpath, fname) = split(filepath)
if fname == "":
## error opening file
msg = """Unable to open filepath [%s] - couldn't determine a valid filename""" % (filepath,)
raise InvenioWebSubmitAdminWarningIOError(msg)
## if fpath is not empty, append the trailing "/":
if fpath != "":
fpath += "/"
if appendmode == 0:
if notruncate != 0 and access("%s%s" % (fpath, fname), F_OK):
## in no-truncate mode, but file already exists!
msg = """Unable to write to file [%s] in "no-truncate mode" because file already exists"""\
% (fname,)
raise InvenioWebSubmitAdminWarningIOError(msg)
## file already exists, make temporary file first, then move it later
tmpfname = "%s_%s_%s" % (fname, strftime("%Y%m%d%H%M%S", localtime()), getpid())
## open temp file for writing:
try:
fp = open("%s%s" % (fpath, tmpfname), "w")
except IOError, e:
## cannot open file
msg = """Unable to write to file [%s%s] - cannot open file for writing""" % (fpath, fname)
raise InvenioWebSubmitAdminWarningIOError(msg)
## write contents to temp file:
try:
fp.write(filecontent)
fp.flush()
fp.close()
except IOError, e:
## could not write to temp file
msg = """Unable to write to file [%s]""" % (tmpfname,)
## remove the "temp file"
try:
fp.close()
unlink("%s%s" % (fpath, tmpfname))
except IOError:
pass
raise InvenioWebSubmitAdminWarningIOError(msg)
## rename temp file to final filename:
try:
rename("%s%s" % (fpath, tmpfname), "%s%s" % (fpath, fname))
except OSError:
## couldnt rename the tmp file to final file name
msg = """Unable to write to file [%s] - created temporary file [%s], but could not then rename it to [%s]"""\
% (fname, tmpfname, fname)
raise InvenioWebSubmitAdminWarningIOError(msg)
else:
## append mode:
try:
fp = open("%s%s" % (fpath, fname), "a")
except IOError, e:
## cannot open file
msg = """Unable to write to file [%s] - cannot open file for writing in append mode""" % (fname,)
raise InvenioWebSubmitAdminWarningIOError(msg)
## write contents to temp file:
try:
fp.write(filecontent)
fp.flush()
fp.close()
except IOError, e:
## could not write to temp file
msg = """Unable to write to file [%s] in append mode""" % (fname,)
## close the file
try:
fp.close()
except IOError:
pass
raise InvenioWebSubmitAdminWarningIOError(msg)
return
def string_is_alphanumeric_including_underscore(txtstring):
p_txtstring = re.compile(r'^\w*$')
m_txtstring = p_txtstring.search(txtstring)
if m_txtstring is not None:
return 1
else:
return 0
def function_name_is_valid(fname):
p_fname = re.compile(r'^(_|[a-zA-Z])\w*$')
m_fname = p_fname.search(fname)
if m_fname is not None:
return 1
else:
return 0
def wash_single_urlarg(urlarg, argreqdtype, argdefault, maxstrlen=None, minstrlen=None, truncatestr=0):
"""Wash a single argument according to some specifications.
@param urlarg: the argument to be tested, as passed from the form/url, etc
@param argreqdtype: (a python type) the type that the argument should conform to (argument required
type)
@argdefault: the default value that should be returned for the argument in the case that it
doesn't comply with the washing specifications
@param maxstrlen: (integer) the maximum length for a string argument; defaults to None, which means
that no maximum length is forced upon the string
@param minstrlen: (integer) the minimum length for a string argument; defaults to None, which means
that no minimum length is forced upon the string
@truncatestr: (integer) should be 1 or 0 (ZERO). A flag used to determine whether or not a string
argument that overstretches the maximum length (if one if provided) should be truncated, or reset
to the default for the argument. 0, means don't truncate and reset the argument; 1 means truncate
the string.
@return: the washed argument
@exceptions raised:
- ValueError: when it is not possible to cast an argument to the type passed as argreqdtype
"""
## sanity checking:
if maxstrlen is not None and type(maxstrlen) is not int:
maxstrlen = None
elif maxstrlen is int and maxstrlen < 1:
maxstrlen = None
if minstrlen is not None and type(minstrlen) is not int:
minstrlen = None
elif minstrlen is int and minstrlen < 1:
minstrlen = None
result = ""
arg_dst_type = argreqdtype
## if no urlarg, return the default for that argument:
if urlarg is None:
result = argdefault
return result
## get the type of the argument passed:
arg_src_type = type(urlarg)
value = urlarg
# First, handle the case where we want all the results. In
# this case, we need to ensure all the elements are strings,
# and not Field instances.
if arg_src_type in (list, tuple):
if arg_dst_type is list:
result = [str(x) for x in value]
return result
if arg_dst_type is tuple:
result = tuple([str(x) for x in value])
return result
# in all the other cases, we are only interested in the
# first value.
value = value[0]
# Maybe we already have what is expected? Then don't change
# anything.
if arg_src_type is arg_dst_type:
result = value
if arg_dst_type is str and maxstrlen is not None and len(result) > maxstrlen:
if truncatestr != 0:
result = result[0:maxstrlen]
else:
result = argdefault
elif arg_dst_type is str and minstrlen is not None and len(result) < minstrlen:
result = argdefault
return result
if arg_dst_type in (str, int):
try:
result = arg_dst_type(value)
if arg_dst_type is str and maxstrlen is not None and len(result) > maxstrlen:
if truncatestr != 0:
result = result[0:maxstrlen]
else:
result = argdefault
elif arg_dst_type is str and minstrlen is not None and len(result) < minstrlen:
result = argdefault
except:
result = argdefault
elif arg_dst_type is tuple:
result = (value,)
elif arg_dst_type is list:
result = [value]
elif arg_dst_type is dict:
result = {0: str(value)}
else:
raise ValueError('cannot cast form argument into type %r' % (arg_dst_type,))
return result
## Internal Business-Logic functions
## Functions for managing collection order, etc:
def build_submission_collection_tree(collection_id, has_brother_above=0, has_brother_below=0):
## get the name of this collection:
collection_name = get_collection_name(collection_id)
if collection_name is None:
collection_name = "Unknown Collection"
## make a data-structure containing the details of the collection:
collection_node = { 'collection_id' : collection_id, ## collection ID
'collection_name' : collection_name, ## collection Name
'collection_children' : [], ## list of 'collection' children nodes
'doctype_children' : [], ## list of 'doctype' children
'has_brother_above' : has_brother_above, ## has a sibling collection above in score
'has_brother_below' : has_brother_below, ## has a sibling collection below in score
}
## get the IDs and names of all doctypes attached to this collection:
res_doctype_children = get_doctype_children_of_collection(collection_id)
## for each child, add its details to the list of doctype children for this node:
for doctype in res_doctype_children:
doctype_node = { 'doctype_id' : doctype[0],
'doctype_lname' : doctype[1],
'catalogue_order' : doctype[2],
}
collection_node['doctype_children'].append(doctype_node)
## now get details of all collections attached to this one:
res_collection_children = get_collection_children_of_collection(collection_id)
num_collection_children = len(res_collection_children)
for child_num in xrange(0, num_collection_children):
brother_below = brother_above = 0
if child_num > 0:
## this is not the first brother - it has a brother above
brother_above = 1
if child_num < num_collection_children - 1:
## this is not the last brother - it has a brother below
brother_below = 1
collection_node['collection_children'].append(\
build_submission_collection_tree(collection_id=res_collection_children[child_num][0],
has_brother_above=brother_above,
has_brother_below=brother_below))
## return the built collection tree:
return collection_node
def _organise_submission_page_display_submission_tree(user_msg=""):
title = "Organise WebSubmit Main Page"
body = ""
if user_msg == "" or type(user_msg) not in (list, tuple, str, unicode):
user_msg = []
## Get the submissions tree:
submission_collection_tree = build_submission_collection_tree(0)
## Get all 'submission collections':
submission_collections = get_details_of_all_submission_collections()
sub_col = [('0', 'Top Level')]
for collection in submission_collections:
sub_col.append((str(collection[0]), str(collection[1])))
## Get all document types:
doctypes = get_docid_docname_and_docid_alldoctypes()
## build the page:
body = websubmitadmin_templates.tmpl_display_submission_page_organisation(submission_collection_tree=submission_collection_tree,
submission_collections=sub_col,
doctypes=doctypes,
user_msg=user_msg)
return (title, body)
def _delete_submission_collection(sbmcolid):
"""Recursively calls itself to delete a submission-collection and all of its
attached children (and their children, etc) from the submission-tree.
@param sbmcolid: (integer) - the ID of the submission-collection to be deleted.
@return: None
@Exceptions raised: InvenioWebSubmitAdminWarningDeleteFailed when it was not
possible to delete the submission-collection or some of its children.
"""
## Get the collection-children of this submission-collection:
collection_children = get_collection_children_of_collection(sbmcolid)
## recursively move through each collection-child:
for collection_child in collection_children:
_delete_submission_collection(collection_child[0])
## delete all document-types attached to this submission-collection:
error_code = delete_doctype_children_from_submission_collection(sbmcolid)
if error_code != 0:
## Unable to delete all doctype-children:
err_msg = "Unable to delete doctype children of submission-collection [%s]" % sbmcolid
raise InvenioWebSubmitAdminWarningDeleteFailed(err_msg)
## delete this submission-collection's entry from the sbmCOLLECTION_sbmCOLLECTION table:
error_code = delete_submission_collection_from_submission_tree(sbmcolid)
if error_code != 0:
## Unable to delete submission-collection from the submission-tree:
err_msg = "Unable to delete submission-collection [%s] from submission-tree" % sbmcolid
raise InvenioWebSubmitAdminWarningDeleteFailed(err_msg)
## Now delete this submission-collection's details:
error_code = delete_submission_collection_details(sbmcolid)
if error_code != 0:
## Unable to delete the details of the submission-collection:
err_msg = "Unable to delete details of submission-collection [%s]" % sbmcolid
raise InvenioWebSubmitAdminWarningDeleteFailed(err_msg)
## return
return
def perform_request_organise_submission_page(doctype="",
sbmcolid="",
catscore="",
addsbmcollection="",
deletesbmcollection="",
addtosbmcollection="",
adddoctypes="",
movesbmcollectionup="",
movesbmcollectiondown="",
deletedoctypefromsbmcollection="",
movedoctypeupinsbmcollection="",
movedoctypedowninsbmcollection=""):
user_msg = []
body = ""
if "" not in (deletedoctypefromsbmcollection, sbmcolid, catscore, doctype):
## delete a document type from it's position in the tree
error_code = delete_doctype_from_position_on_submission_page(doctype, sbmcolid, catscore)
if error_code == 0:
## doctype deleted - now normalize scores of remaining doctypes:
normalize_scores_of_doctype_children_for_submission_collection(sbmcolid)
user_msg.append("Document type successfully deleted from submissions tree")
else:
user_msg.append("Unable to delete document type from submission-collection")
## display submission-collections:
(title, body) = _organise_submission_page_display_submission_tree(user_msg=user_msg)
elif "" not in (deletesbmcollection, sbmcolid):
## try to delete the submission-collection from the tree:
try:
_delete_submission_collection(sbmcolid)
user_msg.append("Submission-collection successfully deleted from submissions tree")
except InvenioWebSubmitAdminWarningDeleteFailed, excptn:
user_msg.append(str(excptn))
## re-display submission-collections:
(title, body) = _organise_submission_page_display_submission_tree(user_msg=user_msg)
elif "" not in (movedoctypedowninsbmcollection, sbmcolid, doctype, catscore):
## move a doctype down in order for a submission-collection:
## normalize scores of all doctype-children of the submission-collection:
normalize_scores_of_doctype_children_for_submission_collection(sbmcolid)
## swap this doctype with that below it:
## Get score of doctype to move:
score_doctype_to_move = get_catalogue_score_of_doctype_child_of_submission_collection(sbmcolid, doctype)
## Get score of the doctype brother directly below the doctype to be moved:
score_brother_below = get_score_of_next_doctype_child_below(sbmcolid, score_doctype_to_move)
if None in (score_doctype_to_move, score_brother_below):
user_msg.append("Unable to move document type down")
else:
## update the brother below the doctype to be moved to have a score the same as the doctype to be moved:
update_score_of_doctype_child_of_submission_collection_at_scorex(sbmcolid, score_brother_below, score_doctype_to_move)
## Update the doctype to be moved to have a score of the brother directly below it:
update_score_of_doctype_child_of_submission_collection_with_doctypeid_and_scorex(sbmcolid,
doctype,
score_doctype_to_move,
score_brother_below)
user_msg.append("Document type moved down")
(title, body) = _organise_submission_page_display_submission_tree(user_msg=user_msg)
elif "" not in (movedoctypeupinsbmcollection, sbmcolid, doctype, catscore):
## move a doctype up in order for a submission-collection:
## normalize scores of all doctype-children of the submission-collection:
normalize_scores_of_doctype_children_for_submission_collection(sbmcolid)
## swap this doctype with that above it:
## Get score of doctype to move:
score_doctype_to_move = get_catalogue_score_of_doctype_child_of_submission_collection(sbmcolid, doctype)
## Get score of the doctype brother directly above the doctype to be moved:
score_brother_above = get_score_of_previous_doctype_child_above(sbmcolid, score_doctype_to_move)
if None in (score_doctype_to_move, score_brother_above):
user_msg.append("Unable to move document type up")
else:
## update the brother above the doctype to be moved to have a score the same as the doctype to be moved:
update_score_of_doctype_child_of_submission_collection_at_scorex(sbmcolid, score_brother_above, score_doctype_to_move)
## Update the doctype to be moved to have a score of the brother directly above it:
update_score_of_doctype_child_of_submission_collection_with_doctypeid_and_scorex(sbmcolid,
doctype,
score_doctype_to_move,
score_brother_above)
user_msg.append("Document type moved up")
(title, body) = _organise_submission_page_display_submission_tree(user_msg=user_msg)
elif "" not in (movesbmcollectiondown, sbmcolid):
## move a submission-collection down in order:
## Sanity checking:
try:
int(sbmcolid)
except ValueError:
sbmcolid = 0
if int(sbmcolid) != 0:
## Get father ID of submission-collection:
sbmcolidfather = get_id_father_of_collection(sbmcolid)
if sbmcolidfather is None:
user_msg.append("Unable to move submission-collection downwards")
else:
## normalize scores of all collection-children of the father submission-collection:
normalize_scores_of_collection_children_of_collection(sbmcolidfather)
## swap this collection with the one above it:
## get the score of the collection to move:
score_col_to_move = get_score_of_collection_child_of_submission_collection(sbmcolidfather, sbmcolid)
## get the score of the collection brother directly below the collection to be moved:
score_brother_below = get_score_of_next_collection_child_below(sbmcolidfather, score_col_to_move)
if None in (score_col_to_move, score_brother_below):
## Invalid movement
user_msg.append("Unable to move submission collection downwards")
else:
## update the brother below the collection to be moved to have a score the same as the collection to be moved:
update_score_of_collection_child_of_submission_collection_at_scorex(sbmcolidfather,
score_brother_below,
score_col_to_move)
## Update the collection to be moved to have a score of the brother directly below it:
update_score_of_collection_child_of_submission_collection_with_colid_and_scorex(sbmcolidfather,
sbmcolid,
score_col_to_move,
score_brother_below)
user_msg.append("Submission-collection moved downwards")
else:
## cannot move the master (0) collection
user_msg.append("Unable to move submission-collection downwards")
(title, body) = _organise_submission_page_display_submission_tree(user_msg=user_msg)
elif "" not in (movesbmcollectionup, sbmcolid):
## move a submission-collection up in order:
## Sanity checking:
try:
int(sbmcolid)
except ValueError:
sbmcolid = 0
if int(sbmcolid) != 0:
## Get father ID of submission-collection:
sbmcolidfather = get_id_father_of_collection(sbmcolid)
if sbmcolidfather is None:
user_msg.append("Unable to move submission-collection upwards")
else:
## normalize scores of all collection-children of the father submission-collection:
normalize_scores_of_collection_children_of_collection(sbmcolidfather)
## swap this collection with the one above it:
## get the score of the collection to move:
score_col_to_move = get_score_of_collection_child_of_submission_collection(sbmcolidfather, sbmcolid)
## get the score of the collection brother directly above the collection to be moved:
score_brother_above = get_score_of_previous_collection_child_above(sbmcolidfather, score_col_to_move)
if None in (score_col_to_move, score_brother_above):
## Invalid movement
user_msg.append("Unable to move submission collection upwards")
else:
## update the brother above the collection to be moved to have a score the same as the collection to be moved:
update_score_of_collection_child_of_submission_collection_at_scorex(sbmcolidfather,
score_brother_above,
score_col_to_move)
## Update the collection to be moved to have a score of the brother directly above it:
update_score_of_collection_child_of_submission_collection_with_colid_and_scorex(sbmcolidfather,
sbmcolid,
score_col_to_move,
score_brother_above)
user_msg.append("Submission-collection moved upwards")
else:
## cannot move the master (0) collection
user_msg.append("Unable to move submission-collection upwards")
(title, body) = _organise_submission_page_display_submission_tree(user_msg=user_msg)
elif "" not in (addsbmcollection, addtosbmcollection):
## Add a submission-collection, attached to a submission-collection:
## check that the collection to attach to exists:
parent_ok = 0
if int(addtosbmcollection) != 0:
parent_name = get_collection_name(addtosbmcollection)
if parent_name is not None:
parent_ok = 1
else:
parent_ok = 1
if parent_ok != 0:
## create the new collection:
id_son = insert_submission_collection(addsbmcollection)
## get the maximum catalogue score of the existing collection children:
max_child_score = \
get_maximum_catalogue_score_of_collection_children_of_submission_collection(addtosbmcollection)
## add it to the collection, at a higher score than the others have:
new_score = max_child_score + 1
insert_collection_child_for_submission_collection(addtosbmcollection, id_son, new_score)
user_msg.append("Submission-collection added to submissions tree")
else:
## Parent submission-collection does not exist:
user_msg.append("Unable to add submission-collection - parent unknown")
(title, body) = _organise_submission_page_display_submission_tree(user_msg=user_msg)
elif "" not in (adddoctypes, addtosbmcollection):
## Add document type(s) to a submission-collection:
if type(adddoctypes) == str:
adddoctypes = [adddoctypes,]
## Does submission-collection exist?
num_collections_sbmcolid = get_number_of_rows_for_submission_collection(addtosbmcollection)
if num_collections_sbmcolid > 0:
for doctypeid in adddoctypes:
## Check that Doctype exists:
num_doctypes_doctypeid = get_number_doctypes_docid(doctypeid)
if num_doctypes_doctypeid < 1:
## Cannot connect an unknown doctype:
user_msg.append("Unable to connect unknown document-type [%s] to a submission-collection" \
% doctypeid)
continue
else:
## insert the submission-collection/doctype link:
## get the maximum catalogue score of the existing doctype children:
max_child_score = \
get_maximum_catalogue_score_of_doctype_children_of_submission_collection(addtosbmcollection)
## add it to the new doctype, at a higher score than the others have:
new_score = max_child_score + 1
insert_doctype_child_for_submission_collection(addtosbmcollection, doctypeid, new_score)
user_msg.append("Document-type added to submissions tree")
else:
## submission-collection didn't exist
user_msg.append("The selected submission-collection doesn't seem to exist")
## Check that submission-collection exists:
## insert
(title, body) = _organise_submission_page_display_submission_tree(user_msg=user_msg)
else:
## default action - display submission-collections:
(title, body) = _organise_submission_page_display_submission_tree(user_msg=user_msg)
return (title, body)
## Functions for adding new catalgue to DB:
def _add_new_action(actid,actname,working_dir,status_text):
"""Insert the details of a new action into the websubmit system database.
@param actid: unique action id (sactname)
@param actname: action name (lactname)
@param working_dir: directory action works from (dir)
@param status_text: text string indicating action status (statustext)
"""
(actid,actname,working_dir,status_text) = (str(actid).upper(),str(actname),str(working_dir),str(status_text))
err_code = insert_action_details(actid,actname,working_dir,status_text)
return err_code
def perform_request_add_function(funcname=None, funcdescr=None, funcaddcommit=""):
user_msg = []
body = ""
title = "Create New WebSubmit Function"
commit_error=0
## wash args:
if funcname is not None:
try:
funcname = wash_single_urlarg(urlarg=funcname, argreqdtype=str, argdefault="", maxstrlen=40, minstrlen=1)
if function_name_is_valid(fname=funcname) == 0:
funcname = ""
except ValueError, e:
funcname = ""
else:
funcname = ""
if funcdescr is not None:
try:
funcdescr = wash_single_urlarg(urlarg=funcdescr, argreqdtype=str, argdefault="")
except ValueError, e:
funcdescr = ""
else:
funcdescr = ""
## process request:
if funcaddcommit != "" and funcaddcommit is not None:
if funcname == "":
funcname = ""
user_msg.append("""Function name is mandatory and must be a string with no more than 40 characters""")
user_msg.append("""It must contain only alpha-numeric and underscore characters, beginning with a """\
"""letter or underscore""")
commit_error = 1
if commit_error != 0:
## don't commit - just re-display page with message to user
body = websubmitadmin_templates.tmpl_display_addfunctionform(funcdescr=funcdescr, user_msg=user_msg)
return (title, body)
## Add a new function definition - IF it is not already present
err_code = insert_function_details(funcname, funcdescr)
## Handle error code - redisplay form with warning about no DB commit, or display with options
## to edit function:
if err_code == 0:
user_msg.append("""'%s' Function Added to WebSubmit""" % (funcname,))
all_function_parameters = get_distinct_paramname_all_websubmit_function_parameters()
body = websubmitadmin_templates.tmpl_display_addfunctionform(funcname=funcname,
funcdescr=funcdescr,
all_websubmit_func_parameters=all_function_parameters,
perform_act="functionedit",
user_msg=user_msg)
else:
## Could not commit function to WebSubmit DB - redisplay form with function description:
user_msg.append("""Could Not Add '%s' Function to WebSubmit""" % (funcname,))
body = websubmitadmin_templates.tmpl_display_addfunctionform(funcdescr=funcdescr, user_msg=user_msg)
else:
## Display Web form for new function addition:
body = websubmitadmin_templates.tmpl_display_addfunctionform()
return (title, body)
def perform_request_add_action(actid=None, actname=None, working_dir=None, status_text=None, actcommit=""):
"""An interface for the addition of a new WebSubmit action.
If form fields filled, will insert new action into WebSubmit database, else will display
web form prompting for action details.
@param actid: unique id for new action
@param actname: name of new action
@param working_dir: action working directory for WebSubmit core
@param status_text: status text displayed at end of action
@return: tuple containing "title" (title of page), body (page body).
"""
user_msg = []
body = ""
title = "Create New WebSubmit Action"
commit_error=0
## wash args:
if actid is not None:
try:
actid = wash_single_urlarg(urlarg=actid, argreqdtype=str, argdefault="", maxstrlen=3, minstrlen=3)
if string_is_alphanumeric_including_underscore(txtstring=actid) == 0:
actid = ""
except ValueError, e:
actid = ""
else:
actid = ""
if actname is not None:
try:
actname = wash_single_urlarg(urlarg=actname, argreqdtype=str, argdefault="")
except ValueError, e:
actname = ""
else:
actname = ""
if working_dir is not None:
try:
working_dir = wash_single_urlarg(urlarg=working_dir, argreqdtype=str, argdefault="")
except ValueError, e:
working_dir = ""
else:
working_dir = ""
if status_text is not None:
try:
status_text = wash_single_urlarg(urlarg=status_text, argreqdtype=str, argdefault="")
except ValueError, e:
status_text = ""
else:
status_text = ""
## process request:
if actcommit != "" and actcommit is not None:
if actid in ("", None):
actid = ""
user_msg.append("""Action ID is mandatory and must be a 3 letter string""")
commit_error = 1
if actname in ("", None):
actname = ""
user_msg.append("""Action description is mandatory""")
commit_error = 1
if commit_error != 0:
## don't commit - just re-display page with message to user
body = websubmitadmin_templates.tmpl_display_addactionform(actid=actid, actname=actname, working_dir=working_dir,\
status_text=status_text, user_msg=user_msg)
return (title, body)
## Commit new action to WebSubmit DB:
err_code = _add_new_action(actid,actname,working_dir,status_text)
## Handle error code - redisplay form with warning about no DB commit, or move to list
## of actions
if err_code == 0:
## Action added: show page listing WebSubmit actions
user_msg = """'%s' Action Added to WebSubmit""" % (actid,)
all_actions = get_actid_actname_allactions()
body = websubmitadmin_templates.tmpl_display_allactions(all_actions,user_msg=user_msg)
title = "Available WebSubmit Actions"
else:
## Could not commit action to WebSubmit DB redisplay form with completed details and error message
## warnings.append(('ERR_WEBSUBMIT_ADMIN_ADDACTIONFAILDUPLICATE',actid) ## TODO
user_msg = """Could Not Add '%s' Action to WebSubmit""" % (actid,)
body = websubmitadmin_templates.tmpl_display_addactionform(actid=actid, actname=actname, working_dir=working_dir, \
status_text=status_text, user_msg=user_msg)
else:
## Display Web form for new action details:
body = websubmitadmin_templates.tmpl_display_addactionform()
return (title, body)
def perform_request_add_jscheck(chname=None, chdesc=None, chcommit=""):
"""An interface for the addition of a new WebSubmit JavaScript Check, as used on form elements.
If form fields filled, will insert new Check into WebSubmit database, else will display
Web form prompting for Check details.
@param chname: unique id/name for new Check
@param chdesc: description (JavaScript code body) of new Check
@return: tuple containing "title" (title of page), body (page body).
"""
user_msg = []
body = ""
title = "Create New WebSubmit Checking Function"
commit_error=0
## wash args:
if chname is not None:
try:
chname = wash_single_urlarg(urlarg=chname, argreqdtype=str, argdefault="", maxstrlen=15, minstrlen=1)
if function_name_is_valid(fname=chname) == 0:
chname = ""
except ValueError, e:
chname = ""
else:
chname = ""
if chdesc is not None:
try:
chdesc = wash_single_urlarg(urlarg=chdesc, argreqdtype=str, argdefault="")
except ValueError, e:
chdesc = ""
else:
chdesc = ""
## process request:
if chcommit != "" and chcommit is not None:
if chname in ("", None):
chname = ""
user_msg.append("""Check name is mandatory and must be a string with no more than 15 characters""")
user_msg.append("""It must contain only alpha-numeric and underscore characters, beginning with a """\
"""letter or underscore""")
commit_error = 1
if commit_error != 0:
## don't commit - just re-display page with message to user
body = websubmitadmin_templates.tmpl_display_addjscheckform(chname=chname, chdesc=chdesc, user_msg=user_msg)
return (title, body)
## Commit new check to WebSubmit DB:
err_code = insert_jscheck_details(chname, chdesc)
## Handle error code - redisplay form wih warning about no DB commit, or move to list
## of checks
if err_code == 0:
## Check added: show page listing WebSubmit JS Checks
user_msg.append("""'%s' Checking Function Added to WebSubmit""" % (chname,))
all_jschecks = get_chname_alljschecks()
body = websubmitadmin_templates.tmpl_display_alljschecks(all_jschecks, user_msg=user_msg)
title = "Available WebSubmit Checking Functions"
else:
## Could not commit Check to WebSubmit DB: redisplay form with completed details and error message
## TODO : Warning Message
user_msg.append("""Could Not Add '%s' Checking Function to WebSubmit""" % (chname,))
body = websubmitadmin_templates.tmpl_display_addjscheckform(chname=chname, chdesc=chdesc, user_msg=user_msg)
else:
## Display Web form for new check details:
body = websubmitadmin_templates.tmpl_display_addjscheckform()
return (title, body)
def perform_request_add_element(elname=None, elmarccode=None, eltype=None, elsize=None, elrows=None, \
elcols=None, elmaxlength=None, elval=None, elfidesc=None, \
elmodifytext=None, elcommit=""):
"""An interface for adding a new ELEMENT to the WebSubmit DB.
@param elname: (string) element name.
@param elmarccode: (string) element's MARC code.
@param eltype: (character) element type.
@param elsize: (integer) element size.
@param elrows: (integer) number of rows in element.
@param elcols: (integer) number of columns in element.
@param elmaxlength: (integer) maximum length of element
@param elval: (string) default value of element
@param elfidesc: (string) description of element
@param elmodifytext: (string) modification text of element
@param elcommit: (string) If this value is not empty, attempt to commit element details to WebSubmit DB
@return: tuple containing "title" (title of page), body (page body).
"""
user_msg = []
body = ""
title = "Create New WebSubmit Element"
commit_error=0
## wash args:
if elname is not None:
try:
elname = wash_single_urlarg(urlarg=elname, argreqdtype=str, argdefault="", maxstrlen=15, minstrlen=1)
if string_is_alphanumeric_including_underscore(txtstring=elname) == 0:
elname = ""
except ValueError, e:
elname = ""
else:
elname = ""
if elmarccode is not None:
try:
elmarccode = wash_single_urlarg(urlarg=elmarccode, argreqdtype=str, argdefault="")
except ValueError, e:
elmarccode = ""
else:
elmarccode = ""
if eltype is not None:
try:
eltype = wash_single_urlarg(urlarg=eltype, argreqdtype=str, argdefault="", maxstrlen=1, minstrlen=1)
except ValueError, e:
eltype = ""
else:
eltype = ""
if elsize is not None:
try:
elsize = wash_single_urlarg(urlarg=elsize, argreqdtype=int, argdefault="")
except ValueError, e:
elsize = ""
else:
elsize = ""
if elrows is not None:
try:
elrows = wash_single_urlarg(urlarg=elrows, argreqdtype=int, argdefault="")
except ValueError, e:
elrows = ""
else:
elrows = ""
if elcols is not None:
try:
elcols = wash_single_urlarg(urlarg=elcols, argreqdtype=int, argdefault="")
except ValueError, e:
elcols = ""
else:
elcols = ""
if elmaxlength is not None:
try:
elmaxlength = wash_single_urlarg(urlarg=elmaxlength, argreqdtype=int, argdefault="")
except ValueError, e:
elmaxlength = ""
else:
elmaxlength = ""
if elval is not None:
try:
elval = wash_single_urlarg(urlarg=elval, argreqdtype=str, argdefault="")
except ValueError, e:
elval = ""
else:
elval = ""
if elfidesc is not None:
try:
elfidesc = wash_single_urlarg(urlarg=elfidesc, argreqdtype=str, argdefault="")
except ValueError, e:
elfidesc = ""
else:
elfidesc = ""
if elmodifytext is not None:
try:
elmodifytext = wash_single_urlarg(urlarg=elmodifytext, argreqdtype=str, argdefault="")
except ValueError, e:
elmodifytext = ""
else:
elmodifytext = ""
## process request:
if elcommit != "" and elcommit is not None:
if elname == "":
elname = ""
user_msg.append("""The element name is mandatory and must be a string with no more than 15 characters""")
user_msg.append("""It must contain only alpha-numeric and underscore characters""")
commit_error = 1
if eltype == "" or eltype not in ("D", "F", "H", "I", "R", "S", "T"):
eltype = ""
user_msg.append("""The element type is mandatory and must be selected from the list""")
commit_error = 1
if commit_error != 0:
## don't commit - just re-display page with message to user
body = websubmitadmin_templates.tmpl_display_addelementform(elname=elname,
elmarccode=elmarccode,
eltype=eltype,
elsize=str(elsize),
elrows=str(elrows),
elcols=str(elcols),
elmaxlength=str(elmaxlength),
elval=elval,
elfidesc=elfidesc,
elmodifytext=elmodifytext,
user_msg=user_msg,
)
return (title, body)
## Commit new element description to WebSubmit DB:
err_code = insert_element_details(elname=elname, elmarccode=elmarccode, eltype=eltype, \
elsize=elsize, elrows=elrows, elcols=elcols, \
elmaxlength=elmaxlength, elval=elval, elfidesc=elfidesc, \
elmodifytext=elmodifytext)
if err_code == 0:
## Element added: show page listing WebSubmit elements
user_msg.append("""'%s' Element Added to WebSubmit""" % (elname,))
if elname in CFG_RESERVED_SUBMISSION_FILENAMES:
user_msg.append("""WARNING: '%s' is a reserved name. Check WebSubmit admin guide to be aware of possible side-effects.""" % elname)
title = "Available WebSubmit Elements"
all_elements = get_elename_allelements()
body = websubmitadmin_templates.tmpl_display_allelements(all_elements, user_msg=user_msg)
else:
## Could not commit element to WebSubmit DB: redisplay form with completed details and error message
## TODO : Warning Message
user_msg.append("""Could Not Add '%s' Element to WebSubmit""" % (elname,))
body = websubmitadmin_templates.tmpl_display_addelementform(elname=elname,
elmarccode=elmarccode,
eltype=eltype,
elsize=str(elsize),
elrows=str(elrows),
elcols=str(elcols),
elmaxlength=str(elmaxlength),
elval=elval,
elfidesc=elfidesc,
elmodifytext=elmodifytext,
user_msg=user_msg,
)
else:
## Display Web form for new element details:
body = websubmitadmin_templates.tmpl_display_addelementform()
return (title, body)
def perform_request_edit_element(elname, elmarccode=None, eltype=None, elsize=None, \
elrows=None, elcols=None, elmaxlength=None, elval=None, \
elfidesc=None, elmodifytext=None, elcommit=""):
"""An interface for the editing and updating the details of a WebSubmit ELEMENT.
@param elname: element name.
@param elmarccode: element's MARC code.
@param eltype: element type.
@param elsize: element size.
@param elrows: number of rows in element.
@param elcols: number of columns in element.
@param elmaxlength: maximum length of element
@param elval: default value of element
@param elfidesc: description of element
@param elmodifytext: modification text of element
@param elcommit: If this value is not empty, attempt to commit element details to WebSubmit DB
@return: tuple containing "title" (title of page), body (page body).
"""
user_msg = []
body = ""
title = "Edit WebSubmit Element"
commit_error=0
## wash args:
if elname is not None:
try:
elname = wash_single_urlarg(urlarg=elname, argreqdtype=str, argdefault="", maxstrlen=15, minstrlen=1)
if string_is_alphanumeric_including_underscore(txtstring=elname) == 0:
elname = ""
except ValueError, e:
elname = ""
else:
elname = ""
if elmarccode is not None:
try:
elmarccode = wash_single_urlarg(urlarg=elmarccode, argreqdtype=str, argdefault="")
except ValueError, e:
elmarccode = ""
else:
elmarccode = ""
if eltype is not None:
try:
eltype = wash_single_urlarg(urlarg=eltype, argreqdtype=str, argdefault="", maxstrlen=1, minstrlen=1)
except ValueError, e:
eltype = ""
else:
eltype = ""
if elsize is not None:
try:
elsize = wash_single_urlarg(urlarg=elsize, argreqdtype=int, argdefault="")
except ValueError, e:
elsize = ""
else:
elsize = ""
if elrows is not None:
try:
elrows = wash_single_urlarg(urlarg=elrows, argreqdtype=int, argdefault="")
except ValueError, e:
elrows = ""
else:
elrows = ""
if elcols is not None:
try:
elcols = wash_single_urlarg(urlarg=elcols, argreqdtype=int, argdefault="")
except ValueError, e:
elcols = ""
else:
elcols = ""
if elmaxlength is not None:
try:
elmaxlength = wash_single_urlarg(urlarg=elmaxlength, argreqdtype=int, argdefault="")
except ValueError, e:
elmaxlength = ""
else:
elmaxlength = ""
if elval is not None:
try:
elval = wash_single_urlarg(urlarg=elval, argreqdtype=str, argdefault="")
except ValueError, e:
elval = ""
else:
elval = ""
if elfidesc is not None:
try:
elfidesc = wash_single_urlarg(urlarg=elfidesc, argreqdtype=str, argdefault="")
except ValueError, e:
elfidesc = ""
else:
elfidesc = ""
if elmodifytext is not None:
try:
elmodifytext = wash_single_urlarg(urlarg=elmodifytext, argreqdtype=str, argdefault="")
except ValueError, e:
elmodifytext = ""
else:
elmodifytext = ""
## process request:
if elcommit != "" and elcommit is not None:
if elname == "":
elname = ""
user_msg.append("""Invalid Element Name!""")
commit_error = 1
if eltype == "" or eltype not in ("D", "F", "H", "I", "R", "S", "T"):
eltype = ""
user_msg.append("""Invalid Element Type!""")
commit_error = 1
if commit_error != 0:
## don't commit - just re-display page with message to user
all_elements = get_elename_allelements()
user_msg.append("""Could Not Update Element""")
title = "Available WebSubmit Elements"
body = websubmitadmin_templates.tmpl_display_allelements(all_elements, user_msg=user_msg)
return (title, body)
## Commit updated element description to WebSubmit DB:
err_code = update_element_details(elname=elname, elmarccode=elmarccode, eltype=eltype, \
elsize=elsize, elrows=elrows, elcols=elcols, \
elmaxlength=elmaxlength, elval=elval, elfidesc=elfidesc, \
elmodifytext=elmodifytext)
if err_code == 0:
## Element Updated: Show All Element Details Again
user_msg.append("""'%s' Element Updated""" % (elname,))
## Get submission page usage of element:
el_use = get_doctype_action_pagenb_for_submissions_using_element(elname)
element_dets = get_element_details(elname)
element_dets = stringify_listvars(element_dets)
## Take elements from results tuple:
(elmarccode, eltype, elsize, elrows, elcols, elmaxlength, \
elval, elfidesc, elcd, elmd, elmodifytext) = \
(element_dets[0][0], element_dets[0][1], element_dets[0][2], element_dets[0][3], \
element_dets[0][4], element_dets[0][5], element_dets[0][6], element_dets[0][7], \
element_dets[0][8], element_dets[0][9], element_dets[0][10])
## Pass to template:
body = websubmitadmin_templates.tmpl_display_addelementform(elname=elname,
elmarccode=elmarccode,
eltype=eltype,
elsize=elsize,
elrows=elrows,
elcols=elcols,
elmaxlength=elmaxlength,
elval=elval,
elfidesc=elfidesc,
elcd=elcd,
elmd=elmd,
elmodifytext=elmodifytext,
perform_act="elementedit",
user_msg=user_msg,
el_use_tuple=el_use
)
else:
## Could Not Update Element: Maybe Key Violation, or Invalid elname? Redisplay all elements.
## TODO : LOGGING
all_elements = get_elename_allelements()
user_msg.append("""Could Not Update Element '%s'""" % (elname,))
title = "Available WebSubmit Elements"
body = websubmitadmin_templates.tmpl_display_allelements(all_elements, user_msg=user_msg)
else:
## Display Web form containing existing details of element:
element_dets = get_element_details(elname)
## Get submission page usage of element:
el_use = get_doctype_action_pagenb_for_submissions_using_element(elname)
num_rows_ret = len(element_dets)
element_dets = stringify_listvars(element_dets)
if num_rows_ret == 1:
## Display Element details
## Take elements from results tuple:
(elmarccode, eltype, elsize, elrows, elcols, elmaxlength, \
elval, elfidesc, elcd, elmd, elmodifytext) = \
(element_dets[0][0], element_dets[0][1], element_dets[0][2], element_dets[0][3], \
element_dets[0][4], element_dets[0][5], element_dets[0][6], element_dets[0][7], \
element_dets[0][8], element_dets[0][9], element_dets[0][10])
## Pass to template:
body = websubmitadmin_templates.tmpl_display_addelementform(elname=elname,
elmarccode=elmarccode,
eltype=eltype,
elsize=elsize,
elrows=elrows,
elcols=elcols,
elmaxlength=elmaxlength,
elval=elval,
elfidesc=elfidesc,
elcd=elcd,
elmd=elmd,
elmodifytext=elmodifytext,
perform_act="elementedit",
el_use_tuple=el_use
)
else:
## Either no rows, or more than one row for ELEMENT: log error, and display all Elements
## TODO : LOGGING
title = "Available WebSubmit Elements"
all_elements = get_elename_allelements()
if num_rows_ret > 1:
## Key Error - duplicated elname
user_msg.append("""Found Several Rows for Element with Name '%s' - Inform Administrator""" % (elname,))
## LOG MESSAGE
else:
## No rows for ELEMENT
user_msg.append("""Could Not Find Any Rows for Element with Name '%s'""" % (elname,))
## LOG MESSAGE
body = websubmitadmin_templates.tmpl_display_allelements(all_elements, user_msg=user_msg)
return (title, body)
def _display_edit_check_form(chname, user_msg=""):
title = "Edit WebSubmit Checking Function"
if user_msg == "":
user_msg = []
jscheck_dets = get_jscheck_details(chname)
num_rows_ret = len(jscheck_dets)
if num_rows_ret == 1:
## Display Check details
body = websubmitadmin_templates.tmpl_display_addjscheckform(chname=jscheck_dets[0][0],
chdesc=jscheck_dets[0][1],
perform_act="jscheckedit",
cd=jscheck_dets[0][2],
md=jscheck_dets[0][3],
user_msg=user_msg)
else:
## Either no rows, or more than one row for Check: log error, and display all Checks
## TODO : LOGGING
title = "Available WebSubmit Checking Functions"
all_jschecks = get_chname_alljschecks()
if num_rows_ret > 1:
## Key Error - duplicated chname
user_msg.append("""Found Several Rows for Checking Function with Name '%s' - Inform Administrator""" % (chname,))
## LOG MESSAGE
else:
## No rows for action
user_msg.append("""Could Not Find Any Rows for Checking Function with Name '%s'""" % (chname,))
## LOG MESSAGE
body = websubmitadmin_templates.tmpl_display_alljschecks(all_jschecks, user_msg=user_msg)
return (title, body)
def perform_request_edit_jscheck(chname, chdesc=None, chcommit=""):
"""Interface for editing and updating the details of a WebSubmit Check.
If only "chname" provided, will display the details of a Check in a Web form.
If "chdesc" not empty, will assume that this is a call to commit update to Check details.
@param chname: unique id for Check
@param chdesc: modified value for WebSubmit Check description (code body) - (presence invokes update)
@return: tuple containing "title" (title of page), body (page body).
"""
user_msg = []
body = ""
title = "Edit WebSubmit Checking Function"
commit_error=0
## wash args:
if chname is not None:
try:
chname = wash_single_urlarg(urlarg=chname, argreqdtype=str, argdefault="", maxstrlen=15, minstrlen=1)
if function_name_is_valid(fname=chname) == 0:
chname = ""
except ValueError, e:
chname = ""
else:
chname = ""
if chdesc is not None:
try:
chdesc = wash_single_urlarg(urlarg=chdesc, argreqdtype=str, argdefault="")
except ValueError, e:
chdesc = ""
else:
chdesc = ""
(chname, chdesc) = (str(chname), str(chdesc))
if chcommit != "" and chcommit is not None:
if chname in ("", None):
chname = ""
user_msg.append("""Check name is mandatory and must be a string with no more than 15 characters""")
user_msg.append("""It must contain only alpha-numeric and underscore characters, beginning with a """\
"""letter or underscore""")
commit_error = 1
if commit_error != 0:
## don't commit - just re-display page with message to user
all_jschecks = get_chname_alljschecks()
user_msg.append("""Could Not Update Checking Function""")
body = websubmitadmin_templates.tmpl_display_alljschecks(all_jschecks, user_msg=user_msg)
title = "Available WebSubmit Checking Functions"
return (title, body)
## Commit updated Check details to WebSubmit DB:
err_code = update_jscheck_details(chname, chdesc)
if err_code == 0:
## Check Updated: Show All Check Details Again
user_msg.append("""'%s' Check Updated""" % (chname,))
jscheck_dets = get_jscheck_details(chname)
body = websubmitadmin_templates.tmpl_display_addjscheckform(chname=jscheck_dets[0][0],
chdesc=jscheck_dets[0][1],
perform_act="jscheckedit",
cd=jscheck_dets[0][2],
md=jscheck_dets[0][3],
user_msg=user_msg
)
else:
## Could Not Update Check: Maybe Key Violation, or Invalid chname? Redisplay all Checks.
## TODO : LOGGING
all_jschecks = get_chname_alljschecks()
user_msg.append("""Could Not Update Checking Function '%s'""" % (chname,))
body = websubmitadmin_templates.tmpl_display_alljschecks(all_jschecks, user_msg=user_msg)
title = "Available WebSubmit Checking Functions"
else:
## Display Web form containing existing details of Check:
(title, body) = _display_edit_check_form(chname=chname)
return (title, body)
def _display_edit_action_form(actid, user_msg=""):
title = "Edit WebSubmit Action"
if user_msg == "":
user_msg = []
action_dets = get_action_details(actid)
num_rows_ret = len(action_dets)
if num_rows_ret == 1:
## Display action details
body = websubmitadmin_templates.tmpl_display_addactionform(actid=action_dets[0][0],
actname=action_dets[0][1],
working_dir=action_dets[0][2],
status_text=action_dets[0][3],
perform_act="actionedit",
cd=action_dets[0][4],
md=action_dets[0][5],
user_msg=user_msg)
else:
## Either no rows, or more than one row for action: log error, and display all actions
## TODO : LOGGING
title = "Available WebSubmit Actions"
all_actions = get_actid_actname_allactions()
if num_rows_ret > 1:
## Key Error - duplicated actid
user_msg.append("""Found Several Rows for Action with ID '%s' - Inform Administrator""" % (actid,))
## LOG MESSAGE
else:
## No rows for action
user_msg.append("""Could Not Find Any Rows for Action with ID '%s'""" % (actid,))
## LOG MESSAGE
body = websubmitadmin_templates.tmpl_display_allactions(all_actions, user_msg=user_msg)
return (title, body)
def perform_request_edit_action(actid, actname=None, working_dir=None, status_text=None, actcommit=""):
"""Interface for editing and updating the details of a WebSubmit action.
If only "actid" provided, will display the details of an action in a Web form.
If "actname" not empty, will assume that this is a call to commit update to action details.
@param actid: unique id for action
@param actname: modified value for WebSubmit action name/description (presence invokes update)
@param working_dir: modified value for WebSubmit action working_dir
@param status_text: modified value for WebSubmit action status text
@return: tuple containing "title" (title of page), body (page body).
"""
user_msg = []
body = ""
title = "Edit WebSubmit Action"
commit_error = 0
## wash args:
if actid is not None:
try:
actid = wash_single_urlarg(urlarg=actid, argreqdtype=str, argdefault="", maxstrlen=3, minstrlen=3)
if string_is_alphanumeric_including_underscore(txtstring=actid) == 0:
actid = ""
except ValueError, e:
actid = ""
actid = actid.upper()
else:
actid = ""
if actname is not None:
try:
actname = wash_single_urlarg(urlarg=actname, argreqdtype=str, argdefault="")
except ValueError, e:
actname = ""
else:
actname = ""
if working_dir is not None:
try:
working_dir = wash_single_urlarg(urlarg=working_dir, argreqdtype=str, argdefault="")
except ValueError, e:
working_dir = ""
else:
working_dir = ""
if status_text is not None:
try:
status_text = wash_single_urlarg(urlarg=status_text, argreqdtype=str, argdefault="")
except ValueError, e:
status_text = ""
else:
status_text = ""
## process request:
if actcommit != "" and actcommit is not None:
if actname in ("", None):
actname = ""
user_msg.append("""Action description is mandatory""")
commit_error = 1
if commit_error != 0:
## don't commit - just re-display page with message to user
(title, body) = _display_edit_action_form(actid=actid, user_msg=user_msg)
return (title, body)
## Commit updated action details to WebSubmit DB:
err_code = update_action_details(actid, actname, working_dir, status_text)
if err_code == 0:
## Action Updated: Show Action Details Again
user_msg.append("""'%s' Action Updated""" % (actid,))
action_dets = get_action_details(actid)
body = websubmitadmin_templates.tmpl_display_addactionform(actid=action_dets[0][0],
actname=action_dets[0][1],
working_dir=action_dets[0][2],
status_text=action_dets[0][3],
perform_act="actionedit",
cd=action_dets[0][4],
md=action_dets[0][5],
user_msg=user_msg
)
else:
## Could Not Update Action: Maybe Key Violation, or Invalid actid? Redisplay all actions.
## TODO : LOGGING
all_actions = get_actid_actname_allactions()
user_msg.append("""Could Not Update Action '%s'""" % (actid,))
body = websubmitadmin_templates.tmpl_display_allactions(all_actions, user_msg=user_msg)
title = "Available WebSubmit Actions"
else:
## Display Web form containing existing details of action:
(title, body) = _display_edit_action_form(actid=actid)
return (title, body)
def _functionedit_display_function_details(funcname, user_msg=""):
"""Display the details of a function, along with any message to the user that may have been provided.
@param funcname: unique name of function to be updated
@param user_msg: Any message to the user that is to be displayed on the page.
@return: tuple containing (page title, HTML page body).
"""
if user_msg == "":
user_msg = []
title = "Edit WebSubmit Function"
func_descr_res = get_function_description(function=funcname)
num_rows_ret = len(func_descr_res)
if num_rows_ret == 1:
## Display action details
funcdescr = func_descr_res[0][0]
if funcdescr is None:
funcdescr = ""
## get parameters for this function:
this_function_parameters = get_function_parameters(function=funcname)
## get all function parameters in WebSubmit:
all_function_parameters = get_distinct_paramname_all_websubmit_function_parameters()
## get the docstring of the function. Remove leading empty
## lines and remove unnecessary leading whitespaces
docstring = None
try:
websubmit_function = __import__('invenio.legacy.websubmit.functions.%s' % funcname,
globals(), locals(), [funcname])
if hasattr(websubmit_function, funcname) and getattr(websubmit_function, funcname).__doc__:
docstring = getattr(websubmit_function, funcname).__doc__
except Exception, e:
docstring = '''<span style="color:#f00;font-weight:700">Function documentation could
not be loaded</span>.<br/>Please check function definition. Error was:<br/>%s''' % str(e)
if docstring:
docstring = '<pre style="max-height:500px;overflow: auto;">' + _format_function_docstring(docstring) + '</pre>'
body = websubmitadmin_templates.tmpl_display_addfunctionform(funcname=funcname,
funcdescr=funcdescr,
func_parameters=this_function_parameters,
all_websubmit_func_parameters=all_function_parameters,
perform_act="functionedit",
user_msg=user_msg,
func_docstring = docstring
)
else:
## Either no rows, or more than one row for function: log error, and display all functions
## TODO : LOGGING
title = "Available WebSubmit Functions"
all_functions = get_funcname_funcdesc_allfunctions()
if num_rows_ret > 1:
## Key Error - duplicated function name
user_msg.append("""Found Several Rows for Function with Name '%s' - Inform Administrator""" % (funcname,))
## LOG MESSAGE
else:
## No rows for function
user_msg.append("""Could Not Find Any Rows for Function with Name '%s'""" % (funcname,))
## LOG MESSAGE
body = websubmitadmin_templates.tmpl_display_allfunctions(all_functions, user_msg=user_msg)
return (title, body)
def _format_function_docstring(docstring):
"""
Remove unnecessary leading and trailing empty lines, as well as
meaningless leading and trailing whitespaces on every lines
@param docstring: the input docstring to format
@type docstring: string
@return: a formatted docstring
@rtype: string
"""
def count_leading_whitespaces(line):
"Count enumber of leading whitespaces"
line_length = len(line)
pos = 0
while pos < line_length and line[pos] == " ":
pos += 1
return pos
new_docstring_list = []
min_nb_leading_whitespace = len(docstring) # this is really the max possible
# First count min number of leading whitespaces of all lines. Also
# remove leading empty lines.
docstring_has_started_p = False
for line in docstring.splitlines():
if docstring_has_started_p or line.strip():
# A non-empty line has been found, or an emtpy line after
# the beginning of some text was found
docstring_has_started_p = True
new_docstring_list.append(line)
if line.strip():
# If line has some meaningful char, count leading whitespaces
line_nb_spaces = count_leading_whitespaces(line)
if line_nb_spaces < min_nb_leading_whitespace:
min_nb_leading_whitespace = line_nb_spaces
return '\n'.join([line[min_nb_leading_whitespace:] for line in new_docstring_list]).rstrip()
def _functionedit_update_description(funcname, funcdescr):
"""Perform an update of the description for a given function.
@param funcname: unique name of function to be updated
@param funcdescr: description to be updated for funcname
@return: a tuple containing (page title, HTML body content)
"""
user_msg = []
err_code = update_function_description(funcname, funcdescr)
if err_code == 0:
## Function updated - redisplay
user_msg.append("""'%s' Function Description Updated""" % (funcname,))
else:
## Could not update function description
## TODO : ERROR LIBS
user_msg.append("""Could Not Update Description for Function '%s'""" % (funcname,))
## Display function details
(title, body) = _functionedit_display_function_details(funcname=funcname, user_msg=user_msg)
return (title, body)
def _functionedit_delete_parameter(funcname, deleteparam):
"""Delete a parameter from a given function.
Important: if any document types have been using the function from which this parameter will be deleted,
and therefore have values for this parameter, these values will not be deleted from the WebSubmit DB.
The deleted parameter therefore may continue to exist in the WebSubmit DB, but will be disassociated
from this function.
@param funcname: unique name of the function from which the parameter is to be deleted.
@param deleteparam: the name of the parameter to be deleted from the function.
@return: tuple containing (title, HTML body content)
"""
user_msg = []
err_code = delete_function_parameter(function=funcname, parameter_name=deleteparam)
if err_code == 0:
## Parameter deleted - redisplay function details
user_msg.append("""'%s' Parameter Deleted from '%s' Function""" % (deleteparam, funcname))
else:
## could not delete param - it does not exist for this function
## TODO : ERROR LIBS
user_msg.append("""'%s' Parameter Does not Seem to Exist for Function '%s' - Could not Delete""" \
% (deleteparam, funcname))
## Display function details
(title, body) = _functionedit_display_function_details(funcname=funcname, user_msg=user_msg)
return (title, body)
def _functionedit_add_parameter(funcname, funceditaddparam="", funceditaddparamfree=""):
"""Add (connect) a parameter to a given WebSubmit function.
@param funcname: unique name of the function to which the parameter is to be added.
@param funceditaddparam: the value of a HTML select list: if present, will contain the name of the
parameter to be added to the function. May also be empty - the user may have used the free-text field
(funceditaddparamfree) to manually enter the name of a parameter. The important thing is that one
must be present for the parameter to be added sucessfully.
@param funceditaddparamfree: The name of the parameter to be added to the function, as taken from a free-
text HTML input field. May also be empty - the user may have used the HTML select-list (funceditaddparam)
field to choose the parameter. The important thing is that one must be present for the parameter to be
added sucessfully. The value "funceditaddparamfree" value will take priority over the "funceditaddparam"
list value.
@return: tuple containing (title, HTML body content)
"""
user_msg = []
if funceditaddparam in ("", None, "NO_VALUE") and funceditaddparamfree in ("", None):
## no parameter chosen
## TODO : ERROR LIBS
user_msg.append("""Unable to Find the Parameter to be Added to Function '%s' - Could not Add""" % (funcname,))
else:
add_parameter = ""
if funceditaddparam not in ("", None) and funceditaddparamfree not in ("", None):
## both select box and free-text values provided for parameter - prefer free-text
add_parameter = funceditaddparamfree
elif funceditaddparam not in ("", None):
## take add select-box chosen parameter
add_parameter = funceditaddparam
else:
## take add free-text chosen parameter
add_parameter = funceditaddparamfree
## attempt to commit parameter:
err_code = add_function_parameter(function=funcname, parameter_name=add_parameter)
if err_code == 0:
## Parameter added - redisplay function details
user_msg.append("""'%s' Parameter Added to '%s' Function""" % (add_parameter, funcname))
else:
## could not add param - perhaps it already exists for this function
## TODO : ERROR LIBS
user_msg.append("""Could not Add '%s' Parameter to Function '%s' - It Already Exists for this Function""" \
% (add_parameter, funcname))
## Display function details
(title, body) = _functionedit_display_function_details(funcname=funcname, user_msg=user_msg)
return (title, body)
def perform_request_edit_function(funcname, funcdescr=None, funceditaddparam=None, funceditaddparamfree=None,
funceditdelparam=None, funcdescreditcommit="", funcparamdelcommit="",
funcparamaddcommit=""):
"""Edit a WebSubmit function. 3 possibilities: edit the function description; delete a parameter from the
function; add a new parameter to the function.
@param funcname: the name of the function to be modified
@param funcdescr: the new function description
@param funceditaddparam: the name of the parameter to be added to the function (taken from HTML SELECT-list)
@param funceditaddparamfree: the name of the parameter to be added to the function (taken from free-text input)
@param funceditdelparam: the name of the parameter to be deleted from the function
@param funcdescreditcommit: a flag to indicate that this request is to update the description of a function
@param funcparamdelcommit: a flag to indicate that this request is to delete a parameter from a function
@param funcparamaddcommit: a flag to indicate that this request is to add a new parameter to a function
@return: tuple containing (page title, HTML page body)
"""
body = ""
title = "Edit WebSubmit Function"
commit_error = 0
## wash args:
if funcname is not None:
try:
funcname = wash_single_urlarg(urlarg=funcname, argreqdtype=str, argdefault="")
if string_is_alphanumeric_including_underscore(txtstring=funcname) == 0:
funcname = ""
except ValueError, e:
funcname = ""
else:
funcname = ""
if funcdescr is not None:
try:
funcdescr = wash_single_urlarg(urlarg=funcdescr, argreqdtype=str, argdefault="")
except ValueError, e:
funcdescr = ""
else:
funcdescr = ""
if funceditaddparam is not None:
try:
funceditaddparam = wash_single_urlarg(urlarg=funceditaddparam, argreqdtype=str, argdefault="")
if string_is_alphanumeric_including_underscore(txtstring=funceditaddparam) == 0:
funceditaddparam = ""
except ValueError, e:
funceditaddparam = ""
else:
funceditaddparam = ""
if funceditaddparamfree is not None:
try:
funceditaddparamfree = wash_single_urlarg(urlarg=funceditaddparamfree, argreqdtype=str, argdefault="")
if string_is_alphanumeric_including_underscore(txtstring=funceditaddparamfree) == 0:
funceditaddparamfree = ""
except ValueError, e:
funceditaddparamfree = ""
else:
funceditaddparamfree = ""
if funceditdelparam is not None:
try:
funceditdelparam = wash_single_urlarg(urlarg=funceditdelparam, argreqdtype=str, argdefault="")
except ValueError, e:
funceditdelparam = ""
else:
funceditdelparam = ""
if funcname == "":
(title, body) = _functionedit_display_function_details(funcname=funcname)
return (title, body)
if funcdescreditcommit != "" and funcdescreditcommit is not None:
## Update the definition of a function:
(title, body) = _functionedit_update_description(funcname=funcname, funcdescr=funcdescr)
elif funcparamaddcommit != "" and funcparamaddcommit is not None:
## Request to add a new parameter to a function
(title, body) = _functionedit_add_parameter(funcname=funcname,
funceditaddparam=funceditaddparam, funceditaddparamfree=funceditaddparamfree)
elif funcparamdelcommit != "" and funcparamdelcommit is not None:
## Request to delete a parameter from a function
(title, body) = _functionedit_delete_parameter(funcname=funcname, deleteparam=funceditdelparam)
else:
## Display Web form for new function addition:
(title, body) = _functionedit_display_function_details(funcname=funcname)
return (title, body)
def perform_request_function_usage(funcname):
"""Display a page containing the usage details of a given function.
@param funcname: the function name
@return: page body
"""
func_usage = get_function_usage_details(function=funcname)
func_usage = stringify_listvars(func_usage)
body = websubmitadmin_templates.tmpl_display_function_usage(funcname, func_usage)
return body
def perform_request_list_actions():
"""Display a list of all WebSubmit actions.
@return: body where body is a string of HTML, which is a page body.
"""
body = ""
all_actions = get_actid_actname_allactions()
body = websubmitadmin_templates.tmpl_display_allactions(all_actions)
return body
def perform_request_list_doctypes():
"""Display a list of all WebSubmit document types.
@return: body where body is a string of HTML, which is a page body.
"""
body = ""
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(all_doctypes)
return body
def perform_request_list_jschecks():
"""Display a list of all WebSubmit JavaScript element checking functions.
@return: body, where body is a string of HTML, which is a page body.
"""
body = ""
all_jschecks = get_chname_alljschecks()
body = websubmitadmin_templates.tmpl_display_alljschecks(all_jschecks)
return body
def perform_request_list_functions():
"""Display a list of all WebSubmit FUNCTIONS.
@return: body where body is a string of HTML, which is a page body.
"""
body = ""
all_functions = get_funcname_funcdesc_allfunctions()
body = websubmitadmin_templates.tmpl_display_allfunctions(all_functions)
return body
def perform_request_list_elements():
"""Display a list of all WebSubmit ELEMENTS.
@return: body where body is a string of HTML, which is a page body.
"""
body = ""
all_elements = get_elename_allelements()
body = websubmitadmin_templates.tmpl_display_allelements(all_elements)
return body
def _remove_doctype(doctype):
"""Process removal of a document type.
@param doctype: the document type to be removed.
@return: a tuple containing page title, and HTML page body)
"""
title = ""
body = ""
user_msg = []
numrows_doctype = get_number_doctypes_docid(docid=doctype)
if numrows_doctype == 1:
## Doctype is unique and can therefore be deleted:
## Delete any function parameters for this document type:
error_code = delete_all_parameters_doctype(doctype=doctype)
if error_code != 0:
## problem deleting some or all parameters - inform user and log error
## TODO : ERROR LOGGING
user_msg.append("""Unable to delete some or all function parameter values for document type "%s".""" % (doctype,))
## delete all functions called by this doctype's actions
error_code = delete_all_functions_doctype(doctype=doctype)
if error_code != 0:
## problem deleting some or all functions - inform user and log error
## TODO : ERROR LOGGING
user_msg.append("""Unable to delete some or all functions for document type "%s".""" % (doctype,))
## delete all categories of this doctype
error_code = delete_all_categories_doctype(doctype=doctype)
if error_code != 0:
## problem deleting some or all categories - inform user and log error
## TODO : ERROR LOGGING
user_msg.append("""Unable to delete some or all parameters for document type "%s".""" % (doctype,))
## delete all submission interface fields for this doctype
error_code = delete_all_submissionfields_doctype(doctype=doctype)
if error_code != 0:
## problem deleting some or all submission fields - inform user and log error
## TODO : ERROR LOGGING
user_msg.append("""Unable to delete some or all submission fields for document type "%s".""" % (doctype,))
## delete all submissions for this doctype
error_code = delete_all_submissions_doctype(doctype)
if error_code != 0:
## problem deleting some or all submissions - inform user and log error
## TODO : ERROR LOGGING
user_msg.append("""Unable to delete some or all submissions for document type "%s".""" % (doctype,))
## delete entry for this doctype in the collection-doctypes table
error_code = delete_collection_doctype_entry_doctype(doctype)
if error_code != 0:
## problem deleting this doctype from the collection-doctypes table
## TODO : ERROR LOGGING
user_msg.append("""Unable to delete document type "%s" from the collection-doctypes table.""" % (doctype,))
## delete the doctype itself
error_code = delete_doctype(doctype)
if error_code != 0:
## problem deleting this doctype from the doctypes table
## TODO : ERROR LOGGING
user_msg.append("""Unable to delete document type "%s" from the document types table.""" % (doctype,))
user_msg.append("""The "%s" document type should now have been deleted, but you should not ignore any warnings.""" % (doctype,))
title = """Available WebSubmit Document Types"""
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
else:
## doctype is not unique and cannot be deleted
if numrows_doctype > 1:
## doctype is duplicated - cannot delete - needs admin intervention
## TODO : LOG ERROR
user_msg.append("""%s WebSubmit document types have been identified for doctype id "%s" - unable to delete.""" \
""" Please inform administrator.""" % (numrows_doctype, doctype))
else:
## no document types found for this doctype id
## TODO : LOG ERROR
user_msg.append("""Unable to find any document types in the WebSubmit database for doctype id "%s" - unable to delete""" \
% (doctype,))
## get a list of all document types, and once more display the delete form, with the message
alldoctypes = get_docid_docname_and_docid_alldoctypes()
title = "Remove WebSubmit Doctument Type"
body = websubmitadmin_templates.tmpl_display_delete_doctype_form(doctype="", alldoctypes=alldoctypes, user_msg=user_msg)
return (title, body)
def perform_request_remove_doctype(doctype="", doctypedelete="", doctypedeleteconfirm=""):
"""Remove a document type from WebSubmit.
@param doctype: the document type to be removed
@doctypedelete: flag to signal that a confirmation for deletion should be displayed
@doctypedeleteconfirm: flag to signal that confirmation for deletion has been received and
the doctype should be removed
@return: a tuple (title, body)
"""
body = ""
title = "Remove WebSubmit Document Type"
if doctypedeleteconfirm not in ("", None):
## Delete the document type:
(title, body) = _remove_doctype(doctype=doctype)
else:
## Display "doctype delete form"
if doctypedelete not in ("", None) and doctype not in ("", None):
## don't bother to get list of doctypes - user will be prompted to confirm the deletion of "doctype"
alldoctypes = None
else:
## get list of all doctypes to pass to template so that it can prompt the user to choose a doctype to delete
## alldoctypes = get_docid_docname_alldoctypes()
alldoctypes = get_docid_docname_and_docid_alldoctypes()
body = websubmitadmin_templates.tmpl_display_delete_doctype_form(doctype=doctype, alldoctypes=alldoctypes)
return (title, body)
def _create_add_doctype_form(doctype="", doctypename="", doctypedescr="", clonefrom="", user_msg=""):
"""Perform the steps necessary to create the "add a new doctype" form.
@param doctype: The unique ID that is to be used for the new doctype.
@param doctypename: the name that is to be given to a doctype.
@param doctypedescr: the description to be allocated to the new doctype.
@param user_msg: any message to be displayed to the user.
@return: a tuple containing page title and HTML body of page: (title, body)
"""
title = """Add New WebSubmit Document Type"""
alldoctypes = get_docid_docname_and_docid_alldoctypes()
body = websubmitadmin_templates.tmpl_display_doctypedetails_form(doctype=doctype,
doctypename=doctypename,
doctypedescr=doctypedescr,
clonefrom=clonefrom,
alldoctypes=alldoctypes,
user_msg=user_msg
)
return (title, body)
def _clone_categories_doctype(user_msg, fromdoctype, todoctype):
"""Clone the categories of one document type, to another document type.
@param user_msg: any message to be displayed to the user (this is a list)
@param fromdoctype: the doctype from which categories are to be cloned
@param todoctype: the doctype into which categories are to be cloned
@return: integer value (0/1/2) - if doctype's categories couldn't be deleted, return 0 (cloning failed);
if some categories could be cloned, return 1 (cloning partially successful); if all categories could be
cloned, return 2 (cloning successful).
"""
error_code = clone_categories_fromdoctype_todoctype(fromdoctype=fromdoctype, todoctype=todoctype)
if error_code == 1:
## doctype had existing categories and they could not be deleted
## TODO : LOG ERRORS
user_msg.append("""Categories already existed for the document type "%s" but could not be deleted. Unable to clone""" \
""" categories of doctype "%s".""" % (todoctype, fromdoctype))
return 1 ## cloning failed
elif error_code == 2:
## could not clone all categories for new doctype
## TODO : LOG ERRORS
user_msg.append("""Unable to clone all categories from doctype "%s", for doctype "%s".""" % (fromdoctype, todoctype))
return 2 ## cloning at least partially successful
else:
return 0 ## cloning successful
def _clone_functions_foraction_doctype(user_msg, fromdoctype, todoctype, action):
"""Clone the functions of a given action of one document type, to the same action on another document type.
@param user_msg: any message to be displayed to the user (this is a list)
@param fromdoctype: the doctype from which functions are to be cloned
@param todoctype: the doctype into which functions are to be cloned
@param action: the action for which functions are to be cloned
@return: an integer value (0/1/2). In the case that todoctype had existing functions for the given action and
they could not be deleted return 0, signalling that this is a serious problem; in the case that some
functions were cloned, return 1; in the case that all functions were cloned, return 2.
"""
error_code = clone_functions_foraction_fromdoctype_todoctype(fromdoctype=fromdoctype, todoctype=todoctype, action=action)
if error_code == 1:
## doctype had existing functions for the given action and they could not be deleted
## TODO : LOG ERRORS
user_msg.append("""Functions already existed for the "%s" action of the document type "%s" but they could not be """ \
"""deleted. Unable to clone the functions of Document Type "%s" for action "%s".""" \
% (action, todoctype, fromdoctype, action))
## critical - return 1 to signal this
return 1
elif error_code == 2:
## could not clone all functions for given action for new doctype
## TODO : LOG ERRORS
user_msg.append("""Unable to clone all functions for the "%s" action from doctype "%s", for doctype "%s".""" \
% (action, fromdoctype, todoctype))
return 2 ## not critical
else:
return 0 ## total success
def _clone_functionparameters_foraction_fromdoctype_todoctype(user_msg, fromdoctype, todoctype, action):
"""Clone the parameters/values of a given action of one document type, to the same action on another document type.
@param user_msg: any message to be displayed to the user (this is a list)
@param fromdoctype: the doctype from which parameters are to be cloned
@param todoctype: the doctype into which parameters are to be cloned
@param action: the action for which parameters are to be cloned
@return: 0 if it was not possible to clone all parameters/values; 1 if all parameters/values were cloned successfully.
"""
error_code = clone_functionparameters_foraction_fromdoctype_todoctype(fromdoctype=fromdoctype, \
todoctype=todoctype, action=action)
if error_code in (1, 2):
## something went wrong and it was not possible to clone all parameters/values of "action"/"fromdoctype" for "action"/"todoctype"
## TODO : LOG ERRORS
user_msg.append("""It was not possible to clone all parameter values from the action "%(act)s" of the document type""" \
""" "%(fromdt)s" for the action "%(act)s" of the document type "%(todt)s".""" \
% { 'act' : action, 'fromdt' : fromdoctype, 'todt' : todoctype }
)
return 2 ## to signal that addition wasn't 100% successful
else:
return 0 ## all parameters were cloned
def _add_doctype(doctype, doctypename, doctypedescr, clonefrom):
title = ""
body = ""
user_msg = []
commit_error = 0
if doctype == "":
user_msg.append("""The Document Type ID is mandatory and must be a string with no more than 10 alpha-numeric characters""")
commit_error = 1
if commit_error != 0:
## don't commit - just re-display page with message to user
(title, body) = _create_add_doctype_form(doctypename=doctypename, doctypedescr=doctypedescr, clonefrom=clonefrom, user_msg=user_msg)
return (title, body)
numrows_doctype = get_number_doctypes_docid(docid=doctype)
if numrows_doctype > 0:
## this document type already exists - do not add
## TODO : LOG ERROR
user_msg.append("""A document type identified by "%s" already seems to exist and there cannot be added. Choose another ID.""" \
% (doctype,))
(title, body) = _create_add_doctype_form(doctypename=doctypename, doctypedescr=doctypedescr, clonefrom=clonefrom, user_msg=user_msg)
else:
## proceed with addition
## add the document type details:
error_code = insert_doctype_details(doctype=doctype, doctypename=doctypename, doctypedescr=doctypedescr)
if error_code == 0:
## added successfully
if clonefrom not in ("", "None", None):
## document type should be cloned from "clonefrom"
## first, clone the categories from another doctype:
error_code = _clone_categories_doctype(user_msg=user_msg,
fromdoctype=clonefrom,
todoctype=doctype)
## get details of clonefrom's submissions
all_actnames_submissions_clonefrom = get_actname_all_submissions_doctype(doctype=clonefrom)
if len(all_actnames_submissions_clonefrom) > 0:
## begin cloning
for doc_submission_actname in all_actnames_submissions_clonefrom:
## clone submission details:
action_name = doc_submission_actname[0]
_clone_submission_fromdoctype_todoctype(user_msg=user_msg,
todoctype=doctype, action=action_name, clonefrom=clonefrom)
user_msg.append("""The "%s" document type has been added.""" % (doctype,))
title = """Available WebSubmit Document Types"""
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
else:
## could not add document type details - do no more
## TODO : LOG ERROR!
user_msg.append("""Unable to add details for document type "%s".""" % (doctype,))
(title, body) = _create_add_doctype_form(user_msg=user_msg)
return (title, body)
def perform_request_add_doctype(doctype=None, doctypename=None, doctypedescr=None, clonefrom=None, doctypedetailscommit=""):
body = ""
## wash args:
if doctype is not None:
try:
doctype = wash_single_urlarg(urlarg=doctype, argreqdtype=str, argdefault="", maxstrlen=10, minstrlen=1)
if string_is_alphanumeric_including_underscore(txtstring=doctype) == 0:
doctype = ""
except ValueError, e:
doctype = ""
else:
doctype = ""
if doctypename is not None:
try:
doctypename = wash_single_urlarg(urlarg=doctypename, argreqdtype=str, argdefault="")
except ValueError, e:
doctypename = ""
else:
doctypename = ""
if doctypedescr is not None:
try:
doctypedescr = wash_single_urlarg(urlarg=doctypedescr, argreqdtype=str, argdefault="")
except ValueError, e:
doctypedescr = ""
else:
doctypedescr = ""
if clonefrom is not None:
try:
clonefrom = wash_single_urlarg(urlarg=clonefrom, argreqdtype=str, argdefault="None")
except ValueError, e:
clonefrom = "None"
else:
clonefrom = "None"
if doctypedetailscommit not in ("", None):
(title, body) = _add_doctype(doctype=doctype,
doctypename=doctypename, doctypedescr=doctypedescr, clonefrom=clonefrom)
else:
(title, body) = _create_add_doctype_form()
return (title, body)
def _delete_referee_doctype(doctype, categid, refereeid):
"""Delete a referee from a given category of a document type.
@param doctype: the document type from whose category the referee is to be removed
@param categid: the name/ID of the category from which the referee is to be removed
@param refereeid: the id of the referee to be removed from the given category
@return: a tuple containing 2 strings: (page title, page body)
"""
user_msg = []
role_name = """referee_%s_%s""" % (doctype, categid)
error_code = acc_delete_user_role(id_user=refereeid, name_role=role_name)
if error_code > 0:
## referee was deleted from category
user_msg.append(""" "%s".""" % (doctype,))
def _create_list_referees_doctype(doctype):
referees = {}
referees_details = {}
## get all Invenio roles:
all_roles = acc_get_all_roles()
for role in all_roles:
(roleid, rolename) = (role[0], role[1])
if re.match("^referee_%s_" % (doctype,), rolename):
## this is a "referee" role - get users of this role:
role_users = acc_get_role_users(roleid)
if role_users is not None and (type(role_users) in (tuple, list) and len(role_users) > 0):
## this role has users, record them in dictionary:
referees[rolename] = role_users
## for each "group" of referees:
for ref_role in referees.keys():
## get category ID for this referee-role:
try:
categid = re.match("^referee_%s_(.*)" % (doctype,), ref_role).group(1)
## from WebSubmit DB, get categ name for "categid":
if categid != "*":
categ_details = get_all_categories_sname_lname_for_doctype_categsname(doctype=doctype, categsname=categid)
if len(categ_details) > 0:
## if possible to receive details of this category, record them in a tuple in the format:
## ("categ name", (tuple of users details)):
referees_details[ref_role] = (categid, categ_details[0][1], referees[ref_role])
else:
## general referee entry:
referees_details[ref_role] = (categid, "General Referee(s)", referees[ref_role])
except AttributeError:
## there is no category for this role - it is broken, so pass it
pass
return referees_details
def _create_edit_doctype_details_form(doctype, doctypename="", doctypedescr="", doctypedetailscommit="", user_msg=""):
if user_msg == "" or type(user_msg) not in (list, tuple, str, unicode):
user_msg = []
elif type(user_msg) in (str, unicode):
user_msg = [user_msg]
title = "Edit Document Type Details"
doctype_details = get_doctype_docname_descr_cd_md_fordoctype(doctype)
if len(doctype_details) == 1:
docname = doctype_details[0][1]
docdescr = doctype_details[0][2]
(cd, md) = (doctype_details[0][3], doctype_details[0][4])
if doctypedetailscommit != "":
## could not commit details
docname = doctypename
docdescr = doctypedescr
body = websubmitadmin_templates.tmpl_display_doctypedetails_form(doctype=doctype,
doctypename=docname,
doctypedescr=docdescr,
cd=cd,
md=md,
user_msg=user_msg,
perform_act="doctypeconfigure")
else:
## problem retrieving details of doctype:
user_msg.append("""Unable to retrieve details of doctype '%s' - cannot edit.""" % (doctype,),)
## TODO : LOG ERROR
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
def _create_add_submission_choose_clonefrom_form(doctype, action, user_msg=""):
if user_msg == "" or type(user_msg) not in (list, tuple, str, unicode):
user_msg = []
elif type(user_msg) in (str, unicode):
user_msg = [user_msg]
if action in ("", None):
user_msg.append("""Unknown Submission""")
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
## does this doctype already have this action?
numrows_doctype_action = get_number_submissions_doctype_action(doctype=doctype, action=action)
if numrows_doctype_action < 1:
## action not present for this doctype - can be added
## get list of all doctypes implementing this action (for possible cloning purposes)
doctypes_implementing_action = get_doctypeid_doctypes_implementing_action(action=action)
## create form to display document types to clone from
title = "Add Submission '%s' to Document Type '%s'" % (action, doctype)
body = websubmitadmin_templates.tmpl_display_submission_clone_form(doctype=doctype,
action=action,
clonefrom_list=doctypes_implementing_action,
user_msg=user_msg
)
else:
## warn user that action already exists for doctype and canot be added, then display all
## details of doctype again
user_msg.append("The Document Type '%s' already implements the Submission '%s' - cannot add it again" \
% (doctype, action))
## TODO : LOG WARNING
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _create_add_submission_form(doctype, action, displayed="", buttonorder="", statustext="",
level="", score="", stpage="", endtxt="", user_msg=""):
if user_msg == "" or type(user_msg) not in (list, tuple, str, unicode):
user_msg = []
elif type(user_msg) in (str, unicode):
user_msg = [user_msg]
if action in ("", None):
user_msg.append("""Unknown Submission""")
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
title = "Add Submission '%s' to Document Type '%s'" % (action, doctype)
body = websubmitadmin_templates.tmpl_display_submissiondetails_form(doctype=doctype,
action=action,
displayed=displayed,
buttonorder=buttonorder,
statustext=statustext,
level=level,
score=score,
stpage=stpage,
endtxt=endtxt,
user_msg=user_msg,
saveaction="add"
)
return (title, body)
def _create_delete_submission_form(doctype, action):
user_msg = []
title = """Delete Submission "%s" from Document Type "%s" """ % (action, doctype)
numrows_doctypesubmission = get_number_submissions_doctype_action(doctype=doctype, action=action)
if numrows_doctypesubmission > 0:
## submission exists: create form to delete it:
body = websubmitadmin_templates.tmpl_display_delete_doctypesubmission_form(doctype=doctype, action=action)
else:
## submission doesn't seem to exist. Display details of doctype only:
user_msg.append("""The Submission "%s" doesn't seem to exist for the Document Type "%s" - unable to delete it""" % (action, doctype))
## TODO : LOG ERRORS
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _create_edit_submission_form(doctype, action, user_msg=""):
if user_msg == "" or type(user_msg) not in (list, tuple, str, unicode):
user_msg = []
elif type(user_msg) in (str, unicode):
user_msg = [user_msg]
submission_details = get_submissiondetails_doctype_action(doctype=doctype, action=action)
numrows_submission_details = len(submission_details)
if numrows_submission_details == 1:
## correctly retrieved details of submission - display:
submission_details = stringify_listvars(submission_details)
displayed = submission_details[0][3]
buttonorder = submission_details[0][7]
statustext = submission_details[0][8]
level = submission_details[0][9]
score = submission_details[0][10]
stpage = submission_details[0][11]
endtxt = submission_details[0][12]
cd = submission_details[0][5]
md = submission_details[0][6]
title = "Edit Details of '%s' Submission of '%s' Document Type" % (action, doctype)
body = websubmitadmin_templates.tmpl_display_submissiondetails_form(doctype=doctype,
action=action,
displayed=displayed,
buttonorder=buttonorder,
statustext=statustext,
level=level,
score=score,
stpage=stpage,
endtxt=endtxt,
cd=cd,
md=md,
user_msg=user_msg
)
else:
if numrows_submission_details > 1:
## multiple rows for this submission - this is a key violation
user_msg.append("Found multiple rows for the Submission '%s' of the Document Type '%s'" \
% (action, doctype))
else:
## submission does not exist
user_msg.append("The Submission '%s' of the Document Type '%s' doesn't seem to exist." \
% (action, doctype))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _create_edit_category_form(doctype, categid):
title = "Edit Category Description"
categ_details = get_all_categories_sname_lname_for_doctype_categsname(doctype=doctype, categsname=categid)
if len(categ_details) == 1:
## disaply details
retrieved_categid=categ_details[0][0]
retrieved_categdescr=categ_details[0][1]
body = websubmitadmin_templates.tmpl_display_edit_category_form(doctype=doctype,
categid=retrieved_categid,
categdescr=retrieved_categdescr
)
else:
## problem retrieving details of categ
user_msg = """Unable to retrieve details of category '%s'""" % (categid,)
## TODO : LOG ERRORS
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _create_configure_doctype_form(doctype, jumpcategout="", user_msg=""):
title = "Configure Document Type"
body = ""
if user_msg == "" or type(user_msg) not in (list, tuple, str, unicode):
user_msg = []
## get details of doctype:
doctype_details = get_doctype_docname_descr_cd_md_fordoctype(doctype)
docname = doctype_details[0][1]
docdescr = doctype_details[0][2]
(cd, md) = (doctype_details[0][3], doctype_details[0][4])
## get categories for doctype:
doctype_categs = get_all_category_details_for_doctype(doctype=doctype)
## get submissions for doctype:
doctype_submissions = get_submissiondetails_all_submissions_doctype(doctype=doctype)
## get list of actions that this doctype doesn't have:
unlinked_actions = get_actions_sname_lname_not_linked_to_doctype(doctype=doctype)
## get referees for doctype:
referees_dets = _create_list_referees_doctype(doctype=doctype)
body = websubmitadmin_templates.tmpl_configure_doctype_overview(doctype=doctype, doctypename=docname,
doctypedescr=docdescr, doctype_cdate=cd,
doctype_mdate=md, doctype_categories=doctype_categs,
jumpcategout=jumpcategout,
doctype_submissions=doctype_submissions,
doctype_referees=referees_dets,
add_actions_list=unlinked_actions,
user_msg=user_msg)
return (title, body)
def _clone_submission_fromdoctype_todoctype(user_msg, todoctype, action, clonefrom):
## first, delete the submission from todoctype (if it exists):
error_code = delete_submissiondetails_doctype(doctype=todoctype, action=action)
if error_code == 0:
## could be deleted - now clone it
error_code = insert_submission_details_clonefrom_submission(addtodoctype=todoctype, action=action, clonefromdoctype=clonefrom)
if error_code == 0:
## submission inserted
## now clone functions:
error_code = _clone_functions_foraction_doctype(user_msg=user_msg, \
fromdoctype=clonefrom, todoctype=todoctype, action=action)
if error_code in (0, 2):
## no serious error - clone parameters:
error_code = _clone_functionparameters_foraction_fromdoctype_todoctype(user_msg=user_msg,
fromdoctype=clonefrom,
todoctype=todoctype,
action=action)
## now clone pages/elements
error_code = clone_submissionfields_from_doctypesubmission_to_doctypesubmission(fromsub="%s%s" % (action, clonefrom),
tosub="%s%s" % (action, todoctype))
if error_code == 1:
## could not delete all existing submission fields and therefore could no clone submission fields at all
## TODO : LOG ERROR
user_msg.append("""Unable to delete existing submission fields for Submission "%s" of Document Type "%s" - """ \
"""cannot clone submission fields!""" % (action, todoctype))
elif error_code == 2:
## could not clone all fields
## TODO : LOG ERROR
user_msg.append("""Unable to clone all submission fields for submission "%s" on Document Type "%s" from Document""" \
""" Type "%s" """ % (action, todoctype, clonefrom))
else:
## could not insert submission details!
user_msg.append("""Unable to successfully insert details of submission "%s" into Document Type "%s" - cannot clone from "%s" """ \
% (action, todoctype, clonefrom))
## TODO : LOG ERROR
else:
## could not delete details of existing submission (action) from 'todoctype' - cannot clone it as new
user_msg.append("""Unable to delete details of existing Submission "%s" from Document Type "%s" - cannot clone it from "%s" """ \
% (action, todoctype, clonefrom))
## TODO : LOG ERROR
def _add_submission_to_doctype_clone(doctype, action, clonefrom):
user_msg = []
if action in ("", None) or clonefrom in ("", None):
user_msg.append("Unknown action or document type to clone from - cannot add submission")
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
## does action exist?
numrows_action = get_number_actions_with_actid(actid=action)
if numrows_action > 0:
## The action exists, but is it already implemented as a submission by doctype?
numrows_submission_doctype = get_number_submissions_doctype_action(doctype=doctype, action=action)
if numrows_submission_doctype > 0:
## this submission already exists for this document type - unable to add it again
user_msg.append("""The Submission "%s" already exists for Document Type "%s" - cannot add it again""" \
%(action, doctype))
## TODO : LOG ERROR
else:
## clone the submission
_clone_submission_fromdoctype_todoctype(user_msg=user_msg,
todoctype=doctype, action=action, clonefrom=clonefrom)
user_msg.append("""Cloning of Submission "%s" from Document Type "%s" has been carried out. You should not""" \
""" ignore any warnings that you may have seen.""" % (action, clonefrom))
## TODO : LOG WARNING OF NEW SUBMISSION CREATION BY CLONING
else:
## this action doesn't exist! cannot add a submission based upon it!
user_msg.append("The Action '%s' does not seem to exist in WebSubmit. Cannot add it as a Submission!" \
% (action))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _add_submission_to_doctype(doctype, action, displayed, buttonorder,
statustext, level, score, stpage, endtxt):
user_msg = []
## does "action" exist?
numrows_action = get_number_actions_with_actid(actid=action)
if numrows_action < 1:
## this action does not exist! Can't add a submission based upon it!
user_msg.append("'%s' does not exist in WebSubmit as an Action! Unable to add this submission."\
% (action,))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
## Insert the new submission
error_code = insert_submission_details(doctype=doctype, action=action, displayed=displayed,
nbpg="0", buttonorder=buttonorder, statustext=statustext,
level=level, score=score, stpage=stpage, endtext=endtxt)
if error_code == 0:
## successful insert
user_msg.append("""'%s' Submission Successfully Added to Document Type '%s'""" % (action, doctype))
else:
## could not insert submission into doctype
user_msg.append("""Unable to Add '%s' Submission to '%s' Document Type""" % (action, doctype))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _delete_submission_from_doctype(doctype, action):
"""Delete a submission (action) from the document type identified by "doctype".
@param doctype: the unique ID of the document type from which the submission is to be deleted
@param categid: the action ID of the submission to be deleted from doctype
@return: a tuple containing 2 strings: (page title, page body)
"""
user_msg = []
if action in ("", None):
user_msg.append("Unknown action - cannot delete submission")
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
## delete fields for this submission:
error_code = delete_all_submissionfields_submission("""%s%s""" % (action, doctype) )
if error_code != 0:
## could not successfully delete all fields - report error
user_msg.append("""When deleting Submission "%s" from Document Type "%s", it wasn't possible to delete all Submission Fields""" \
% (action, doctype))
## TODO : LOG ERROR
## delete parameters for this submission:
error_code = delete_functionparameters_doctype_submission(doctype=doctype, action=action)
if error_code != 0:
## could not successfully delete all functions - report error
user_msg.append("""When deleting Submission "%s" from Document Type "%s", it wasn't possible to delete all Function Parameters""" \
% (action, doctype))
## TODO : LOG ERROR
## delete functions for this submission:
error_code = delete_all_functions_foraction_doctype(doctype=doctype, action=action)
if error_code != 0:
## could not successfully delete all functions - report error
user_msg.append("""When deleting Submission "%s" from Document Type "%s", it wasn't possible to delete all Functions""" \
% (action, doctype))
## TODO : LOG ERROR
## delete this submission itself:
error_code = delete_submissiondetails_doctype(doctype=doctype, action=action)
if error_code == 0:
## successful delete
user_msg.append("""The "%s" Submission has been deleted from the "%s" Document Type""" % (action, doctype))
else:
## could not delete category
user_msg.append("""Unable to successfully delete the "%s" Submission from the "%s" Document Type""" % (action, doctype))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _edit_submission_for_doctype(doctype, action, displayed, buttonorder,
statustext, level, score, stpage, endtxt):
"""Update the details of a given submission belonging to the document type identified by "doctype".
@param doctype: the unique ID of the document type for which the submission is to be updated
@param action: action name of the submission to be updated
@param displayed: displayed on main submission page? (Y/N)
@param buttonorder: button order
@param statustext: statustext
@param level: level
@param score: score
@param stpage: stpage
@param endtxt: endtxt
@return: a tuple of 2 strings: (page title, page body)
"""
user_msg = []
commit_error = 0
if action in ("", None):
user_msg.append("Unknown Action - cannot update submission")
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
error_code = update_submissiondetails_doctype_action(doctype=doctype, action=action, displayed=displayed,
buttonorder=buttonorder, statustext=statustext, level=level,
score=score, stpage=stpage, endtxt=endtxt)
if error_code == 0:
## successful update
user_msg.append("'%s' Submission of '%s' Document Type updated." % (action, doctype) )
else:
## could not update
user_msg.append("Unable to update '%s' Submission of '%s' Document Type." % (action, doctype) )
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _edit_doctype_details(doctype, doctypename, doctypedescr):
"""Update the details (name and/or description) of a document type (identified by doctype.)
@param doctype: the unique ID of the document type to be updated
@param doctypename: the new/updated name for the doctype
@param doctypedescr: the new/updated description for the doctype
@return: a tuple containing 2 strings: (page title, page body)
"""
user_msg = []
error_code = update_doctype_details(doctype=doctype, doctypename=doctypename, doctypedescr=doctypedescr)
if error_code == 0:
## successful update
user_msg.append("""'%s' Document Type Updated""" % (doctype,))
else:
## could not update
user_msg.append("""Unable to Update Doctype '%s'""" % (doctype,))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _edit_category_for_doctype(doctype, categid, categdescr):
"""Edit the description of a given category (identified by categid), belonging to
the document type identified by doctype.
@param doctype: the unique ID of the document type for which the category is to be modified
@param categid: the unique category ID of the category to be modified
@param categdescr: the new description for the category
@return: at tuple containing 2 strings: (page title, page body)
"""
user_msg = []
if categid in ("", None) or categdescr in ("", None):
## cannot edit unknown category!
user_msg.append("Category ID and Description are both mandatory")
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
error_code = update_category_description_doctype_categ(doctype=doctype, categ=categid, categdescr=categdescr)
if error_code == 0:
## successful update
user_msg.append("""'%s' Category Description Successfully Updated""" % (categid,))
else:
## could not update category description
user_msg.append("""Unable to Description for Category '%s'""" % (categid,))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _add_category_to_doctype(doctype, categid, categdescr):
"""Add a new category to the document type identified by "doctype".
Category ID, and category description are both mandatory.
@param doctype: the unique ID of the document type to which the category is to be added
@param categid: the unique category ID of the category to be added to doctype
@param categdescr: the description of the category to be added
@return: at tuple containing 2 strings: (page title, page body)
"""
user_msg = []
if categid in ("", None) or categdescr in ("", None):
## cannot add unknown category!
user_msg.append("Category ID and Description are both mandatory")
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
error_code = insert_category_into_doctype(doctype=doctype, categ=categid, categdescr=categdescr)
if error_code == 0:
## successful insert
user_msg.append("""'%s' Category Successfully Added""" % (categid,))
else:
## could not insert category into doctype
user_msg.append("""Unable to Add '%s' Category""" % (categid,))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _delete_category_from_doctype(doctype, categid):
"""Delete a category (categid) from the document type identified by "doctype".
@param doctype: the unique ID of the document type from which the category is to be deleted
@param categid: the unique category ID of the category to be deleted from doctype
@return: a tuple containing 2 strings: (page title, page body)
"""
user_msg = []
if categid in ("", None):
## cannot delete unknown category!
user_msg.append("Category ID is mandatory")
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
error_code = delete_category_doctype(doctype=doctype, categ=categid)
if error_code == 0:
## successful delete
user_msg.append("""'%s' Category Successfully Deleted""" % (categid,))
else:
## could not delete category
user_msg.append("""Unable to Delete '%s' Category""" % (categid,))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _jump_category_to_new_score(doctype, jumpcategout, jumpcategin):
user_msg = []
if jumpcategout in ("", None) or jumpcategin in ("", None):
## need both jumpcategout and jumpcategin to move a category:
user_msg.append("Unable to move category - unknown source and/or destination score(s)")
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
## FIXME TODO:
error_code = move_category_to_new_score(doctype, jumpcategout, jumpcategin)
if error_code == 0:
## successful jump of category
user_msg.append("""Successfully Moved [%s] Category""" % (jumpcategout,))
else:
## could not delete category
user_msg.append("""Unable to Move [%s] Category""" % (jumpcategout,))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _move_category(doctype, categid, movecategup=""):
user_msg = []
if categid in ("", None):
## cannot move unknown category!
user_msg.append("Cannot move an unknown category - category ID is mandatory")
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
if movecategup not in ("", None):
## move the category up in score:
error_code = move_category_by_one_place_in_score(doctype=doctype,
categsname=categid,
direction="up")
else:
## move the category down in score:
error_code = move_category_by_one_place_in_score(doctype=doctype,
categsname=categid,
direction="down")
if error_code == 0:
## successful move of category
user_msg.append("""[%s] Category Successfully Moved""" % (categid,))
else:
## could not delete category
user_msg.append("""Unable to Move [%s] Category""" % (categid,))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def perform_request_configure_doctype(doctype,
doctypename=None,
doctypedescr=None,
doctypedetailsedit="",
doctypedetailscommit="",
doctypecategoryadd="",
doctypecategoryedit="",
doctypecategoryeditcommit="",
doctypecategorydelete="",
doctypesubmissionadd="",
doctypesubmissiondelete="",
doctypesubmissiondeleteconfirm="",
doctypesubmissionedit="",
doctypesubmissionaddclonechosen="",
doctypesubmissionadddetailscommit="",
doctypesubmissioneditdetailscommit="",
categid=None,
categdescr=None,
movecategup=None,
movecategdown=None,
jumpcategout=None,
jumpcategin=None,
action=None,
doctype_cloneactionfrom=None,
displayed=None,
buttonorder=None,
statustext=None,
level=None,
score=None,
stpage=None,
endtxt=None
):
user_msg = []
body = ""
if doctype is not None:
try:
doctype = wash_single_urlarg(urlarg=doctype, argreqdtype=str, argdefault="", maxstrlen=10, minstrlen=1)
if string_is_alphanumeric_including_underscore(txtstring=doctype) == 0:
doctype = ""
except ValueError, e:
doctype = ""
else:
doctype = ""
if action is not None:
try:
action = wash_single_urlarg(urlarg=action, argreqdtype=str, argdefault="", maxstrlen=3, minstrlen=1)
if string_is_alphanumeric_including_underscore(txtstring=action) == 0:
action = ""
except ValueError, e:
action = ""
else:
action = ""
if doctypename is not None:
try:
doctypename = wash_single_urlarg(urlarg=doctypename, argreqdtype=str, argdefault="")
except ValueError, e:
doctypename = ""
else:
doctypename = ""
if doctypedescr is not None:
try:
doctypedescr = wash_single_urlarg(urlarg=doctypedescr, argreqdtype=str, argdefault="")
except ValueError, e:
doctypedescr = ""
else:
doctypedescr = ""
if categid is not None:
try:
categid = wash_single_urlarg(urlarg=categid, argreqdtype=str, argdefault="")
except ValueError, e:
categid = ""
else:
categid = ""
if categdescr is not None:
try:
categdescr = wash_single_urlarg(urlarg=categdescr, argreqdtype=str, argdefault="")
except ValueError, e:
categdescr = ""
else:
categdescr = ""
if doctype_cloneactionfrom is not None:
try:
doctype_cloneactionfrom = wash_single_urlarg(urlarg=doctype_cloneactionfrom, argreqdtype=str, argdefault="", maxstrlen=10, minstrlen=1)
if string_is_alphanumeric_including_underscore(txtstring=doctype_cloneactionfrom) == 0:
doctype_cloneactionfrom = ""
except ValueError, e:
doctype_cloneactionfrom = ""
else:
doctype_cloneactionfrom = ""
if displayed is not None:
try:
displayed = wash_single_urlarg(urlarg=displayed, argreqdtype=str, argdefault="Y", maxstrlen=1, minstrlen=1)
except ValueError, e:
displayed = "Y"
else:
displayed = "Y"
if buttonorder is not None:
try:
buttonorder = wash_single_urlarg(urlarg=buttonorder, argreqdtype=int, argdefault="")
except ValueError, e:
buttonorder = ""
else:
buttonorder = ""
if level is not None:
try:
level = wash_single_urlarg(urlarg=level, argreqdtype=str, argdefault="", maxstrlen=1, minstrlen=1)
except ValueError, e:
level = ""
else:
level = ""
if score is not None:
try:
score = wash_single_urlarg(urlarg=score, argreqdtype=int, argdefault="")
except ValueError, e:
score = ""
else:
score = ""
if stpage is not None:
try:
stpage = wash_single_urlarg(urlarg=stpage, argreqdtype=int, argdefault="")
except ValueError, e:
stpage = ""
else:
stpage = ""
if statustext is not None:
try:
statustext = wash_single_urlarg(urlarg=statustext, argreqdtype=str, argdefault="")
except ValueError, e:
statustext = ""
else:
statustext = ""
if endtxt is not None:
try:
endtxt = wash_single_urlarg(urlarg=endtxt, argreqdtype=str, argdefault="")
except ValueError, e:
endtxt = ""
else:
endtxt = ""
## ensure that there is only one doctype for this doctype ID - simply display all doctypes with warning if not
numrows_doctype = get_number_doctypes_docid(docid=doctype)
if numrows_doctype > 1:
## there are multiple doctypes with this doctype ID:
## TODO : LOG ERROR
user_msg.append("""Multiple document types identified by "%s" exist - cannot configure at this time.""" \
% (doctype,))
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
elif numrows_doctype == 0:
## this doctype does not seem to exist:
user_msg.append("""The document type identified by "%s" doesn't exist - cannot configure at this time.""" \
% (doctype,))
## TODO : LOG ERROR
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
## since doctype ID is OK, process doctype configuration request:
if doctypedetailsedit not in ("", None):
(title, body) = _create_edit_doctype_details_form(doctype=doctype)
elif doctypedetailscommit not in ("", None):
## commit updated document type details
(title, body) = _edit_doctype_details(doctype=doctype,
doctypename=doctypename, doctypedescr=doctypedescr)
elif doctypecategoryadd not in ("", None):
## add new category:
(title, body) = _add_category_to_doctype(doctype=doctype, categid=categid, categdescr=categdescr)
elif doctypecategoryedit not in ("", None):
## create form to update category description:
(title, body) = _create_edit_category_form(doctype=doctype,
categid=categid)
elif doctypecategoryeditcommit not in ("", None):
## commit updated category description:
(title, body) = _edit_category_for_doctype(doctype=doctype, categid=categid, categdescr=categdescr)
elif doctypecategorydelete not in ("", None):
## delete a category
(title, body) = _delete_category_from_doctype(doctype=doctype, categid=categid)
elif movecategup not in ("", None) or movecategdown not in ("", None):
## move a category up or down in score:
(title, body) = _move_category(doctype=doctype, categid=categid,
movecategup=movecategup)
elif jumpcategout not in ("", None) and jumpcategin not in ("", None):
## jump a category from one score to another:
(title, body) = _jump_category_to_new_score(doctype=doctype, jumpcategout=jumpcategout,
jumpcategin=jumpcategin)
elif doctypesubmissionadd not in ("", None):
## form displaying option of adding doctype:
(title, body) = _create_add_submission_choose_clonefrom_form(doctype=doctype, action=action)
elif doctypesubmissionaddclonechosen not in ("", None):
## add a submission. if there is a document type to be cloned from, then process clone;
## otherwise, present form with details of doctype
if doctype_cloneactionfrom in ("", None, "None"):
## no clone - present form into which details of new submission should be entered
(title, body) = _create_add_submission_form(doctype=doctype, action=action)
else:
## new submission should be cloned from doctype_cloneactionfrom
(title, body) = _add_submission_to_doctype_clone(doctype=doctype, action=action, clonefrom=doctype_cloneactionfrom)
elif doctypesubmissiondelete not in ("", None):
## create form to prompt for confirmation of deletion of a submission:
(title, body) = _create_delete_submission_form(doctype=doctype, action=action)
elif doctypesubmissiondeleteconfirm not in ("", None):
## process the deletion of a submission from the doctype concerned:
(title, body) = _delete_submission_from_doctype(doctype=doctype, action=action)
elif doctypesubmissionedit not in ("", None):
## create form to update details of a submission
(title, body) = _create_edit_submission_form(doctype=doctype, action=action)
elif doctypesubmissioneditdetailscommit not in ("", None):
## commit updated submission details:
(title, body) = _edit_submission_for_doctype(doctype=doctype, action=action,
displayed=displayed, buttonorder=buttonorder, statustext=statustext,
level=level, score=score, stpage=stpage, endtxt=endtxt)
elif doctypesubmissionadddetailscommit not in ("", None):
## commit new submission to doctype (not by cloning)
(title, body) = _add_submission_to_doctype(doctype=doctype, action=action,
displayed=displayed, buttonorder=buttonorder, statustext=statustext,
level=level, score=score, stpage=stpage, endtxt=endtxt)
else:
## default - display root of edit doctype
(title, body) = _create_configure_doctype_form(doctype=doctype, jumpcategout=jumpcategout)
return (title, body)
def _create_configure_doctype_submission_functions_form(doctype,
action,
movefromfunctionname="",
movefromfunctionstep="",
movefromfunctionscore="",
user_msg=""):
title = """Functions of the "%s" Submission of the "%s" Document Type:""" % (action, doctype)
submission_functions = get_functionname_step_score_allfunctions_doctypesubmission(doctype=doctype, action=action)
body = websubmitadmin_templates.tmpl_configuredoctype_display_submissionfunctions(doctype=doctype,
action=action,
movefromfunctionname=movefromfunctionname,
movefromfunctionstep=movefromfunctionstep,
movefromfunctionscore=movefromfunctionscore,
submissionfunctions=submission_functions,
user_msg=user_msg)
return (title, body)
def _create_configure_doctype_submission_functions_add_function_form(doctype, action, addfunctionname="",
addfunctionstep="", addfunctionscore="", user_msg=""):
"""Create a form that allows a user to add a function a submission.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param addfunctionname: (string) the name of the function to be added to the submission (passed in case of page refresh)
@param addfunctionstep: (integer) the step of the submission into which the function is to be added (passed in case of
page refresh)
@param addfunctionscore: (integer) the score at at which the function is to be added (passed in case of page refresh)
@param user_msg: (string or list of strings) any message(s) to be displayed to the user
@return: (tuple) containing 2 strings - (page-title, HTML page-body)
"""
title = """Add a function to the [%s] submission of the [%s] document type""" % (action, doctype)
submission_functions = get_functionname_step_score_allfunctions_doctypesubmission(doctype=doctype, action=action)
## get names of all WebSubmit functions:
all_websubmit_functions = get_names_of_all_functions()
## put names into a list of single-element tuples, so that template can make HTML select list with them:
all_websubmit_functions = map(lambda x: (str(x),), all_websubmit_functions)
## create page body:
body = websubmitadmin_templates.tmpl_configuredoctype_add_submissionfunction(doctype=doctype,
action=action,
cursubmissionfunctions=submission_functions,
allWSfunctions=all_websubmit_functions,
addfunctionname=addfunctionname,
addfunctionstep=addfunctionstep,
addfunctionscore=addfunctionscore,
user_msg=user_msg)
return (title, body)
def _create_configure_doctype_submission_functions_list_parameters_form(doctype,
action,
functionname,
user_msg=""):
title = """Parameters of the %s function, as used in the %s document type"""\
% (functionname, doctype)
funcparams = get_function_parameters(function=functionname)
if len(funcparams) > 0:
## get the values
paramslist = map(lambda x: str(x[0]), funcparams)
params = get_function_parameter_vals_doctype(doctype=doctype, paramlist=paramslist)
else:
params = ()
## params = get_parameters_name_and_value_for_function_of_doctype(doctype=doctype, function=functionname)
body = websubmitadmin_templates.tmpl_configuredoctype_list_functionparameters(doctype=doctype,
action=action,
function=functionname,
params=params,
user_msg=user_msg)
return (title, body)
def _update_submission_function_parameter_file(doctype, action, functionname,
paramname, paramfilecontent):
user_msg = []
## get the filename:
paramval_res = get_value_of_parameter_for_doctype(doctype=doctype, parameter=paramname)
if paramval_res is None:
## this parameter doesn't exist for this doctype!
user_msg.append("The parameter [%s] doesn't exist for the document type [%s]!" % (paramname, doctype))
(title, body) = _create_configure_doctype_submission_functions_list_parameters_form(doctype=doctype,
action=action,
functionname=functionname,
user_msg=user_msg)
return (title, body)
paramval = str(paramval_res)
filename = basename(paramval)
if filename == "":
## invalid filename
user_msg.append("[%s] is an invalid filename - cannot save details" % (paramval,))
(title, body) = _create_configure_doctype_submission_functions_list_parameters_form(doctype=doctype,
action=action,
functionname=functionname,
user_msg=user_msg)
return (title, body)
## save file:
try:
save_update_to_file(filepath="%s/%s" % (CFG_WEBSUBMIT_BIBCONVERTCONFIGDIR, filename), filecontent=paramfilecontent)
except InvenioWebSubmitAdminWarningIOError, e:
## could not correctly update the file!
user_msg.append(str(e))
(title, body) = _create_configure_doctype_submission_functions_list_parameters_form(doctype=doctype,
action=action,
functionname=functionname,
user_msg=user_msg)
return (title, body)
## redisplay form
user_msg.append("""[%s] file updated""" % (filename,))
(title, body) = _create_configure_doctype_submission_functions_edit_parameter_file_form(doctype=doctype,
action=action,
functionname=functionname,
paramname=paramname,
user_msg=user_msg)
return (title, body)
def _create_configure_doctype_submission_functions_edit_parameter_file_form(doctype,
action,
functionname,
paramname,
user_msg=""):
if type(user_msg) is not list:
user_msg = []
paramval_res = get_value_of_parameter_for_doctype(doctype=doctype, parameter=paramname)
if paramval_res is None:
## this parameter doesn't exist for this doctype!
user_msg.append("The parameter [%s] doesn't exist for the document type [%s]!" % (paramname, doctype))
(title, body) = _create_configure_doctype_submission_functions_list_parameters_form(doctype=doctype,
action=action,
functionname=functionname)
return (title, body)
paramval = str(paramval_res)
title = "Edit the [%s] file for the [%s] document type" % (paramval, doctype)
## get basename of file:
filecontent = ""
filename = basename(paramval)
if filename == "":
## invalid filename
user_msg.append("[%s] is an invalid filename" % (paramval,))
(title, body) = _create_configure_doctype_submission_functions_list_parameters_form(doctype=doctype,
action=action,
functionname=functionname,
user_msg=user_msg)
return (title, body)
## try to read file contents:
if access("%s/%s" % (CFG_WEBSUBMIT_BIBCONVERTCONFIGDIR, filename), F_OK):
## file exists
if access("%s/%s" % (CFG_WEBSUBMIT_BIBCONVERTCONFIGDIR, filename), R_OK) and \
isfile("%s/%s" % (CFG_WEBSUBMIT_BIBCONVERTCONFIGDIR, filename)):
## file is a regular file and is readable - get contents
filecontent = open("%s/%s" % (CFG_WEBSUBMIT_BIBCONVERTCONFIGDIR, filename), "r").read()
else:
if not isfile("%s/%s" % (CFG_WEBSUBMIT_BIBCONVERTCONFIGDIR, filename)):
## file is not a regular file
user_msg.append("The parameter file [%s] is not regular file - unable to read" % (filename,))
else:
## file is not readable - error message
user_msg.append("The parameter file [%s] could not be read - check permissions" % (filename,))
## display page listing the parameters of this function:
(title, body) = _create_configure_doctype_submission_functions_list_parameters_form(doctype=doctype,
action=action,
functionname=functionname,
user_msg=user_msg)
return (title, body)
else:
## file does not exist:
user_msg.append("The parameter file [%s] does not exist - it will be created" % (filename,))
## make page body:
body = websubmitadmin_templates.tmpl_configuredoctype_edit_functionparameter_file(doctype=doctype,
action=action,
function=functionname,
paramname=paramname,
paramfilename=filename,
paramfilecontent=filecontent,
user_msg=user_msg)
return (title, body)
def _create_configure_doctype_submission_functions_edit_parameter_value_form(doctype,
action,
functionname,
paramname,
paramval="",
user_msg=""):
title = """Edit the value of the [%s] Parameter""" % (paramname,)
## get the parameter's value from the DB:
paramval_res = get_value_of_parameter_for_doctype(doctype=doctype, parameter=paramname)
if paramval_res is None:
## this parameter doesn't exist for this doctype!
(title, body) = _create_configure_doctype_submission_functions_list_parameters_form(doctype=doctype,
action=action,
functionname=functionname)
if paramval == "":
## use whatever retrieved paramval_res contains:
paramval = str(paramval_res)
body = websubmitadmin_templates.tmpl_configuredoctype_edit_functionparameter_value(doctype=doctype,
action=action,
function=functionname,
paramname=paramname,
paramval=paramval)
return (title, body)
def _update_submissionfunction_parameter_value(doctype, action, functionname, paramname, paramval):
user_msg = []
try:
update_value_of_function_parameter_for_doctype(doctype=doctype, paramname=paramname, paramval=paramval)
user_msg.append("""The value of the parameter [%s] was updated for document type [%s]""" % (paramname, doctype))
except InvenioWebSubmitAdminWarningTooManyRows, e:
## multiple rows found for param - update not carried out
user_msg.append(str(e))
except InvenioWebSubmitAdminWarningNoRowsFound, e:
## no rows found - parameter does not exist for doctype, therefore no update
user_msg.append(str(e))
(title, body) = \
_create_configure_doctype_submission_functions_list_parameters_form(doctype=doctype, action=action,
functionname=functionname, user_msg=user_msg)
return (title, body)
def perform_request_configure_doctype_submissionfunctions_parameters(doctype,
action,
functionname,
functionstep,
functionscore,
paramname="",
paramval="",
editfunctionparametervalue="",
editfunctionparametervaluecommit="",
editfunctionparameterfile="",
editfunctionparameterfilecommit="",
paramfilename="",
paramfilecontent=""):
body = ""
user_msg = []
## ensure that there is only one doctype for this doctype ID - simply display all doctypes with warning if not
if doctype in ("", None):
user_msg.append("""Unknown Document Type""")
## TODO : LOG ERROR
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
numrows_doctype = get_number_doctypes_docid(docid=doctype)
if numrows_doctype > 1:
## there are multiple doctypes with this doctype ID:
## TODO : LOG ERROR
user_msg.append("""Multiple document types identified by "%s" exist - cannot configure at this time.""" \
% (doctype,))
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
elif numrows_doctype == 0:
## this doctype does not seem to exist:
user_msg.append("""The document type identified by "%s" doesn't exist - cannot configure at this time.""" \
% (doctype,))
## TODO : LOG ERROR
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
## ensure that this submission exists for this doctype:
numrows_submission = get_number_submissions_doctype_action(doctype=doctype, action=action)
if numrows_submission > 1:
## there are multiple submissions for this doctype/action ID:
## TODO : LOG ERROR
user_msg.append("""The Submission "%s" seems to exist multiple times for the Document Type "%s" - cannot configure at this time.""" \
% (action, doctype))
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
elif numrows_submission == 0:
## this submission does not seem to exist for this doctype:
user_msg.append("""The Submission "%s" doesn't exist for the "%s" Document Type - cannot configure at this time.""" \
% (action, doctype))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
if editfunctionparametervaluecommit not in ("", None):
## commit an update to a function parameter:
(title, body) = _update_submissionfunction_parameter_value(doctype=doctype, action=action, functionname=functionname,
paramname=paramname, paramval=paramval)
elif editfunctionparametervalue not in ("", None):
## display a form for editing the value of a parameter:
(title, body) = _create_configure_doctype_submission_functions_edit_parameter_value_form(doctype=doctype,
action=action,
functionname=functionname,
paramname=paramname,
paramval=paramval)
elif editfunctionparameterfile not in ("", None):
## display a form for editing the contents of a file, named by the parameter's value:
(title, body) = _create_configure_doctype_submission_functions_edit_parameter_file_form(doctype=doctype,
action=action,
functionname=functionname,
paramname=paramname)
elif editfunctionparameterfilecommit not in ("", None):
(title, body) = _update_submission_function_parameter_file(doctype=doctype, action=action, functionname=functionname,
paramname=paramname, paramfilecontent=paramfilecontent)
else:
## default - display list of parameters for function:
(title, body) = _create_configure_doctype_submission_functions_list_parameters_form(doctype=doctype,
action=action,
functionname=functionname)
return (title, body)
def perform_request_configure_doctype_submissionfunctions(doctype,
action,
moveupfunctionname="",
moveupfunctionstep="",
moveupfunctionscore="",
movedownfunctionname="",
movedownfunctionstep="",
movedownfunctionscore="",
movefromfunctionname="",
movefromfunctionstep="",
movefromfunctionscore="",
movetofunctionname="",
movetofunctionstep="",
movetofunctionscore="",
deletefunctionname="",
deletefunctionstep="",
deletefunctionscore="",
configuresubmissionaddfunction="",
configuresubmissionaddfunctioncommit="",
addfunctionname="",
addfunctionstep="",
addfunctionscore=""):
body = ""
user_msg = []
if addfunctionstep != "":
try:
addfunctionstep = str(wash_single_urlarg(urlarg=addfunctionstep, argreqdtype=int, argdefault=""))
except ValueError, e:
addfunctionstep = ""
if addfunctionscore != "":
try:
addfunctionscore = str(wash_single_urlarg(urlarg=addfunctionscore, argreqdtype=int, argdefault=""))
except ValueError, e:
addfunctionscore = ""
if deletefunctionstep != "":
try:
deletefunctionstep = str(wash_single_urlarg(urlarg=deletefunctionstep, argreqdtype=int, argdefault=""))
except ValueError, e:
deletefunctionstep = ""
if deletefunctionscore != "":
try:
deletefunctionscore = str(wash_single_urlarg(urlarg=deletefunctionscore, argreqdtype=int, argdefault=""))
except ValueError, e:
deletefunctionscore = ""
if movetofunctionstep != "":
try:
movetofunctionstep = str(wash_single_urlarg(urlarg=movetofunctionstep, argreqdtype=int, argdefault=""))
except ValueError, e:
movetofunctionstep = ""
if movetofunctionscore != "":
try:
movetofunctionscore = str(wash_single_urlarg(urlarg=movetofunctionscore, argreqdtype=int, argdefault=""))
except ValueError, e:
movetofunctionscore = ""
if moveupfunctionstep != "":
try:
moveupfunctionstep = str(wash_single_urlarg(urlarg=moveupfunctionstep, argreqdtype=int, argdefault=""))
except ValueError, e:
moveupfunctionstep = ""
if moveupfunctionscore != "":
try:
moveupfunctionscore = str(wash_single_urlarg(urlarg=moveupfunctionscore, argreqdtype=int, argdefault=""))
except ValueError, e:
moveupfunctionscore = ""
if movedownfunctionstep != "":
try:
movedownfunctionstep = str(wash_single_urlarg(urlarg=movedownfunctionstep, argreqdtype=int, argdefault=""))
except ValueError, e:
movedownfunctionstep = ""
if movedownfunctionscore != "":
try:
movedownfunctionscore = str(wash_single_urlarg(urlarg=movedownfunctionscore, argreqdtype=int, argdefault=""))
except ValueError, e:
movedownfunctionscore = ""
## ensure that there is only one doctype for this doctype ID - simply display all doctypes with warning if not
if doctype in ("", None):
user_msg.append("""Unknown Document Type""")
## TODO : LOG ERROR
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
numrows_doctype = get_number_doctypes_docid(docid=doctype)
if numrows_doctype > 1:
## there are multiple doctypes with this doctype ID:
## TODO : LOG ERROR
user_msg.append("""Multiple document types identified by "%s" exist - cannot configure at this time.""" \
% (doctype,))
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
elif numrows_doctype == 0:
## this doctype does not seem to exist:
user_msg.append("""The document type identified by "%s" doesn't exist - cannot configure at this time.""" \
% (doctype,))
## TODO : LOG ERROR
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
## ensure that this submission exists for this doctype:
numrows_submission = get_number_submissions_doctype_action(doctype=doctype, action=action)
if numrows_submission > 1:
## there are multiple submissions for this doctype/action ID:
## TODO : LOG ERROR
user_msg.append("""The Submission "%s" seems to exist multiple times for the Document Type "%s" - cannot configure at this time.""" \
% (action, doctype))
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
elif numrows_submission == 0:
## this submission does not seem to exist for this doctype:
user_msg.append("""The Submission "%s" doesn't exist for the "%s" Document Type - cannot configure at this time.""" \
% (action, doctype))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
## submission valid
if movefromfunctionname != "" and movefromfunctionstep != "" and movefromfunctionscore != "" and \
movetofunctionname != "" and movetofunctionstep != "" and movetofunctionscore != "":
## process moving the function by jumping it to another position
try:
move_submission_function_from_one_position_to_another_position(doctype=doctype, action=action,
movefuncname=movefromfunctionname,
movefuncfromstep=movefromfunctionstep,
movefuncfromscore=movefromfunctionscore,
movefunctostep=movetofunctionstep,
movefunctoscore=movetofunctionscore)
user_msg.append("""The function [%s] at step [%s], score [%s] was successfully moved."""\
% (movefromfunctionname, movefromfunctionstep, movefromfunctionscore))
except Exception, e:
## there was a problem
user_msg.append(str(e))
(title, body) = _create_configure_doctype_submission_functions_form(doctype=doctype,
action=action,
user_msg=user_msg)
elif moveupfunctionname != "" and moveupfunctionstep != "" and moveupfunctionscore != "":
## process moving the function up one position
error_code = move_position_submissionfunction_up(doctype=doctype,
action=action,
function=moveupfunctionname,
funccurstep=moveupfunctionstep,
funccurscore=moveupfunctionscore)
if error_code == 0:
## success
user_msg.append("""The Function "%s" that was located at step %s, score %s, has been moved upwards""" \
% (moveupfunctionname, moveupfunctionstep, moveupfunctionscore))
else:
## could not move it
user_msg.append("""Unable to move the Function "%s" that is located at step %s, score %s""" \
% (moveupfunctionname, moveupfunctionstep, moveupfunctionscore))
(title, body) = _create_configure_doctype_submission_functions_form(doctype=doctype,
action=action,
user_msg=user_msg)
elif movedownfunctionname != "" and movedownfunctionstep != "" and movedownfunctionscore != "":
## process moving the function down one position
error_code = move_position_submissionfunction_down(doctype=doctype,
action=action,
function=movedownfunctionname,
funccurstep=movedownfunctionstep,
funccurscore=movedownfunctionscore)
if error_code == 0:
## success
user_msg.append("""The Function "%s" that was located at step %s, score %s, has been moved downwards""" \
% (movedownfunctionname, movedownfunctionstep, movedownfunctionscore))
else:
## could not move it
user_msg.append("""Unable to move the Function "%s" that is located at step %s, score %s""" \
% (movedownfunctionname, movedownfunctionstep, movedownfunctionscore))
(title, body) = _create_configure_doctype_submission_functions_form(doctype=doctype,
action=action,
user_msg=user_msg)
elif deletefunctionname != "" and deletefunctionstep != "" and deletefunctionscore != "":
## process deletion of function from the given position
(title, body) = _delete_submission_function(doctype=doctype, action=action, deletefunctionname=deletefunctionname,
deletefunctionstep=deletefunctionstep, deletefunctionscore=deletefunctionscore)
elif configuresubmissionaddfunction != "":
## display a form that allows the addition of a new WebSubmit function
(title, body) = _create_configure_doctype_submission_functions_add_function_form(doctype=doctype,
action=action)
elif configuresubmissionaddfunctioncommit != "":
## process the addition of the new WebSubmit function to the submission:
(title, body) = _add_function_to_submission(doctype=doctype, action=action, addfunctionname=addfunctionname,
addfunctionstep=addfunctionstep, addfunctionscore=addfunctionscore)
else:
## default - display functions for this submission
(title, body) = _create_configure_doctype_submission_functions_form(doctype=doctype,
action=action,
movefromfunctionname=movefromfunctionname,
movefromfunctionstep=movefromfunctionstep,
movefromfunctionscore=movefromfunctionscore
)
return (title, body)
def _add_function_to_submission(doctype, action, addfunctionname, addfunctionstep, addfunctionscore):
"""Process the addition of a function to a submission.
The user can decide in which step and at which score to insert the function.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param addfunctionname: (string) the name of the function to be added to the submission
@param addfunctionstep: (integer) the step at which the function is to be added
@param addfunctionscore: (integer) the score at which the function is to be added
@return: a tuple containing 2 strings: (page-title, page-body)
"""
user_msg = []
if addfunctionname == "" or addfunctionstep == "" or addfunctionscore == "":
## invalid details!
user_msg.append("""Invalid function coordinates supplied!""")
(title, body) = _create_configure_doctype_submission_functions_add_function_form(doctype=doctype,
action=action,
user_msg=user_msg)
return (title, body)
try:
if int(addfunctionstep) < 1 or int(addfunctionscore) < 1:
## invalid details!
user_msg.append("""Invalid function step and/or score!""")
(title, body) = _create_configure_doctype_submission_functions_add_function_form(doctype=doctype,
action=action,
user_msg=user_msg)
return (title, body)
except ValueError:
user_msg.append("""Invalid function step and/or score!""")
(title, body) = _create_configure_doctype_submission_functions_add_function_form(doctype=doctype,
action=action,
user_msg=user_msg)
try:
insert_function_into_submission_at_step_and_score_then_regulate_scores_of_functions_in_step(doctype=doctype,
action=action,
function=addfunctionname,
step=addfunctionstep,
score=addfunctionscore)
except InvenioWebSubmitAdminWarningReferentialIntegrityViolation, e:
## Function didn't exist in WebSubmit! Not added to submission.
user_msg.append(str(e))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_submission_functions_add_function_form(doctype=doctype,
action=action,
addfunctionstep=addfunctionstep,
addfunctionscore=addfunctionscore,
user_msg=user_msg)
return (title, body)
except InvenioWebSubmitAdminWarningInsertFailed, e:
## insert failed - some functions within the step may have been corrupted!
user_msg.append(str(e))
## TODO : LOG ERROR
(title, body) = \
_create_configure_doctype_submission_functions_form(doctype=doctype, action=action, user_msg=user_msg)
return (title, body)
except InvenioWebSubmitAdminWarningDeleteFailed, e:
## when regulating the scores of functions within the step, could not delete some or all of the functions
## within the step that the function was added to. Some functions may have been lost!
user_msg.append(str(e))
## TODO : LOG ERROR
(title, body) = \
_create_configure_doctype_submission_functions_form(doctype=doctype, action=action, user_msg=user_msg)
return (title, body)
## Successfully added
user_msg.append("""The function [%s] has been added to submission [%s] at step [%s], score [%s]."""\
% (addfunctionname, "%s%s" % (action, doctype), addfunctionstep, addfunctionscore))
(title, body) = \
_create_configure_doctype_submission_functions_form(doctype=doctype, action=action, user_msg=user_msg)
return (title, body)
def _delete_submission_function(doctype, action, deletefunctionname, deletefunctionstep, deletefunctionscore):
"""Delete a submission function from a given submission. Re-order all functions below it (within the same step)
to fill the gap left by the deleted function.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param deletefunctionname: (string) the name of the function to be deleted from the submission
@param deletefunctionstep: (string) the step of the function to be deleted from the submission
@param deletefunctionscore: (string) the score of the function to be deleted from the submission
@return: tuple containing 2 strings: (page-title, page-body)
"""
user_msg = []
## first, delete the function:
try:
delete_function_at_step_and_score_from_submission(doctype=doctype, action=action,
function=deletefunctionname, step=deletefunctionstep,
score=deletefunctionscore)
except InvenioWebSubmitAdminWarningDeleteFailed, e:
## unable to delete function - error message and return
user_msg.append("""Unable to delete function [%s] at step [%s], score [%s] from submission [%s]""" \
% (deletefunctionname, deletefunctionstep, deletefunctionscore, "%s%s" % (action, doctype)))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_submission_functions_form(doctype=doctype, action=action, user_msg=user_msg)
return (title, body)
## now, correct the scores of all functions in the step from which a function was just deleted:
try:
regulate_score_of_all_functions_in_step_to_ascending_multiples_of_10_for_submission(doctype=doctype,
action=action,
step=deletefunctionstep)
except InvenioWebSubmitAdminWarningDeleteFailed, e:
## couldnt delete the functions before reordering them
user_msg.append("""Deleted function [%s] at step [%s], score [%s] from submission [%s], but could not re-order""" \
""" scores of remaining functions within step [%s]""" \
% (deletefunctionname, deletefunctionstep, deletefunctionscore,
"%s%s" % (action, doctype), deletefunctionstep))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_submission_functions_form(doctype=doctype, action=action, user_msg=user_msg)
return (title, body)
## update submission "last-modification" date:
update_modification_date_for_submission(doctype=doctype, action=action)
## success message:
user_msg.append("""Successfully deleted function [%s] at step [%s], score [%s] from submission [%s]""" \
% (deletefunctionname, deletefunctionstep, deletefunctionscore, "%s%s" % (action, doctype)))
## TODO : LOG function Deletion
(title, body) = _create_configure_doctype_submission_functions_form(doctype=doctype, action=action, user_msg=user_msg)
return (title, body)
def perform_request_configure_doctype_submissionpage_preview(doctype, action, pagenum):
"""Display a preview of a Submission Page and its fields.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param pagenum: (integer) the number of the submission page to be previewed
@return: a tuple of four elements. (page-title, page-body)
"""
body = ""
user_msg = []
try:
pagenum = str(pagenum)
except ValueError:
pagenum = ""
if pagenum != "":
try:
pagenum = str(wash_single_urlarg(urlarg=pagenum, argreqdtype=int, argdefault=""))
except ValueError, e:
pagenum = ""
## ensure that the page number for this submission is valid:
num_pages_submission = get_numbersubmissionpages_doctype_action(doctype=doctype, action=action)
try:
if not (int(pagenum) > 0 and int(pagenum) <= num_pages_submission):
user_msg.append("Invalid page number - out of range")
(title, body) = _create_configure_doctype_submission_pages_form(doctype=doctype, action=action, user_msg=user_msg)
return (title, body)
except ValueError:
## invalid page number
user_msg.append("Invalid page number - must be an integer value!")
(title, body) = _create_configure_doctype_submission_pages_form(doctype=doctype, action=action, user_msg=user_msg)
return (title, body)
## get details of all fields on submission page:
fields = get_details_and_description_of_all_fields_on_submissionpage(doctype=doctype, action=action, pagenum=pagenum)
## ensure all values for each field are strings:
string_fields = []
for field in fields:
string_fields.append(stringify_list_elements(field))
title = """A preview of Page %s of the %s Submission""" % (pagenum, "%s%s" % (action, doctype))
body = websubmitadmin_templates.tmpl_configuredoctype_display_submissionpage_preview(doctype=doctype,
action=action,
pagenum=pagenum,
fields=string_fields)
return (title, body)
def perform_request_configure_doctype_submissionpage_elements(doctype, action, pagenum, movefieldfromposn="",
movefieldtoposn="", deletefieldposn="", editfieldposn="",
editfieldposncommit="", fieldname="", fieldtext="", fieldlevel="",
fieldshortdesc="", fieldcheck="", addfield="", addfieldcommit=""):
"""Process requests relating to the elements of a particular submission page"""
body = ""
user_msg = []
try:
pagenum = str(pagenum)
except ValueError:
pagenum = ""
if pagenum != "":
try:
pagenum = str(wash_single_urlarg(urlarg=pagenum, argreqdtype=int, argdefault=""))
except ValueError, e:
pagenum = ""
if fieldname != "":
try:
fieldname = wash_single_urlarg(urlarg=fieldname, argreqdtype=str, argdefault="", maxstrlen=15, minstrlen=1)
if string_is_alphanumeric_including_underscore(txtstring=fieldname) == 0:
fieldname = ""
except ValueError, e:
fieldname = ""
if fieldtext != "":
try:
fieldtext = wash_single_urlarg(urlarg=fieldtext, argreqdtype=str, argdefault="")
except ValueError, e:
fieldtext = ""
if fieldlevel != "":
try:
fieldlevel = wash_single_urlarg(urlarg=fieldlevel, argreqdtype=str, argdefault="O", maxstrlen=1, minstrlen=1)
if string_is_alphanumeric_including_underscore(txtstring=fieldlevel) == 0:
fieldlevel = "O"
if fieldlevel not in ("m", "M", "o", "O"):
fieldlevel = "O"
fieldlevel = fieldlevel.upper()
except ValueError, e:
fieldlevel = "O"
if fieldshortdesc != "":
try:
fieldshortdesc = wash_single_urlarg(urlarg=fieldshortdesc, argreqdtype=str, argdefault="")
except ValueError, e:
fieldshortdesc = ""
if fieldcheck != "":
try:
fieldcheck = wash_single_urlarg(urlarg=fieldcheck, argreqdtype=str, argdefault="", maxstrlen=15, minstrlen=1)
if string_is_alphanumeric_including_underscore(txtstring=fieldcheck) == 0:
fieldcheck = ""
except ValueError, e:
fieldcheck = ""
if editfieldposn != "":
try:
editfieldposn = str(wash_single_urlarg(urlarg=editfieldposn, argreqdtype=int, argdefault=""))
except ValueError, e:
editfieldposn = ""
if deletefieldposn != "":
try:
deletefieldposn = str(wash_single_urlarg(urlarg=deletefieldposn, argreqdtype=int, argdefault=""))
except ValueError, e:
deletefieldposn = ""
if movefieldfromposn != "":
try:
movefieldfromposn = str(wash_single_urlarg(urlarg=movefieldfromposn, argreqdtype=int, argdefault=""))
except ValueError, e:
movefieldfromposn = ""
if movefieldtoposn != "":
try:
movefieldtoposn = str(wash_single_urlarg(urlarg=movefieldtoposn, argreqdtype=int, argdefault=""))
except ValueError, e:
movefieldtoposn = ""
## ensure that there is only one doctype for this doctype ID - simply display all doctypes with warning if not
if doctype in ("", None):
user_msg.append("""Unknown Document Type""")
## TODO : LOG ERROR
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
numrows_doctype = get_number_doctypes_docid(docid=doctype)
if numrows_doctype > 1:
## there are multiple doctypes with this doctype ID:
## TODO : LOG ERROR
user_msg.append("""Multiple document types identified by "%s" exist - cannot configure at this time.""" \
% (doctype,))
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
elif numrows_doctype == 0:
## this doctype does not seem to exist:
user_msg.append("""The document type identified by "%s" doesn't exist - cannot configure at this time.""" \
% (doctype,))
## TODO : LOG ERROR
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
## ensure that this submission exists for this doctype:
numrows_submission = get_number_submissions_doctype_action(doctype=doctype, action=action)
if numrows_submission > 1:
## there are multiple submissions for this doctype/action ID:
## TODO : LOG ERROR
user_msg.append("""The Submission "%s" seems to exist multiple times for the Document Type "%s" - cannot configure at this time.""" \
% (action, doctype))
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
elif numrows_submission == 0:
## this submission does not seem to exist for this doctype:
user_msg.append("""The Submission "%s" doesn't exist for the "%s" Document Type - cannot configure at this time.""" \
% (action, doctype))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
## ensure that the page number for this submission is valid:
num_pages_submission = get_numbersubmissionpages_doctype_action(doctype=doctype, action=action)
try:
if not (int(pagenum) > 0 and int(pagenum) <= num_pages_submission):
user_msg.append("Invalid page number - out of range")
(title, body) = _create_configure_doctype_submission_pages_form(doctype=doctype, action=action, user_msg=user_msg)
return (title, body)
except ValueError:
## invalid page number
user_msg.append("Invalid page number - must be an integer value!")
(title, body) = _create_configure_doctype_submission_pages_form(doctype=doctype, action=action, user_msg=user_msg)
return (title, body)
## submission valid
if editfieldposn != "" and editfieldposncommit == "":
## display form for editing field
(title, body) = _configure_doctype_edit_field_on_submissionpage_display_field_details(doctype=doctype, action=action,
pagenum=pagenum, fieldposn=editfieldposn)
elif editfieldposn != "" and editfieldposncommit != "":
## commit changes to element
(title, body) = _configure_doctype_edit_field_on_submissionpage(doctype=doctype, action=action,
pagenum=pagenum, fieldposn=editfieldposn, fieldtext=fieldtext,
fieldlevel=fieldlevel, fieldshortdesc=fieldshortdesc, fieldcheck=fieldcheck)
elif movefieldfromposn != "" and movefieldtoposn != "":
## move a field
(title, body) = _configure_doctype_move_field_on_submissionpage(doctype=doctype,
action=action, pagenum=pagenum, movefieldfromposn=movefieldfromposn,
movefieldtoposn=movefieldtoposn)
elif addfield != "":
## request to add a new field to a page - display form
(title, body) = _configure_doctype_add_field_to_submissionpage_display_form(doctype=doctype, action=action, pagenum=pagenum)
elif addfieldcommit != "":
## commit a new field to the page
(title, body) = _configure_doctype_add_field_to_submissionpage(doctype=doctype, action=action,
pagenum=pagenum, fieldname=fieldname, fieldtext=fieldtext,
fieldlevel=fieldlevel, fieldshortdesc=fieldshortdesc, fieldcheck=fieldcheck)
elif deletefieldposn != "":
## user wishes to delete a field from the page:
(title, body) = _configure_doctype_delete_field_from_submissionpage(doctype=doctype,
action=action, pagenum=pagenum, fieldnum=deletefieldposn)
else:
## default visit to page - list its elements:
(title, body) = _create_configure_doctype_submission_page_elements_form(doctype=doctype, action=action,
pagenum=pagenum, movefieldfromposn=movefieldfromposn)
return (title, body)
def stringify_list_elements(elementslist):
o = []
for el in elementslist:
o.append(str(el))
return o
def _configure_doctype_edit_field_on_submissionpage(doctype, action, pagenum, fieldposn,
fieldtext, fieldlevel, fieldshortdesc, fieldcheck):
"""Perform an update to the details of a field on a submission page.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param pagenum: (integer) the number of the page on which the element to be updated is found
@param fieldposn: (integer) the numeric position of the field to be editied
@param fieldtext: (string) the text label displayed with a field
@param fieldlevel: (char) M or O (whether the field is mandatory or optional)
@param fieldshortdesc: (string) the short description of a field
@param fieldcheck: (string) the name of a JavaScript check to be applied to a field
@return: a tuple containing 2 strings - (page-title, page-body)
"""
user_msg = []
if fieldcheck not in ("", None):
## ensure check exists:
checkres = get_number_jschecks_with_chname(chname=fieldcheck)
if checkres < 1:
user_msg.append("The Check '%s' does not exist in WebSubmit - changes to field not saved" % (fieldcheck,))
(title, body) = _configure_doctype_edit_field_on_submissionpage_display_field_details(doctype=doctype, action=action,
pagenum=pagenum, fieldposn=fieldposn,
fieldtext=fieldtext, fieldlevel=fieldlevel,
fieldshortdesc=fieldshortdesc, user_msg=user_msg)
return (title, body)
try:
update_details_of_a_field_on_a_submissionpage(doctype=doctype, action=action, pagenum=pagenum, fieldposn=fieldposn,
fieldtext=fieldtext, fieldlevel=fieldlevel, fieldshortdesc=fieldshortdesc,
fieldcheck=fieldcheck)
user_msg.append("The details of the field at position %s have been updated successfully" % (fieldposn,))
update_modification_date_for_submission(doctype=doctype, action=action)
except InvenioWebSubmitAdminWarningTooManyRows, e:
## multiple rows found at page position - not safe to edit:
user_msg.append("Unable to update details of field at position %s on submission page %s - multiple fields found at this position" \
% (fieldposn, pagenum))
## TODO : LOG WARNING
except InvenioWebSubmitAdminWarningNoRowsFound, e:
## field not found - cannot edit
user_msg.append("Unable to update details of field at position %s on submission page %s - field doesn't seem to exist there!" \
% (fieldposn, pagenum))
## TODO : LOG WARNING
(title, body) = _create_configure_doctype_submission_page_elements_form(doctype=doctype, action=action, pagenum=pagenum, user_msg=user_msg)
return (title, body)
def _configure_doctype_edit_field_on_submissionpage_display_field_details(doctype, action, pagenum, fieldposn,
fieldtext=None, fieldlevel=None, fieldshortdesc=None,
fieldcheck=None, user_msg=""):
"""Display a form used to edit a field on a submission page.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param pagenum: (integer) the number of the page on which the element to be updated is found
@param fieldposn: (integer) the numeric position of the field to be editied
@param fieldtext: (string) the text label displayed with a field
@param fieldlevel: (char) M or O (whether the field is mandatory or optional)
@param fieldshortdesc: (string) the short description of a field
@param fieldcheck: (string) the name of a JavaScript check to be applied to a field
@param user_msg: (list of strings, or string) any warning/error message to be displayed to the user
@return: a tuple containing 2 strings (page-title, page-body)
"""
if type(user_msg) not in (list, tuple) or user_msg == "":
user_msg = []
## get a list of all check names:
checks_res = get_all_jscheck_names()
allchecks=[]
for check in checks_res:
allchecks.append((check,))
## get the details for the field to be edited:
fielddets = get_details_of_field_at_positionx_on_submissionpage(doctype=doctype, action=action, pagenum=pagenum, fieldposition=fieldposn)
if len(fielddets) < 1:
(title, body) = _create_configure_doctype_submission_page_elements_form(doctype=doctype, action=action, pagenum=pagenum)
return (title, body)
fieldname = str(fielddets[2])
if fieldtext is not None:
fieldtext = str(fieldtext)
else:
fieldtext = str(fielddets[3])
if fieldlevel is not None:
fieldlevel = str(fieldlevel)
else:
fieldlevel = str(fielddets[4])
if fieldshortdesc is not None:
fieldshortdesc = str(fieldshortdesc)
else:
fieldshortdesc = str(fielddets[5])
if fieldcheck is not None:
fieldcheck = str(fieldcheck)
else:
fieldcheck = str(fielddets[6])
cd = str(fielddets[7])
md = str(fielddets[8])
title = """Edit the %(fieldname)s field as it appears at position %(fieldnum)s on Page %(pagenum)s of the %(submission)s Submission""" \
% { 'fieldname' : fieldname, 'fieldnum' : fieldposn, 'pagenum' : pagenum, 'submission' : "%s%s" % (action, doctype) }
body = websubmitadmin_templates.tmpl_configuredoctype_edit_submissionfield(doctype=doctype,
action=action,
pagenum=pagenum,
fieldnum=fieldposn,
fieldname=fieldname,
fieldtext=fieldtext,
fieldlevel=fieldlevel,
fieldshortdesc=fieldshortdesc,
fieldcheck=fieldcheck,
cd=cd,
md=md,
allchecks=allchecks,
user_msg=user_msg)
return (title, body)
def _configure_doctype_add_field_to_submissionpage(doctype, action, pagenum, fieldname="",
fieldtext="", fieldlevel="", fieldshortdesc="", fieldcheck=""):
"""Add a field to a submission page.
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param pagenum: (integer) the number of the page on which the element to be updated is found
@param fieldname: (string) the name of the field to be added to the page
@param fieldtext: (string) the text label displayed with a field
@param fieldlevel: (char) M or O (whether the field is mandatory or optional)
@param fieldshortdesc: (string) the short description of a field
@param fieldcheck: (string) the name of a JavaScript check to be applied to a field
@return: a tuple containing 2 strings - (page-title, page-body)
"""
user_msg = []
## ensure that a field "fieldname" actually exists:
if fieldname == "":
## the field to be added has no element description in the WebSubmit DB - cannot add
user_msg.append("""The field that you have chosen to add does not seem to exist in WebSubmit - cannot add""")
(title, body) = _configure_doctype_add_field_to_submissionpage_display_form(doctype, action, pagenum,
fieldtext=fieldtext,
fieldlevel=fieldlevel, fieldshortdesc=fieldshortdesc,
fieldcheck=fieldcheck, user_msg=user_msg)
return (title, body)
numelements_elname = get_number_elements_with_elname(elname=fieldname)
if numelements_elname < 1:
## the field to be added has no element description in the WebSubmit DB - cannot add
user_msg.append("""The field that you have chosen to add (%s) does not seem to exist in WebSubmit - cannot add""" % (fieldname,))
(title, body) = _configure_doctype_add_field_to_submissionpage_display_form(doctype, action, pagenum,
fieldtext=fieldtext,
fieldlevel=fieldlevel, fieldshortdesc=fieldshortdesc,
fieldcheck=fieldcheck, user_msg=user_msg)
return (title, body)
## if fieldcheck has been provided, ensure that it is a valid check name:
if fieldcheck not in ("", None):
## ensure check exists:
checkres = get_number_jschecks_with_chname(chname=fieldcheck)
if checkres < 1:
user_msg.append("The Check '%s' does not exist in WebSubmit - new field not saved to page" % (fieldcheck,))
(title, body) = _configure_doctype_add_field_to_submissionpage_display_form(doctype, action, pagenum,
fieldname=fieldname, fieldtext=fieldtext,
fieldlevel=fieldlevel, fieldshortdesc=fieldshortdesc,
user_msg=user_msg)
return (title, body)
## now add the new field to the page:
try:
insert_field_onto_submissionpage(doctype=doctype, action=action, pagenum=pagenum, fieldname=fieldname, fieldtext=fieldtext,
fieldlevel=fieldlevel, fieldshortdesc=fieldshortdesc, fieldcheck=fieldcheck)
user_msg.append("""Successfully added the field "%s" to the last position on page %s of submission %s""" \
% (fieldname, pagenum, "%s%s" % (action, doctype)))
update_modification_date_for_submission(doctype=doctype, action=action)
(title, body) = _create_configure_doctype_submission_page_elements_form(doctype=doctype, action=action, pagenum=pagenum, user_msg=user_msg)
except InvenioWebSubmitAdminWarningInsertFailed, e:
## the insert of the new field failed for some reason
## TODO : LOG ERROR
user_msg.append("""Couldn't add the field "%s" to page %s of submission %s - please try again""" \
% (fieldname, pagenum, "%s%s" % (action, doctype)))
(title, body) = _configure_doctype_add_field_to_submissionpage_display_form(doctype, action, pagenum,
fieldname=fieldname, fieldtext=fieldtext,
fieldlevel=fieldlevel, fieldshortdesc=fieldshortdesc,
fieldcheck=fieldcheck, user_msg=user_msg)
return (title, body)
def _configure_doctype_add_field_to_submissionpage_display_form(doctype, action, pagenum, fieldname="", fieldtext="",
fieldlevel="", fieldshortdesc="", fieldcheck="", user_msg=""):
title = """Add a Field to Page %(pagenum)s of the %(submission)s Submission""" \
% { 'pagenum' : pagenum, 'submission' : "%s%s" % (action, doctype) }
## sanity checking:
if type(user_msg) not in (list, tuple) or user_msg == "":
user_msg = []
## get a list of all check names:
checks_res = get_all_jscheck_names()
allchecks=[]
for check in checks_res:
allchecks.append((check,))
## get a list of all WebSubmit element names:
elements_res = get_all_element_names()
allelements = []
for el in elements_res:
allelements.append((el,))
## get form:
body = websubmitadmin_templates.tmpl_configuredoctype_add_submissionfield(doctype=doctype,
action=action,
pagenum=pagenum,
fieldname=fieldname,
fieldtext=fieldtext,
fieldlevel=fieldlevel,
fieldshortdesc=fieldshortdesc,
fieldcheck=fieldcheck,
allchecks=allchecks,
allelements=allelements,
user_msg=user_msg)
return (title, body)
def _configure_doctype_move_field_on_submissionpage(doctype, action, pagenum, movefieldfromposn, movefieldtoposn):
user_msg = []
_ = gettext_set_language(CFG_SITE_LANG)
movefield_res = move_field_on_submissionpage_from_positionx_to_positiony(doctype=doctype, action=action, pagenum=pagenum,
movefieldfrom=movefieldfromposn, movefieldto=movefieldtoposn)
if movefield_res == 1:
## invalid field numbers
try:
raise InvenioWebSubmitWarning(_('Unable to move field at position %s to position %s on page %s of submission \'%s%s\' - Invalid Field Position Numbers') % (movefieldfromposn, movefieldtoposn, pagenum, action, doctype))
except InvenioWebSubmitWarning, exc:
register_exception(stream='warning')
#warnings.append(exc.message)
#warnings.append(('WRN_WEBSUBMITADMIN_INVALID_FIELD_NUMBERS_SUPPLIED_WHEN_TRYING_TO_MOVE_FIELD_ON_SUBMISSION_PAGE', \
#movefieldfromposn, movefieldtoposn, pagenum, "%s%s" % (action, doctype)))
user_msg.append("""Unable to move field from position %s to position %s on page %s of submission %s%s - field position numbers invalid""" \
% (movefieldfromposn, movefieldtoposn, pagenum, action, doctype))
elif movefield_res == 2:
## failed to swap 2 fields - couldn't move field1 to temp position
try:
raise InvenioWebSubmitWarning(_('Unable to swap field at position %s with field at position %s on page %s of submission %s - could not move field at position %s to temporary field location') % (movefieldfromposn, movefieldtoposn, pagenum, action, doctype))
except InvenioWebSubmitWarning, exc:
register_exception(stream='warning')
#warnings.append(exc.message)
#warnings.append(('WRN_WEBSUBMITADMIN_UNABLE_TO_SWAP_TWO_FIELDS_ON_SUBMISSION_PAGE_COULDNT_MOVE_FIELD1_TO_TEMP_POSITION', \
#movefieldfromposn, movefieldtoposn, pagenum, "%s%s" % (action, doctype)))
user_msg.append("""Unable to move field from position %s to position %s on page %s of submission %s%s""" \
% (movefieldfromposn, movefieldtoposn, pagenum, action, doctype))
elif movefield_res == 3:
## failed to swap 2 fields on submission page - couldn't move field2 to field1 position
try:
raise InvenioWebSubmitWarning(_('Unable to swap field at position %s with field at position %s on page %s of submission %s - could not move field at position %s to position %s. Please ask Admin to check that a field was not stranded in a temporary position') % (movefieldfromposn, movefieldtoposn, pagenum, action, doctype))
except InvenioWebSubmitWarning, exc:
register_exception(stream='warning')
#warnings.append(exc.message)
#warnings.append(('WRN_WEBSUBMITADMIN_UNABLE_TO_SWAP_TWO_FIELDS_ON_SUBMISSION_PAGE_COULDNT_MOVE_FIELD2_TO_FIELD1_POSITION', \
#movefieldfromposn, movefieldtoposn, pagenum, "%s%s" % (action, doctype), movefieldtoposn, movefieldfromposn))
user_msg.append("""Unable to move field from position %s to position %s on page %s of submission %s%s - See Admin if field order is broken""" \
% (movefieldfromposn, movefieldtoposn, pagenum, action, doctype))
elif movefield_res == 4:
## failed to swap 2 fields in submission page - couldnt swap field at temp position to field2 position
try:
raise InvenioWebSubmitWarning(_('Unable to swap field at position %s with field at position %s on page %s of submission %s - could not move field that was located at position %s to position %s from temporary position. Field is now stranded in temporary position and must be corrected manually by an Admin') % (movefieldfromposn, movefieldtoposn, pagenum, action, doctype, movefieldfromposn, movefieldtoposn))
except InvenioWebSubmitWarning, exc:
register_exception(stream='warning')
#warnings.append(exc.message)
#warnings.append(('WRN_WEBSUBMITADMIN_UNABLE_TO_SWAP_TWO_FIELDS_ON_SUBMISSION_PAGE_COULDNT_MOVE_FIELD1_TO_POSITION_FIELD2_FROM_TEMPORARY_POSITION', \
#movefieldfromposn, movefieldtoposn, pagenum, "%s%s" % (action, doctype), movefieldfromposn, movefieldtoposn))
user_msg.append("""Unable to move field from position %s to position %s on page %s of submission %s%s - Field-order is now broken and must be corrected by Admin""" \
% (movefieldfromposn, movefieldtoposn, pagenum, action, doctype))
elif movefield_res == 5:
## failed to decrement the position of all fields below the field that was moved to a temp position
try:
raise InvenioWebSubmitWarning(_('Unable to move field at position %s to position %s on page %s of submission %s - could not decrement the position of the fields below position %s. Tried to recover - please check that field ordering is not broken') % (movefieldfromposn, movefieldtoposn, pagenum, action, doctype, movefieldfromposn))
except InvenioWebSubmitWarning, exc:
register_exception(stream='warning')
#warnings.append(exc.message)
#warnings.append(('WRN_WEBSUBMITADMIN_UNABLE_TO_MOVE_FIELD_TO_NEW_POSITION_ON_SUBMISSION_PAGE_COULDNT_DECREMENT_POSITION_OF_FIELDS_BELOW_FIELD1', \
#movefieldfromposn, movefieldtoposn, pagenum, "%s%s" % (action, doctype), movefieldfromposn))
user_msg.append("""Unable to move field from position %s to position %s on page %s of submission %s%s - See Admin if field-order is broken""" \
% (movefieldfromposn, movefieldtoposn, pagenum, action, doctype))
elif movefield_res == 6:
## failed to increment position of fields in and below position into which 'movefromfieldposn' is to be inserted
try:
raise InvenioWebSubmitWarning(_('Unable to move field at position %s to position %s on page %s of submission %s%s - could not increment the position of the fields at and below position %s. The field that was at position %s is now stranded in a temporary position.') % (movefieldfromposn, movefieldtoposn, pagenum, action, doctype, movefieldtoposn, movefieldfromposn))
except InvenioWebSubmitWarning, exc:
register_exception(stream='warning')
#warnings.append(exc.message)
#warnings.append(('WRN_WEBSUBMITADMIN_UNABLE_TO_MOVE_FIELD_TO_NEW_POSITION_ON_SUBMISSION_PAGE_COULDNT_INCREMENT_POSITION_OF_FIELDS_AT_AND_BELOW_FIELD2', \
#movefieldfromposn, movefieldtoposn, pagenum, "%s%s" % (action, doctype), movefieldtoposn, movefieldfromposn))
user_msg.append("""Unable to move field from position %s to position %s on page %s of submission %s%s - Field-order is now broken and must be corrected by Admin""" \
% (movefieldfromposn, movefieldtoposn, pagenum, action, doctype))
else:
## successful update:
try:
raise InvenioWebSubmitWarning(_('Moved field from position %s to position %s on page %s of submission \'%s%s\'.') % (movefieldfromposn, movefieldtoposn, pagenum, action, doctype))
except InvenioWebSubmitWarning, exc:
register_exception(stream='warning')
#warnings.append(exc.message)
#warnings.append(('WRN_WEBSUBMITADMIN_MOVED_FIELD_ON_SUBMISSION_PAGE', movefieldfromposn, movefieldtoposn, pagenum, "%s%s" % (action, doctype)))
user_msg.append("""Successfully moved field from position %s to position %s on page %s of submission %s%s""" \
% (movefieldfromposn, movefieldtoposn, pagenum, action, doctype))
(title, body) = _create_configure_doctype_submission_page_elements_form(doctype=doctype, action=action, pagenum=pagenum, user_msg=user_msg)
return (title, body)
def _configure_doctype_delete_field_from_submissionpage(doctype, action, pagenum, fieldnum):
"""Delete a field from a submission page"""
_ = gettext_set_language(CFG_SITE_LANG)
user_msg = []
del_res = delete_a_field_from_submissionpage_then_reorder_fields_below_to_fill_vacant_position(doctype=doctype,
action=action,
pagenum=pagenum,
fieldposn=fieldnum)
if del_res == 1:
try:
raise InvenioWebSubmitWarning(_('Unable to delete field at position %s from page %s of submission \'%s\'') % (fieldnum, pagenum, action, doctype))
except InvenioWebSubmitWarning, exc:
register_exception(stream='warning')
#warnings.append(exc.message)
#warnings.append(('WRN_WEBSUBMITADMIN_UNABLE_TO_DELETE_FIELD_FROM_SUBMISSION_PAGE', fieldnum, pagenum, "%s%s" % (action, doctype)))
user_msg.append("Unable to delete field at position %s from page number %s of submission %s%s" % (fieldnum, pagenum, action, doctype))
else:
## deletion was OK
user_msg.append("Field deleted")
try:
raise InvenioWebSubmitWarning(_('Unable to delete field at position %s from page %s of submission \'%s%s\'') % (fieldnum, pagenum, action, doctype))
except InvenioWebSubmitWarning, exc:
register_exception(stream='warning')
#warnings.append(exc.message)
#warnings.append(('WRN_WEBSUBMITADMIN_DELETED_FIELD_FROM_SUBMISSION_PAGE', fieldnum, pagenum, "%s%s" % (action, doctype)))
(title, body) = _create_configure_doctype_submission_page_elements_form(doctype=doctype, action=action, pagenum=pagenum, user_msg=user_msg)
return (title, body)
def _create_configure_doctype_submission_page_elements_form(doctype, action, pagenum, movefieldfromposn="", user_msg=""):
## get list of elements for page:
title = """Submission Elements found on Page %s of the "%s" Submission of the "%s" Document Type:"""\
% (pagenum, action, doctype)
body = ""
raw_page_elements = get_details_allsubmissionfields_on_submission_page(doctype=doctype, action=action, pagenum=pagenum)
## correctly stringify page elements for the template:
page_elements = []
for element in raw_page_elements:
page_elements.append(stringify_list_elements(element))
body = websubmitadmin_templates.tmpl_configuredoctype_list_submissionelements(doctype=doctype,
action=action,
pagenum=pagenum,
page_elements=page_elements,
movefieldfromposn=movefieldfromposn,
user_msg=user_msg)
return (title, body)
def perform_request_configure_doctype_submissionpages(doctype,
action,
pagenum="",
movepage="",
movepagedirection="",
deletepage="",
deletepageconfirm="",
addpage=""):
"""Process requests relating to the submission pages of a doctype/submission"""
body = ""
user_msg = []
try:
pagenum = int(pagenum)
except ValueError:
pagenum = ""
## ensure that there is only one doctype for this doctype ID - simply display all doctypes with warning if not
if doctype in ("", None):
user_msg.append("""Unknown Document Type""")
## TODO : LOG ERROR
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
numrows_doctype = get_number_doctypes_docid(docid=doctype)
if numrows_doctype > 1:
## there are multiple doctypes with this doctype ID:
## TODO : LOG ERROR
user_msg.append("""Multiple document types identified by "%s" exist - cannot configure at this time.""" \
% (doctype,))
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
elif numrows_doctype == 0:
## this doctype does not seem to exist:
user_msg.append("""The document type identified by "%s" doesn't exist - cannot configure at this time.""" \
% (doctype,))
## TODO : LOG ERROR
all_doctypes = get_docid_docname_alldoctypes()
body = websubmitadmin_templates.tmpl_display_alldoctypes(doctypes=all_doctypes, user_msg=user_msg)
title = "Available WebSubmit Document Types"
return (title, body)
## ensure that this submission exists for this doctype:
numrows_submission = get_number_submissions_doctype_action(doctype=doctype, action=action)
if numrows_submission > 1:
## there are multiple submissions for this doctype/action ID:
## TODO : LOG ERROR
user_msg.append("""The Submission "%s" seems to exist multiple times for the Document Type "%s" - cannot configure at this time.""" \
% (action, doctype))
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
elif numrows_submission == 0:
## this submission does not seem to exist for this doctype:
user_msg.append("""The Submission "%s" doesn't exist for the "%s" Document Type - cannot configure at this time.""" \
% (action, doctype))
## TODO : LOG ERROR
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
## submission valid
if addpage != "":
## add a new page to a submission:
error_code = add_submission_page_doctype_action(doctype=doctype, action=action)
if error_code == 0:
## success
user_msg.append("""A new Submission Page has been added into the last position""")
else:
## could not move it
user_msg.append("""Unable to add a new Submission Page""")
(title, body) = _create_configure_doctype_submission_pages_form(doctype=doctype,
action=action,
user_msg=user_msg)
elif movepage != "":
## user wants to move a page upwards in the order
(title, body) = _configure_doctype_move_submission_page(doctype=doctype,
action=action, pagenum=pagenum, direction=movepagedirection)
elif deletepage != "":
## user wants to delete a page:
if deletepageconfirm != "":
## confirmation of deletion has been provided - proceed
(title, body) = _configure_doctype_delete_submission_page(doctype=doctype,
action=action, pagenum=pagenum)
else:
## user has not yet confirmed the deletion of a page - prompt for confirmation
(title, body) = _create_configure_doctype_submission_pages_form(doctype=doctype,
action=action,
deletepagenum=pagenum)
else:
## default - display details of submission pages for this submission:
(title, body) = _create_configure_doctype_submission_pages_form(doctype=doctype, action=action)
return (title, body)
def _configure_doctype_move_submission_page(doctype, action, pagenum, direction):
user_msg = []
## Sanity checking:
if direction.lower() not in ("up", "down"):
## invalid direction:
user_msg.append("""Invalid Page destination - no action was taken""")
(title, body) = _create_configure_doctype_submission_pages_form(doctype=doctype,
action=action,
user_msg=user_msg)
return (title, body)
## swap the pages:
if direction.lower() == "up":
error_code = swap_elements_adjacent_pages_doctype_action(doctype=doctype, action=action,
page1=pagenum, page2=pagenum-1)
else:
error_code = swap_elements_adjacent_pages_doctype_action(doctype=doctype, action=action,
page1=pagenum, page2=pagenum+1)
if error_code == 0:
## pages swapped successfully:
## TODO : LOG PAGE SWAP
user_msg.append("""Page %s was successfully moved %swards""" % (pagenum, direction.capitalize()))
elif error_code == 1:
## pages are not adjacent:
user_msg.append("""Unable to move page - only adjacent pages can be swapped around""")
elif error_code == 2:
## at least one page out of legal range (e.g. trying to move a page to a position higher or lower
## than the number of pages:
user_msg.append("""Unable to move page to illegal position""")
elif error_code in (3, 4):
## Some sort of problem moving fields around!
## TODO : LOG ERROR
user_msg.append("""Error: there was a problem swapping the submission elements to their new pages.""")
user_msg.append("""An attempt was made to return the elements to their original pages - you """\
"""should verify that this was successful, or ask your administrator"""\
""" to fix the problem manually""")
elif error_code == 5:
## the elements from the first page were left stranded in the temporary page!
## TODO : LOG ERROR
user_msg.append("""Error: there was a problem swapping the submission elements to their new pages.""")
user_msg.append("""Some elements were left stranded on a temporary page. Please ask your administrator to"""\
""" fix this problem manually""")
(title, body) = _create_configure_doctype_submission_pages_form(doctype=doctype, action=action, user_msg=user_msg)
return (title, body)
def _configure_doctype_delete_submission_page(doctype, action, pagenum):
user_msg = []
num_pages = get_numbersubmissionpages_doctype_action(doctype=doctype, action=action)
if num_pages > 0:
## proceed with deletion
error_code = delete_allfields_submissionpage_doctype_action(doctype=doctype, action=action, pagenum=pagenum)
if error_code == 0:
## everything OK
## move elements from pages above the deleted page down by one page:
decrement_by_one_pagenumber_submissionelements_abovepage(doctype=doctype, action=action, frompage=pagenum)
## now decrement the number of pages associated with the submission:
error_code = decrement_by_one_number_submissionpages_doctype_action(doctype=doctype, action=action)
if error_code == 0:
## successfully deleted submission page
## TODO : LOG DELETION
user_msg.append("""Page number %s of Submission %s was successfully deleted."""\
% (pagenum, action))
(title, body) = _create_configure_doctype_submission_pages_form(doctype=doctype,
action=action,
user_msg=user_msg)
else:
## error - either submission didn't exist, or multiple instances found
## TODO : LOG ERROR
user_msg.append("""The Submission elements were deleted from Page %s of the Submission "%s"."""\
""" However, it was not possible to delete the page itself."""\
% (pagenum, action))
(title, body) = _create_configure_doctype_submission_pages_form(doctype=doctype,
action=action,
user_msg=user_msg)
else:
## unable to delete some or all fields from the page
## TODO : LOG ERROR
user_msg.append("""Error: Unable to delete some field elements from Page %s of Submission %s%s - """\
"""Page not deleted!""" % (pagenum, action, doctype))
(title, body) = _create_configure_doctype_submission_pages_form(doctype=doctype,
action=action,
user_msg=user_msg)
elif num_pages == 0:
## no pages to delete for this submission
user_msg.append("""This Submission has no Pages - Cannot delete a Page!""")
(title, body) = _create_configure_doctype_submission_pages_form(doctype=doctype,
action=action,
user_msg=user_msg)
else:
## error - couldn't determine the number of pages for submission
## TODO : LOG ERROR
user_msg.append("""Unable to determine number of Submission Pages for Submission "%s" - """\
"""Cannot delete page %s"""\
% (action, pagenum))
(title, body) = _create_configure_doctype_form(doctype=doctype, user_msg=user_msg)
return (title, body)
def _create_configure_doctype_submission_pages_form(doctype,
action,
deletepagenum="",
user_msg=""):
"""Perform the necessary steps in order to display a list of the pages belonging to a given
submission of a given document type.
@param doctype: (string) the unique ID of the document type.
@param action: (string) the unique name/ID of the action.
@param user_msg: (string, or list) any message(s) to be displayed to the user.
@return: a tuple containing 2 strings - the page title and the page body.
"""
title = """Details of the Pages of the "%s" Submission of the "%s" Document Type:""" % (action, doctype)
submission_dets = get_cd_md_numbersubmissionpages_doctype_action(doctype=doctype, action=action)
if len(submission_dets) > 0:
cd = str(submission_dets[0][0])
md = str(submission_dets[0][1])
num_pages = submission_dets[0][2]
else:
(cd, md, num_pages) = ("", "", "0")
body = websubmitadmin_templates.tmpl_configuredoctype_list_submissionpages(doctype=doctype,
action=action,
number_pages=num_pages,
cd=cd,
md=md,
deletepagenum=deletepagenum,
user_msg=user_msg)
return (title, body)
diff --git a/invenio/legacy/websubmit/admin_templates.py b/invenio/legacy/websubmit/admin_templates.py
index 73d0c3c3f..ba4fd2ae4 100644
--- a/invenio/legacy/websubmit/admin_templates.py
+++ b/invenio/legacy/websubmit/admin_templates.py
@@ -1,3450 +1,3450 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
import cgi
from invenio.config import CFG_SITE_URL, CFG_SITE_LANG
-from invenio.websubmitadmin_config import WEBSUBMITADMINURL, FUNCTIONS_WITH_FILE_PARAMS, WEBSUBMITADMINURL_OLD
+from invenio.legacy.websubmit.admin_config import WEBSUBMITADMINURL, FUNCTIONS_WITH_FILE_PARAMS, WEBSUBMITADMINURL_OLD
def create_html_table_from_tuple(tableheader=None, tablebody=None, start="", end=""):
"""Create a table from a tuple or a list.
@param header: optional header for the columns (MUST be a list of header titles)
@param tablebody: table body (rows - tuple of tuples)
@param start: text to be added in the beginning, most likely beginning of a form
@param end: text to be added in the end, most likely end of a form.
"""
if type(tableheader) is None:
tableheader = ()
if type(tablebody) is None:
tablebody = ()
## determine table cells alignment based upon first row alignment
align = []
try:
if type(tablebody[0]) in [int, long]:
align = ['admintdright']
elif type(tablebody[0]) in [str, dict]:
align = ['admintdleft']
else:
for item in tablebody[0]:
if type(item) is int:
align.append('admintdright')
else:
align.append('admintdleft')
except IndexError:
## Empty tablebody
pass
## table header row:
tblstr = ""
for hdr in tableheader:
tblstr += """ <th class="adminheader">%s</th>\n""" % (hdr,)
if tblstr != "":
tblstr = """ <tr>\n%s</tr>\n""" % (tblstr, )
tblstr = start + """<table class="admin_wvar_nomargin">\n""" + tblstr
## table body
if len(tablebody) > 0:
j = 1
for row in tablebody:
j += 1
tblstr += """ <tr class="admin_row_highlight %s">\n""" % \
((j % 2) and 'admin_row_color' or '')
if type(row) not in [int, long, str, dict]:
for i in range(len(row)):
tblstr += """<td class="%s">%s</td>\n""" % (align[i], row[i])
else:
tblstr += """ <td class="%s">%s</td>\n""" % (align[0], row)
tblstr += """ </tr>\n"""
else:
# Empty tuple of table data - display message:
tblstr += """<tr>
<td class="admintdleft" colspan="%s"><span class="info">None</span></td>
</tr>
""" % (len(tableheader),)
tblstr += """</table>\n"""
tblstr += end
return tblstr
def create_html_select_list(select_name, option_list, selected_values="", default_opt="", multiple="", list_size="", css_style="", css_class=""):
"""Make a HTML "select" element from the parameters passed.
@param select_name: Name given to the HTML "select" element
@param option_list: a tuple of tuples containing the options (their values, followed by their
display text). Thus: ( (opt_val, opt_txt), (opt_val, opt_txt) )
It is also possible to provide a tuple of single-element tuples in the case when it is not desirable
to have different option text to the value, thus: ( (opt_val,), (opt_val,) ).
@param selected_value: can be a list/tuple of strings, or a single string/unicode string. Treated as
the "selected" values for the select list's options. E.g. if a value in the "option_list" param was
"test", and the "selected_values" parameter contained "test", then the test option would appear as
follows: '<option value="test" selected>'.
@param default_opt: The default option (value and displayed text) for the select list. If left blank, there
will be no default option, and the first "natural" option will appear first in the list.
If the value of "default_opt" is a string, then this string will be used both as the value and the displayed
text of the default option. If the value of "default_opt" is a tuple/list, then the first option will be
used as the option "value", and the second will be used as the option "displayed text". In the case that
the list/tuple length is only 1, the value will be used for both option "value" and "displayed text".
@param multiple: shall this be a multiple select box? If present, select box will be marked as "multiple".
@param list_size: the size for a multiple select list. If mutiple is present, then this optional size can
be provided. If not provided, the list size attribute will automatically be given the value of the list
length, up to 30.
@param css_style: A string: any additional CSS style information to be placed as the select element's "style"
attribute value.
@param css_class: A string: any class value for CSS.
@return: a string containing the completed HTML Select element
"""
## sanity checking:
if type(css_style) not in (str, unicode):
css_style = ""
if type(option_list) not in (list, tuple):
option_list = ()
txt = """\n <select name="%s"%s""" % ( cgi.escape(select_name, 1),
(multiple != "" and " multiple") or ("")
)
if multiple != "":
## Size attribute for multiple-select list
if (type(list_size) is str and list_size.isdigit()) or type(list_size) is int:
txt += """ size="%s\"""" % (list_size,)
else:
txt += """ size="%s\"""" % ( (len(option_list) <= 30 and str(len(option_list))) or ("30"),)
if css_style != "":
txt += """ style="%s\"""" % (cgi.escape(css_style, 1),)
txt += """>\n"""
if default_opt != "":
if type(default_opt) in (str, unicode):
## default_opt is a string - use its value as both option value and displayed text
txt += """ <option value="%(deflt_opt)s">%(deflt_opt)s</option>\n""" % {'deflt_opt' : cgi.escape(default_opt, 1)}
elif type(default_opt) in (list, tuple):
try:
txt += """ <option value="%(deflt_opt)s">""" % {'deflt_opt' : cgi.escape(default_opt[0], 1) }
try:
txt += """%(deflt_opt)s""" % {'deflt_opt' : cgi.escape(default_opt[1], 1) }
except IndexError:
txt += """%(deflt_opt)s""" % {'deflt_opt' : cgi.escape(default_opt[0], 1) }
txt += """</option>\n"""
except IndexError:
## seems to be an empty list - there will be no default opt
pass
for option in option_list:
try:
txt += """ <option value="%(option_val)s\"""" % { 'option_val' : cgi.escape(option[0], 1) }
if type(selected_values) in (list, tuple):
txt += """%(option_selected)s""" % \
{ 'option_selected' : (option[0] in selected_values and " selected") or ("") }
elif type(selected_values) in (str, unicode) and selected_values != "":
txt += """%(option_selected)s""" % \
{ 'option_selected' : (option[0] == selected_values and " selected") or ("") }
try:
txt += """>%(option_txt)s</option>\n""" % { 'option_txt' : cgi.escape(option[1], 1) }
except IndexError:
txt += """>%(option_txt)s</option>\n""" % { 'option_txt' : cgi.escape(option[0], 1) }
except IndexError:
## empty option tuple - skip
pass
txt += """ </select>\n"""
return txt
class Template:
"""Invenio Template class for creating Web interface"""
def tmpl_navtrail(self, ln=CFG_SITE_LANG):
"""display the navtrail, e.g.:
Home > Admin Area > WebSubmit Administration > Available WebSubmit Actions
@param title: the last part of the navtrail. Is not a link
@param ln: language
return html formatted navtrail
"""
return '<a class="navtrail" href="%s/help/admin">Admin Area</a> ' % (CFG_SITE_URL,)
def _create_adminbox(self, header="", datalist=[], cls="admin_wvar"):
"""Create an adminbox table around the main data on a page - row based.
@param header: the header for the "adminbox".
@param datalist: contents of the "body" to be encapsulated by the "adminbox".
@param cls: css-class to format the look of the table.
@return: the "adminbox" and its contents.
"""
if len(datalist) == 1:
per = "100"
else:
per = "75"
output = """
<table class="%s" width="95%%">
""" % (cls,)
output += """
<thead>
<tr>
<th class="adminheaderleft" colspan="%s">
%s
</th>
</tr>
</thead>
<tbody>""" % (len(datalist), header)
output += """
<tr>
<td style="vertical-align: top; margin-top: 5px; width: %s;">
%s
</td>
""" % (per+'%', datalist[0])
if len(datalist) > 1:
output += """
<td style="vertical-align: top; margin-top: 5px; width: %s;">
%s
</td>""" % ('25%', datalist[1])
output += """
</tr>
</tbody>
</table>
"""
return output
def _create_user_message_string(self, user_msg):
"""Create and return a string containing any message(s) to be shown to the user.
In particular, these messages are generally info/warning messages.
@param user_msg: The message to be shown to the user. This parameter can have either a
string value (in the case where one message is to be shown to the user), or a list/tuple
value, where each value in the list is a string containing the message to be shown to the
user.
@return: EITHER: a string containing a HTML "DIV" section, which contains the message(s) to be
displayed to the user. In the case where there were multiple messages, each message will be
placed on its own line, by means of a "<br />" tag.
OR: an empty string - in the case that the parameter "user_msg" was an empty string.
"""
user_msg_str = ""
user_msg_str_end = ""
if type(user_msg) in (str, unicode):
if user_msg == "":
user_msg = ()
else:
user_msg = (user_msg,)
if len(user_msg) > 0:
user_msg_str += """<div align="center">\n"""
user_msg_str_end = """</div><br />\n"""
for msg in user_msg:
user_msg_str += """<span class="info">%s</span><br />\n""" % (cgi.escape(msg, 1),)
user_msg_str += user_msg_str_end
return user_msg_str
def _create_websubmitadmin_main_menu_header(self):
"""Create the main menu to be displayed on WebSubmit Admin pages."""
menu_body = """
<div>
<table>
<tr>
<td>0.&nbsp;<small><a href="%(adminurl)s/showall">Show all</a></small></td>
<td>&nbsp;1.&nbsp;<small><a href="%(adminurl)s/doctypelist">Available Document Types</a></small></td>
<td>&nbsp;2.&nbsp;<small><a href="%(adminurl)s/doctypeadd">Add New Document Type</a></small></td>
<td>&nbsp;3.&nbsp;<small><a href="%(adminurl)s/doctyperemove">Remove Document Type</a></small></td>
<td>&nbsp;4.&nbsp;<small><a href="%(adminurl)s/actionlist">Available Actions</a></small></td>
<td>&nbsp;5.&nbsp;<small><a href="%(adminurl)s/jschecklist">Available Checks</a></small></td>
</tr>
<tr>
<td>6.&nbsp;<small><a href="%(adminurl)s/elementlist">Available Elements</a></small></td>
<td>&nbsp;7.&nbsp;<small><a href="%(adminurl)s/functionlist">Available Functions</a></small></td>
<td>&nbsp;8.&nbsp;<small><a href="%(adminurl)s/organisesubmissionpage">Organise Main Page</a></small></td>
<td colspan=2>&nbsp;9.&nbsp;<small><a href="%(siteurl)s/help/admin/websubmit-admin-guide">Guide</a></small></td>
</tr>
</table>
</div>
<br />
""" % { 'adminurl' : WEBSUBMITADMINURL, 'siteurl': CFG_SITE_URL }
return self._create_adminbox(header="Main Menu", datalist=[menu_body])
def _element_display_preview_get_element(self,
elname="",
eltype="",
elsize="",
elrows="",
elcols="",
elval="",
elfidesc="",
ellabel=""):
"""Return the raw display-code for an individual element.
@param
"""
preview = "%s" % (ellabel,)
try:
preview += {"D" : """&nbsp;&nbsp;%s&nbsp;&nbsp;""" % (elfidesc,),
"F" : """<input type="file" %sname="dummyfile">""" % \
( (elsize != "" and """size="%s" """ % (cgi.escape(elsize, 1),) ) or (""),),
"H" : """<span class="info">Hidden Input. Contains Following Value: %s</span>""" % (cgi.escape(elval, 1),),
"I" : """<input type="text" %sname="dummyinput" value="%s">""" % \
( (elsize != "" and """size="%s" """ % (cgi.escape(elsize, 1),) ) or (""), cgi.escape(elval, 1)),
"R" : """<span class="info">Cannot Display Response Element - See Element Description</span>""",
"S" : """&nbsp;%s&nbsp;""" % (elfidesc,),
"T" : """<textarea name="dummytextarea" %s%s></textarea>""" % \
( (elrows != "" and """rows="%s" """ % (cgi.escape(elrows, 1),) ) or (""),
(elcols != "" and """cols="%s" """ % (cgi.escape(elcols, 1),) ) or (""),)
}[eltype]
except KeyError:
## Unknown element type - display warning:
preview += """<span class="info">Element Type not Recognised - Cannot Display</span>"""
return preview
def _element_display_preview(self,
elname="",
eltype="",
elsize="",
elrows="",
elcols="",
elval="",
elfidesc=""
):
"""Return a form containing a preview of an element, based on the values of the parameters provided
@param elname: element name
@param eltype: element type (e.g. text, user-defined, etc)
@param elsize: element size (e.g. for text input element)
@param elrows: number of rows (e.g. for textarea element)
@param elcols: number of columns (e.g. for textarea element)
@param elval: value of element (e.g. for text input element)
@param elfidesc: description for element (e.g. for user-defined element)
@return: string of HTML making up a preview of the element in a table
"""
## Open a dummy form and table in which to display a preview of the element
body = """<div><br />
<form name="dummyeldisplay" action="%(adminurl)s/elementlist">
<table class="admin_wvar" align="center">
<thead>
<tr>
<th class="adminheaderleft" colspan="1">
Element Preview:
</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<br />&nbsp;&nbsp;
""" % {'adminurl' : WEBSUBMITADMINURL}
## Based on element type, display a preview of element:
body += self._element_display_preview_get_element(eltype=eltype, elsize=elsize, elrows=elrows, elcols=elcols,
elval=elval, elfidesc=elfidesc)
## Close dummy form and preview table:
body += """&nbsp;&nbsp;<br />
</td>
</tr>
</tbody>
</table>
</form>
</div>"""
return body
def tmpl_display_addelementform(self,
elname="",
elmarccode="",
eltype="",
elsize="",
elrows="",
elcols="",
elmaxlength="",
elval="",
elfidesc="",
elmodifytext="",
elcd="",
elmd="",
perform_act="elementadd",
user_msg="",
el_use_tuple=""
):
"""Display Web form used to add a new element to the database
@param elname: element name
@param elmarccode: marc code of element
@param eltype: element type (e.g. text, user-defined, etc)
@param elsize: element size (e.g. for text input element)
@param elrows: number of rows (e.g. for textarea element)
@param elcols: number of columns (e.g. for textarea element)
@param elmaxlength: maximum length of a text input field
@param elval: value of element (e.g. for text input element)
@param elfidesc: description for element (e.g. for user-defined element)
@param elmodifytext: element's modification text
@param elcd: creation date of element
@param elmd: last modification date of element
@param user_msg: Any message to be displayed on screen, such as a status report for the last task, etc.
@param el_use_tuple:
@return: HTML page body.
"""
## First, get a rough preview of the element:
output = ""
etypes = {"D" : "User Defined Input", "F" : "File Input", "H" : "Hidden Input", "I" : "Text Input", \
"R" : "Response", "S" : "Select Box", "T" : "Text Area Element"}
etypeids = etypes.keys()
etypeids.sort()
body_content = ""
output += self._create_user_message_string(user_msg)
if perform_act != "elementadd":
body_content += self._element_display_preview(elname=elname, eltype=eltype, elsize=elsize, \
elrows=elrows, elcols=elcols, elval=elval, elfidesc=elfidesc)
else:
body_content += "<br />"
body_content += """<form method="post" action="%(adminurl)s/%(perform_action)s">""" \
% {'adminurl': WEBSUBMITADMINURL, 'perform_action': perform_act}
body_content += """
<table width="100%%" class="admin_wvar">
<thead>
<tr>
<th class="adminheaderleft" colspan="2">
Enter Element Details:
</th>
</tr>
</thead>
<tbody>
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">&nbsp;</td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Element Name:</span></td>
<td width="80%%">"""
if perform_act == "elementadd":
body_content += """
<input type="text" size="30" name="elname" value="%(el_name)s" />""" % {'el_name' : cgi.escape(elname, 1)}
else:
body_content += """<span class="info">%(el_name)s</span><input type="hidden" name="elname" value="%(el_name)s" />""" \
% {'el_name' : cgi.escape(elname, 1)}
body_content += """</td>
</tr>"""
if elcd != "" and elcd is not None:
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Creation Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(elcd), 1),)
if elmd != "" and elmd is not None:
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Last Modification Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(elmd), 1),)
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Modification Text:</span></td>
<td width="80%%"><input type="text" size="90" name="elmodifytext" value="%(el_modifytext)s" /></td>
</tr>""" % {'el_modifytext' : cgi.escape(elmodifytext, 1)}
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Element Type:</span></td>
<td width="80%%">
<select name="eltype">
<option value="NONE_SELECTED">Select:</option>\n"""
for itm in etypeids:
body_content += """ <option value="%s"%s>%s</option>\n""" % \
( itm, (eltype == itm and " selected" ) or (""), cgi.escape(etypes[itm], 1) )
body_content += """ </select>
</td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Marc Code:</span></td>
<td width="80%%"><input type="text" size="15" name="elmarccode" value="%(el_marccode)s" /></td>
</tr>
""" % {'el_marccode' : cgi.escape(elmarccode, 1)}
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Size <i><small>(text elements)</small></i>:</span></td>
<td width="80%%"><input type="text" size="10" name="elsize" value="%(el_size)s" /></td>
</tr>
""" % {'el_size' : cgi.escape(elsize, 1)}
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">No. Rows <i><small>(textarea elements)</small></i>:</span></td>
<td width="80%%"><input type="text" size="6" name="elrows" value="%(el_rows)s" /></td>
</tr>
""" % {'el_rows' : cgi.escape(elrows, 1)}
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">No. Columns <i><small>(textarea elements)</small></i>:</span></td>
<td width="80%%"><input type="text" size="6" name="elcols" value="%(el_cols)s" /></td>
</tr>
""" % {'el_cols' : cgi.escape(elcols, 1)}
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Maximum Length <i><small>(text elements)</small></i>:</span></td>
<td width="80%%"><input type="text" size="6" name="elmaxlength" value="%(el_maxlength)s" /></td>
</tr>
""" % {'el_maxlength' : cgi.escape(elmaxlength, 1)}
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Value <i><small>(text/hidden elements)</small></i>:</span></td>
<td width="80%%"><input type="text" size="90" name="elval" value="%(el_val)s" /></td>
</tr>
""" % {'el_val' : cgi.escape(elval, 1)}
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Element Description <i><small>(e.g. user-defined elements)</small></i>:</span></td>
<td width="80%%"><textarea cols="100" rows="30" name="elfidesc" wrap="nowarp">%(el_fidesc)s</textarea></td>
</tr>
""" % {'el_fidesc' : cgi.escape(elfidesc, 1)}
body_content += """
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%"><input name="elcommit" class="adminbutton" type="submit" value="Save Details" /></td>
</tr>
</tbody>
</table>
</form>
"""
## If there is information about which submission pages use this element, display it:
if type(el_use_tuple) is tuple and len(el_use_tuple) > 0:
body_content += """<br /><br />
<table width="100%%" class="admin_wvar">
<thead>
<tr>
<th class="adminheaderleft" colspan="2">
Element Usage:
</th>
</tr>
</thead>
<tbody>
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">"""
for usecase in el_use_tuple:
try:
body_content += """<small><a href="%(adminurl)s/doctypeconfiguresubmissionpageelements?doctype=%(doctype)s"""\
"""&action=%(action)s&pagenum=%(pageno)s">&nbsp;%(subname)s: Page %(pageno)s</a></small><br />\n"""\
% { 'adminurl' : WEBSUBMITADMINURL,
'doctype' : usecase[0],
'action' : usecase[1],
'subname' : "%s%s" % (usecase[1], usecase[0]),
'pageno' : usecase[2]
}
except KeyError, e:
pass
body_content += """&nbsp;</td>
</tr>
</tbody>
</table>
"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Element Details:", datalist=[body_content])
return output
def tmpl_display_submission_page_organisation(self,
submission_collection_tree,
submission_collections,
doctypes,
user_msg=""):
def _build_collection_tree_display(branch, level=0):
outstr = ""
try:
level = int(level)
except TypeError:
level = 0
## open a table in which collection and doctype children will be displayed:
outstr += """<table border ="0" cellspacing="0" cellpadding="0">\n<tr>"""
## Display details of this collection:
if level != 0:
## Button to allow deletion of collection from tree:
outstr += """<td><a href="%(adminurl)s/organisesubmissionpage?sbmcolid=%(collection_id)s""" \
"""&deletesbmcollection=1"><img border="0" src="%(siteurl)s/img/iconcross.gif" """ \
"""title="Remove submission collection from tree"></a></td>""" \
% { 'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'collection_id' : cgi.escape(str(branch['collection_id']), 1),
}
## does this collection have a collection brother above it?
if branch['has_brother_above'] == 1:
## Yes it does - add 'up' arrow:
outstr += """<td><a href="%(adminurl)s/organisesubmissionpage?sbmcolid=%(collection_id)s""" \
"""&movesbmcollectionup=1"><img border="0" src="%(siteurl)s/img/smallup.gif" """\
"""title="Move submission collection up"></a></td>""" \
% { 'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'collection_id' : cgi.escape(str(branch['collection_id']), 1),
}
else:
## No it doesn't - no 'up' arrow:
outstr += """<td><img border="0" src="%(siteurl)s/img/white_field.gif"></td>"""\
% { 'siteurl' : cgi.escape(CFG_SITE_URL, 1), }
## does this collection have a collection brother below it?
if branch['has_brother_below'] == 1:
## Yes it does - add 'down' arrow:
outstr += """<td><a href="%(adminurl)s/organisesubmissionpage?sbmcolid=%(collection_id)s""" \
"""&movesbmcollectiondown=1"><img border="0" src="%(siteurl)s/img/smalldown.gif" """\
"""title="Move submission collection down"></a></td>""" \
% { 'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'collection_id' : cgi.escape(str(branch['collection_id']), 1),
}
else:
## No it doesn't - no 'down' arrow:
outstr += """<td><img border="0" src="%(siteurl)s/img/white_field.gif"></td>"""\
% { 'siteurl' : cgi.escape(CFG_SITE_URL, 1), }
## Display the collection name:
outstr += """<td>&nbsp;<span style="color: green; font-weight: bold;">%s</span></td>""" \
% branch['collection_name']
else:
outstr += "<td>&nbsp;</td><td>&nbsp;</td><td>&nbsp;</td><td>&nbsp;</td>"
outstr += "</tr>\n"
## If there are doctype children attached to this collection, display them:
num_doctype_children = len(branch['doctype_children'])
if num_doctype_children > 0:
outstr += """<tr><td>&nbsp;</td><td>&nbsp;</td><td>&nbsp;</td><td>""" \
"""<table border ="0" cellspacing="0" cellpadding="0">\n"""
for child_num in xrange(0, num_doctype_children):
outstr += """<tr>\n"""
## Button to allow doctype to be detached from tree:
outstr += """<td><a href="%(adminurl)s/organisesubmissionpage?sbmcolid=%(collection_id)s""" \
"""&doctype=%(doctype)s&catscore=%(catalogueorder)s&deletedoctypefromsbmcollection=1"><img border="0" """\
"""src="%(siteurl)s/img/iconcross.gif" title="Remove doctype from branch"></a></td>""" \
% { 'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'collection_id' : cgi.escape(str(branch['collection_id']), 1),
'doctype' : cgi.escape(branch['doctype_children'][child_num]['doctype_id']),
'catalogueorder' : cgi.escape(str(branch['doctype_children'][child_num]['catalogue_order']), 1),
}
## Does this doctype have a brother above it?
if child_num > 0:
## Yes it does - add an 'up' arrow:
outstr += """<td><a href="%(adminurl)s/organisesubmissionpage?sbmcolid=%(collection_id)s""" \
"""&doctype=%(doctype)s&catscore=%(catalogueorder)s&movedoctypeupinsbmcollection=1"><img border="0" """ \
"""src="%(siteurl)s/img/smallup.gif" title="Move doctype up"></a></td>""" \
% { 'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'collection_id' : cgi.escape(str(branch['collection_id']), 1),
'doctype' : cgi.escape(branch['doctype_children'][child_num]['doctype_id']),
'catalogueorder' : cgi.escape(str(branch['doctype_children'][child_num]['catalogue_order']), 1),
}
else:
## No it doesn't - no 'up' arrow:
outstr += """<td><img border="0" src="%(siteurl)s/img/white_field.gif"></td>"""\
% { 'siteurl' : cgi.escape(CFG_SITE_URL, 1), }
## Does this doctype have a brother below it?
if child_num < num_doctype_children - 1:
## Yes it does - add a 'down' arrow:
outstr += """<td><a href="%(adminurl)s/organisesubmissionpage?sbmcolid=%(collection_id)s""" \
"""&doctype=%(doctype)s&catscore=%(catalogueorder)s&movedoctypedowninsbmcollection=1"><img border="0" """ \
"""src="%(siteurl)s/img/smalldown.gif" title="Move doctype down"></a></td>""" \
% { 'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'collection_id' : cgi.escape(str(branch['collection_id']), 1),
'doctype' : cgi.escape(branch['doctype_children'][child_num]['doctype_id']),
'catalogueorder' : cgi.escape(str(branch['doctype_children'][child_num]['catalogue_order']), 1),
}
else:
## No it doesn't - no 'down' arrow:
outstr += """<td><img border="0" src="%(siteurl)s/img/white_field.gif"></td>"""\
% { 'siteurl' : cgi.escape(CFG_SITE_URL, 1), }
## Display the document type details:
outstr += """<td>&nbsp;<small><a href="%(adminurl)s/doctypeconfigure?doctype=%(doctype)s">"""\
"""%(doctype_name)s [%(doctype)s]</a></small></td>""" \
% { 'adminurl' : WEBSUBMITADMINURL,
'doctype' : cgi.escape(branch['doctype_children'][child_num]['doctype_id'], 1),
'doctype_name' : cgi.escape(branch['doctype_children'][child_num]['doctype_lname'], 1),
}
outstr += "</tr>\n"
## If there were doctype children attached to this collection, they have been displayed,
## so close up the row:
if num_doctype_children > 0:
outstr += "</table>\n</td></tr>"
## Display Lower branches of tree:
for lower_branch in branch['collection_children']:
outstr += "<tr><td>&nbsp;</td><td>&nbsp;</td><td>&nbsp;</td><td>"
outstr += _build_collection_tree_display(branch=lower_branch, level=level+1)
outstr += "</td></tr>\n"
outstr += "</table>"
return outstr
## begin display:
output = ""
body_content = """<br />
<table class="admin_wvar" width="100%%">
<thead>
<tr>
<th class="adminheaderleft">
Submission Page Organisational Hierarchy:
</th>
</tr>
</thead>
<tbody>
<tr>
<td><br />"""
body_content += _build_collection_tree_display(submission_collection_tree)
body_content += """</td>
</tr>"""
body_content += """
<tr>
<td><br /></td>
</tr>
<tr>
<td><br />"""
## Form to allow user to add a new submission-collection:
body_content += """
<form method="post" action="%(adminurl)s/organisesubmissionpage">
<span class="adminlabel">You can add a new Submission-Collection:</span><br />
<small style="color: navy;">Name:</small>&nbsp;&nbsp;
<input type="text" name="addsbmcollection" style="margin: 5px 10px 5px 10px;" />
&nbsp;&nbsp;<small style="color: navy;">Attached to:</small>&nbsp;&nbsp;""" \
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1), }
if len(submission_collections) > 0:
body_content += """
%(submission_collections)s""" \
% { 'submission_collections' : \
create_html_select_list(select_name="addtosbmcollection",
option_list=submission_collections,
css_style="margin: 5px 10px 5px 10px;")
}
else:
body_content += """<input type="hidden" name="addtosbcollection" value="0" />
<span style="color: green;">Top Level</span>"""
body_content += """<input name="sbmcollectionadd" class="adminbutton" type="submit" """ \
"""value="Add" />
</form>"""
body_content += """</td>
</tr>
<tr>
<td><br /><br /></td>
</tr>"""
## if there are doctypes in the system, provide a form to enable the user to
## connect a document type to the submission-collection tree:
if len(submission_collections) > 1 and len(doctypes) > 0:
body_content += """<tr><td>
<form method="post" action="%(adminurl)s/organisesubmissionpage">
<span class="adminlabel">You can attach a Document Type to a Submission-Collection:</span><br />
<small style="color: navy;">Document Type Name:</small><br />
%(doctypes)s
<br /><small style="color: navy;">Attached to:</small>&nbsp;&nbsp;""" \
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'doctypes' : create_html_select_list(select_name="adddoctypes",
option_list=doctypes,
css_style="margin: 5px 10px 5px 10px;",
multiple=1,
list_size=5)
}
body_content += """
%(submission_collections)s""" \
% { 'submission_collections' : \
create_html_select_list(select_name="addtosbmcollection",
option_list=submission_collections[1:],
css_style="margin: 5px 10px 5px 10px;")
}
body_content += """<input name="submissioncollectionadd" class="adminbutton" type="submit" """ \
"""value="Add" />
</form></td>
</tr>"""
body_content += """</tbody>
</table>"""
output += self._create_user_message_string(user_msg)
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Submission-Collections of Submission Page:", datalist=[body_content])
return output
def tmpl_display_addactionform(self,
actid="",
actname="",
working_dir="",
status_text="",
perform_act = "actionadd",
cd="",
md="",
user_msg=""):
"""Display web form used to add a new action to Websubmit.
@param actid: Value of the "sactname" (action id) parameter of the Websubmit action.
@param actname: Value of the "lactname" (long action name) parameter of the Websubmit action.
@param working_dir: Value of the "dir" (action working/archive directory) parameter of the Websubmit action.
@param status_text: Value of the "statustext" (action status text) parameter of the WebSubmit action.
@param perform_act: action for form (minus websubmitadmin base url)
@param cd: Creation date of action.
@param md: Modification date of action.
@param user_msg: Any message to be displayed on screen, such as a status report for the last task, etc.
@return: HTML page body.
"""
output = ""
output += self._create_user_message_string(user_msg)
body_content = """<form method="post" action="%(adminurl)s/%(perform_action)s">""" \
% {'adminurl': WEBSUBMITADMINURL, 'perform_action': perform_act}
body_content += """
<table width="90%%">
<tr>
<td width="20%%"><span class="adminlabel">Action Code:</span></td>
<td width="80%%">"""
if perform_act == "actionadd":
body_content += """
<input type="text" size="6" name="actid" value="%(ac_id)s" />""" % {'ac_id' : cgi.escape(actid, 1)}
else:
body_content += """<span class="info">%(ac_id)s</span><input type="hidden" name="actid" value="%(ac_id)s" />""" \
% {'ac_id' : cgi.escape(actid, 1)}
body_content += """</td>
</tr>"""
if "" not in (cd, md):
if cd is not None:
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Creation Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(cd), 1),)
if md is not None:
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Last Modification Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(md), 1), )
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Action Description:</span></td>
<td width="80%%"><input type="text" size="60" name="actname" value="%(ac_name)s" /></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Action dir:</span></td>
<td width="80%%"><input type="text" size="40" name="working_dir" value="%(w_dir)s" /></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Action Status Text:</span></td>
<td width="80%%"><input type="text" size="60" name="status_text" value="%(s_txt)s" /></td>
</tr>""" % {'ac_name' : cgi.escape(actname, 1), 'w_dir' : cgi.escape(working_dir, 1), \
's_txt' : cgi.escape(status_text, 1)}
body_content += """
<tr>
<td colspan="2">
<table>
<tr>
<td>
<input name="actcommit" class="adminbutton" type="submit" value="Save Details" />
</form>
</td>
<td>
<br />
<form method="post" action="%(adminurl)s/actionlist">
<input name="actcommitcancel" class="adminbutton" type="submit" value="Cancel" />
</form>
</td>
</tr>
</table>
</td>
</tr>
</table>
</form>
""" % { 'adminurl' : WEBSUBMITADMINURL }
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Enter Action Details:", datalist=[body_content])
return output
def tmpl_display_addjscheckform(self,
chname="",
chdesc="",
perform_act = "jscheckadd",
cd="",
md="",
user_msg=""):
"""Display web form used to add a new Check to Websubmit.
@param chname: Value of the "chname" (check ID/name) parameter of the WebSubmit Check.
@param chdesc: Value of the "chdesc" (check description - i.e. JS code) parameter of the
WebSubmit Check.
@param perform_act: action for form (minus websubmitadmin base url)
@param cd: Creation date of check.
@param md: Modification date of check.
@param user_msg: Any message to be displayed on screen, such as a status report for the last task, etc.
@return: HTML page body.
"""
output = ""
output += self._create_user_message_string(user_msg)
body_content = """<form method="post" action="%(adminurl)s/%(perform_action)s">""" \
% {'adminurl': WEBSUBMITADMINURL, 'perform_action': perform_act}
body_content += """
<table width="90%%">
<tr>
<td width="20%%"><span class="adminlabel">Check Name:</span></td>
<td width="80%%">"""
if perform_act == "jscheckadd":
body_content += """
<input type="text" size="15" name="chname" value="%(ch_name)s" />""" % {'ch_name' : cgi.escape(chname, 1)}
else:
body_content += """<span class="info">%(ch_name)s</span><input type="hidden" name="chname" value="%(ch_name)s" />""" \
% {'ch_name' : cgi.escape(chname, 1)}
body_content += """</td>
</tr>"""
if "" not in (cd, md):
if cd is not None:
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Creation Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(cd), 1),)
if md is not None:
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Last Modification Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(md), 1),)
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Check Description:</span></td>
<td width="80%%">
<textarea cols="90" rows="22" name="chdesc">%(ch_descr)s</textarea>
</td>
</tr>
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%"><input name="chcommit" class="adminbutton" type="submit" value="Save Details" /></td>
</tr>
</table>
</form>
""" % {'ch_descr' : cgi.escape(chdesc, 1)}
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Enter Action Details:", datalist=[body_content])
return output
def tmpl_display_addfunctionform(self,
funcname="",
funcdescr="",
func_parameters=None,
all_websubmit_func_parameters=None,
perform_act="functionadd",
user_msg="",
func_docstring=None):
"""Display web form used to add a new function to Websubmit.
@param funcname: Value of the "function" (unique function name) parameter
@param chdesc: Value of the "description" (function textual description) parameter
@param perform_act: action for form (minus websubmitadmin base url)
@param user_msg: Any message to be displayed on screen, such as a status report for the last task, etc.
@param func_docstring: the docstring of the displayed function (or error message if function could not be loaded). None if no docstring
@return: HTML page body.
"""
if type(func_parameters) not in (list, tuple):
## bad parameters list - reset
func_parameters = ()
if type(all_websubmit_func_parameters) not in (list, tuple):
## bad list of function parameters - reset
all_websubmit_func_parameters = ()
output = ""
output += self._create_user_message_string(user_msg)
body_content = """<form method="post" action="%(adminurl)s/%(perform_action)s">""" \
% {'adminurl' : WEBSUBMITADMINURL, 'perform_action': perform_act}
## Function Name and description:
body_content += """<br />
<table width="100%%" class="admin_wvar">
<thead>
<tr>
<th class="adminheaderleft" colspan="2">
%sFunction Details:
</th>
</tr>
</thead>
<tbody>
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">&nbsp;</td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Function Name:</span></td>
<td width="80%%">""" % ((funcname != "" and cgi.escape(funcname, 1) + " ") or (""), )
if perform_act == "functionadd" and funcname == "":
body_content += """
<input type="text" size="30" name="funcname" />"""
else:
body_content += """<span class="info">%(func_name)s</span><input type="hidden" name="funcname" value="%(func_name)s" />""" \
% {'func_name' : cgi.escape(funcname, 1)}
body_content += """</td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Function Description:</span></td>
<td width="80%%"><input type="text" size="90" name="funcdescr" value="%(func_descr)s" />
</tr>
""" % {'func_descr' : cgi.escape(funcdescr, 1)}
if func_docstring:
body_content += """
<tr>
<td width="20%%" valign="top"><span class="adminlabel">Function Documentation:</span></td>
<td width="80%%">%(func_docstring)s</td>
</tr>
""" % {'func_docstring': func_docstring}
body_content += """
<tr>
<td width="20%%" colspan="2">&nbsp;</td>
</tr>
<tr>
<td width="20%%" colspan="2"><input name="%s" class="adminbutton" type="submit" value="Save Details" /></td>
</tr>
</tbody>
</table>""" % ( ((perform_act == "functionadd" and funcname == "") and "funcaddcommit") or ("funcdescreditcommit"), )
if funcname not in ("", None):
body_content += """<br />
<table width="100%%" class="admin_wvar">
<thead>
<tr>
<th class="adminheaderleft">
Parameters for Function %(func_name)s:
</th>
</tr>
</thead>
<tbody>
<tr>
<td><br />""" % {'func_name' : cgi.escape(funcname, 1)}
params_tableheader = ["Parameter", "&nbsp;"]
params_tablebody = []
for parameter in func_parameters:
params_tablebody.append( ("<small>%s</small>" % (cgi.escape(parameter[0], 1),),
"""<small><a href="%(adminurl)s/functionedit?funcparamdelcommit=funcparamdelcommit""" \
"""&amp;funcname=%(func_name)s&amp;funceditdelparam=%(delparam_name)s">delete</a></small>""" \
% { 'adminurl' : WEBSUBMITADMINURL,
'func_name' : cgi.escape(funcname, 1),
'delparam_name' : cgi.escape(parameter[0], 1)
}
) )
body_content += create_html_table_from_tuple(tableheader=params_tableheader, tablebody=params_tablebody)
body_content += """</td>
</tr>
</tbody>
</table>
<br />"""
## Add a parameter?
body_content += """<table width="100%%" class="admin_wvar">
<thead>
<tr>
<th class="adminheaderleft" colspan="2">
Add Parameter to Function %(func_name)s:
</th>
</tr>
</thead>
<tbody>""" % {'func_name' : cgi.escape(funcname, 1)}
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Add Parameter:</span></td>
<td width="80%%"><small>Select a parameter to add to function:</small>&nbsp;%s&nbsp;&nbsp;""" \
% (create_html_select_list(select_name="funceditaddparam", option_list=all_websubmit_func_parameters),)
body_content += """<small>-Or-</small>&nbsp;&nbsp;<small>Enter a new parameter:</small>&nbsp;&nbsp;<input type="text" """ \
+ """name="funceditaddparamfree" size="15" /><input name="funcparamaddcommit" class="adminbutton" """ \
+ """type="submit" value="Add" /></td>
</tr>
</tbody>
</table>"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Enter Function Details:", datalist=[body_content])
return output
def tmpl_display_function_usage(self, funcname, func_usage, user_msg=""):
"""Display a table containing the details of a function's usage in the various actions of the various doctypes.
Displayed will be information about the document type and action, and the score and step at which
the function is called within that action.
@param funcname: (string) function name.
@param func_usage: (tuple) A tuple of tuples, each containing details of the function usage:
(doctype, docname, function-step, function-score, action id, action name)
@param user_msg: Any message to be displayed on screen, such as a status report for the last task, etc.
@return: (string) HTML page body.
"""
output = ""
body_content = ""
header = ["Doctype", "&nbsp;", "Action", "&nbsp;", "Score", "Step", "Show Details"]
tbody = []
output += self._create_user_message_string(user_msg)
body_content += "<br />"
for usage in func_usage:
tbody.append( ("<small>%s</small>" % (cgi.escape(usage[0], 1),),
"<small>%s</small>" % (cgi.escape(usage[1], 1),),
"<small>%s</small>" % (cgi.escape(usage[2], 1),),
"<small>%s</small>" % (cgi.escape(usage[3], 1),),
"<small>%s</small>" % (cgi.escape(usage[4], 1),),
"<small>%s</small>" % (cgi.escape(usage[5], 1),),
"""<small><a href="%s/doctypeconfiguresubmissionfunctions?doctype=%s&action=%s"""\
"""&viewSubmissionFunctions=true">Show</a></small>"""\
% (WEBSUBMITADMINURL, cgi.escape(usage[0], 1), cgi.escape(usage[2], 1))
)
)
body_content += create_html_table_from_tuple(tableheader=header, tablebody=tbody)
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="""Usage of the "%s" Function:""" % (cgi.escape(funcname, 1),), datalist=[body_content])
return output
def tmpl_display_allactions(self,
actions,
user_msg=""):
"""Create the page body used for displaying all Websubmit actions.
@param actions: A tuple of tuples containing the action id, and the action name (actid, actname).
@param user_msg: Any message to be displayed on screen, such as a status report for the last task, etc.
@return: HTML page body.
"""
output = ""
output += self._create_user_message_string(user_msg)
body_content = """<div>
<table>
"""
for action in actions:
body_content += """<tr>
<td align="left">&nbsp;&nbsp;<a href="%s/actionedit?actid=%s">%s: %s</a></td>
</tr>
""" % (WEBSUBMITADMINURL, cgi.escape(action[0], 1), cgi.escape(action[0], 1), cgi.escape(action[1], 1))
body_content += """</table>"""
## Button to create new action:
body_content += """<br /><form action="%s/actionadd" METHOD="post"><input class="adminbutton" type="submit" value="Add Action" /></form>""" \
% (WEBSUBMITADMINURL,)
body_content += """</div>"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Select an Action:", datalist=[body_content])
return output
def tmpl_display_alldoctypes(self,
doctypes,
user_msg = ""):
"""Create the page body used for displaying all Websubmit document types.
@param doctypes: A tuple of tuples containing the doctype id, and the doctype name (docid, docname).
@param user_msg: Any message to be displayed on screen, such as a status report for the last task, etc.
return: HTML page body.
"""
output = ""
output += self._create_user_message_string(user_msg)
body_content = """<div>
<table>
"""
for doctype in doctypes:
body_content += """<tr>
<td align="left">&nbsp;&nbsp;<a href="%s/doctypeconfigure?doctype=%s">%s&nbsp;&nbsp;[%s]</a></td>
</tr>
""" % (WEBSUBMITADMINURL, cgi.escape(doctype[0], 1), cgi.escape(doctype[1], 1), cgi.escape(doctype[0], 1))
body_content += """</table>"""
## Button to create new action:
body_content += """<br /><form action="%s/doctypeadd" METHOD="post"><input class="adminbutton" type="submit" value="Add New Doctype" /></form>""" \
% (WEBSUBMITADMINURL,)
body_content += """</div>"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Select a Document Type:", datalist=[body_content])
return output
def tmpl_display_alljschecks(self,
jschecks,
user_msg = ""):
"""Create the page body used for displaying all Websubmit JavaScript Checks.
@param jschecks: A tuple of tuples containing the check name (chname, which is unique for
each check.)
@param user_msg: Any message to be displayed on screen, such as a status report for the last task, etc.
return: HTML page body.
"""
output = ""
output += self._create_user_message_string(user_msg)
body_content = """<div>
<table>
"""
for jscheck in jschecks:
body_content += """<tr>
<td align="left">&nbsp;&nbsp;<a href="%s/jscheckedit?chname=%s">%s</a></td>
</tr>
""" % (WEBSUBMITADMINURL, cgi.escape(jscheck[0], 1), cgi.escape(jscheck[0], 1))
body_content += """</table>"""
## Button to create new action:
body_content += """<br /><form action="%s/jscheckadd" METHOD="post"><input class="adminbutton" type="submit" value="Add Check" /></form>""" \
% (WEBSUBMITADMINURL,)
body_content += """</div>"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Select a Checking Function:", datalist=[body_content])
return output
def tmpl_display_allfunctions(self,
functions,
user_msg = ""):
"""Create the page body used for displaying all Websubmit functions.
@param functions: A tuple of tuples containing the function name, and the function
description (function, description).
@param user_msg: Any message to be displayed on screen, such as a status report for the last task, etc.
return: HTML page body.
"""
output = ""
header = ["Function Name", "View Usage", "Edit Details"]
output += self._create_user_message_string(user_msg)
body_content = """<div><br />\n"""
tbody = []
for function in functions:
tbody.append(("&nbsp;&nbsp;%s" % (cgi.escape(function[0], 1),),
"""<small><a href="%s/functionusage?funcname=%s">View Usage</a></small>""" % \
(WEBSUBMITADMINURL, cgi.escape(function[0], 1)),
"""<small><a href="%s/functionedit?funcname=%s">Edit Details</a></small>""" % \
(WEBSUBMITADMINURL, cgi.escape(function[0], 1))
))
button_newfunc = """<form action="%s/functionadd" METHOD="post">
<input class="adminbutton" type="submit" value="Add New Function" />
</form>""" % (WEBSUBMITADMINURL,)
body_content += create_html_table_from_tuple(tableheader=header, tablebody=tbody, end=button_newfunc)
body_content += """</div>"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="WebSubmit Functions:", datalist=[body_content])
return output
def tmpl_display_allelements(self,
elements,
user_msg = ""):
"""Create the page body used for displaying all Websubmit elements.
@param elements: A tuple of tuples containing the element name (name).
@param user_msg: Any message to be displayed on screen, such as a status report for the last task, etc.
return: HTML page body.
"""
output = ""
output += self._create_user_message_string(user_msg)
body_content = """<div>
<table style="align:center;">
"""
for element in elements:
body_content += """<tr>
<td align="left">&nbsp;&nbsp;<a href="%s/elementedit?elname=%s">%s</a></td>
</tr>
""" % (WEBSUBMITADMINURL, cgi.escape(element[0], 1), cgi.escape(element[0], 1))
body_content += """</table>"""
## Button to create new action:
body_content += """<br /><form action="%s/elementadd" METHOD="post"><input class="adminbutton" type="submit" value="Add New Element" /></form>""" \
% (WEBSUBMITADMINURL,)
body_content += """</div>"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Select an Element:", datalist=[body_content])
return output
def tmpl_display_delete_doctype_form(self, doctype="", alldoctypes="", user_msg=""):
"""TODO: DOCSTRING"""
output = ""
output += self._create_user_message_string(user_msg)
body_content = "<div>"
if doctype not in ("", None) and type(doctype) in (str, unicode):
## Display the confirmation message:
body_content += """<form method="get" action="%(adminurl)s/doctyperemove">""" \
"""<input type="hidden" name="doctype" value="%(doc_type)s" />\n""" \
% { 'adminurl' : WEBSUBMITADMINURL, 'doc_type' : cgi.escape(doctype, 1) }
body_content += """<div><span class="info"><i>Really</i> remove document type "%s" and all of its configuration details?</span> <input name="doctypedeleteconfirm" class="adminbutton\""""\
"""type="submit" value="Confirm" /></div>\n</form>\n""" % (cgi.escape(doctype,) )
else:
## just display the list of document types to delete:
if type(alldoctypes) not in (list, tuple):
## bad list of document types - reset
alldoctypes = ()
body_content += """<form method="get" action="%(adminurl)s/doctyperemove">""" \
% { 'adminurl' : WEBSUBMITADMINURL }
body_content += """
<table width="100%%" class="admin_wvar">
<thead>
<tr>
<th class="adminheaderleft">
Select a Document Type to Remove:
</th>
</tr>
</thead>
<tbody>
<tr>
<td>&nbsp;&nbsp;%s&nbsp;&nbsp;<input name="doctypedelete" class="adminbutton" type="submit" value="Remove" /></td>
</tr>
</table>
</form>""" \
% (create_html_select_list(select_name="doctype", option_list=alldoctypes),)
body_content += """</div>"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Remove a Document Type:", datalist=[body_content])
return output
## DOCTYPE CONFIGURE
def tmpl_display_submission_clone_form(self,
doctype,
action,
clonefrom_list,
user_msg=""
):
if type(clonefrom_list) not in (list, tuple):
clonefrom_list = ()
output = ""
output += self._create_user_message_string(user_msg)
body_content = """<form method="get" action="%(adminurl)s/%(formaction)s">""" \
% { 'adminurl' : WEBSUBMITADMINURL , 'formaction' : cgi.escape("doctypeconfigure", 1) }
body_content += """
<input type="hidden" name="doctype" value="%(doctype)s" />
<input type="hidden" name="action" value="%(action)s" />
<table width="90%%">
<tr>
<td width="20%%"><span class="adminlabel">Clone from Document Type:</span></td>
<td width="80%%">
%(clonefrom)s
</td>
</tr>
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">
<input name="doctypesubmissionaddclonechosen" class="adminbutton" type="submit" value="Continue" />&nbsp;
<input name="doctypesubmissionaddclonechosencancel" class="adminbutton" type="submit" value="Cancel" />&nbsp;
</td>
</tr>
</table>
</form>""" % { 'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'clonefrom' : create_html_select_list(select_name="doctype_cloneactionfrom",
option_list=clonefrom_list,
default_opt=("None", "Do not clone from another Document Type/Submission"),
css_style="margin: 5px 10px 5px 10px;"
)
}
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Add Submission '%s' to Document Type '%s':" % (action, doctype),
datalist=[body_content])
return output
def tmpl_display_delete_doctypesubmission_form(self, doctype="", action="", user_msg=""):
"""TODO: DOCSTRING"""
output = ""
output += self._create_user_message_string(user_msg)
body_content = """<div>"""
## Display the confirmation message:
body_content += """<form method="get" action="%(adminurl)s/%(formaction)s">""" \
"""<input type="hidden" name="doctype" value="%(doctype)s" />\n""" \
"""<input type="hidden" name="action" value="%(action)s" />\n""" \
% { 'adminurl' : WEBSUBMITADMINURL,
'formaction' : "doctypeconfigure",
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1)
}
body_content += """<div><span class="info"><i>Really</i> remove the Submission "%s" and all related details from Document Type "%s"?</span> <input name="doctypesubmissiondeleteconfirm" class="adminbutton" """ \
"""type="submit" value="Confirm" /></div>\n</form>\n""" % (cgi.escape(action, 1), cgi.escape(doctype, 1) )
body_content += """</div>"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="""Delete Submission "%s" from Document Type "%s"\"""" % (action, doctype), datalist=[body_content])
return output
def tmpl_display_submissiondetails_form(self,
doctype,
action,
displayed="",
buttonorder="",
statustext="",
level="",
score="",
stpage="",
endtxt="",
cd="",
md="",
user_msg="",
perform_act="doctypeconfigure",
saveaction="edit"
):
output = ""
output += self._create_user_message_string(user_msg)
body_content = """<form method="get" action="%(adminurl)s/%(action)s">""" \
% { 'adminurl' : WEBSUBMITADMINURL , 'action' : cgi.escape(perform_act, 1) }
body_content += """
<table width="90%%">"""
if cd not in ("", None):
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Creation Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(cd), 1),)
if md not in ("", None):
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Last Modification Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(md), 1),)
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Submission Displayed on Start Page:</span></td>
<td width="80%%">
<select name="displayed">
<option value="Y"%s>Yes</option>
<option value="N"%s>No</option>
</select>
</td>
</tr>""" % ( (displayed == "Y" and " selected") or (""),
(displayed == "N" and " selected") or ("")
)
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Button Order:</span></td>
<td width="80%%">
<input type="text" size="4" name="buttonorder" value="%(buttonorder)s" />
</td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Status Text:</span></td>
<td width="80%%">
<input type="text" size="35" name="statustext" value="%(statustext)s" />
</td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Level:</span></td>
<td width="80%%">
<input type="text" size="4" name="level" value="%(level)s" />
</td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Score:</span></td>
<td width="80%%">
<input type="text" size="4" name="score" value="%(score)s" />
</td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Stpage:</span></td>
<td width="80%%">
<input type="text" size="4" name="stpage" value="%(stpage)s" />
</td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">End Text:</span></td>
<td width="80%%">
<input type="text" size="35" name="endtxt" value="%(endtxt)s" />
</td>
</tr>
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">
<input type="hidden" name="doctype" value="%(doctype)s" />
<input type="hidden" name="action" value="%(action)s" />
<input name="%(savebutton)s" class="adminbutton" type="submit" value="Save Details" />
&nbsp;
<input name="doctypesubmissiondetailscancel" class="adminbutton" type="submit" value="Cancel" />
</td>
</tr>
</table>
</form>
""" % { 'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'displayed' : cgi.escape(displayed, 1),
'buttonorder' : cgi.escape(buttonorder, 1),
'statustext' : cgi.escape(statustext, 1),
'level' : cgi.escape(level, 1),
'score' : cgi.escape(score, 1),
'stpage' : cgi.escape(stpage, 1),
'endtxt' : cgi.escape(endtxt, 1),
'cd' : cgi.escape(cd, 1),
'md' : cgi.escape(md, 1),
'savebutton' : ((saveaction == "edit" and "doctypesubmissioneditdetailscommit") or ("doctypesubmissionadddetailscommit"))
}
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Enter Details of '%s' Submission of '%s' Document Type:" % (action, doctype),
datalist=[body_content])
return output
def tmpl_display_doctypedetails_form(self, doctype="", doctypename="", doctypedescr="", cd="", md="", clonefrom="", \
alldoctypes="", user_msg="", perform_act="doctypeadd"):
"""TODO : DOCSTRING"""
output = ""
body_content = ""
if perform_act == "doctypeadd":
formheader = "Add a new Document Type:"
else:
formheader = "Edit Document Type Details:"
output += self._create_user_message_string(user_msg)
if type(alldoctypes) not in (list, tuple):
## bad list of document types - reset
alldoctypes = ()
body_content += """<form method="post" action="%(adminurl)s/%(action)s">""" \
% { 'adminurl' : WEBSUBMITADMINURL , 'action' : cgi.escape(perform_act, 1) }
body_content += """
<table width="90%%">
<tr>
<td width="20%%"><span class="adminlabel">Document Type ID:</span></td>
<td width="80%%">"""
if perform_act == "doctypeadd":
body_content += """<input type="text" size="15" name="doctype" value="%(doctype_id)s" />""" \
% {'doctype_id' : cgi.escape(doctype, 1)}
else:
body_content += """<span class="info">%(doctype_id)s</span><input type="hidden" name="doctype" value="%(doctype_id)s" />""" \
% {'doctype_id' : cgi.escape(doctype, 1)}
body_content += """</td>
</tr>"""
if perform_act != "doctypeadd":
if cd not in ("", None):
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Creation Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(cd), 1),)
if md not in ("", None):
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Last Modification Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(md), 1), )
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Document Type Name:</span></td>
<td width="80%%"><input type="text" size="60" name="doctypename" value="%(doctype_name)s" /></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Document Type Description:</span></td>
<td width="80%%"><textarea name="doctypedescr" cols="60" rows="8">%(doctype_description)s</textarea></td>
</tr>""" % { 'doctype_name' : cgi.escape(doctypename, 1),
'doctype_description' : "%s" % ((doctypedescr is not None and cgi.escape(str(doctypedescr), 1)) or ("")),
}
if perform_act == "doctypeadd":
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Document Type to Clone:</span></td>
<td width="80%%">%(doctype_select_list)s</td>
</tr>""" % { 'doctype_select_list' :
create_html_select_list(select_name="clonefrom",
option_list=alldoctypes,
selected_values=clonefrom,
default_opt=('None', 'Select:')
)
}
body_content += """
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">
<input name="doctypedetailscommit" class="adminbutton" type="submit" value="Save Details" />"""
if perform_act != "doctypeadd":
## add a cancel button if this is not a call to add a new document type:
body_content += """
&nbsp;
<input name="doctypedetailscommitcancel" class="adminbutton" type="submit" value="cancel" />"""
body_content += """
</td>
</tr>
</table>
</form>\n"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header=formheader, datalist=[body_content])
return output
def _tmpl_configire_doctype_overview_create_doctype_details(self, doctype="", doctypename="", doctypedescr="",
doctype_cdate="", doctype_mdate="", perform_act="doctypeconfigure"
):
"""Display the details of a document type"""
txt = """
<table class="admin_wvar" rules="rows" width="100%%">
<thead>
<tr style="border-bottom: hidden">
<th class="adminheaderleft" colspan="2">
%(doctype_id)s Document Type Details:
</th>
</tr>
</thead>
<tbody>
<tr style="border-top: hidden; border-bottom: hidden">
<td width="20%%">&nbsp;</td>
<td width="80%%">&nbsp;</td>
</tr>
<tr>
<td width="20%%" style="border-bottom: hidden"><span class="adminlabel">Document Type ID:</span></td>
<td width="80%%"><span class="info">%(doctype_id)s</span></td>
</tr>
<tr>
<td width="20%%" style="border-bottom: hidden"><span class="adminlabel">Creation Date:</span></td>
<td width="80%%"><span class="info">%(doctype_cdate)s</span></td>
</tr>
<tr>
<td width="20%%" style="border-bottom: hidden"><span class="adminlabel">Modification Date:</span></td>
<td width="80%%"><span class="info">%(doctype_mdate)s</span></td>
</tr>
<tr>
<td width="20%%" style="border-top: hidden; border-bottom: hidden"><span class="adminlabel">Document Type Name:</span></td>
<td width="80%%"><span>%(doctype_name)s</span></td>
</tr>
<tr style="border-bottom: hidden">
<td width="20%%" style="border-top: hidden"><span class="adminlabel">Document Type Description:</span></td>
<td width="80%%"><span>%(doctype_descr)s</span></td>
</tr>
<tr style="border-top: hidden">
<td colspan="2">
<form method="post" action="%(adminurl)s/%(performaction)s">
<input name="doctype" type="hidden" value="%(doctype_id)s" />
<input name="doctypedetailsedit" class="adminbutton" type="submit" value="Edit Details" />
</form>
</td>
</tr>
</tbody>
</table>\n""" % { 'doctype_id' : cgi.escape(doctype, 1),
'doctype_cdate' : "%s" % ((doctype_cdate not in ("", None) and cgi.escape(str(doctype_cdate), 1)) or (""),),
'doctype_mdate' : "%s" % ((doctype_mdate not in ("", None) and cgi.escape(str(doctype_mdate), 1)) or (""),),
'doctype_name' : cgi.escape(doctypename, 1),
'doctype_descr' : doctypedescr,
'performaction' : cgi.escape(perform_act, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1)
}
return txt
def _tmpl_configure_doctype_overview_create_categories_view(self,
doctype="",
doctype_categories="",
jumpcategout="",
perform_act="doctypeconfigure"
):
"""Display the details of the categories for a given document type"""
## sanity checking for categories list:
if type(doctype_categories) not in (list, tuple):
doctype_categories = ()
txt = """
<table class="admin_wvar" width="100%%">
<thead>
<tr>
<th class="adminheaderleft">
Categories of Document Type %(doctype_id)s:
</th>
</tr>
</thead>
<tbody>
<tr>
<td><br />""" % { 'doctype_id' : cgi.escape(doctype, 1) }
modify_categ_txt = ""
try:
categs_tableheader = ["Categ ID", "Description", "&nbsp;", "&nbsp;", "&nbsp;", "&nbsp;", "&nbsp;", "&nbsp;"]
categs_tablebody = []
num_categs = len(doctype_categories)
for i in range(0, num_categs):
this_categname = doctype_categories[i][0]
this_categdescr = doctype_categories[i][1]
this_categscore = doctype_categories[i][2]
t_row = ["""&nbsp;&nbsp;%s""" % cgi.escape(this_categname, 1),
"""&nbsp;&nbsp;%s""" % cgi.escape(this_categdescr, 1)]
## up arrow:
if i != 0:
t_row += ["""<a href="%(adminurl)s/%(performaction)s?doctype=%(doctype)s&categid=%(categid)s&"""\
"""movecategup=1">"""\
"""<img border="0" src="%(siteurl)s/img/smallup.gif" title="Move Category Up" /></a>""" \
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'categid' : cgi.escape(str(this_categname), 1),
}
]
else:
## this is the first category - don't provide an arrow to move it up
t_row += ["&nbsp;"]
## down arrow:
if i != num_categs - 1:
t_row += ["""<a href="%(adminurl)s/%(performaction)s?doctype=%(doctype)s&categid=%(categid)s&"""\
"""movecategdown=1">"""\
"""<img border="0" src="%(siteurl)s/img/smalldown.gif" title="Move Category Down" /></a>""" \
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'categid' : cgi.escape(str(this_categname), 1),
}
]
else:
## this is the first function - don't provide an arrow to move it up
t_row += ["&nbsp;"]
## 'jump-out' arrow:
if jumpcategout in ("", None):
## provide "move from" arrows for all categories:
if num_categs > 1:
t_row += ["""<a href="%(adminurl)s/%(performaction)s?doctype=%(doctype)s&jumpcategout=%(categid)s">"""\
"""<img border="0" src="%(siteurl)s/img/move_from.gif" title="Move category [%(categid)s] """\
"""from score %(categscore)s" /></a>"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'categid' : cgi.escape(str(this_categname), 1),
'categscore' : cgi.escape(str(this_categscore), 1),
}
]
else:
t_row += ["&nbsp;"]
else:
## there is a value for "jumpcategout", so a "moveto" button must be provided
if num_categs > 1:
## is this the categ that will be moved?
if jumpcategout == this_categname:
## yes it is - no "move-to" arrow here
t_row += ["&nbsp;"]
else:
## no it isn't - "move-to" arrow here
t_row += ["""<a href="%(adminurl)s/%(performaction)s?doctype=%(doctype)s"""\
"""&jumpcategout=%(jumpcategout)s&jumpcategin=%(categid)s">"""\
"""<img border="0" src="%(siteurl)s/img/move_to.gif" title="Move category"""\
""" [%(jumpcategout)s] to this location" /></a>"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'categid' : cgi.escape(str(this_categname), 1),
'jumpcategout' : cgi.escape(str(jumpcategout), 1),
}
]
else:
## there is only 1 category - cannot perform a "move"
t_row += ["&nbsp;"]
## 'edit' button:
t_row += ["""<form class="hyperlinkform" method="post" action="%(adminurl)s/%(performaction)s">""" \
"""<input class="hyperlinkformHiddenInput" name="doctype" value="%(doctype)s" type""" \
"""="hidden" />""" \
"""<input class="hyperlinkformHiddenInput" name="categid" value="%(category)s" type""" \
"""="hidden" />""" \
"""<input type="submit" name="doctypecategoryedit" value="edit" """\
"""class="hyperlinkformSubmitButton" />""" \
"""</form>""" % { 'doctype' : cgi.escape(doctype, 1),
'category' : cgi.escape(str(this_categname), 1),
'performaction' : cgi.escape(perform_act, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
}
]
## 'delete' button:
t_row += ["""<form class="hyperlinkform" method="post" action="%(adminurl)s/%(performaction)s">""" \
"""<input class="hyperlinkformHiddenInput" name="doctype" value="%(doctype)s" type""" \
"""="hidden" />""" \
"""<input class="hyperlinkformHiddenInput" name="categid" value="%(category)s" type""" \
"""="hidden" />""" \
"""<input type="submit" name="doctypecategorydelete" value="delete" """\
"""class="hyperlinkformSubmitButton" />""" \
"""</form>""" % { 'doctype' : cgi.escape(doctype, 1),
'category' : cgi.escape(str(this_categname), 1),
'performaction' : cgi.escape(perform_act, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
}
]
## 'jumping-out from' arrow:
if jumpcategout not in ("", None):
if jumpcategout == this_categname and num_categs > 1:
t_row += ["""<img border="0" src="%(siteurl)s/img/move_from.gif" title="Moving category """\
"""[%(categid)s] from this location (score %(categscore)s)" />"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'categid' : cgi.escape(str(this_categname), 1),
'categscore' : cgi.escape(str(this_categscore), 1),
}
]
else:
t_row += ["&nbsp;"]
else:
t_row += ["&nbsp;"]
## finally, append the newly created row to the tbody list:
categs_tablebody.append(t_row)
txt += create_html_table_from_tuple(tableheader=categs_tableheader, tablebody=categs_tablebody)
except IndexError:
## categs tuple was not in expected format ((sname, lname), (sname, lname)[, ...])
txt += """<span class="info">Unable to correctly display categories</span>"""
txt += """</td>
</tr>
<tr>
<td><br />
</td>
</tr>"""
## form to add a new category:
txt += """
<tr>
<td>
<span class="adminlabel">Add a new Category:</span><br />
<form method="post" action="%(adminurl)s/%(formaction)s">
<input name="doctype" type="hidden" value="%(doctype)s" />
<small style="color: navy;">ID:&nbsp;</small>
<input style="margin: 5px 10px 5px 10px;" name="categid" type="text" size="10" />&nbsp;
<small style="color: navy;">Description:&nbsp;</small>
<input style="margin: 5px 10px 5px 10px;" name="categdescr" type="text" size="25" />&nbsp;
<input name="doctypecategoryadd" class="adminbutton" type="submit" value="Add Category" />
</form>
</td>
</tr>
</tbody>
</table>""" % { 'formaction' : cgi.escape(perform_act, 1),
'doctype' : cgi.escape(doctype, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1)
}
return txt
def _tmpl_configure_doctype_overview_create_submissions_view(self,
doctype="",
doctype_submissions="",
add_actions_list=None,
perform_act="doctypeconfigure"
):
"""Display the details of the submissions for a given document type"""
## sanity checking for submissions list:
if type(doctype_submissions) not in (list, tuple):
doctype_submissions = ()
if type(add_actions_list) not in (list, tuple):
add_actions_list = ()
txt = """
<table class="admin_wvar" width="100%%">
<thead>
<tr>
<th class="adminheaderleft">
Submissions of Document Type %(doctype_id)s:
</th>
</tr>
</thead>
<tbody>
<tr>
<td><br />""" % { 'doctype_id' : cgi.escape(doctype, 1) }
try:
submissions_tableheader = ["Action", "Creation<br />Date", "Modification<br />Date", "Displayed?", "No.<br />Pages", \
"Button<br />Order", "Status<br />Text", "Level", "Score", "Stpage", "End<br />Text", \
"View Submission<br />Interface", "View Submission<br />Functions", \
"Edit Submission<br />Details", "Delete<br />Submission"]
submissions_tablebody = []
for subm in doctype_submissions:
submissions_tablebody.append( ("%s" % (cgi.escape(str(subm[2]), 1),),
"%s" % (cgi.escape(str(subm[5]), 1),),
"%s" % (cgi.escape(str(subm[6]), 1),),
"%s" % (cgi.escape(str(subm[3]), 1),),
"%s" % (cgi.escape(str(subm[4]), 1),),
"%s" % (cgi.escape(str(subm[7]), 1),),
"%s" % (cgi.escape(str(subm[8]), 1),),
"%s" % (cgi.escape(str(subm[9]), 1),),
"%s" % (cgi.escape(str(subm[10]), 1),),
"%s" % (cgi.escape(str(subm[11]), 1),),
"%s" % (cgi.escape(str(subm[12]), 1),),
"""<form class="hyperlinkform" method="get" action="%(adminurl)s/doctypeconfiguresubmissionpages">""" \
"""<input class="hyperlinkformHiddenInput" name="doctype" value="%(doctype)s" type""" \
"""="hidden" />""" \
"""<input class="hyperlinkformHiddenInput" name="action" value="%(action)s" type""" \
"""="hidden" />""" \
"""<input type="submit" name="viewSubmissionInterface" value="view interface" """\
"""class="hyperlinkformSubmitButton" />""" \
"""</form>""" % { 'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(str(subm[2]), 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1)
},
"""<form class="hyperlinkform" method="get" action="%(adminurl)s/doctypeconfiguresubmissionfunctions">""" \
"""<input class="hyperlinkformHiddenInput" name="doctype" value="%(doctype)s" type""" \
"""="hidden" />""" \
"""<input class="hyperlinkformHiddenInput" name="action" value="%(action)s" type""" \
"""="hidden" />""" \
"""<input type="submit" name="viewSubmissionFunctions" value="view functions" """\
"""class="hyperlinkformSubmitButton" />""" \
"""</form>""" % { 'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(str(subm[2]), 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1)
},
"""<form class="hyperlinkform" method="get" action="%(adminurl)s/%(formaction)s">""" \
"""<input class="hyperlinkformHiddenInput" name="doctype" value="%(doctype)s" type""" \
"""="hidden" />""" \
"""<input class="hyperlinkformHiddenInput" name="action" value="%(action)s" type""" \
"""="hidden" />""" \
"""<input type="submit" name="doctypesubmissionedit" value="edit submission" """\
"""class="hyperlinkformSubmitButton" />""" \
"""</form>""" % { 'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(str(subm[2]), 1),
'formaction' : cgi.escape(perform_act, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1)
},
"""<form class="hyperlinkform" method="get" action="%(adminurl)s/%(formaction)s">""" \
"""<input class="hyperlinkformHiddenInput" name="doctype" value="%(doctype)s" type""" \
"""="hidden" />""" \
"""<input class="hyperlinkformHiddenInput" name="action" value="%(action)s" type""" \
"""="hidden" />""" \
"""<input type="submit" name="doctypesubmissiondelete" value="delete submission" """\
"""class="hyperlinkformSubmitButton" />""" \
"""</form>""" % { 'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(str(subm[2]), 1),
'formaction' : cgi.escape(perform_act, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1)
}
) )
txt += create_html_table_from_tuple(tableheader=submissions_tableheader, tablebody=submissions_tablebody)
except IndexError:
## submissions tuple was not in expected format
txt += """<span class="info">Unable to correctly display details of submissions</span>"""
txt += """</td>
</tr>"""
## now, display a list of actions that can be added
txt += """
<tr>
<td>
<span class="adminlabel">Add a new Submission:</span><br />"""
if len(add_actions_list) > 0:
txt += """
<form method="get" action="%(adminurl)s/%(performaction)s">
<input type="hidden" name="doctype" value="%(doctype)s" />
%(submissions_list)s
<input name="doctypesubmissionadd" class="adminbutton" type="submit" value="Add Submission" />
</form>""" \
% { 'doctype' : cgi.escape(doctype, 1),
'performaction' : cgi.escape(perform_act, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'submissions_list' : create_html_select_list(select_name="action", option_list=add_actions_list,
css_style="margin: 5px 10px 5px 10px;")
}
else:
txt += """
<br />
<span class="info">No Available Actions to Add</span>"""
txt += """
</td>
</tr>
</tbody>
</table>"""
return txt
def _tmpl_configure_doctype_overview_create_referees_view(self,
doctype="",
doctype_referees="",
perform_act="doctypeconfigure"
):
"""Display the details of the referees of the various categories of a given document type"""
## sanity checking for doctype_referees:
if type(doctype_referees) is not dict:
doctype_referees = {}
txt = """
<table class="admin_wvar" width="100%%">
<thead>
<tr>
<th class="adminheaderleft">
Manage Referees for Document Type %(doctype_id)s:
</th>
</tr>
</thead>
<tbody>
<tr>
<td><br />""" % { 'doctype_id' : cgi.escape(doctype, 1) }
try:
referees_tableheader = ["Referee"]
referees_tablebody = []
referee_roles = doctype_referees.keys()
referee_roles.sort()
for role in referee_roles:
if doctype_referees[role][0] == "*":
referees_tablebody.append( ("""<span style="color: navy;">%s</span>""" % (cgi.escape(doctype_referees[role][1], 1)),
"&nbsp;") )
else:
referees_tablebody.append( ("""<span style="color: navy;">%s (%s)</span>""" % (cgi.escape(doctype_referees[role][0], 1), \
cgi.escape(doctype_referees[role][1], 1)),
"&nbsp;") )
for referee in doctype_referees[role][2]:
referees_tablebody.append( ("""<small>%s</small>""" % (cgi.escape(referee[1], 1),),))
txt += create_html_table_from_tuple(tableheader=referees_tableheader, tablebody=referees_tablebody)
except IndexError:
## referees dictionary was not in expected format
txt += """<span class="info">Unable to correctly display details of referees</span>"""
txt += """
</td>
</tr>
<tr>
<td>
<form method="post" action="%(adminurl)s/referees.py">
<input type="hidden" name="doctype" value="%(doctype_id)s" />
<input name="managerefereesdoctype" class="adminbutton" type="submit" value="Manage Referees" />
</form>
</td>
</tr>
</tbody>
</table>""" % { 'doctype_id' : cgi.escape(doctype, 1),
'performaction' : cgi.escape(perform_act, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL_OLD, 1)
}
return txt
def tmpl_configure_doctype_overview(self, doctype="", doctypename="", doctypedescr="", doctype_cdate="", doctype_mdate="", \
doctype_categories="", jumpcategout="", doctype_submissions="", \
doctype_referees="", user_msg="", add_actions_list=None, perform_act="doctypeconfigure"):
"""TODO : DOCSTRING"""
## sanity checking:
if type(doctype_categories) not in (list, tuple):
doctype_categories = ()
if type(doctype_submissions) not in (list, tuple):
doctype_submissions = ()
if type(add_actions_list) not in (list, tuple):
add_actions_list = ()
output = ""
body_content = ""
output += self._create_user_message_string(user_msg)
## table containing document type details:
body_content += """<br />%s""" % (self._tmpl_configire_doctype_overview_create_doctype_details(doctype=doctype,
doctypename=doctypename,
doctypedescr=doctypedescr,
doctype_cdate=doctype_cdate,
doctype_mdate=doctype_mdate,
perform_act=perform_act
)
)
body_content += """<hr style="width: 80%%;" />"""
## this document type's submissions:
body_content += """<br />%s""" % (self._tmpl_configure_doctype_overview_create_submissions_view(doctype=doctype,
doctype_submissions=doctype_submissions,
add_actions_list=add_actions_list,
perform_act=perform_act
)
)
body_content += """<hr style="width: 80%%;" />"""
## table containing document type's categories:
body_content += """<br />%s""" % (self._tmpl_configure_doctype_overview_create_categories_view(doctype=doctype,
doctype_categories=doctype_categories,
jumpcategout=jumpcategout,
perform_act=perform_act
)
)
body_content += """<hr style="width: 80%%;" />"""
## button for allocation of referees:
body_content += """<br />%s""" % (self._tmpl_configure_doctype_overview_create_referees_view(doctype=doctype,
doctype_referees=doctype_referees,
perform_act=perform_act
)
)
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Configure Document Type:", datalist=[body_content])
return output
def tmpl_display_edit_category_form(self, doctype, categid, categdescr, user_msg="", perform_act="doctypeconfigure"):
output = ""
body_content = "<div>"
output += self._create_user_message_string(user_msg)
body_content += """
<form method="get" action="%(adminurl)s/%(performaction)s">
<table width="90%%">
<tr>
<td width="20%%"><span class="adminlabel">Category Name:</span></td>
<td width="80%%"><span class="info">%(categid)s</span></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Category Description:</span></td>
<td width="80%%"><input type="text" size="60" name="categdescr" value="%(categdescr)s" /></td>
</tr>
<tr>
<td width="20%%">&nbsp;&nbsp;</td>
<td width="80%%">
<input type="hidden" name="doctype" value="%(doctype)s" />
<input type="hidden" name="categid" value="%(categid)s" />
<input name="doctypecategoryeditcommit" class="adminbutton" type="submit" value="Save Details" />
&nbsp;
<input name="doctypecategoryeditcancel" class="adminbutton" type="submit" value="Cancel" />
</td>
</tr>
</table>
</form>
""" % {
'categid' : cgi.escape(categid, 1),
'doctype' : cgi.escape(doctype, 1),
'categdescr' : cgi.escape(categdescr, 1),
'performaction' : cgi.escape(perform_act, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1)
}
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Edit Details of '%(categid)s' Category of '%(doctype)s' Document Type:" \
% { 'doctype' : cgi.escape(doctype, 1), 'categid' : cgi.escape(categid, 1) },
datalist=[body_content])
return output
def tmpl_configuredoctype_add_submissionfunction(self,
doctype,
action,
cursubmissionfunctions,
allWSfunctions,
addfunctionname="",
addfunctionstep="",
addfunctionscore="",
perform_act="doctypeconfiguresubmissionfunctions",
user_msg=""):
## sanity checking:
if type(cursubmissionfunctions) not in (list, tuple):
submissionfunctions = ()
if type(allWSfunctions) not in (list, tuple):
allWSfunctions = ()
output = ""
output += self._create_user_message_string(user_msg)
## display a form to add a function to the submission:
body_content = """
<br />
<table class="admin_wvar" width="55%%">
<thead>
<tr>
<th class="adminheaderleft" colspan="2">
Add function:
</th>
</tr>
</thead>
<tbody>
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">&nbsp;
<form method="get" action="%(adminurl)s/%(performaction)s">
<input name="doctype" type="hidden" value="%(doctype)s" />
<input name="action" type="hidden" value="%(action)s" />
</td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Function Name:</span></td>
<td width="80%%"><span class="info">%(allWSfunctions)s</span></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Step:</span></td>
<td width="80%%"><span class="info"><input name="addfunctionstep" type="text" value="%(step)s" size="5" /></span></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Score:</span></td>
<td width="80%%"><span class="info"><input name="addfunctionscore" type="text" value="%(score)s" size="5" /></span></td>
</tr>
<tr>
<td colspan="2">
<table>
<tr>
<td>
<input name="configuresubmissionaddfunctioncommit" class="adminbutton" type="submit" value="Save Details" />
</form>
</td>
<td>
<br />
<form method="post" action="%(adminurl)s/%(performaction)s">
<input name="doctype" type="hidden" value="%(doctype)s" />
<input name="action" type="hidden" value="%(action)s" />
<input name="configuresubmissionaddfunctioncancel" class="adminbutton" type="submit" value="Cancel" />
</form>
</td>
</tr>
</table>
</td>
</tr>
</table>""" % { 'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1),
'step' : cgi.escape(addfunctionstep, 1),
'score' : cgi.escape(addfunctionscore, 1),
'allWSfunctions' : create_html_select_list(select_name="addfunctionname",
option_list=allWSfunctions,
selected_values=addfunctionname,
default_opt=("", "Select function to add:"))
}
## build a table of the functions currently associated with the submission:
body_content += """<hr />\n"""
header = ["Function Name", "Step", "Score"]
tbody = []
for functn in cursubmissionfunctions:
thisfunctionname = functn[0]
thisfunctionstep = str(functn[1])
thisfunctionscore = str(functn[2])
## function name:
t_row = ["""&nbsp;&nbsp;%s""" % (cgi.escape(thisfunctionname, 1),)]
## function step:
t_row += ["""%s""" % (cgi.escape(thisfunctionstep, 1),) ]
## function score:
t_row += ["""%s""" % (cgi.escape(thisfunctionscore, 1),) ]
## finally, append the newly created row to the tbody list:
tbody.append(t_row)
body_content += """
<table class="admin_wvar" width="55%%">
<thead>
<tr>
<th class="adminheaderleft">
Current submission functions configuration:
</th>
</tr>
</thead>
<tbody>
<tr>
<td width="100%%">&nbsp;</td>
</tr>
<tr>
<td width="100%%">"""
body_content += create_html_table_from_tuple(tableheader=header, tablebody=tbody)
body_content += """
</td>
</tr>
</table>"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="""Add a function to the [%s] submission of the [%s] document type""" \
% (cgi.escape(action, 1), cgi.escape(doctype, 1)), datalist=[body_content])
return output
def tmpl_configuredoctype_display_submissionfunctions(self,
doctype,
action,
submissionfunctions,
movefromfunctionname="",
movefromfunctionstep="",
movefromfunctionscore="",
perform_act="doctypeconfiguresubmissionfunctions",
user_msg=""):
"""Create the page body used for displaying all Websubmit functions.
@param functions: A tuple of tuples containing the function name, and the function
description (function, description).
@param user_msg: Any message to be displayed on screen, such as a status report for the last task, etc.
return: HTML page body.
"""
## sanity checking:
if type(submissionfunctions) not in (list, tuple):
submissionfunctions = ()
output = ""
output += self._create_user_message_string(user_msg)
body_content = """<div><br />\n"""
header = ["Function Name", "&nbsp;", "&nbsp;", "&nbsp;", "Step", "Score", "View Parameters", "Delete", "&nbsp;"]
tbody = []
num_functions = len(submissionfunctions)
for i in range(0, num_functions):
thisfunctionname = submissionfunctions[i][0]
thisfunctionstep = str(submissionfunctions[i][1])
thisfunctionscore = str(submissionfunctions[i][2])
t_row = ["""&nbsp;&nbsp;%s""" % (cgi.escape(thisfunctionname, 1),)]
## up arrow:
if i != 0:
t_row += ["""<a href="%(adminurl)s/%(performaction)s?doctype=%(doctype)s&action=%(action)s&"""\
"""moveupfunctionname=%(func)s&moveupfunctionstep=%(step)s&moveupfunctionscore=%(score)s">"""\
"""<img border="0" src="%(siteurl)s/img/smallup.gif" title="Move Function Up" /></a>""" \
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'func' : cgi.escape(thisfunctionname, 1),
'step' : cgi.escape(thisfunctionstep, 1),
'score' : cgi.escape(thisfunctionscore, 1)
}
]
else:
## this is the first function - don't provide an arrow to move it up
t_row += ["&nbsp;"]
## down arrow:
if num_functions > 1 and i < num_functions - 1:
t_row += ["""<a href="%(adminurl)s/%(performaction)s?doctype=%(doctype)s&action=%(action)s&"""\
"""movedownfunctionname=%(func)s&movedownfunctionstep=%(step)s&movedownfunctionscore=%(score)s">"""\
"""<img border="0" src="%(siteurl)s/img/smalldown.gif" title="Move Function Down" /></a>""" \
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'func' : cgi.escape(thisfunctionname, 1),
'step' : cgi.escape(thisfunctionstep, 1),
'score' : cgi.escape(thisfunctionscore, 1)
}
]
else:
t_row += ["&nbsp;"]
if movefromfunctionname in ("", None):
## provide "move from" arrows for all functions
if num_functions > 1:
t_row += ["""<a href="%(adminurl)s/%(performaction)s?doctype=%(doctype)s&action=%(action)s&"""\
"""movefromfunctionname=%(func)s&movefromfunctionstep=%(step)s&movefromfunctionscore=%(score)s">"""\
"""<img border="0" src="%(siteurl)s/img/move_from.gif" title="Move %(func)s (step %(step)s, score %(score)s)"""\
""" from this location" /></a>"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'func' : cgi.escape(thisfunctionname, 1),
'step' : cgi.escape(thisfunctionstep, 1),
'score' : cgi.escape(thisfunctionscore, 1)
}
]
else:
t_row += ["&nbsp;"]
else:
## there is a value for "movefromfunctionname", so a "moveto" button must be provided
if num_functions > 1:
## is this the function that will be moved?
if movefromfunctionname == thisfunctionname and \
movefromfunctionstep == thisfunctionstep and \
movefromfunctionscore == thisfunctionscore:
## yes it is - no "move-to" arrow here
t_row += ["&nbsp;"]
else:
## no it isn't - "move-to" arrow here
t_row += ["""<a href="%(adminurl)s/%(performaction)s?doctype=%(doctype)s&action=%(action)s&"""\
"""movefromfunctionname=%(fromfunc)s&movefromfunctionstep=%(fromstep)s&movefromfunctionscore=%(fromscore)s&"""\
"""movetofunctionname=%(tofunc)s&movetofunctionstep=%(tostep)s&movetofunctionscore=%(toscore)s">"""\
"""<img border="0" src="%(siteurl)s/img/move_to.gif" title="Move %(fromfunc)s (step %(fromstep)s, score %(fromscore)s)"""\
""" to this location (step %(tostep)s, score %(toscore)s)" /></a>"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'fromfunc' : cgi.escape(movefromfunctionname, 1),
'fromstep' : cgi.escape(movefromfunctionstep, 1),
'fromscore' : cgi.escape(movefromfunctionscore, 1),
'tofunc' : cgi.escape(thisfunctionname, 1),
'tostep' : cgi.escape(thisfunctionstep, 1),
'toscore' : cgi.escape(thisfunctionscore, 1)
}
]
else:
## there is only 1 function - cannot perform a "move"!
t_row += ["&nbsp;"]
## function step:
t_row += ["""%s""" % (cgi.escape(thisfunctionstep, 1),) ]
## function score:
t_row += ["""%s""" % (cgi.escape(thisfunctionscore, 1),) ]
## "view parameters" link:
t_row += ["""<form class="hyperlinkform" method="get" action="%(adminurl)s/doctypeconfiguresubmissionfunctionsparameters">"""\
"""<input class="hyperlinkformHiddenInput" name="doctype" value="%(doctype)s" type="hidden" />"""\
"""<input class="hyperlinkformHiddenInput" name="action" value="%(action)s" type="hidden" />"""\
"""<input class="hyperlinkformHiddenInput" name="functionname" value="%(thisfunctionname)s" type="hidden" />"""\
"""<input type="submit" name="viewfunctionparameters" value="view parameters" class="hyperlinkformSubmitButton" />"""\
"""</form>\n"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'thisfunctionname' : cgi.escape(thisfunctionname, 1)
} ]
## "delete function" link:
t_row += ["""<form class="hyperlinkform" method="get" action="%(adminurl)s/%(performaction)s">"""\
"""<input class="hyperlinkformHiddenInput" name="doctype" value="%(doctype)s" type="hidden" />"""\
"""<input class="hyperlinkformHiddenInput" name="action" value="%(action)s" type="hidden" />"""\
"""<input class="hyperlinkformHiddenInput" name="deletefunctionname" value="%(thisfunctionname)s" type="hidden" />"""\
"""<input class="hyperlinkformHiddenInput" name="deletefunctionstep" value="%(step)s" type="hidden" />"""\
"""<input class="hyperlinkformHiddenInput" name="deletefunctionscore" value="%(score)s" type="hidden" />"""\
"""<input type="submit" name="deletefunction" value="delete" class="hyperlinkformSubmitButton" />"""\
"""</form>\n"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'thisfunctionname' : cgi.escape(thisfunctionname, 1),
'step' : cgi.escape(thisfunctionstep, 1),
'score' : cgi.escape(thisfunctionscore, 1)
} ]
## final column containing "jumping-out from" image when moving a function:
if movefromfunctionname not in ("", None):
if movefromfunctionname == thisfunctionname and \
movefromfunctionstep == thisfunctionstep and \
movefromfunctionscore == thisfunctionscore and \
num_functions > 1:
t_row += ["""<img border="0" src="%(siteurl)s/img/move_from.gif" title="Moving %(fromfunc)s (step %(fromstep)s, """\
"""score %(fromscore)s) from this location" />"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'fromfunc' : cgi.escape(movefromfunctionname, 1),
'fromstep' : cgi.escape(movefromfunctionstep, 1),
'fromscore' : cgi.escape(movefromfunctionscore, 1)
}
]
else:
t_row += ["&nbsp;"]
else:
t_row += ["&nbsp;"]
## finally, append the newly created row to the tbody list:
tbody.append(t_row)
body_content += create_html_table_from_tuple(tableheader=header, tablebody=tbody)
body_content += """</div>"""
## buttons for "add a function" and "finished":
body_content += """
<table>
<tr>
<td>
<br />
<form method="post" action="%(adminurl)s/doctypeconfiguresubmissionfunctions">
<input name="doctype" type="hidden" value="%(doctype)s" />
<input name="action" type="hidden" value="%(action)s" />
<input name="configuresubmissionaddfunction" class="adminbutton" type="submit" value="Add a Function" />
</form>
</td>
<td>
<br />
<form method="post" action="%(adminurl)s/doctypeconfigure">
<input name="doctype" type="hidden" value="%(doctype)s" />
<input name="funishedviewsubmissionfunctions" class="adminbutton" type="submit" value="Finished" />
</form>
</td>
</tr>
</table>""" % { 'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1)
}
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="""Functions of the "%s" Submission of the "%s" Document Type:""" \
% (cgi.escape(action, 1), cgi.escape(doctype, 1)), datalist=[body_content])
return output
def _tmpl_configuredoctype_submissionfield_display_changeable_fields(self,
fieldtext="",
fieldlevel="",
fieldshortdesc="",
fieldcheck="",
allchecks=""):
"""Used when displaying the details of a submission field that is to be edited or inserted onto a
submission page.
This function creates the form elements for the values that can be edited by the user, such as the field's
label, short description, etc. (Examples of details of the submission field that could not be edited by the
user and are therefore not included in this function, are the creation-date/modification-date of the field,
etc.
@param fieldtext: (string) the label used for a field
@param fieldlevel: (char) 'M' or 'O' - whether a field is Mandatory or Optional
@param fieldshortdesc: (string) the short description of a field
@param fieldcheck: (string) the JavaScript checking function applied to a field
@param allchecks: (tuple of strings) the names of all WebSubmit JavaScript checks
@return: (string) a section of a form
"""
## sanity checking
if type(allchecks) not in (tuple, list):
allchecks = []
## make form-section
txt = """
<tr>
<td width="20%%"><span class="adminlabel">Field Label:</span></td>
<td width="80%%"><br /><textarea name="fieldtext" rows="5" cols="50">%(fieldtext)s</textarea><br /><br /></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Field Level:</span></td>
<td width="80%%"><span>%(fieldlevel)s</span></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Field Short Description:</span></td>
<td width="80%%"><br /><input type="text" size="35" name="fieldshortdesc" value="%(fieldshortdesc)s" /><br /><br /></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">JavaScript Check:</span></td>
<td width="80%%"><span>%(fieldcheck)s</span></td>
</tr>""" % { 'fieldtext' : cgi.escape(fieldtext, 1),
'fieldlevel' : create_html_select_list(select_name="fieldlevel",
option_list=(("M", "Mandatory"), ("O", "Optional")),
selected_values=fieldlevel
),
'fieldshortdesc' : cgi.escape(fieldshortdesc, 1),
'fieldcheck' : create_html_select_list(select_name="fieldcheck",
option_list=allchecks,
selected_values=fieldcheck,
default_opt=("", "--NO CHECK--")
)
}
return txt
def tmpl_configuredoctype_add_submissionfield(self,
doctype="",
action="",
pagenum="",
fieldname="",
fieldtext="",
fieldlevel="",
fieldshortdesc="",
fieldcheck="",
allchecks="",
allelements="",
user_msg="",
perform_act="doctypeconfiguresubmissionpageelements"):
## sanity checking
if type(allelements) not in (tuple, list):
allelements = []
## begin template:
output = ""
body_content = ""
output += self._create_user_message_string(user_msg)
body_content += """
<table class="admin_wvar" width="95%%">
<thead>
<tr>
<th class="adminheaderleft" colspan="2">
Add a field to page %(pagenum)s of submission %(submission)s
</th>
</tr>
</thead>
<tbody>
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">&nbsp;<form method="get" action="%(adminurl)s/%(performaction)s"></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Page Number:</span></td>
<td width="80%%"><span class="info">%(pagenum)s</span></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Field Name:</span></td>
<td width="80%%">%(fieldname)s</td>
</tr>""" % { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'fieldname' : create_html_select_list(select_name="fieldname",
option_list=allelements,
selected_values=fieldname,
default_opt=("", "Select a Field:")
),
'pagenum' : cgi.escape(pagenum, 1),
'submission' : cgi.escape("%s%s" % (action, doctype), 1),
'performaction' : cgi.escape(perform_act, 1)
}
body_content += self._tmpl_configuredoctype_submissionfield_display_changeable_fields(fieldtext=fieldtext,
fieldlevel=fieldlevel,
fieldshortdesc=fieldshortdesc,
fieldcheck=fieldcheck,
allchecks=allchecks)
body_content += """
<tr>
<td colspan="2">
<table>
<tr>
<td>
<input name="doctype" type="hidden" value="%(doctype)s" />
<input name="action" type="hidden" value="%(action)s" />
<input name="pagenum" type="hidden" value="%(pagenum)s" />
<input name="addfieldcommit" class="adminbutton" type="submit" value="Add Field" />
</form>
</td>
<td>
<br />
<form method="post" action="%(adminurl)s/%(performaction)s">
<input name="doctype" type="hidden" value="%(doctype)s" />
<input name="action" type="hidden" value="%(action)s" />
<input name="pagenum" type="hidden" value="%(pagenum)s" />
<input name="canceladdsubmissionfield" class="adminbutton" type="submit" value="Cancel" />
</form>
</td>
</tr>
</table>
</td>
</tr>
</tbody>
</table>\n""" % { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(pagenum, 1)
}
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Field Details:", datalist=[body_content])
return output
def tmpl_configuredoctype_edit_submissionfield(self,
doctype="",
action="",
pagenum="",
fieldnum="",
fieldname="",
fieldtext="",
fieldlevel="",
fieldshortdesc="",
fieldcheck="",
cd="",
md="",
allchecks="",
user_msg="",
perform_act="doctypeconfiguresubmissionpageelements"):
## begin template:
output = ""
body_content = ""
output += self._create_user_message_string(user_msg)
body_content += """
<table class="admin_wvar" width="95%%">
<thead>
<tr>
<th class="adminheaderleft" colspan="2">
Details of the %(fieldname)s field as it appears at position %(fieldnum)s on Page %(pagenum)s of the %(submission)s Submission:
</th>
</tr>
</thead>
<tbody>
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">&nbsp;<form method="get" action="%(adminurl)s/%(performaction)s"></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Page Number:</span></td>
<td width="80%%"><span class="info">%(pagenum)s</span></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Field Number:</span></td>
<td width="80%%"><span class="info">%(fieldnum)s</span></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Field Name:</span></td>
<td width="80%%"><span class="info">%(fieldname)s</span></td>
</tr>""" % { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'pagenum' : cgi.escape(pagenum, 1),
'fieldnum' : cgi.escape(fieldnum, 1),
'fieldname' : cgi.escape(fieldname, 1),
'submission' : cgi.escape("%s%s" % (action, doctype), 1),
'performaction' : cgi.escape(perform_act, 1)
}
## field creation date:
if cd not in ("", None):
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Creation Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(cd), 1),)
## field last-modified date:
if md not in ("", None):
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Last Modification Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(md), 1), )
body_content += self._tmpl_configuredoctype_submissionfield_display_changeable_fields(fieldtext=fieldtext,
fieldlevel=fieldlevel,
fieldshortdesc=fieldshortdesc,
fieldcheck=fieldcheck,
allchecks=allchecks)
body_content += """
<tr>
<td colspan="2">
<table>
<tr>
<td>
<input name="doctype" type="hidden" value="%(doctype)s" />
<input name="action" type="hidden" value="%(action)s" />
<input name="pagenum" type="hidden" value="%(pagenum)s" />
<input name="editfieldposn" type="hidden" value="%(fieldnum)s" />
<input name="editfieldposncommit" class="adminbutton" type="submit" value="Save Changes" />
</form>
</td>
<td>
<br />
<form method="post" action="%(adminurl)s/%(performaction)s">
<input name="doctype" type="hidden" value="%(doctype)s" />
<input name="action" type="hidden" value="%(action)s" />
<input name="pagenum" type="hidden" value="%(pagenum)s" />
<input name="canceleditsubmissionfield" class="adminbutton" type="submit" value="Cancel" />
</form>
</td>
</tr>
</table>
</td>
</tr>
</tbody>
</table>\n""" % { 'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(pagenum, 1),
'fieldnum' : cgi.escape(fieldnum, 1),
'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'performaction' : cgi.escape(perform_act, 1)
}
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Field Details:", datalist=[body_content])
return output
def tmpl_configuredoctype_display_submissionpage_preview(self, doctype, action, pagenum, fields, user_msg=""):
"""Create a page displaying a simple preview of a submission page
@param doctype: (string) the unique ID of a document type
@param action: (string) the unique ID of an action
@param pagenum: (string) the number of the page that is to be previewed
@param fields: a tuple of tuples, whereby each tuple contains the details of a field on the submission page:
(fieldname, check-name, field-type, size, rows, cols, field-description)
@param user_msg: a tuple or string, containing any message(s) to be displayed to the user
@return: a string, which makes up the page body
"""
## Sanity Checking of elements:
if type(fields) not in (list, tuple):
fields = ()
try:
if type(fields[0]) not in (tuple, list):
fields = ()
except IndexError:
pass
## begin template:
output = ""
body_content = ""
output += self._create_user_message_string(user_msg)
## hyperlink back to page details:
body_content += """
<div style="text-align: center;">
<a href="%(adminurl)s/doctypeconfiguresubmissionpageelements?doctype=%(doctype)s&action=%(action)s&pagenum=%(pagenum)s">
Return to details of page [%(pagenum)s] of submission [%(submission)s]</a>
</div>
<hr />""" % { 'adminurl' : WEBSUBMITADMINURL,
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(pagenum, 1),
'submission' : cgi.escape("%s%s" % (action, doctype), 1)
}
body_content += """<div><br />
<form name="dummyeldisplay" action="%(adminurl)s">
<table class="admin_wvar" align="center">
<thead>
<tr>
<th class="adminheaderleft" colspan="1">
Page Preview:
</th>
</tr>
</thead>
<tbody>
<tr bgcolor="#f1f1f1">
<td>
<br />&nbsp;&nbsp;
""" % {'adminurl' : WEBSUBMITADMINURL}
for field in fields:
body_content += self._element_display_preview_get_element(elname=field[0], eltype=field[3], elsize=field[4],
elrows=field[5], elcols=field[6], elval=field[8],
elfidesc=field[7], ellabel=field[1])
body_content += "\n"
body_content += """&nbsp;&nbsp;<br />
</td>
</tr>
</tbody>
</table>
</form>
</div>"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Preview of Page %s of Submission %s:" \
% (pagenum, "%s%s" % (action, doctype)), datalist=[body_content])
return output
def tmpl_configuredoctype_list_submissionelements(self,
doctype,
action,
pagenum,
page_elements,
movefieldfromposn="",
user_msg=""):
## Sanity Checking of elements:
if type(page_elements) not in (list, tuple):
page_elements = ()
try:
if type(page_elements[0]) not in (tuple, list):
page_elements = ()
except IndexError:
pass
try:
int(movefieldfromposn)
except ValueError:
movefieldfromposn = ""
## begin template:
output = ""
body_content = ""
output += self._create_user_message_string(user_msg)
number_elements = len(page_elements)
if number_elements > 0:
body_content += """
<table width="100%%" class="admin_wvar">
<tbody>
<tr>
<td style="text-align: center;">
<br />
<form method="get" action="%(adminurl)s/doctypeconfiguresubmissionpagespreview">
<input type="hidden" name="doctype" value="%(doctype_id)s" />
<input type="hidden" name="action" value="%(action)s" />
<input type="hidden" name="pagenum" value="%(pagenum)s" />
<input name="viewsubmissionpagepreview" class="adminbutton" type="submit" value="View Page Preview" />
</form>
</td>
</tr>
</table>""" % { 'adminurl' : WEBSUBMITADMINURL,
'doctype_id' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(pagenum, 1)
}
t_header = ["&nbsp;", "&nbsp;", "&nbsp;", "&nbsp;", "Name", "Element Label",
"Level", "Short Descr.", "Check", "Creation Date", "Modification Date", "&nbsp;",
"&nbsp;", "&nbsp;", "&nbsp;"]
t_body = []
for i in range(0, number_elements):
## Field number:
t_row = ["""%s""" % (cgi.escape(page_elements[i][1], 1),) ]
## Move a field from posn - to posn arrows:
if movefieldfromposn in ("", None):
## provide "move from" arrow for all element
if number_elements > 1:
t_row += ["""<a href="%(adminurl)s/doctypeconfiguresubmissionpageelements?doctype=%(doctype)s&action=%(action)s&"""\
"""pagenum=%(pagenum)s&movefieldfromposn=%(fieldnum)s">"""\
"""<img border="0" src="%(siteurl)s/img/move_from.gif" title="Move field at position %(fieldnum)s"""\
""" from this location" /></a>"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(pagenum, 1),
'fieldnum' : cgi.escape(page_elements[i][1], 1)
}
]
else:
t_row += ["&nbsp;"]
else:
## there is a value for "movefieldfromposn", so a "moveto" button must be provided
if number_elements > 1:
## is this the field that will be moved?
if movefieldfromposn == page_elements[i][1]:
## yes it is - no "move-to" arrow here
t_row += ["&nbsp;"]
else:
## no it isn't - "move-to" arrow here
t_row += ["""<a href="%(adminurl)s/doctypeconfiguresubmissionpageelements?doctype=%(doctype)s&action=%(action)s&"""\
"""pagenum=%(pagenum)s&movefieldfromposn=%(movefieldfromposn)s&movefieldtoposn=%(fieldnum)s">"""\
"""<img border="0" src="%(siteurl)s/img/move_to.gif" title="Move field at position %(movefieldfromposn)s"""\
""" to this location at position %(fieldnum)s" /></a>"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(pagenum, 1),
'fieldnum' : cgi.escape(page_elements[i][1], 1),
'movefieldfromposn' : cgi.escape(movefieldfromposn, 1)
}
]
else:
## there is only 1 field - cannot perform a "move"!
t_row += ["&nbsp;"]
## up arrow:
if i != 0:
t_row += ["""<a href="%(adminurl)s/doctypeconfiguresubmissionpageelements?doctype=%(doctype)s&action=%(action)s&"""\
"""pagenum=%(pagenum)s&movefieldfromposn=%(fieldnum)s&movefieldtoposn=%(previousfield)s">"""\
"""<img border="0" src="%(siteurl)s/img/smallup.gif" title="Move Element Up" /></a>"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(pagenum, 1),
'fieldnum' : cgi.escape(page_elements[i][1], 1),
'previousfield' : cgi.escape(str(int(page_elements[i][1])-1), 1)
}
]
else:
## first element - don't provide up arrow:
t_row += ["&nbsp;"]
## down arrow:
if number_elements > 1 and i < number_elements - 1:
t_row += ["""<a href="%(adminurl)s/doctypeconfiguresubmissionpageelements?doctype=%(doctype)s&action=%(action)s&"""\
"""pagenum=%(pagenum)s&movefieldfromposn=%(fieldnum)s&movefieldtoposn=%(nextfield)s">"""\
"""<img border="0" src="%(siteurl)s/img/smalldown.gif" title="Move Element Down" /></a>"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(pagenum, 1),
'fieldnum' : cgi.escape(page_elements[i][1], 1),
'nextfield' : cgi.escape(str(int(page_elements[i][1])+1), 1)
}
]
else:
t_row += ["&nbsp;"]
## Element Name:
t_row += ["""<span class="info">%s</span>""" % (cgi.escape(str(page_elements[i][2]), 1),) ]
## Element Label:
t_row += ["""%s""" % (cgi.escape(str(page_elements[i][3]), 1),) ]
## Level:
t_row += ["""%s""" % (cgi.escape(str(page_elements[i][4]), 1),) ]
## Short Descr:
t_row += ["""%s""" % (cgi.escape(str(page_elements[i][5]), 1),) ]
## Check:
t_row += ["""%s""" % (cgi.escape(str(page_elements[i][6]), 1),) ]
## Creation Date:
if page_elements[i][7] not in ("", None):
t_row += ["%s" % (cgi.escape(str(page_elements[i][7]), 1),)]
else:
t_row += ["&nbsp;"]
## Modification Date:
if page_elements[i][8] not in ("", None):
t_row += ["%s" % (cgi.escape(str(page_elements[i][8]), 1),)]
else:
t_row += ["&nbsp;"]
## View/Edit field:
t_row += ["""<a href="%(adminurl)s/doctypeconfiguresubmissionpageelements?doctype=%(doctype)s&action=%(action)s&"""\
"""pagenum=%(pagenum)s&editfieldposn=%(fieldnum)s"><small>edit</small></a>"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(pagenum, 1),
'fieldnum' : cgi.escape(page_elements[i][1], 1)
}
]
## Delete Element from page:
t_row += ["""<a href="%(adminurl)s/doctypeconfiguresubmissionpageelements?doctype=%(doctype)s&action=%(action)s&"""\
"""pagenum=%(pagenum)s&deletefieldposn=%(fieldnum)s"><small>delete</small></a>"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(pagenum, 1),
'fieldnum' : cgi.escape(page_elements[i][1], 1)
}
]
## View/Edit Element Definition:
t_row += ["""<a href="%(adminurl)s/elementedit?elname=%(elementname)s&doctype=%(doctype)s&action=%(action)s&"""\
"""pagenum=%(pagenum)s"><small>element</small></a>"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(pagenum, 1),
'elementname' : cgi.escape(page_elements[i][2], 1)
}
]
## Jump element out-from:
t_row += ["&nbsp;"]
## final column containing "jumping-out from" image when moving a field:
if movefieldfromposn not in ("", None):
if movefieldfromposn == page_elements[i][1] and number_elements > 1:
t_row += ["""<img border="0" src="%(siteurl)s/img/move_from.gif" title="Move field at position %(fieldnum)s"""\
""" from this location" />"""\
% { 'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'fieldnum' : cgi.escape(page_elements[i][1], 1)
}
]
else:
t_row += ["&nbsp;"]
else:
t_row += ["&nbsp;"]
## finally, append the newly created row to the tbody list:
t_body.append(t_row)
## now create the table and include it into the page body:
body_content += """
<table width="100%%">
<tr>
<td colspan="2"><br />"""
body_content += create_html_table_from_tuple(tableheader=t_header, tablebody=t_body)
body_content += """
<br />
</td>
</tr>"""
body_content += """
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">
<table>
<tr>
<td>
<form method="get" action="%(adminurl)s/doctypeconfiguresubmissionpageelements">
<input type="hidden" name="doctype" value="%(doctype_id)s" />
<input type="hidden" name="action" value="%(action)s" />
<input type="hidden" name="pagenum" value="%(pagenum)s" />
<input name="addfield" class="adminbutton" type="submit" value="Add a Field" />
</form>
</td>
<td>
<form method="get" action="%(adminurl)s/doctypeconfiguresubmissionpages">
<input type="hidden" name="doctype" value="%(doctype_id)s" />
<input type="hidden" name="action" value="%(action)s" />
<input name="finishedviewfields" class="adminbutton" type="submit" value="Finished" />
</form>
</td>
</tr>
</table>
</td>
</tr>""" % { 'adminurl' : WEBSUBMITADMINURL,
'doctype_id' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(pagenum, 1)
}
body_content += """
</table>"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Submission Page Details:", datalist=[body_content])
return output
def tmpl_configuredoctype_edit_functionparameter_file(self, doctype, action, function, paramfilename,
paramfilecontent, paramname="", user_msg=""):
## begin template:
output = ""
body_content = ""
output += self._create_user_message_string(user_msg)
body_content += """
<table class="admin_wvar" width="95%%">
<tbody>
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">&nbsp;<form method="post" action="%(adminurl)s/doctypeconfiguresubmissionfunctionsparameters"></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Parameter Value:</span></td>
<td width="80%%"><textarea cols="115" rows="22" name="paramfilecontent">%(paramfilecontent)s</textarea></td>
</tr>
<tr>
<td colspan="2">
<table>
<tr>
<td>
<input name="doctype" type="hidden" value="%(doctype)s" />
<input name="action" type="hidden" value="%(action)s" />
<input name="functionname" type="hidden" value="%(function)s" />
<input name="paramname" type="hidden" value="%(paramname)s" />
<input name="paramfilename" type="hidden" value="%(paramfilename)s" />
<input name="editfunctionparameterfilecommit" class="adminbutton" type="submit" value="Save Changes" />
</form>
</td>
<td>
<br />
<form method="post" action="%(adminurl)s/doctypeconfiguresubmissionfunctionsparameters">
<input name="doctype" type="hidden" value="%(doctype)s" />
<input name="action" type="hidden" value="%(action)s" />
<input name="functionname" type="hidden" value="%(function)s" />
<input name="editfunctionparameterfilecancel" class="adminbutton" type="submit" value="Cancel" />
</form>
</td>
</tr>
</table>
</td>
</tr>
</tbody>
</table>\n""" % { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'function' : cgi.escape(function, 1),
'paramname' : cgi.escape(paramname, 1),
'paramfilename' : cgi.escape(paramfilename, 1),
'paramfilecontent' : cgi.escape(paramfilecontent, 1)
}
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Edit the [%s] parameter file:" % (paramfilename,), datalist=[body_content])
return output
def tmpl_configuredoctype_edit_functionparameter_value(self,
doctype,
action,
function,
paramname,
paramval,
user_msg=""):
## begin template:
output = ""
body_content = ""
output += self._create_user_message_string(user_msg)
body_content += """
<table class="admin_wvar" width="95%%">
<tbody>
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">&nbsp;<form method="get" action="%(adminurl)s/doctypeconfiguresubmissionfunctionsparameters"></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Parameter Value:</span></td>
<td width="80%%"><input type="text" size="35" name="paramval" value="%(paramval)s" /></td>
</tr>
<tr>
<td colspan="2">
<table>
<tr>
<td>
<input name="doctype" type="hidden" value="%(doctype)s" />
<input name="action" type="hidden" value="%(action)s" />
<input name="functionname" type="hidden" value="%(function)s" />
<input name="paramname" type="hidden" value="%(paramname)s" />
<input name="editfunctionparametervaluecommit" class="adminbutton" type="submit" value="Save Changes" />
</form>
</td>
<td>
<br />
<form method="post" action="%(adminurl)s/doctypeconfiguresubmissionfunctionsparameters">
<input name="doctype" type="hidden" value="%(doctype)s" />
<input name="action" type="hidden" value="%(action)s" />
<input name="functionname" type="hidden" value="%(function)s" />
<input name="editfunctionparametervaluecancel" class="adminbutton" type="submit" value="Cancel" />
</form>
</td>
</tr>
</table>
</td>
</tr>
</tbody>
</table>\n""" % { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'function' : cgi.escape(function, 1),
'paramname' : cgi.escape(paramname, 1),
'paramval' : cgi.escape(paramval, 1)
}
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Edit the value of the %s Parameter:" % (paramname,), datalist=[body_content])
return output
def tmpl_configuredoctype_list_functionparameters(self,
doctype,
action,
function,
params,
user_msg=""):
"""Display the parameters and their values for a given function as applied to a given document type
"""
linktoparamfile = 0
## sanity checking:
if type(params) not in (list, tuple):
params = ()
## make table of function parameters:
if function in FUNCTIONS_WITH_FILE_PARAMS:
linktoparamfile = 1
t_header = ["Parameter Name", "Parameter Value", "Edit Parameter", "%s" \
% ((linktoparamfile == 1 and "Edit File") or ("&nbsp;"),)]
t_body = []
num_params = len(params)
for i in range(0, num_params):
thisparamname = params[i][0]
thisparamval = params[i][1]
## parameter name:
t_row = ["""&nbsp;&nbsp;%s""" % (cgi.escape(thisparamname, 1),)]
## parameter value:
t_row += ["""&nbsp;&nbsp;<span class="info">%s</span>""" % (cgi.escape(thisparamval, 1),)]
## button to edit parameter value:
t_row += ["""<form class="hyperlinkform" method="get" action="%(adminurl)s/doctypeconfiguresubmissionfunctionsparameters">"""\
"""<input class="hyperlinkformHiddenInput" name="doctype" value="%(doctype)s" type="hidden" />"""\
"""<input class="hyperlinkformHiddenInput" name="action" value="%(action)s" type="hidden" />"""\
"""<input class="hyperlinkformHiddenInput" name="functionname" value="%(function)s" type="hidden" />"""\
"""<input class="hyperlinkformHiddenInput" name="paramname" value="%(thisparamname)s" type="hidden" />"""\
"""<input type="submit" name="editfunctionparametervalue" value="edit value" class="hyperlinkformSubmitButton" />"""\
"""</form>\n"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'function' : cgi.escape(function, 1),
'thisparamname' : cgi.escape(thisparamname, 1)
} ]
## button to edit the value of a parameter's file:
editstr = """<form class="hyperlinkform" method="get" action="%(adminurl)s/doctypeconfiguresubmissionfunctionsparameters">"""\
"""<input class="hyperlinkformHiddenInput" name="doctype" value="%(doctype)s" type="hidden" />"""\
"""<input class="hyperlinkformHiddenInput" name="action" value="%(action)s" type="hidden" />"""\
"""<input class="hyperlinkformHiddenInput" name="functionname" value="%(function)s" type="hidden" />"""\
"""<input class="hyperlinkformHiddenInput" name="paramname" value="%(thisparamname)s" type="hidden" />"""\
"""<input type="submit" name="editfunctionparameterfile" value="edit file" class="hyperlinkformSubmitButton" />"""\
"""</form>\n"""\
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'function' : cgi.escape(function, 1),
'thisparamname' : cgi.escape(thisparamname, 1)
}
t_row += ["%s" % ((linktoparamfile == 1 and editstr) or ("&nbsp;"),)]
## finally, append the newly created row to the tbody list:
t_body.append(t_row)
## create display of page
output = ""
output += self._create_user_message_string(user_msg)
body_content = """
<table class="admin_wvar" width="100%%">
<tbody>
<tr>
<td>
<br />
%(paramstable)s
<br />
</td>
</tr>
<tr>
<td>
<form method="get" action="%(adminurl)s/doctypeconfiguresubmissionfunctions">
<input type="hidden" name="doctype" value="%(doctype)s" />
<input type="hidden" name="action" value="%(action)s" />
<input name="finishedviewfields" class="adminbutton" type="submit" value="Finished" />
</form>
</td>
</tr>
</table>""" % { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'paramstable' : create_html_table_from_tuple(tableheader=t_header, tablebody=t_body)
}
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="""Parameters of the %(function)s Function, belonging to the %(doctype)s Document Type:"""\
% { 'function' : cgi.escape(function, 1), 'doctype' : cgi.escape(doctype, 1) },
datalist=[body_content])
return output
def tmpl_configuredoctype_list_submissionpages(self,
doctype,
action,
number_pages,
cd="",
md="",
deletepagenum="",
user_msg=""):
## sanity checking:
try:
number_pages = int(number_pages)
except ValueError:
number_pages = 0
deletepagenum = str(deletepagenum)
output = ""
body_content = ""
output += self._create_user_message_string(user_msg)
body_content += """
<table width="90%%">
<tr>
<td width="20%%"><span class="adminlabel">Document Type ID:</span></td>
<td width="80%%"><span class="info">%(doctype_id)s</span></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Submission ID:</span></td>
<td width="80%%"><span class="info">%(action)s</span></td>
</tr>
<tr>
<td width="20%%"><span class="adminlabel">Number of Pages:</span></td>
<td width="80%%"><span class="info">%(num_pages)s</span></td>
</tr>""" % { 'doctype_id' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'num_pages' : cgi.escape(str(number_pages), 1)
}
if cd not in ("", None):
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Creation Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(cd), 1),)
if md not in ("", None):
body_content += """
<tr>
<td width="20%%"><span class="adminlabel">Last Modification Date:</span></td>
<td width="80%%"><span class="info">%s</span></td>
</tr>""" % (cgi.escape(str(md), 1), )
## EITHER: Make a table of links to each page -OR-
## prompt for confirmation of deletion of a page:
if deletepagenum == "":
## This is a normal visit to display details of a submission's pages
## make a table of links to each page:
t_header = ["Page", "&nbsp;", "&nbsp;", "View Page", "Delete"]
t_body = []
for i in range(1, number_pages + 1):
t_row = ["""Page %d""" % (i,)]
## up arrow:
if i != 1:
t_row += ["""<a href="%(adminurl)s/doctypeconfiguresubmissionpages?doctype=%(doctype)s&action=%(action)s&"""\
"""pagenum=%(pagenum)s&movepage=true&movepagedirection=up">"""\
"""<img border="0" src="%(siteurl)s/img/smallup.gif" title="Move Page Up" /></a>""" \
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(str(i), 1)
}
]
else:
## this is the first function - don't provide an arrow to move it up
t_row += ["&nbsp;"]
## down arrow:
if number_pages > 1 and i < number_pages:
t_row += ["""<a href="%(adminurl)s/doctypeconfiguresubmissionpages?doctype=%(doctype)s&action=%(action)s&"""\
"""pagenum=%(pagenum)s&movepage=true&movepagedirection=down">"""\
"""<img border="0" src="%(siteurl)s/img/smalldown.gif" title="Move Page Down" /></a>""" \
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'siteurl' : cgi.escape(CFG_SITE_URL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(str(i), 1)
}
]
else:
t_row += ["&nbsp;"]
## "view page" link:
t_row += ["""<small><a href="%(adminurl)s/doctypeconfiguresubmissionpageelements?doctype=%(doctype)s&action=%(action)s&"""\
"""pagenum=%(pagenum)s">view page</a></small>""" \
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(str(i), 1)
}
]
## "delete page" link:
t_row += ["""<small><a href="%(adminurl)s/doctypeconfiguresubmissionpages?doctype=%(doctype)s&action=%(action)s&"""\
"""pagenum=%(pagenum)s&deletepage=true">delete page</a></small>""" \
% { 'adminurl' : cgi.escape(WEBSUBMITADMINURL, 1),
'doctype' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(str(i), 1)
}
]
## finally, append the newly created row to the tbody list:
t_body.append(t_row)
## now create the table and include it into the page body:
body_content += """
<tr>
<td colspan="2"><br />"""
body_content += create_html_table_from_tuple(tableheader=t_header, tablebody=t_body)
body_content += """
<br />
</td>
</tr>"""
body_content += """
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">
<table>
<tr>
<td>
<form method="get" action="%(adminurl)s/doctypeconfiguresubmissionpages">
<input type="hidden" name="doctype" value="%(doctype_id)s" />
<input type="hidden" name="action" value="%(action)s" />
<input name="addpage" class="adminbutton" type="submit" value="Add a Page" />
</form>
</td>
<td>
<form method="get" action="%(adminurl)s/doctypeconfigure">
<input type="hidden" name="doctype" value="%(doctype_id)s" />
<input name="finishedviewpages" class="adminbutton" type="submit" value="Finished" />
</form>
</td>
</tr>
</table>
</td>
</tr>""" % { 'adminurl' : WEBSUBMITADMINURL,
'doctype_id' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1)
}
else:
## user has requested the deletion of a page from the current submission, and this visit should
## simply prompt them for confirmation:
body_content += """
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%"><br /><span class="info">REALLY delete page %(pagenum)s and all of its associated interface elements from """\
"""this submission? You CANNOT undo this!</span></td>
</tr>
<tr>
<td width="20%%">&nbsp;</td>
<td width="80%%">
<table>
<tr>
<td>
<form method="get" action="%(adminurl)s/doctypeconfiguresubmissionpages">
<input type="hidden" name="doctype" value="%(doctype_id)s" />
<input type="hidden" name="action" value="%(action)s" />
<input type="hidden" name="deletepage" value="true" />
<input type="hidden" name="pagenum" value="%(pagenum)s" />
<input name="deletepageconfirm" class="adminbutton" type="submit" value="Confirm" />
</form>
</td>
<td>
<form method="get" action="%(adminurl)s/doctypeconfiguresubmissionpages">
<input type="hidden" name="doctype" value="%(doctype_id)s" />
<input type="hidden" name="action" value="%(action)s" />
<input name="cancelpagedelete" class="adminbutton" type="submit" value="No! Stop!" />
</form>
</td>
</tr>
</table>
</td>
</tr>""" % { 'adminurl' : WEBSUBMITADMINURL,
'doctype_id' : cgi.escape(doctype, 1),
'action' : cgi.escape(action, 1),
'pagenum' : cgi.escape(deletepagenum, 1)
}
body_content += """
</table>
"""
output += self._create_websubmitadmin_main_menu_header()
output += self._create_adminbox(header="Submission Page Details:", datalist=[body_content])
return output
diff --git a/invenio/legacy/websubmit/engine.py b/invenio/legacy/websubmit/engine.py
index eb3d83a6c..1855fbde1 100644
--- a/invenio/legacy/websubmit/engine.py
+++ b/invenio/legacy/websubmit/engine.py
@@ -1,1858 +1,1858 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebSubmit: the mechanism for the submission of new records into Invenio
via a Web interface.
"""
__revision__ = "$Id$"
## import interesting modules:
import string
import os
import sys
import time
import types
import re
import pprint
from urllib import quote_plus
from cgi import escape
from invenio.config import \
CFG_SITE_LANG, \
CFG_SITE_NAME, \
CFG_SITE_URL, \
CFG_WEBSUBMIT_STORAGEDIR, \
CFG_DEVEL_SITE, \
CFG_SITE_SECURE_URL, \
CFG_WEBSUBMIT_USE_MATHJAX
from invenio.legacy.dbquery import Error
from invenio.modules.access.engine import acc_authorize_action
from invenio.legacy.webpage import page, error_page, warning_page
from invenio.legacy.webuser import getUid, get_email, collect_user_info, isGuestUser, \
page_not_authorized
from invenio.legacy.websubmit.config import CFG_RESERVED_SUBMISSION_FILENAMES, \
InvenioWebSubmitFunctionError, InvenioWebSubmitFunctionStop, \
InvenioWebSubmitFunctionWarning
from invenio.base.i18n import gettext_set_language, wash_language
-from invenio.webstat import register_customevent
+from invenio.legacy.webstat.api import register_customevent
from invenio.ext.logging import register_exception
from invenio.utils.url import make_canonical_urlargd, redirect_to_url
from invenio.websubmitadmin_engine import string_is_alphanumeric_including_underscore
from invenio.utils.html import get_mathjax_header
from invenio.websubmit_dblayer import \
get_storage_directory_of_action, \
get_longname_of_doctype, \
get_longname_of_action, \
get_num_pages_of_submission, \
get_parameter_value_for_doctype, \
submission_exists_in_log, \
log_new_pending_submission, \
log_new_completed_submission, \
update_submission_modified_date_in_log, \
update_submission_reference_in_log, \
update_submission_reference_and_status_in_log, \
get_form_fields_on_submission_page, \
get_element_description, \
get_element_check_description, \
get_form_fields_not_on_submission_page, \
function_step_is_last, \
get_collection_children_of_submission_collection, \
get_submission_collection_name, \
get_doctype_children_of_submission_collection, \
get_categories_of_doctype, \
get_doctype_details, \
get_actions_on_submission_page_for_doctype, \
get_action_details, \
get_parameters_of_function, \
get_details_of_submission, \
get_functions_for_submission_step, \
get_submissions_at_level_X_with_score_above_N, \
submission_is_finished
import invenio.legacy.template
websubmit_templates = invenio.legacy.template.load('websubmit')
def interface(req,
c=CFG_SITE_NAME,
ln=CFG_SITE_LANG,
doctype="",
act="",
startPg=1,
access="",
mainmenu="",
fromdir="",
nextPg="",
nbPg="",
curpage=1):
"""This function is called after a user has visited a document type's
"homepage" and selected the type of "action" to perform. Having
clicked an action-button (e.g. "Submit a New Record"), this function
will be called . It performs the task of initialising a new submission
session (retrieving information about the submission, creating a
working submission-directory, etc), and "drawing" a submission page
containing the WebSubmit form that the user uses to input the metadata
to be submitted.
When a user moves between pages in the submission interface, this
function is recalled so that it can save the metadata entered into the
previous page by the user, and draw the current submission-page.
Note: During a submission, for each page refresh, this function will be
called while the variable "step" (a form variable, seen by
websubmit_webinterface, which calls this function) is 0 (ZERO).
In other words, this function handles the FRONT-END phase of a
submission, BEFORE the WebSubmit functions are called.
@param req: (apache request object) *** NOTE: Added into this object, is
a variable called "form" (req.form). This is added into the object in
the index function of websubmit_webinterface. It contains a
"mod_python.util.FieldStorage" instance, that contains the form-fields
found on the previous submission page.
@param c: (string), defaulted to CFG_SITE_NAME. The name of the Invenio
installation.
@param ln: (string), defaulted to CFG_SITE_LANG. The language in which to
display the pages.
@param doctype: (string) - the doctype ID of the doctype for which the
submission is being made.
@param act: (string) - The ID of the action being performed (e.g.
submission of bibliographic information; modification of bibliographic
information, etc).
@param startPg: (integer) - Starting page for the submission? Defaults
to 1.
@param indir: (string) - the directory used to store all submissions
of the given "type" of this submission. For example, if the submission
is of the type "modify bibliographic information", this variable would
contain "modify".
@param access: (string) - the "access" number for the submission
(e.g. 1174062451_7010). This number is also used as the name for the
current working submission directory.
@param mainmenu: (string) - contains the URL (minus the Invenio
home stem) for the submission's home-page. (E.g. If this submission
is "PICT", the "mainmenu" file would contain "/submit?doctype=PICT".
@param fromdir: (integer)
@param nextPg: (string)
@param nbPg: (string)
@param curpage: (integer) - the current submission page number. Defaults
to 1.
"""
ln = wash_language(ln)
# load the right message language
_ = gettext_set_language(ln)
sys.stdout = req
# get user ID:
user_info = collect_user_info(req)
uid = user_info['uid']
uid_email = user_info['email']
# variable initialisation
t = ""
field = []
fieldhtml = []
level = []
fullDesc = []
text = ''
check = []
select = []
radio = []
upload = []
txt = []
noPage = []
# Preliminary tasks
if not access:
# In some cases we want to take the users directly to the submit-form.
# This fix makes this possible - as it generates the required access
# parameter if it is not present.
pid = os.getpid()
now = time.time()
access = "%i_%s" % (now, pid)
# check we have minimum fields
if not doctype or not act or not access:
## We don't have all the necessary information to go ahead
## with this submission:
return warning_page(_("Not enough information to go ahead with the submission."), req, ln)
try:
assert(not access or re.match('\d+_\d+', access))
except AssertionError:
register_exception(req=req, prefix='doctype="%s", access="%s"' % (doctype, access))
return warning_page(_("Invalid parameters"), req, ln)
if doctype and act:
## Let's clean the input
details = get_details_of_submission(doctype, act)
if not details:
return warning_page(_("Invalid doctype and act parameters"), req, ln)
doctype = details[0]
act = details[1]
## Before continuing to display the submission form interface,
## verify that this submission has not already been completed:
if submission_is_finished(doctype, act, access, uid_email):
## This submission has already been completed.
## This situation can arise when, having completed a submission,
## the user uses the browser's back-button to go back to the form
## stage of the submission and then tries to submit once more.
## This is unsafe and should not be allowed. Instead of re-displaying
## the submission forms, display an error message to the user:
wrnmsg = """<b>This submission has been completed. Please go to the""" \
""" <a href="/submit?doctype=%(doctype)s&amp;ln=%(ln)s">""" \
"""main menu</a> to start a new submission.</b>""" \
% { 'doctype' : quote_plus(doctype), 'ln' : ln }
return warning_page(wrnmsg, req, ln)
## retrieve the action and doctype data:
## Concatenate action ID and doctype ID to make the submission ID:
subname = "%s%s" % (act, doctype)
## Get the submission storage directory from the DB:
submission_dir = get_storage_directory_of_action(act)
if submission_dir:
indir = submission_dir
else:
## Unable to determine the submission-directory:
return warning_page(_("Unable to find the submission directory for the action: %s") % escape(str(act)), req, ln)
## get the document type's long-name:
doctype_lname = get_longname_of_doctype(doctype)
if doctype_lname is not None:
## Got the doctype long-name: replace spaces with HTML chars:
docname = doctype_lname.replace(" ", "&nbsp;")
else:
## Unknown document type:
return warning_page(_("Unknown document type"), req, ln)
## get the action's long-name:
actname = get_longname_of_action(act)
if actname is None:
## Unknown action:
return warning_page(_("Unknown action"), req, ln)
## Get the number of pages for this submission:
num_submission_pages = get_num_pages_of_submission(subname)
if num_submission_pages is not None:
nbpages = num_submission_pages
else:
## Unable to determine the number of pages for this submission:
return warning_page(_("Unable to determine the number of submission pages."), req, ln)
## If unknown, get the current page of submission:
if startPg != "" and curpage in ("", 0):
curpage = startPg
## retrieve the name of the file in which the reference of
## the submitted document will be stored
rn_filename = get_parameter_value_for_doctype(doctype, "edsrn")
if rn_filename is not None:
edsrn = rn_filename
else:
## Unknown value for edsrn - set it to an empty string:
edsrn = ""
## This defines the path to the directory containing the action data
curdir = os.path.join(CFG_WEBSUBMIT_STORAGEDIR, indir, doctype, access)
try:
assert(curdir == os.path.abspath(curdir))
except AssertionError:
register_exception(req=req, prefix='indir="%s", doctype="%s", access="%s"' % (indir, doctype, access))
return warning_page(_("Invalid parameters"), req, ln)
## if this submission comes from another one (fromdir is then set)
## We retrieve the previous submission directory and put it in the proper one
if fromdir != "":
olddir = os.path.join(CFG_WEBSUBMIT_STORAGEDIR, fromdir, doctype, access)
try:
assert(olddir == os.path.abspath(olddir))
except AssertionError:
register_exception(req=req, prefix='fromdir="%s", doctype="%s", access="%s"' % (fromdir, doctype, access))
return warning_page(_("Invalid parameters"), req, ln)
if os.path.exists(olddir):
os.rename(olddir, curdir)
## If the submission directory still does not exist, we create it
if not os.path.exists(curdir):
try:
os.makedirs(curdir)
except Exception, e:
register_exception(req=req, alert_admin=True)
return warning_page(_("Unable to create a directory for this submission. The administrator has been alerted."), req, ln)
## Retrieve the previous page, as submitted to curdir (before we
## overwrite it with our curpage as declared from the incoming
## form)
try:
fp = open(os.path.join(curdir, "curpage"))
previous_page_from_disk = fp.read()
fp.close()
except:
previous_page_from_disk = "1"
# retrieve the original main menu url and save it in the "mainmenu" file
if mainmenu != "":
fp = open(os.path.join(curdir, "mainmenu"), "w")
fp.write(mainmenu)
fp.close()
# and if the file containing the URL to the main menu exists
# we retrieve it and store it in the $mainmenu variable
if os.path.exists(os.path.join(curdir, "mainmenu")):
fp = open(os.path.join(curdir, "mainmenu"), "r");
mainmenu = fp.read()
fp.close()
else:
mainmenu = "%s/submit" % (CFG_SITE_URL,)
# various authentication related tasks...
if uid_email != "guest" and uid_email != "":
#First save the username (email address) in the SuE file. This way bibconvert will be able to use it if needed
fp = open(os.path.join(curdir, "SuE"), "w")
fp.write(uid_email)
fp.close()
if os.path.exists(os.path.join(curdir, "combo%s" % doctype)):
fp = open(os.path.join(curdir, "combo%s" % doctype), "r");
categ = fp.read()
fp.close()
else:
categ = req.form.get('combo%s' % doctype, '*')
# is user authorized to perform this action?
(auth_code, auth_message) = acc_authorize_action(req, 'submit', \
authorized_if_no_roles=not isGuestUser(uid), \
verbose=0, \
doctype=doctype, \
act=act, \
categ=categ)
if not auth_code == 0:
return warning_page("""<center><font color="red">%s</font></center>""" % auth_message, req, ln)
## update the "journal of submission":
## Does the submission already exist in the log?
submission_exists = \
submission_exists_in_log(doctype, act, access, uid_email)
if submission_exists == 1:
## update the modification-date of this submission in the log:
update_submission_modified_date_in_log(doctype, act, access, uid_email)
else:
## Submission doesn't exist in log - create it:
log_new_pending_submission(doctype, act, access, uid_email)
## Let's write in curdir file under curdir the curdir value
## in case e.g. it is needed in FFT.
fp = open(os.path.join(curdir, "curdir"), "w")
fp.write(curdir)
fp.close()
## Let's write in ln file the current language
fp = open(os.path.join(curdir, "ln"), "w")
fp.write(ln)
fp.close()
# Save the form fields entered in the previous submission page
# If the form was sent with the GET method
form = dict(req.form)
value = ""
# we parse all the form variables
for key, formfields in form.items():
filename = key.replace("[]", "")
file_to_open = os.path.join(curdir, filename)
try:
assert(file_to_open == os.path.abspath(file_to_open))
except AssertionError:
register_exception(req=req, prefix='curdir="%s", filename="%s"' % (curdir, filename))
return warning_page(_("Invalid parameters"), req, ln)
# Do not write reserved filenames to disk
if filename in CFG_RESERVED_SUBMISSION_FILENAMES:
# Unless there is really an element with that name on this
# page or previous one (either visited, or declared to be
# visited), which means that admin authorized it.
if not ((str(curpage).isdigit() and \
filename in [submission_field[3] for submission_field in \
get_form_fields_on_submission_page(subname, curpage)]) or \
(str(curpage).isdigit() and int(curpage) > 1 and \
filename in [submission_field[3] for submission_field in \
get_form_fields_on_submission_page(subname, int(curpage) - 1)]) or \
(previous_page_from_disk.isdigit() and \
filename in [submission_field[3] for submission_field in \
get_form_fields_on_submission_page(subname, int(previous_page_from_disk))])):
# Still this will filter out reserved field names that
# might have been called by functions such as
# Create_Modify_Interface function in MBI step, or
# dynamic fields in response elements, but that is
# unlikely to be a problem.
continue
# Skip variables containing characters that are not allowed in
# WebSubmit elements
if not string_is_alphanumeric_including_underscore(filename):
continue
# the field is an array
if isinstance(formfields, types.ListType):
fp = open(file_to_open, "w")
for formfield in formfields:
#stripslashes(value)
value = specialchars(formfield)
fp.write(value+"\n")
fp.close()
# the field is a normal string
elif isinstance(formfields, types.StringTypes) and formfields != "":
value = formfields
fp = open(file_to_open, "w")
fp.write(specialchars(value))
fp.close()
# the field is a file
elif hasattr(formfields,"filename") and formfields.filename:
dir_to_open = os.path.join(curdir, 'files', key)
try:
assert(dir_to_open == os.path.abspath(dir_to_open))
assert(dir_to_open.startswith(CFG_WEBSUBMIT_STORAGEDIR))
except AssertionError:
register_exception(req=req, prefix='curdir="%s", key="%s"' % (curdir, key))
return warning_page(_("Invalid parameters"), req, ln)
if not os.path.exists(dir_to_open):
try:
os.makedirs(dir_to_open)
except:
register_exception(req=req, alert_admin=True)
return warning_page(_("Cannot create submission directory. The administrator has been alerted."), req, ln)
filename = formfields.filename
## Before saving the file to disc, wash the filename (in particular
## washing away UNIX and Windows (e.g. DFS) paths):
filename = os.path.basename(filename.split('\\')[-1])
filename = filename.strip()
if filename != "":
fp = open(os.path.join(dir_to_open, filename), "w")
while True:
buf = formfields.read(10240)
if buf:
fp.write(buf)
else:
break
fp.close()
fp = open(os.path.join(curdir, "lastuploadedfile"), "w")
fp.write(filename)
fp.close()
fp = open(file_to_open, "w")
fp.write(filename)
fp.close()
else:
return warning_page(_("No file uploaded?"), req, ln)
## if the found field is the reference of the document,
## save this value in the "journal of submissions":
if uid_email != "" and uid_email != "guest":
if key == edsrn:
update_submission_reference_in_log(doctype, access, uid_email, value)
## create the interface:
subname = "%s%s" % (act, doctype)
## Get all of the form fields that appear on this page, ordered by fieldnum:
form_fields = get_form_fields_on_submission_page(subname, curpage)
full_fields = []
values = []
the_globals = {
'doctype' : doctype,
'action' : action,
'access' : access,
'ln' : ln,
'curdir' : curdir,
'uid' : uid,
'uid_email' : uid_email,
'form' : form,
'act' : act,
'action' : act, ## for backward compatibility
'req' : req,
'user_info' : user_info,
'InvenioWebSubmitFunctionError' : InvenioWebSubmitFunctionError,
'__websubmit_in_jail__' : True,
'__builtins__' : globals()['__builtins__']
}
for field_instance in form_fields:
full_field = {}
## Retrieve the field's description:
element_descr = get_element_description(field_instance[3])
try:
assert(element_descr is not None)
except AssertionError:
msg = _("Unknown form field found on submission page.")
register_exception(req=req, alert_admin=True, prefix=msg)
## The form field doesn't seem to exist - return with error message:
return warning_page(_("Unknown form field found on submission page."), req, ln)
if element_descr[8] is None:
val = ""
else:
val = element_descr[8]
## we also retrieve and add the javascript code of the checking function, if needed
## Set it to empty string to begin with:
full_field['javascript'] = ''
if field_instance[7] != '':
check_descr = get_element_check_description(field_instance[7])
if check_descr is not None:
## Retrieved the check description:
full_field['javascript'] = check_descr
full_field['type'] = element_descr[3]
full_field['name'] = field_instance[3]
full_field['rows'] = element_descr[5]
full_field['cols'] = element_descr[6]
full_field['val'] = val
full_field['size'] = element_descr[4]
full_field['maxlength'] = element_descr[7]
full_field['htmlcode'] = element_descr[9]
full_field['typename'] = field_instance[1] ## TODO: Investigate this, Not used?
## It also seems to refer to pagenum.
# The 'R' fields must be executed in the engine's environment,
# as the runtime functions access some global and local
# variables.
if full_field ['type'] == 'R':
try:
co = compile (full_field ['htmlcode'].replace("\r\n","\n"), "<string>", "exec")
the_globals['text'] = ''
exec co in the_globals
text = the_globals['text']
except:
register_exception(req=req, alert_admin=True, prefix="Error in evaluating response element %s with globals %s" % (pprint.pformat(full_field), pprint.pformat(the_globals)))
raise
else:
text = websubmit_templates.tmpl_submit_field (ln = ln, field = full_field)
# we now determine the exact type of the created field
if full_field['type'] not in [ 'D','R']:
field.append(full_field['name'])
level.append(field_instance[5])
fullDesc.append(field_instance[4])
txt.append(field_instance[6])
check.append(field_instance[7])
# If the field is not user-defined, we try to determine its type
# (select, radio, file upload...)
# check whether it is a select field or not
if re.search("SELECT", text, re.IGNORECASE) is not None:
select.append(1)
else:
select.append(0)
# checks whether it is a radio field or not
if re.search(r"TYPE=[\"']?radio", text, re.IGNORECASE) is not None:
radio.append(1)
else:
radio.append(0)
# checks whether it is a file upload or not
if re.search(r"TYPE=[\"']?file", text, re.IGNORECASE) is not None:
upload.append(1)
else:
upload.append(0)
# if the field description contains the "<COMBO>" string, replace
# it by the category selected on the document page submission page
combofile = "combo%s" % doctype
if os.path.exists("%s/%s" % (curdir, combofile)):
f = open("%s/%s" % (curdir, combofile), "r")
combo = f.read()
f.close()
else:
combo = ""
text = text.replace("<COMBO>", combo)
# if there is a <YYYY> tag in it, replace it by the current year
year = time.strftime("%Y");
text = text.replace("<YYYY>", year)
# if there is a <TODAY> tag in it, replace it by the current year
today = time.strftime("%d/%m/%Y");
text = text.replace("<TODAY>", today)
fieldhtml.append(text)
else:
select.append(0)
radio.append(0)
upload.append(0)
# field.append(value) - initial version, not working with JS, taking a submitted value
field.append(field_instance[3])
level.append(field_instance[5])
txt.append(field_instance[6])
fullDesc.append(field_instance[4])
check.append(field_instance[7])
fieldhtml.append(text)
full_field['fullDesc'] = field_instance[4]
full_field['text'] = text
# If a file exists with the name of the field we extract the saved value
text = ''
if os.path.exists(os.path.join(curdir, full_field['name'])):
file = open(os.path.join(curdir, full_field['name']), "r");
text = file.read()
text = re.compile("[\n\r]*$").sub("", text)
text = re.compile("\n").sub("\\n", text)
text = re.compile("\r").sub("", text)
file.close()
values.append(text)
full_fields.append(full_field)
returnto = {}
if int(curpage) == int(nbpages):
subname = "%s%s" % (act, doctype)
other_form_fields = \
get_form_fields_not_on_submission_page(subname, curpage)
nbFields = 0
message = ""
fullcheck_select = []
fullcheck_radio = []
fullcheck_upload = []
fullcheck_field = []
fullcheck_level = []
fullcheck_txt = []
fullcheck_noPage = []
fullcheck_check = []
for field_instance in other_form_fields:
if field_instance[5] == "M":
## If this field is mandatory, get its description:
element_descr = get_element_description(field_instance[3])
try:
assert(element_descr is not None)
except AssertionError:
msg = _("Unknown form field found on submission page.")
register_exception(req=req, alert_admin=True, prefix=msg)
## The form field doesn't seem to exist - return with error message:
return warning_page(_("Unknown form field found on submission page."), req, ln)
if element_descr[3] in ['D', 'R']:
if element_descr[3] == "D":
text = element_descr[9]
else:
text = eval(element_descr[9])
formfields = text.split(">")
for formfield in formfields:
match = re.match("name=([^ <>]+)", formfield, re.IGNORECASE)
if match is not None:
names = match.groups
for value in names:
if value != "":
value = re.compile("[\"']+").sub("", value)
fullcheck_field.append(value)
fullcheck_level.append(field_instance[5])
fullcheck_txt.append(field_instance[6])
fullcheck_noPage.append(field_instance[1])
fullcheck_check.append(field_instance[7])
nbFields = nbFields + 1
else:
fullcheck_noPage.append(field_instance[1])
fullcheck_field.append(field_instance[3])
fullcheck_level.append(field_instance[5])
fullcheck_txt.append(field_instance[6])
fullcheck_check.append(field_instance[7])
nbFields = nbFields+1
# tests each mandatory field
fld = 0
res = 1
for i in xrange(nbFields):
res = 1
if not os.path.exists(os.path.join(curdir, fullcheck_field[i])):
res = 0
else:
file = open(os.path.join(curdir, fullcheck_field[i]), "r")
text = file.read()
if text == '':
res = 0
else:
if text == "Select:":
res = 0
if res == 0:
fld = i
break
if not res:
returnto = {
'field' : fullcheck_txt[fld],
'page' : fullcheck_noPage[fld],
}
t += websubmit_templates.tmpl_page_interface(
ln = ln,
docname = docname,
actname = actname,
curpage = curpage,
nbpages = nbpages,
nextPg = nextPg,
access = access,
nbPg = nbPg,
doctype = doctype,
act = act,
fields = full_fields,
javascript = websubmit_templates.tmpl_page_interface_js(
ln = ln,
upload = upload,
field = field,
fieldhtml = fieldhtml,
txt = txt,
check = check,
level = level,
curdir = curdir,
values = values,
select = select,
radio = radio,
curpage = curpage,
nbpages = nbpages,
returnto = returnto,
),
mainmenu = mainmenu,
)
t += websubmit_templates.tmpl_page_do_not_leave_submission_js(ln)
# start display:
req.content_type = "text/html"
req.send_http_header()
p_navtrail = """<a href="/submit?ln=%(ln)s" class="navtrail">%(submit)s</a>&nbsp;>&nbsp;<a href="/submit?doctype=%(doctype)s&amp;ln=%(ln)s" class="navtrail">%(docname)s</a>&nbsp;""" % {
'submit' : _("Submit"),
'doctype' : quote_plus(doctype),
'docname' : docname,
'ln' : ln
}
## add MathJax if wanted
if CFG_WEBSUBMIT_USE_MATHJAX:
metaheaderadd = get_mathjax_header(req.is_https())
metaheaderadd += websubmit_templates.tmpl_mathpreview_header(ln, req.is_https())
else:
metaheaderadd = ''
return page(title= actname,
body = t,
navtrail = p_navtrail,
description = "submit documents",
keywords = "submit",
uid = uid,
language = ln,
req = req,
navmenuid='submit',
metaheaderadd=metaheaderadd)
def endaction(req,
c=CFG_SITE_NAME,
ln=CFG_SITE_LANG,
doctype="",
act="",
startPg=1,
access="",
mainmenu="",
fromdir="",
nextPg="",
nbPg="",
curpage=1,
step=1,
mode="U"):
"""Having filled-in the WebSubmit form created for metadata by the interface
function, the user clicks a button to either "finish the submission" or
to "proceed" to the next stage of the submission. At this point, a
variable called "step" will be given a value of 1 or above, which means
that this function is called by websubmit_webinterface.
So, during all non-zero steps of the submission, this function is called.
In other words, this function is called during the BACK-END phase of a
submission, in which WebSubmit *functions* are being called.
The function first ensures that all of the WebSubmit form field values
have been saved in the current working submission directory, in text-
files with the same name as the field elements have. It then determines
the functions to be called for the given step of the submission, and
executes them.
Following this, if this is the last step of the submission, it logs the
submission as "finished" in the journal of submissions.
@param req: (apache request object) *** NOTE: Added into this object, is
a variable called "form" (req.form). This is added into the object in
the index function of websubmit_webinterface. It contains a
"mod_python.util.FieldStorage" instance, that contains the form-fields
found on the previous submission page.
@param c: (string), defaulted to CFG_SITE_NAME. The name of the Invenio
installation.
@param ln: (string), defaulted to CFG_SITE_LANG. The language in which to
display the pages.
@param doctype: (string) - the doctype ID of the doctype for which the
submission is being made.
@param act: (string) - The ID of the action being performed (e.g.
submission of bibliographic information; modification of bibliographic
information, etc).
@param startPg: (integer) - Starting page for the submission? Defaults
to 1.
@param indir: (string) - the directory used to store all submissions
of the given "type" of this submission. For example, if the submission
is of the type "modify bibliographic information", this variable would
contain "modify".
@param access: (string) - the "access" number for the submission
(e.g. 1174062451_7010). This number is also used as the name for the
current working submission directory.
@param mainmenu: (string) - contains the URL (minus the Invenio
home stem) for the submission's home-page. (E.g. If this submission
is "PICT", the "mainmenu" file would contain "/submit?doctype=PICT".
@param fromdir:
@param nextPg:
@param nbPg:
@param curpage: (integer) - the current submission page number. Defaults
to 1.
@param step: (integer) - the current step of the submission. Defaults to
1.
@param mode:
"""
# load the right message language
_ = gettext_set_language(ln)
dismode = mode
ln = wash_language(ln)
sys.stdout = req
rn = ""
t = ""
# get user ID:
uid = getUid(req)
uid_email = get_email(uid)
## Get the submission storage directory from the DB:
submission_dir = get_storage_directory_of_action(act)
if submission_dir:
indir = submission_dir
else:
## Unable to determine the submission-directory:
return warning_page(_("Unable to find the submission directory for the action: %s") % escape(str(act)), req, ln)
curdir = os.path.join(CFG_WEBSUBMIT_STORAGEDIR, indir, doctype, access)
if os.path.exists(os.path.join(curdir, "combo%s" % doctype)):
fp = open(os.path.join(curdir, "combo%s" % doctype), "r");
categ = fp.read()
fp.close()
else:
categ = req.form.get('combo%s' % doctype, '*')
# is user authorized to perform this action?
(auth_code, auth_message) = acc_authorize_action(req, 'submit', \
authorized_if_no_roles=not isGuestUser(uid), \
verbose=0, \
doctype=doctype, \
act=act, \
categ=categ)
if not auth_code == 0:
return warning_page("""<center><font color="red">%s</font></center>""" % auth_message, req, ln)
# Preliminary tasks
## check we have minimum fields
if not doctype or not act or not access:
## We don't have all the necessary information to go ahead
## with this submission:
return warning_page(_("Not enough information to go ahead with the submission."), req, ln)
if doctype and act:
## Let's clean the input
details = get_details_of_submission(doctype, act)
if not details:
return warning_page(_("Invalid doctype and act parameters"), req, ln)
doctype = details[0]
act = details[1]
try:
assert(not access or re.match('\d+_\d+', access))
except AssertionError:
register_exception(req=req, prefix='doctype="%s", access="%s"' % (doctype, access))
return warning_page(_("Invalid parameters"), req, ln)
## Before continuing to process the submitted data, verify that
## this submission has not already been completed:
if submission_is_finished(doctype, act, access, uid_email):
## This submission has already been completed.
## This situation can arise when, having completed a submission,
## the user uses the browser's back-button to go back to the form
## stage of the submission and then tries to submit once more.
## This is unsafe and should not be allowed. Instead of re-processing
## the submitted data, display an error message to the user:
wrnmsg = """<b>This submission has been completed. Please go to the""" \
""" <a href="/submit?doctype=%(doctype)s&amp;ln=%(ln)s">""" \
"""main menu</a> to start a new submission.</b>""" \
% { 'doctype' : quote_plus(doctype), 'ln' : ln }
return warning_page(wrnmsg, req, ln)
## Get the number of pages for this submission:
subname = "%s%s" % (act, doctype)
## retrieve the action and doctype data
## Get the submission storage directory from the DB:
submission_dir = get_storage_directory_of_action(act)
if submission_dir:
indir = submission_dir
else:
## Unable to determine the submission-directory:
return warning_page(_("Unable to find the submission directory for the action: %s") % escape(str(act)), req, ln)
# The following words are reserved and should not be used as field names
reserved_words = ["stop", "file", "nextPg", "startPg", "access", "curpage", "nbPg", "act", \
"indir", "doctype", "mode", "step", "deleted", "file_path", "userfile_name"]
# This defines the path to the directory containing the action data
curdir = os.path.join(CFG_WEBSUBMIT_STORAGEDIR, indir, doctype, access)
try:
assert(curdir == os.path.abspath(curdir))
except AssertionError:
register_exception(req=req, prefix='indir="%s", doctype=%s, access=%s' % (indir, doctype, access))
return warning_page(_("Invalid parameters"), req, ln)
## If the submission directory still does not exist, we create it
if not os.path.exists(curdir):
try:
os.makedirs(curdir)
except Exception, e:
register_exception(req=req, alert_admin=True)
return warning_page(_("Unable to create a directory for this submission. The administrator has been alerted."), req, ln)
# retrieve the original main menu url ans save it in the "mainmenu" file
if mainmenu != "":
fp = open(os.path.join(curdir, "mainmenu"), "w")
fp.write(mainmenu)
fp.close()
# and if the file containing the URL to the main menu exists
# we retrieve it and store it in the $mainmenu variable
if os.path.exists(os.path.join(curdir, "mainmenu")):
fp = open(os.path.join(curdir, "mainmenu"), "r");
mainmenu = fp.read()
fp.close()
else:
mainmenu = "%s/submit" % (CFG_SITE_URL,)
num_submission_pages = get_num_pages_of_submission(subname)
if num_submission_pages is not None:
nbpages = num_submission_pages
else:
## Unable to determine the number of pages for this submission:
return warning_page(_("Unable to determine the number of submission pages."), \
req, ln)
## Retrieve the previous page, as submitted to curdir (before we
## overwrite it with our curpage as declared from the incoming
## form)
try:
fp = open(os.path.join(curdir, "curpage"))
previous_page_from_disk = fp.read()
fp.close()
except:
previous_page_from_disk = str(num_submission_pages)
## retrieve the name of the file in which the reference of
## the submitted document will be stored
rn_filename = get_parameter_value_for_doctype(doctype, "edsrn")
if rn_filename is not None:
edsrn = rn_filename
else:
## Unknown value for edsrn - set it to an empty string:
edsrn = ""
## Determine whether the action is finished
## (ie there are no other steps after the current one):
finished = function_step_is_last(doctype, act, step)
## Let's write in curdir file under curdir the curdir value
## in case e.g. it is needed in FFT.
fp = open(os.path.join(curdir, "curdir"), "w")
fp.write(curdir)
fp.close()
## Let's write in ln file the current language
fp = open(os.path.join(curdir, "ln"), "w")
fp.write(ln)
fp.close()
# Save the form fields entered in the previous submission page
# If the form was sent with the GET method
form = req.form
value = ""
# we parse all the form variables
for key in form.keys():
formfields = form[key]
filename = key.replace("[]", "")
file_to_open = os.path.join(curdir, filename)
try:
assert(file_to_open == os.path.abspath(file_to_open))
assert(file_to_open.startswith(CFG_WEBSUBMIT_STORAGEDIR))
except AssertionError:
register_exception(req=req, prefix='curdir="%s", filename="%s"' % (curdir, filename))
return warning_page(_("Invalid parameters"), req, ln)
# Do not write reserved filenames to disk
if filename in CFG_RESERVED_SUBMISSION_FILENAMES:
# Unless there is really an element with that name on this
# page, or on the previously visited one, which means that
# admin authorized it. Note that in endaction() curpage is
# equivalent to the "previous" page value
if not ((previous_page_from_disk.isdigit() and \
filename in [submission_field[3] for submission_field in \
get_form_fields_on_submission_page(subname, int(previous_page_from_disk))]) or \
(str(curpage).isdigit() and int(curpage) > 1 and \
filename in [submission_field[3] for submission_field in \
get_form_fields_on_submission_page(subname, int(curpage) - 1)])):
# might have been called by functions such as
# Create_Modify_Interface function in MBI step, or
# dynamic fields in response elements, but that is
# unlikely to be a problem.
continue
# Skip variables containing characters that are not allowed in
# WebSubmit elements
if not string_is_alphanumeric_including_underscore(filename):
continue
# the field is an array
if isinstance(formfields, types.ListType):
fp = open(file_to_open, "w")
for formfield in formfields:
#stripslashes(value)
value = specialchars(formfield)
fp.write(value+"\n")
fp.close()
# the field is a normal string
elif isinstance(formfields, types.StringTypes) and formfields != "":
value = formfields
fp = open(file_to_open, "w")
fp.write(specialchars(value))
fp.close()
# the field is a file
elif hasattr(formfields, "filename") and formfields.filename:
dir_to_open = os.path.join(curdir, 'files', key)
try:
assert(dir_to_open == os.path.abspath(dir_to_open))
assert(dir_to_open.startswith(CFG_WEBSUBMIT_STORAGEDIR))
except AssertionError:
register_exception(req=req, prefix='curdir="%s", key="%s"' % (curdir, key))
return warning_page(_("Invalid parameters"), req, ln)
if not os.path.exists(dir_to_open):
try:
os.makedirs(dir_to_open)
except:
register_exception(req=req, alert_admin=True)
return warning_page(_("Cannot create submission directory. The administrator has been alerted."), req, ln)
filename = formfields.filename
## Before saving the file to disc, wash the filename (in particular
## washing away UNIX and Windows (e.g. DFS) paths):
filename = os.path.basename(filename.split('\\')[-1])
filename = filename.strip()
if filename != "":
fp = open(os.path.join(dir_to_open, filename), "w")
while True:
buf = formfields.file.read(10240)
if buf:
fp.write(buf)
else:
break
fp.close()
fp = open(os.path.join(curdir, "lastuploadedfile"), "w")
fp.write(filename)
fp.close()
fp = open(file_to_open, "w")
fp.write(filename)
fp.close()
else:
return warning_page(_("No file uploaded?"), req, ln)
## if the found field is the reference of the document
## we save this value in the "journal of submissions"
if uid_email != "" and uid_email != "guest":
if key == edsrn:
update_submission_reference_in_log(doctype, access, uid_email, value)
## get the document type's long-name:
doctype_lname = get_longname_of_doctype(doctype)
if doctype_lname is not None:
## Got the doctype long-name: replace spaces with HTML chars:
docname = doctype_lname.replace(" ", "&nbsp;")
else:
## Unknown document type:
return warning_page(_("Unknown document type"), req, ln)
## get the action's long-name:
actname = get_longname_of_action(act)
if actname is None:
## Unknown action:
return warning_page(_("Unknown action"), req, ln)
## Determine whether the action is finished
## (ie there are no other steps after the current one):
last_step = function_step_is_last(doctype, act, step)
next_action = '' ## The next action to be proposed to the user
# Prints the action details, returning the mandatory score
action_score = action_details(doctype, act)
current_level = get_level(doctype, act)
# Calls all the function's actions
function_content = ''
try:
## Handle the execution of the functions for this
## submission/step:
start_time = time.time()
(function_content, last_step, action_score, rn) = \
print_function_calls(req=req,
doctype=doctype,
action=act,
step=step,
form=form,
start_time=start_time,
access=access,
curdir=curdir,
dismode=mode,
rn=rn,
last_step=last_step,
action_score=action_score,
ln=ln)
except InvenioWebSubmitFunctionError, e:
register_exception(req=req, alert_admin=True, prefix='doctype="%s", action="%s", step="%s", form="%s", start_time="%s"' % (doctype, act, step, form, start_time))
## There was a serious function-error. Execution ends.
if CFG_DEVEL_SITE:
raise
else:
return warning_page(_("A serious function-error has been encountered. Adminstrators have been alerted. <br /><em>Please not that this might be due to wrong characters inserted into the form</em> (e.g. by copy and pasting some text from a PDF file)."), req, ln)
except InvenioWebSubmitFunctionStop, e:
## For one reason or another, one of the functions has determined that
## the data-processing phase (i.e. the functions execution) should be
## halted and the user should be returned to the form interface once
## more. (NOTE: Redirecting the user to the Web-form interface is
## currently done using JavaScript. The "InvenioWebSubmitFunctionStop"
## exception contains a "value" string, which is effectively JavaScript
## - probably an alert box and a form that is submitted). **THIS WILL
## CHANGE IN THE FUTURE WHEN JavaScript IS REMOVED!**
if e.value is not None:
function_content = e.value
else:
function_content = e
else:
## No function exceptions (InvenioWebSubmitFunctionStop,
## InvenioWebSubmitFunctionError) were raised by the functions. Propose
## the next action (if applicable), and log the submission as finished:
## If the action was mandatory we propose the next
## mandatory action (if any)
if action_score != -1 and last_step == 1:
next_action = Propose_Next_Action(doctype, \
action_score, \
access, \
current_level, \
indir)
## If we are in the last step of an action, we can update
## the "journal of submissions"
if last_step == 1:
if uid_email != "" and uid_email != "guest":
## update the "journal of submission":
## Does the submission already exist in the log?
submission_exists = \
submission_exists_in_log(doctype, act, access, uid_email)
if submission_exists == 1:
## update the rn and status to finished for this submission
## in the log:
update_submission_reference_and_status_in_log(doctype, \
act, \
access, \
uid_email, \
rn, \
"finished")
else:
## Submission doesn't exist in log - create it:
log_new_completed_submission(doctype, \
act, \
access, \
uid_email, \
rn)
## Having executed the functions, create the page that will be displayed
## to the user:
t = websubmit_templates.tmpl_page_endaction(
ln = ln,
# these fields are necessary for the navigation
nextPg = nextPg,
startPg = startPg,
access = access,
curpage = curpage,
nbPg = nbPg,
nbpages = nbpages,
doctype = doctype,
act = act,
docname = docname,
actname = actname,
mainmenu = mainmenu,
finished = finished,
function_content = function_content,
next_action = next_action,
)
if finished:
# register event in webstat
try:
register_customevent("websubmissions", [get_longname_of_doctype(doctype)])
except:
register_exception(suffix="Do the webstat tables exists? Try with 'webstatadmin --load-config'")
else:
t += websubmit_templates.tmpl_page_do_not_leave_submission_js(ln)
# start display:
req.content_type = "text/html"
req.send_http_header()
p_navtrail = '<a href="/submit?ln='+ln+'" class="navtrail">' + _("Submit") +\
"""</a>&nbsp;>&nbsp;<a href="/submit?doctype=%(doctype)s&amp;ln=%(ln)s" class="navtrail">%(docname)s</a>""" % {
'doctype' : quote_plus(doctype),
'docname' : docname,
'ln' : ln,
}
## add MathJax if wanted
if CFG_WEBSUBMIT_USE_MATHJAX:
metaheaderadd = get_mathjax_header(req.is_https())
metaheaderadd += websubmit_templates.tmpl_mathpreview_header(ln, req.is_https())
else:
metaheaderadd = ''
return page(title= actname,
body = t,
navtrail = p_navtrail,
description="submit documents",
keywords="submit",
uid = uid,
language = ln,
req = req,
navmenuid='submit',
metaheaderadd=metaheaderadd)
def home(req, catalogues_text, c=CFG_SITE_NAME, ln=CFG_SITE_LANG):
"""This function generates the WebSubmit "home page".
Basically, this page contains a list of submission-collections
in WebSubmit, and gives links to the various document-type
submissions.
Document-types only appear on this page when they have been
connected to a submission-collection in WebSubmit.
@param req: (apache request object)
@param catalogues_text (string): the computed catalogues tree
@param c: (string) - defaults to CFG_SITE_NAME
@param ln: (string) - The Invenio interface language of choice.
Defaults to CFG_SITE_LANG (the default language of the installation).
@return: (string) - the Web page to be displayed.
"""
ln = wash_language(ln)
# get user ID:
try:
uid = getUid(req)
user_info = collect_user_info(req)
except Error, e:
return error_page(e, req, ln)
# load the right message language
_ = gettext_set_language(ln)
finaltext = websubmit_templates.tmpl_submit_home_page(
ln = ln,
catalogues = catalogues_text,
user_info = user_info,
)
return page(title=_("Submit"),
body=finaltext,
navtrail=[],
description="submit documents",
keywords="submit",
uid=uid,
language=ln,
req=req,
navmenuid='submit'
)
def makeCataloguesTable(req, ln=CFG_SITE_LANG):
"""Build the 'catalogues' (submission-collections) tree for
the WebSubmit home-page. This tree contains the links to
the various document types in WebSubmit.
@param req: (dict) - the user request object
in order to decide whether to display a submission.
@param ln: (string) - the language of the interface.
(defaults to 'CFG_SITE_LANG').
@return: (string, bool, bool) - the submission-collections tree.
True if there is at least one submission authorized for the user
True if there is at least one submission
"""
def is_at_least_one_submission_authorized(cats):
for cat in cats:
if cat['docs']:
return True
if is_at_least_one_submission_authorized(cat['sons']):
return True
return False
text = ""
catalogues = []
## Get the submission-collections attached at the top level
## of the submission-collection tree:
top_level_collctns = get_collection_children_of_submission_collection(0)
if len(top_level_collctns) != 0:
## There are submission-collections attatched to the top level.
## retrieve their details for displaying:
for child_collctn in top_level_collctns:
catalogues.append(getCatalogueBranch(child_collctn[0], 1, req))
text = websubmit_templates.tmpl_submit_home_catalogs(
ln=ln,
catalogs=catalogues)
submissions_exist = True
at_least_one_submission_authorized = is_at_least_one_submission_authorized(catalogues)
else:
text = websubmit_templates.tmpl_submit_home_catalog_no_content(ln=ln)
submissions_exist = False
at_least_one_submission_authorized = False
return text, at_least_one_submission_authorized, submissions_exist
def getCatalogueBranch(id_father, level, req):
"""Build up a given branch of the submission-collection
tree. I.e. given a parent submission-collection ID,
build up the tree below it. This tree will include
doctype-children, as well as other submission-
collections and their children.
Finally, return the branch as a dictionary.
@param id_father: (integer) - the ID of the submission-collection
from which to begin building the branch.
@param level: (integer) - the level of the current submission-
collection branch.
@param req: (dict) - the user request object in order to decide
whether to display a submission.
@return: (dictionary) - the branch and its sub-branches.
"""
elem = {} ## The dictionary to contain this branch of the tree.
## First, get the submission-collection-details:
collctn_name = get_submission_collection_name(id_father)
if collctn_name is not None:
## Got the submission-collection's name:
elem['name'] = collctn_name
else:
## The submission-collection is unknown to the DB
## set its name as empty:
elem['name'] = ""
elem['id'] = id_father
elem['level'] = level
## Now get details of the doctype-children of this
## submission-collection:
elem['docs'] = [] ## List to hold the doctype-children
## of the submission-collection
doctype_children = \
get_doctype_children_of_submission_collection(id_father)
user_info = collect_user_info(req)
for child_doctype in doctype_children:
## To get access to a submission pipeline for a logged in user,
## it is decided by any authorization. If none are defined for the action
## then a logged in user will get access.
## If user is not logged in, a specific rule to allow the action is needed
if acc_authorize_action(req, 'submit', \
authorized_if_no_roles=not isGuestUser(user_info['uid']), \
doctype=child_doctype[0])[0] == 0:
elem['docs'].append(getDoctypeBranch(child_doctype[0]))
## Now, get the collection-children of this submission-collection:
elem['sons'] = []
collctn_children = \
get_collection_children_of_submission_collection(id_father)
for child_collctn in collctn_children:
elem['sons'].append(getCatalogueBranch(child_collctn[0], level + 1, req))
## Now return this branch of the built-up 'collection-tree':
return elem
def getDoctypeBranch(doctype):
"""Create a document-type 'leaf-node' for the submission-collections
tree. Basically, this leaf is a dictionary containing the name
and ID of the document-type submission to which it links.
@param doctype: (string) - the ID of the document type.
@return: (dictionary) - the document-type 'leaf node'. Contains
the following values:
+ id: (string) - the document-type ID.
+ name: (string) - the (long) name of the document-type.
"""
ldocname = get_longname_of_doctype(doctype)
if ldocname is None:
ldocname = "Unknown Document Type"
return { 'id' : doctype, 'name' : ldocname, }
def displayCatalogueBranch(id_father, level, catalogues):
text = ""
collctn_name = get_submission_collection_name(id_father)
if collctn_name is None:
## If this submission-collection wasn't known in the DB,
## give it the name "Unknown Submission-Collection" to
## avoid errors:
collctn_name = "Unknown Submission-Collection"
## Now, create the display for this submission-collection:
if level == 1:
text = "<LI><font size=\"+1\"><strong>%s</strong></font>\n" \
% collctn_name
else:
## TODO: These are the same (and the if is ugly.) Why?
if level == 2:
text = "<LI>%s\n" % collctn_name
else:
if level > 2:
text = "<LI>%s\n" % collctn_name
## Now display the children document-types that are attached
## to this submission-collection:
## First, get the children:
doctype_children = get_doctype_children_of_submission_collection(id_father)
collctn_children = get_collection_children_of_submission_collection(id_father)
if len(doctype_children) > 0 or len(collctn_children) > 0:
## There is something to display, so open a list:
text = text + "<UL>\n"
## First, add the doctype leaves of this branch:
for child_doctype in doctype_children:
## Add the doctype 'leaf-node':
text = text + displayDoctypeBranch(child_doctype[0], catalogues)
## Now add the submission-collection sub-branches:
for child_collctn in collctn_children:
catalogues.append(child_collctn[0])
text = text + displayCatalogueBranch(child_collctn[0], level+1, catalogues)
## Finally, close up the list if there were nodes to display
## at this branch:
if len(doctype_children) > 0 or len(collctn_children) > 0:
text = text + "</UL>\n"
return text
def displayDoctypeBranch(doctype, catalogues):
text = ""
ldocname = get_longname_of_doctype(doctype)
if ldocname is None:
ldocname = "Unknown Document Type"
text = "<LI><a href=\"\" onmouseover=\"javascript:" \
"popUpTextWindow('%s',true,event);\" onmouseout" \
"=\"javascript:popUpTextWindow('%s',false,event);\" " \
"onClick=\"document.forms[0].doctype.value='%s';" \
"document.forms[0].submit();return false;\">%s</a>\n" \
% (doctype, doctype, doctype, ldocname)
return text
def action(req, c=CFG_SITE_NAME, ln=CFG_SITE_LANG, doctype=""):
# load the right message language
_ = gettext_set_language(ln)
nbCateg = 0
snameCateg = []
lnameCateg = []
actionShortDesc = []
indir = []
actionbutton = []
statustext = []
t = ""
ln = wash_language(ln)
# get user ID:
try:
uid = getUid(req)
except Error, e:
return error_page(e, req, ln)
#parses database to get all data
## first, get the list of categories
doctype_categs = get_categories_of_doctype(doctype)
for doctype_categ in doctype_categs:
if not acc_authorize_action(req, 'submit', \
authorized_if_no_roles=not isGuestUser(uid), \
verbose=0, \
doctype=doctype, \
categ=doctype_categ[0])[0] == 0:
# This category is restricted for this user, move on to the next categories.
continue
nbCateg = nbCateg+1
snameCateg.append(doctype_categ[0])
lnameCateg.append(doctype_categ[1])
## Now get the details of the document type:
doctype_details = get_doctype_details(doctype)
if doctype_details is None:
## Doctype doesn't exist - raise error:
return warning_page(_("Unable to find document type: %s") % escape(str(doctype)), req, ln)
else:
docFullDesc = doctype_details[0]
# Also update the doctype as returned by the database, since
# it might have a differnent case (eg. DemOJrN->demoJRN)
doctype = docShortDesc = doctype_details[1]
description = doctype_details[4]
## Get the details of the actions supported by this document-type:
doctype_actions = get_actions_on_submission_page_for_doctype(doctype)
for doctype_action in doctype_actions:
if not acc_authorize_action(req, 'submit', \
authorized_if_no_roles=not isGuestUser(uid), \
doctype=doctype, \
act=doctype_action[0])[0] == 0:
# This action is not authorized for this user, move on to the next actions.
continue
## Get the details of this action:
action_details = get_action_details(doctype_action[0])
if action_details is not None:
actionShortDesc.append(doctype_action[0])
indir.append(action_details[1])
actionbutton.append(action_details[4])
statustext.append(action_details[5])
if not snameCateg and not actionShortDesc:
if isGuestUser(uid):
# If user is guest and does not have access to any of the
# categories, offer to login.
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({'referer' : CFG_SITE_SECURE_URL + req.unparsed_uri, 'ln' : ln}, {})),
norobot=True)
else:
return page_not_authorized(req, "../submit",
uid=uid,
text=_("You are not authorized to access this submission interface."),
navmenuid='submit')
## Send the gathered information to the template so that the doctype's
## home-page can be displayed:
t = websubmit_templates.tmpl_action_page(
ln=ln,
uid=uid,
pid = os.getpid(),
now = time.time(),
doctype = doctype,
description = description,
docfulldesc = docFullDesc,
snameCateg = snameCateg,
lnameCateg = lnameCateg,
actionShortDesc = actionShortDesc,
indir = indir,
# actionbutton = actionbutton,
statustext = statustext,
)
p_navtrail = """<a href="/submit?ln=%(ln)s" class="navtrail">%(submit)s</a>""" % {'submit' : _("Submit"),
'ln' : ln}
return page(title = docFullDesc,
body=t,
navtrail=p_navtrail,
description="submit documents",
keywords="submit",
uid=uid,
language=ln,
req=req,
navmenuid='submit'
)
def Request_Print(m, txt):
"""The argumemts to this function are the display mode (m) and the text
to be displayed (txt).
"""
return txt
def Evaluate_Parameter (field, doctype):
# Returns the literal value of the parameter. Assumes that the value is
# uniquely determined by the doctype, i.e. doctype is the primary key in
# the table
# If the table name is not null, evaluate the parameter
## TODO: The above comment looks like nonesense? This
## function only seems to get the values of parameters
## from the db...
## Get the value for the parameter:
param_val = get_parameter_value_for_doctype(doctype, field)
if param_val is None:
## Couldn't find a value for this parameter for this doctype.
## Instead, try with the default doctype (DEF):
param_val = get_parameter_value_for_doctype("DEF", field)
if param_val is None:
## There was no value for the parameter for the default doctype.
## Nothing can be done about it - return an empty string:
return ""
else:
## There was some kind of value for the parameter; return it:
return param_val
def Get_Parameters (function, doctype):
"""For a given function of a given document type, a dictionary
of the parameter names and values are returned.
@param function: (string) - the name of the function for which the
parameters are to be retrieved.
@param doctype: (string) - the ID of the document type.
@return: (dictionary) - of the parameters of the function.
Keyed by the parameter name, values are of course the parameter
values.
"""
parray = {}
## Get the names of the parameters expected by this function:
func_params = get_parameters_of_function(function)
for func_param in func_params:
## For each of the parameters, get its value for this document-
## type and add it into the dictionary of parameters:
parameter = func_param[0]
parray[parameter] = Evaluate_Parameter (parameter, doctype)
return parray
def get_level(doctype, action):
"""Get the level of a given submission. If unknown, return 0
as the level.
@param doctype: (string) - the ID of the document type.
@param action: (string) - the ID of the action.
@return: (integer) - the level of the submission; 0 otherwise.
"""
subm_details = get_details_of_submission(doctype, action)
if subm_details is not None:
## Return the level of this action
subm_level = subm_details[9]
try:
int(subm_level)
except ValueError:
return 0
else:
return subm_level
else:
return 0
def action_details (doctype, action):
# Prints whether the action is mandatory or optional. The score of the
# action is returned (-1 if the action was optional)
subm_details = get_details_of_submission(doctype, action)
if subm_details is not None:
if subm_details[9] != "0":
## This action is mandatory; return the score:
return subm_details[10]
else:
return -1
else:
return -1
def print_function_calls(req, doctype, action, step, form, start_time,
access, curdir, dismode, rn, last_step, action_score,
ln=CFG_SITE_LANG):
""" Calls the functions required by an 'action'
action on a 'doctype' document In supervisor mode, a table of the
function calls is produced
@return: (function_output_string, last_step, action_score, rn)
"""
user_info = collect_user_info(req)
# load the right message language
_ = gettext_set_language(ln)
t = ""
## Here follows the global protect environment.
the_globals = {
'doctype' : doctype,
'action' : action,
'act' : action, ## for backward compatibility
'step' : step,
'access' : access,
'ln' : ln,
'curdir' : curdir,
'uid' : user_info['uid'],
'uid_email' : user_info['email'],
'rn' : rn,
'last_step' : last_step,
'action_score' : action_score,
'__websubmit_in_jail__' : True,
'form' : form,
'user_info' : user_info,
'__builtins__' : globals()['__builtins__'],
'Request_Print': Request_Print
}
## Get the list of functions to be called
funcs_to_call = get_functions_for_submission_step(doctype, action, step)
## If no functions are found at this step for this doctype,
## get the functions for the DEF(ault) doctype:
if len(funcs_to_call) == 0:
funcs_to_call = get_functions_for_submission_step("DEF", action, step)
if len(funcs_to_call) > 0:
# while there are functions left...
functions = []
for function in funcs_to_call:
try:
function_name = function[0]
function_score = function[1]
currfunction = {
'name' : function_name,
'score' : function_score,
'error' : 0,
'text' : '',
}
from invenio.legacy.websubmit import functions
function_path = os.path.join(function.__path__,
function_name + '.py')
if os.path.exists(function_path):
# import the function itself
#function = getattr(invenio.legacy.websubmit.functions, function_name)
execfile(function_path, the_globals)
if function_name not in the_globals:
currfunction['error'] = 1
else:
the_globals['function'] = the_globals[function_name]
# Evaluate the parameters, and place them in an array
the_globals['parameters'] = Get_Parameters(function_name, doctype)
# Call function:
log_function(curdir, "Start %s" % function_name, start_time)
try:
try:
## Attempt to call the function with 4 arguments:
## ("parameters", "curdir" and "form" as usual),
## and "user_info" - the dictionary of user
## information:
##
## Note: The function should always be called with
## these keyword arguments because the "TypeError"
## except clause checks for a specific mention of
## the 'user_info' keyword argument when a legacy
## function (one that accepts only 'parameters',
## 'curdir' and 'form') has been called and if
## the error string doesn't contain this,
## the TypeError will be considered as a something
## that was incorrectly handled in the function and
## will be propagated as an
## InvenioWebSubmitFunctionError instead of the
## function being called again with the legacy 3
## arguments.
func_returnval = eval("function(parameters=parameters, curdir=curdir, form=form, user_info=user_info)", the_globals)
except TypeError, err:
## If the error contains the string "got an
## unexpected keyword argument", it means that the
## function doesn't accept the "user_info"
## argument. Test for this:
if "got an unexpected keyword argument 'user_info'" in \
str(err).lower():
## As expected, the function doesn't accept
## the user_info keyword argument. Call it
## again with the legacy 3 arguments
## (parameters, curdir, form):
func_returnval = eval("function(parameters=parameters, curdir=curdir, form=form)", the_globals)
else:
## An unexpected "TypeError" was caught.
## It looks as though the function itself didn't
## handle something correctly.
## Convert this error into an
## InvenioWebSubmitFunctionError and raise it:
msg = "Unhandled TypeError caught when " \
"calling [%s] WebSubmit function: " \
"[%s]" % (function_name, str(err))
raise InvenioWebSubmitFunctionError(msg)
except InvenioWebSubmitFunctionWarning, err:
## There was an unexpected behaviour during the
## execution. Log the message into function's log
## and go to next function
log_function(curdir, "***Warning*** from %s: %s" \
% (function_name, str(err)), start_time)
## Reset "func_returnval" to None:
func_returnval = None
register_exception(req=req, alert_admin=True, prefix="Warning in executing function %s with globals %s" % (pprint.pformat(currfunction), pprint.pformat(the_globals)))
log_function(curdir, "End %s" % function_name, start_time)
if func_returnval is not None:
## Append the returned value as a string:
currfunction['text'] = str(func_returnval)
else:
## The function the NoneType. Don't keep that value as
## the currfunction->text. Replace it with the empty
## string.
currfunction['text'] = ""
else:
currfunction['error'] = 1
functions.append(currfunction)
except InvenioWebSubmitFunctionStop, err:
## The submission asked to stop execution. This is
## ok. Do not alert admin, and raise exception further
log_function(curdir, "***Stop*** from %s: %s" \
% (function_name, str(err)), start_time)
raise
except:
register_exception(req=req, alert_admin=True, prefix="Error in executing function %s with globals %s" % (pprint.pformat(currfunction), pprint.pformat(the_globals)))
raise
t = websubmit_templates.tmpl_function_output(
ln = ln,
display_on = (dismode == 'S'),
action = action,
doctype = doctype,
step = step,
functions = functions,
)
else :
if dismode == 'S':
t = "<br /><br /><b>" + _("The chosen action is not supported by the document type.") + "</b>"
return (t, the_globals['last_step'], the_globals['action_score'], the_globals['rn'])
def Propose_Next_Action (doctype, action_score, access, currentlevel, indir, ln=CFG_SITE_LANG):
t = ""
next_submissions = \
get_submissions_at_level_X_with_score_above_N(doctype, currentlevel, action_score)
if len(next_submissions) > 0:
actions = []
first_score = next_submissions[0][10]
for action in next_submissions:
if action[10] == first_score:
## Get the submission directory of this action:
nextdir = get_storage_directory_of_action(action[1])
if nextdir is None:
nextdir = ""
curraction = {
'page' : action[11],
'action' : action[1],
'doctype' : doctype,
'nextdir' : nextdir,
'access' : access,
'indir' : indir,
'name' : action[12],
}
actions.append(curraction)
t = websubmit_templates.tmpl_next_action(
ln = ln,
actions = actions,
)
return t
def specialchars(text):
text = string.replace(text, "&#147;", "\042");
text = string.replace(text, "&#148;", "\042");
text = string.replace(text, "&#146;", "\047");
text = string.replace(text, "&#151;", "\055");
text = string.replace(text, "&#133;", "\056\056\056");
return text
def log_function(curdir, message, start_time, filename="function_log"):
"""Write into file the message and the difference of time
between starttime and current time
@param curdir:(string) path to the destination dir
@param message: (string) message to write into the file
@param starttime: (float) time to compute from
@param filname: (string) name of log file
"""
time_lap = "%.3f" % (time.time() - start_time)
if os.access(curdir, os.F_OK|os.W_OK):
fd = open("%s/%s" % (curdir, filename), "a+")
fd.write("""%s --- %s\n""" % (message, time_lap))
fd.close()
diff --git a/invenio/legacy/websubmit/file_converter.py b/invenio/legacy/websubmit/file_converter.py
index 19b281df4..8f209a9c5 100644
--- a/invenio/legacy/websubmit/file_converter.py
+++ b/invenio/legacy/websubmit/file_converter.py
@@ -1,1465 +1,1465 @@
# -*- coding: utf-8 -*-
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
This module implement fulltext conversion between many different file formats.
"""
import os
import stat
import re
import sys
import shutil
import tempfile
import HTMLParser
import time
import subprocess
import atexit
import signal
import threading
from logging import DEBUG, getLogger
from htmlentitydefs import entitydefs
from optparse import OptionParser
try:
from invenio.hocrlib import create_pdf, extract_hocr, CFG_PPM_RESOLUTION
try:
from PyPDF2 import PdfFileReader, PdfFileWriter
except ImportError:
from pyPdf import PdfFileReader, PdfFileWriter
CFG_CAN_DO_OCR = True
except ImportError:
CFG_CAN_DO_OCR = False
from invenio.utils.text import wrap_text_in_a_box
from invenio.utils.shell import run_process_with_timeout, run_shell_command
from invenio.config import CFG_TMPDIR, CFG_ETCDIR, CFG_PYLIBDIR, \
CFG_PATH_ANY2DJVU, \
CFG_PATH_PDFINFO, \
CFG_PATH_GS, \
CFG_PATH_PDFOPT, \
CFG_PATH_PDFTOPS, \
CFG_PATH_GZIP, \
CFG_PATH_GUNZIP, \
CFG_PATH_PDFTOTEXT, \
CFG_PATH_PDFTOPPM, \
CFG_PATH_OCROSCRIPT, \
CFG_PATH_DJVUPS, \
CFG_PATH_DJVUTXT, \
CFG_PATH_OPENOFFICE_PYTHON, \
CFG_PATH_PSTOTEXT, \
CFG_PATH_TIFF2PDF, \
CFG_PATH_PS2PDF, \
CFG_OPENOFFICE_SERVER_HOST, \
CFG_OPENOFFICE_SERVER_PORT, \
CFG_OPENOFFICE_USER, \
CFG_PATH_CONVERT, \
CFG_PATH_PAMFILE, \
CFG_BINDIR, \
CFG_LOGDIR, \
CFG_BIBSCHED_PROCESS_USER, \
CFG_BIBDOCFILE_BEST_FORMATS_TO_EXTRACT_TEXT_FROM, \
CFG_BIBDOCFILE_DESIRED_CONVERSIONS
from invenio.ext.logging import register_exception
def get_file_converter_logger():
return getLogger("InvenioWebSubmitFileConverterLogger")
CFG_TWO2THREE_LANG_CODES = {
'en': 'eng',
'nl': 'nld',
'es': 'spa',
'de': 'deu',
'it': 'ita',
'fr': 'fra',
}
CFG_OPENOFFICE_TMPDIR = os.path.join(CFG_TMPDIR, 'ooffice-tmp-files')
CFG_GS_MINIMAL_VERSION_FOR_PDFA = "8.65"
CFG_GS_MINIMAL_VERSION_FOR_PDFX = "8.52"
CFG_ICC_PATH = os.path.join(CFG_ETCDIR, 'websubmit', 'file_converter_templates', 'ISOCoatedsb.icc')
CFG_PDFA_DEF_PATH = os.path.join(CFG_ETCDIR, 'websubmit', 'file_converter_templates', 'PDFA_def.ps')
CFG_PDFX_DEF_PATH = os.path.join(CFG_ETCDIR, 'websubmit', 'file_converter_templates', 'PDFX_def.ps')
CFG_UNOCONV_LOG_PATH = os.path.join(CFG_LOGDIR, 'unoconv.log')
_RE_CLEAN_SPACES = re.compile(r'\s+')
class InvenioWebSubmitFileConverterError(Exception):
pass
def get_conversion_map():
"""Return a dictionary of the form:
'.pdf' : {'.ps.gz' : ('pdf2ps', {param1 : value1...})
"""
ret = {
'.csv': {},
'.djvu': {},
'.doc': {},
'.docx': {},
'.sxw': {},
'.htm': {},
'.html': {},
'.odp': {},
'.ods': {},
'.odt': {},
'.pdf': {},
'.ppt': {},
'.pptx': {},
'.sxi': {},
'.ps': {},
'.ps.gz': {},
'.rtf': {},
'.tif': {},
'.tiff': {},
'.txt': {},
'.xls': {},
'.xlsx': {},
'.sxc': {},
'.xml': {},
'.hocr': {},
'.pdf;pdfa': {},
'.asc': {},
}
if CFG_PATH_GZIP:
ret['.ps']['.ps.gz'] = (gzip, {})
if CFG_PATH_GUNZIP:
ret['.ps.gz']['.ps'] = (gunzip, {})
if CFG_PATH_ANY2DJVU:
ret['.pdf']['.djvu'] = (any2djvu, {})
ret['.ps']['.djvu'] = (any2djvu, {})
if CFG_PATH_DJVUPS:
ret['.djvu']['.ps'] = (djvu2ps, {'compress': False})
if CFG_PATH_GZIP:
ret['.djvu']['.ps.gz'] = (djvu2ps, {'compress': True})
if CFG_PATH_DJVUTXT:
ret['.djvu']['.txt'] = (djvu2text, {})
if CFG_PATH_PSTOTEXT:
ret['.ps']['.txt'] = (pstotext, {})
if CFG_PATH_GUNZIP:
ret['.ps.gz']['.txt'] = (pstotext, {})
if can_pdfa():
ret['.ps']['.pdf;pdfa'] = (ps2pdfa, {})
ret['.pdf']['.pdf;pdfa'] = (pdf2pdfa, {})
if CFG_PATH_GUNZIP:
ret['.ps.gz']['.pdf;pdfa'] = (ps2pdfa, {})
else:
if CFG_PATH_PS2PDF:
ret['.ps']['.pdf;pdfa'] = (ps2pdf, {})
if CFG_PATH_GUNZIP:
ret['.ps.gz']['.pdf'] = (ps2pdf, {})
if can_pdfx():
ret['.ps']['.pdf;pdfx'] = (ps2pdfx, {})
ret['.pdf']['.pdf;pdfx'] = (pdf2pdfx, {})
if CFG_PATH_GUNZIP:
ret['.ps.gz']['.pdf;pdfx'] = (ps2pdfx, {})
if CFG_PATH_PDFTOPS:
ret['.pdf']['.ps'] = (pdf2ps, {'compress': False})
ret['.pdf;pdfa']['.ps'] = (pdf2ps, {'compress': False})
if CFG_PATH_GZIP:
ret['.pdf']['.ps.gz'] = (pdf2ps, {'compress': True})
ret['.pdf;pdfa']['.ps.gz'] = (pdf2ps, {'compress': True})
if CFG_PATH_PDFTOTEXT:
ret['.pdf']['.txt'] = (pdf2text, {})
ret['.pdf;pdfa']['.txt'] = (pdf2text, {})
ret['.asc']['.txt'] = (txt2text, {})
ret['.txt']['.txt'] = (txt2text, {})
ret['.csv']['.txt'] = (txt2text, {})
ret['.html']['.txt'] = (html2text, {})
ret['.htm']['.txt'] = (html2text, {})
ret['.xml']['.txt'] = (html2text, {})
if CFG_PATH_TIFF2PDF:
ret['.tiff']['.pdf'] = (tiff2pdf, {})
ret['.tif']['.pdf'] = (tiff2pdf, {})
if CFG_PATH_OPENOFFICE_PYTHON and CFG_OPENOFFICE_SERVER_HOST:
ret['.rtf']['.odt'] = (unoconv, {'output_format': 'odt'})
ret['.rtf']['.pdf;pdfa'] = (unoconv, {'output_format': 'pdf'})
ret['.rtf']['.txt'] = (unoconv, {'output_format': 'txt'})
ret['.rtf']['.docx'] = (unoconv, {'output_format': 'docx'})
ret['.doc']['.odt'] = (unoconv, {'output_format': 'odt'})
ret['.doc']['.pdf;pdfa'] = (unoconv, {'output_format': 'pdf'})
ret['.doc']['.txt'] = (unoconv, {'output_format': 'txt'})
ret['.doc']['.docx'] = (unoconv, {'output_format': 'docx'})
ret['.docx']['.odt'] = (unoconv, {'output_format': 'odt'})
ret['.docx']['.pdf;pdfa'] = (unoconv, {'output_format': 'pdf'})
ret['.docx']['.txt'] = (unoconv, {'output_format': 'txt'})
ret['.sxw']['.odt'] = (unoconv, {'output_format': 'odt'})
ret['.sxw']['.pdf;pdfa'] = (unoconv, {'output_format': 'pdf'})
ret['.sxw']['.txt'] = (unoconv, {'output_format': 'txt'})
ret['.docx']['.docx'] = (unoconv, {'output_format': 'docx'})
ret['.odt']['.doc'] = (unoconv, {'output_format': 'doc'})
ret['.odt']['.pdf;pdfa'] = (unoconv, {'output_format': 'pdf'})
ret['.odt']['.txt'] = (unoconv, {'output_format': 'txt'})
ret['.odt']['.docx'] = (unoconv, {'output_format': 'docx'})
ret['.ppt']['.odp'] = (unoconv, {'output_format': 'odp'})
ret['.ppt']['.pdf;pdfa'] = (unoconv, {'output_format': 'pdf'})
ret['.ppt']['.txt'] = (unoconv, {'output_format': 'txt'})
ret['.ppt']['.pptx'] = (unoconv, {'output_format': 'pptx'})
ret['.pptx']['.odp'] = (unoconv, {'output_format': 'odp'})
ret['.pptx']['.pdf;pdfa'] = (unoconv, {'output_format': 'pdf'})
ret['.pptx']['.txt'] = (unoconv, {'output_format': 'txt'})
ret['.sxi']['.odp'] = (unoconv, {'output_format': 'odp'})
ret['.sxi']['.pdf;pdfa'] = (unoconv, {'output_format': 'pdf'})
ret['.sxi']['.txt'] = (unoconv, {'output_format': 'txt'})
ret['.sxi']['.pptx'] = (unoconv, {'output_format': 'pptx'})
ret['.odp']['.ppt'] = (unoconv, {'output_format': 'ppt'})
ret['.odp']['.pptx'] = (unoconv, {'output_format': 'pptx'})
ret['.odp']['.pdf;pdfa'] = (unoconv, {'output_format': 'pdf'})
ret['.odp']['.txt'] = (unoconv, {'output_format': 'txt'})
ret['.odp']['.pptx'] = (unoconv, {'output_format': 'pptx'})
ret['.xls']['.ods'] = (unoconv, {'output_format': 'ods'})
ret['.xls']['.xlsx'] = (unoconv, {'output_format': 'xslx'})
ret['.xlsx']['.ods'] = (unoconv, {'output_format': 'ods'})
ret['.sxc']['.ods'] = (unoconv, {'output_format': 'ods'})
ret['.sxc']['.xlsx'] = (unoconv, {'output_format': 'xslx'})
ret['.ods']['.xls'] = (unoconv, {'output_format': 'xls'})
ret['.ods']['.pdf;pdfa'] = (unoconv, {'output_format': 'pdf'})
ret['.ods']['.csv'] = (unoconv, {'output_format': 'csv'})
ret['.ods']['.xlsx'] = (unoconv, {'output_format': 'xslx'})
ret['.csv']['.txt'] = (txt2text, {})
## Let's add all the existing output formats as potential input formats.
for value in ret.values():
for key in value.keys():
if key not in ret:
ret[key] = {}
return ret
def get_best_format_to_extract_text_from(filelist, best_formats=CFG_BIBDOCFILE_BEST_FORMATS_TO_EXTRACT_TEXT_FROM):
"""
Return among the filelist the best file whose format is best suited for
extracting text.
"""
- from invenio.bibdocfile import decompose_file, normalize_format
+ from invenio.legacy.bibdocfile.api import decompose_file, normalize_format
best_formats = [normalize_format(aformat) for aformat in best_formats if can_convert(aformat, '.txt')]
for aformat in best_formats:
for filename in filelist:
if decompose_file(filename, skip_version=True)[2].endswith(aformat):
return filename
raise InvenioWebSubmitFileConverterError("It's not possible to extract valuable text from any of the proposed files.")
def get_missing_formats(filelist, desired_conversion=None):
"""Given a list of files it will return a dictionary of the form:
file1 : missing formats to generate from it...
"""
- from invenio.bibdocfile import normalize_format, decompose_file
+ from invenio.legacy.bibdocfile.api import normalize_format, decompose_file
def normalize_desired_conversion():
ret = {}
for key, value in desired_conversion.iteritems():
ret[normalize_format(key)] = [normalize_format(aformat) for aformat in value]
return ret
if desired_conversion is None:
desired_conversion = CFG_BIBDOCFILE_DESIRED_CONVERSIONS
available_formats = [decompose_file(filename, skip_version=True)[2] for filename in filelist]
missing_formats = []
desired_conversion = normalize_desired_conversion()
ret = {}
for filename in filelist:
aformat = decompose_file(filename, skip_version=True)[2]
if aformat in desired_conversion:
for desired_format in desired_conversion[aformat]:
if desired_format not in available_formats and desired_format not in missing_formats:
missing_formats.append(desired_format)
if filename not in ret:
ret[filename] = []
ret[filename].append(desired_format)
return ret
def can_convert(input_format, output_format, max_intermediate_conversions=4):
"""Return the chain of conversion to transform input_format into output_format, if any."""
- from invenio.bibdocfile import normalize_format
+ from invenio.legacy.bibdocfile.api import normalize_format
if max_intermediate_conversions <= 0:
return []
input_format = normalize_format(input_format)
output_format = normalize_format(output_format)
if input_format in __CONVERSION_MAP:
if output_format in __CONVERSION_MAP[input_format]:
return [__CONVERSION_MAP[input_format][output_format]]
best_res = []
best_intermediate = ''
for intermediate_format in __CONVERSION_MAP[input_format]:
res = can_convert(intermediate_format, output_format, max_intermediate_conversions-1)
if res and (len(res) < best_res or not best_res):
best_res = res
best_intermediate = intermediate_format
if best_res:
return [__CONVERSION_MAP[input_format][best_intermediate]] + best_res
return []
def can_pdfopt(verbose=False):
"""Return True if it's possible to optimize PDFs."""
if CFG_PATH_PDFOPT:
return True
elif verbose:
print >> sys.stderr, "PDF linearization is not supported because the pdfopt executable is not available"
return False
def can_pdfx(verbose=False):
"""Return True if it's possible to generate PDF/Xs."""
if not CFG_PATH_PDFTOPS:
if verbose:
print >> sys.stderr, "Conversion of PS or PDF to PDF/X is not possible because the pdftops executable is not available"
return False
if not CFG_PATH_GS:
if verbose:
print >> sys.stderr, "Conversion of PS or PDF to PDF/X is not possible because the gs executable is not available"
return False
else:
try:
output = run_shell_command("%s --version" % CFG_PATH_GS)[1].strip()
if not output:
raise ValueError("No version information returned")
if [int(number) for number in output.split('.')] < [int(number) for number in CFG_GS_MINIMAL_VERSION_FOR_PDFX.split('.')]:
print >> sys.stderr, "Conversion of PS or PDF to PDF/X is not possible because the minimal gs version for the executable %s is not met: it should be %s but %s has been found" % (CFG_PATH_GS, CFG_GS_MINIMAL_VERSION_FOR_PDFX, output)
return False
except Exception, err:
print >> sys.stderr, "Conversion of PS or PDF to PDF/X is not possible because it's not possible to retrieve the gs version using the executable %s: %s" % (CFG_PATH_GS, err)
return False
if not CFG_PATH_PDFINFO:
if verbose:
print >> sys.stderr, "Conversion of PS or PDF to PDF/X is not possible because the pdfinfo executable is not available"
return False
if not os.path.exists(CFG_ICC_PATH):
if verbose:
print >> sys.stderr, "Conversion of PS or PDF to PDF/X is not possible because %s does not exists. Have you run make install-pdfa-helper-files?" % CFG_ICC_PATH
return False
return True
def can_pdfa(verbose=False):
"""Return True if it's possible to generate PDF/As."""
if not CFG_PATH_PDFTOPS:
if verbose:
print >> sys.stderr, "Conversion of PS or PDF to PDF/A is not possible because the pdftops executable is not available"
return False
if not CFG_PATH_GS:
if verbose:
print >> sys.stderr, "Conversion of PS or PDF to PDF/A is not possible because the gs executable is not available"
return False
else:
try:
output = run_shell_command("%s --version" % CFG_PATH_GS)[1].strip()
if not output:
raise ValueError("No version information returned")
if [int(number) for number in output.split('.')] < [int(number) for number in CFG_GS_MINIMAL_VERSION_FOR_PDFA.split('.')]:
print >> sys.stderr, "Conversion of PS or PDF to PDF/A is not possible because the minimal gs version for the executable %s is not met: it should be %s but %s has been found" % (CFG_PATH_GS, CFG_GS_MINIMAL_VERSION_FOR_PDFA, output)
return False
except Exception, err:
print >> sys.stderr, "Conversion of PS or PDF to PDF/A is not possible because it's not possible to retrieve the gs version using the executable %s: %s" % (CFG_PATH_GS, err)
return False
if not CFG_PATH_PDFINFO:
if verbose:
print >> sys.stderr, "Conversion of PS or PDF to PDF/A is not possible because the pdfinfo executable is not available"
return False
if not os.path.exists(CFG_ICC_PATH):
if verbose:
print >> sys.stderr, "Conversion of PS or PDF to PDF/A is not possible because %s does not exists. Have you run make install-pdfa-helper-files?" % CFG_ICC_PATH
return False
return True
def can_perform_ocr(verbose=False):
"""Return True if it's possible to perform OCR."""
if not CFG_CAN_DO_OCR:
if verbose:
print >> sys.stderr, "OCR is not supported because either the pyPdf of ReportLab Python libraries are missing"
return False
if not CFG_PATH_OCROSCRIPT:
if verbose:
print >> sys.stderr, "OCR is not supported because the ocroscript executable is not available"
return False
if not CFG_PATH_PDFTOPPM:
if verbose:
print >> sys.stderr, "OCR is not supported because the pdftoppm executable is not available"
return False
return True
def guess_ocropus_produced_garbage(input_file, hocr_p):
"""Return True if the output produced by OCROpus in hocr format contains
only garbage instead of text. This is implemented via an heuristic:
if the most common length for sentences encoded in UTF-8 is 1 then
this is Garbage (tm).
"""
def _get_words_from_text():
ret = []
for row in open(input_file):
for word in row.strip().split(' '):
ret.append(word.strip())
return ret
def _get_words_from_hocr():
ret = []
hocr = extract_hocr(open(input_file).read())
for dummy, dummy, lines in hocr:
for dummy, line in lines:
for word in line.split():
ret.append(word.strip())
return ret
if hocr_p:
words = _get_words_from_hocr()
else:
words = _get_words_from_text()
#stats = {}
#most_common_len = 0
#most_common_how_many = 0
#for word in words:
#if word:
#word_length = len(word.decode('utf-8'))
#stats[word_length] = stats.get(word_length, 0) + 1
#if stats[word_length] > most_common_how_many:
#most_common_len = word_length
#most_common_how_many = stats[word_length]
goods = 0
bads = 0
for word in words:
for char in word.decode('utf-8'):
if (u'a' <= char <= u'z') or (u'A' <= char <= u'Z'):
goods += 1
else:
bads += 1
if bads > goods:
get_file_converter_logger().debug('OCROpus produced garbage')
return True
else:
return False
def guess_is_OCR_needed(input_file, ln='en'):
"""
Tries to see if enough text is retrievable from input_file.
Return True if OCR is needed, False if it's already
possible to retrieve information from the document.
"""
## FIXME: a way to understand if pdftotext has returned garbage
## shuould be found. E.g. 1.0*len(text)/len(zlib.compress(text)) < 2.1
## could be a good hint for garbage being found.
return True
def convert_file(input_file, output_file=None, output_format=None, **params):
"""
Convert files from one format to another.
@param input_file [string] the path to an existing file
@param output_file [string] the path to the desired ouput. (if None a
temporary file is generated)
@param output_format [string] the desired format (if None it is taken from
output_file)
@param params other paramaters to pass to the particular converter
@return [string] the final output_file
"""
- from invenio.bibdocfile import decompose_file, normalize_format
+ from invenio.legacy.bibdocfile.api import decompose_file, normalize_format
if output_format is None:
if output_file is None:
raise ValueError("At least output_file or format should be specified.")
else:
output_ext = decompose_file(output_file, skip_version=True)[2]
else:
output_ext = normalize_format(output_format)
input_ext = decompose_file(input_file, skip_version=True)[2]
conversion_chain = can_convert(input_ext, output_ext)
if conversion_chain:
get_file_converter_logger().debug("Conversion chain from %s to %s: %s" % (input_ext, output_ext, conversion_chain))
current_input = input_file
for i, (converter, final_params) in enumerate(conversion_chain):
current_output = None
if i == (len(conversion_chain) - 1):
current_output = output_file
final_params = dict(final_params)
final_params.update(params)
try:
get_file_converter_logger().debug("Converting from %s to %s using %s with params %s" % (current_input, current_output, converter, final_params))
current_output = converter(current_input, current_output, **final_params)
get_file_converter_logger().debug("... current_output %s" % (current_output, ))
except InvenioWebSubmitFileConverterError, err:
raise InvenioWebSubmitFileConverterError("Error when converting from %s to %s: %s" % (input_file, output_ext, err))
except Exception, err:
register_exception(alert_admin=True)
raise InvenioWebSubmitFileConverterError("Unexpected error when converting from %s to %s (%s): %s" % (input_file, output_ext, type(err), err))
if current_input != input_file:
os.remove(current_input)
current_input = current_output
return current_output
else:
raise InvenioWebSubmitFileConverterError("It's impossible to convert from %s to %s" % (input_ext, output_ext))
try:
_UNOCONV_DAEMON
except NameError:
_UNOCONV_DAEMON = None
_UNOCONV_DAEMON_LOCK = threading.Lock()
def _register_unoconv():
global _UNOCONV_DAEMON
if CFG_OPENOFFICE_SERVER_HOST != 'localhost':
return
_UNOCONV_DAEMON_LOCK.acquire()
try:
if not _UNOCONV_DAEMON:
output_log = open(CFG_UNOCONV_LOG_PATH, 'a')
_UNOCONV_DAEMON = subprocess.Popen(['sudo', '-S', '-u', CFG_OPENOFFICE_USER, os.path.join(CFG_BINDIR, 'inveniounoconv'), '-vvv', '-s', CFG_OPENOFFICE_SERVER_HOST, '-p', str(CFG_OPENOFFICE_SERVER_PORT), '-l'], stdin=open('/dev/null', 'r'), stdout=output_log, stderr=output_log)
time.sleep(3)
finally:
_UNOCONV_DAEMON_LOCK.release()
def _unregister_unoconv():
global _UNOCONV_DAEMON
if CFG_OPENOFFICE_SERVER_HOST != 'localhost':
return
_UNOCONV_DAEMON_LOCK.acquire()
try:
if _UNOCONV_DAEMON:
output_log = open(CFG_UNOCONV_LOG_PATH, 'a')
subprocess.call(['sudo', '-S', '-u', CFG_OPENOFFICE_USER, os.path.join(CFG_BINDIR, 'inveniounoconv'), '-k', '-vvv'], stdin=open('/dev/null', 'r'), stdout=output_log, stderr=output_log)
time.sleep(1)
if _UNOCONV_DAEMON.poll():
try:
os.kill(_UNOCONV_DAEMON.pid, signal.SIGTERM)
except OSError:
pass
if _UNOCONV_DAEMON.poll():
try:
os.kill(_UNOCONV_DAEMON.pid, signal.SIGKILL)
except OSError:
pass
finally:
_UNOCONV_DAEMON_LOCK.release()
## NOTE: in case we switch back keeping LibreOffice running, uncomment
## the following line.
#atexit.register(_unregister_unoconv)
def unoconv(input_file, output_file=None, output_format='txt', pdfopt=True, **dummy):
"""Use unconv to convert among OpenOffice understood documents."""
- from invenio.bibdocfile import normalize_format
+ from invenio.legacy.bibdocfile.api import normalize_format
## NOTE: in case we switch back keeping LibreOffice running, uncomment
## the following line.
#_register_unoconv()
input_file, output_file, dummy = prepare_io(input_file, output_file, output_format, need_working_dir=False)
if output_format == 'txt':
unoconv_format = 'text'
else:
unoconv_format = output_format
try:
try:
## We copy the input file and we make it available to OpenOffice
## with the user nobody
- from invenio.bibdocfile import decompose_file
+ from invenio.legacy.bibdocfile.api import decompose_file
input_format = decompose_file(input_file, skip_version=True)[2]
fd, tmpinputfile = tempfile.mkstemp(dir=CFG_TMPDIR, suffix=normalize_format(input_format))
os.close(fd)
shutil.copy(input_file, tmpinputfile)
get_file_converter_logger().debug("Prepared input file %s" % tmpinputfile)
os.chmod(tmpinputfile, stat.S_IRUSR | stat.S_IWUSR | stat.S_IRGRP | stat.S_IROTH)
tmpoutputfile = tempfile.mktemp(dir=CFG_OPENOFFICE_TMPDIR, suffix=normalize_format(output_format))
get_file_converter_logger().debug("Prepared output file %s" % tmpoutputfile)
try:
execute_command(os.path.join(CFG_BINDIR, 'inveniounoconv'), '-vvv', '-s', CFG_OPENOFFICE_SERVER_HOST, '-p', str(CFG_OPENOFFICE_SERVER_PORT), '--output', tmpoutputfile, '-f', unoconv_format, tmpinputfile, sudo=CFG_OPENOFFICE_USER)
except:
register_exception(alert_admin=True)
raise
except InvenioWebSubmitFileConverterError:
## Ok maybe OpenOffice hanged. Let's better kill it and restarted!
if CFG_OPENOFFICE_SERVER_HOST != 'localhost':
## There's not that much that we can do. Let's bail out
if not os.path.exists(tmpoutputfile) or not os.path.getsize(tmpoutputfile):
raise
else:
## Sometimes OpenOffice crashes but we don't care :-)
## it still have created a nice file.
pass
else:
execute_command(os.path.join(CFG_BINDIR, 'inveniounoconv'), '-vvv', '-k', sudo=CFG_OPENOFFICE_USER)
## NOTE: in case we switch back keeping LibreOffice running, uncomment
## the following lines.
#_unregister_unoconv()
#_register_unoconv()
time.sleep(5)
try:
execute_command(os.path.join(CFG_BINDIR, 'inveniounoconv'), '-vvv', '-s', CFG_OPENOFFICE_SERVER_HOST, '-p', str(CFG_OPENOFFICE_SERVER_PORT), '--output', tmpoutputfile, '-f', unoconv_format, tmpinputfile, sudo=CFG_OPENOFFICE_USER)
except InvenioWebSubmitFileConverterError:
execute_command(os.path.join(CFG_BINDIR, 'inveniounoconv'), '-vvv', '-k', sudo=CFG_OPENOFFICE_USER)
if not os.path.exists(tmpoutputfile) or not os.path.getsize(tmpoutputfile):
raise InvenioWebSubmitFileConverterError('No output was generated by OpenOffice')
else:
## Sometimes OpenOffice crashes but we don't care :-)
## it still have created a nice file.
pass
except Exception, err:
raise InvenioWebSubmitFileConverterError(get_unoconv_installation_guideline(err))
output_format = normalize_format(output_format)
if output_format == '.pdf' and pdfopt:
pdf2pdfopt(tmpoutputfile, output_file)
else:
shutil.copy(tmpoutputfile, output_file)
execute_command(os.path.join(CFG_BINDIR, 'inveniounoconv'), '-r', tmpoutputfile, sudo=CFG_OPENOFFICE_USER)
os.remove(tmpinputfile)
return output_file
def get_unoconv_installation_guideline(err):
"""Return the Libre/OpenOffice installation guideline (embedding the
current error message).
"""
- from invenio.bibtask import guess_apache_process_user
+ from invenio.legacy.bibsched.bibtask import guess_apache_process_user
return wrap_text_in_a_box("""\
OpenOffice.org can't properly create files in the OpenOffice.org temporary
directory %(tmpdir)s, as the user %(nobody)s (as configured in
CFG_OPENOFFICE_USER invenio(-local).conf variable): %(err)s.
In your /etc/sudoers file, you should authorize the %(apache)s user to run
%(unoconv)s as %(nobody)s user as in:
%(apache)s ALL=(%(nobody)s) NOPASSWD: %(unoconv)s
You should then run the following commands:
$ sudo mkdir -p %(tmpdir)s
$ sudo chown -R %(nobody)s %(tmpdir)s
$ sudo chmod -R 755 %(tmpdir)s""" % {
'tmpdir' : CFG_OPENOFFICE_TMPDIR,
'nobody' : CFG_OPENOFFICE_USER,
'err' : err,
'apache' : CFG_BIBSCHED_PROCESS_USER or guess_apache_process_user(),
'python' : CFG_PATH_OPENOFFICE_PYTHON,
'unoconv' : os.path.join(CFG_BINDIR, 'inveniounoconv')
})
def can_unoconv(verbose=False):
"""
If OpenOffice.org integration is enabled, checks whether the system is
properly configured.
"""
if CFG_PATH_OPENOFFICE_PYTHON and CFG_OPENOFFICE_SERVER_HOST:
try:
test = os.path.join(CFG_TMPDIR, 'test.txt')
open(test, 'w').write('test')
output = unoconv(test, output_format='pdf')
output2 = convert_file(output, output_format='.txt')
if 'test' not in open(output2).read():
raise Exception("Coulnd't produce a valid PDF with Libre/OpenOffice.org")
os.remove(output2)
os.remove(output)
os.remove(test)
return True
except Exception, err:
if verbose:
print >> sys.stderr, get_unoconv_installation_guideline(err)
return False
else:
if verbose:
print >> sys.stderr, "Libre/OpenOffice.org integration not enabled"
return False
def any2djvu(input_file, output_file=None, resolution=400, ocr=True, input_format=5, **dummy):
"""
Transform input_file into a .djvu file.
@param input_file [string] the input file name
@param output_file [string] the output_file file name, None for temporary generated
@param resolution [int] the resolution of the output_file
@param input_format [int] [1-9]:
1 - DjVu Document (for verification or OCR)
2 - PS/PS.GZ/PDF Document (default)
3 - Photo/Picture/Icon
4 - Scanned Document - B&W - <200 dpi
5 - Scanned Document - B&W - 200-400 dpi
6 - Scanned Document - B&W - >400 dpi
7 - Scanned Document - Color/Mixed - <200 dpi
8 - Scanned Document - Color/Mixed - 200-400 dpi
9 - Scanned Document - Color/Mixed - >400 dpi
@return [string] output_file input_file.
raise InvenioWebSubmitFileConverterError in case of errors.
Note: due to the bottleneck of using a centralized server, it is very
slow and is not suitable for interactive usage (e.g. WebSubmit functions)
"""
- from invenio.bibdocfile import decompose_file
+ from invenio.legacy.bibdocfile.api import decompose_file
input_file, output_file, working_dir = prepare_io(input_file, output_file, '.djvu')
ocr = ocr and "1" or "0"
## Any2djvu expect to find the file in the current directory.
execute_command(CFG_PATH_ANY2DJVU, '-a', '-c', '-r', resolution, '-o', ocr, '-f', input_format, os.path.basename(input_file), cwd=working_dir)
## Any2djvu doesn't let you choose the output_file file name.
djvu_output = os.path.join(working_dir, decompose_file(input_file)[1] + '.djvu')
shutil.move(djvu_output, output_file)
clean_working_dir(working_dir)
return output_file
_RE_FIND_TITLE = re.compile(r'^Title:\s*(.*?)\s*$')
def pdf2pdfx(input_file, output_file=None, title=None, pdfopt=False, profile="pdf/x-3:2002", **dummy):
"""
Transform any PDF into a PDF/X (see: <http://en.wikipedia.org/wiki/PDF/X>)
@param input_file [string] the input file name
@param output_file [string] the output_file file name, None for temporary generated
@param title [string] the title of the document. None for autodiscovery.
@param pdfopt [bool] whether to linearize the pdf, too.
@param profile: [string] the PDFX profile to use. Supports: 'pdf/x-1a:2001', 'pdf/x-1a:2003', 'pdf/x-3:2002'
@return [string] output_file input_file
raise InvenioWebSubmitFileConverterError in case of errors.
"""
input_file, output_file, working_dir = prepare_io(input_file, output_file, '.pdf')
if title is None:
stdout = execute_command(CFG_PATH_PDFINFO, input_file)
for line in stdout.split('\n'):
g = _RE_FIND_TITLE.match(line)
if g:
title = g.group(1)
break
if not title:
title = 'No title'
get_file_converter_logger().debug("Extracted title is %s" % title)
if os.path.exists(CFG_ICC_PATH):
shutil.copy(CFG_ICC_PATH, working_dir)
else:
raise InvenioWebSubmitFileConverterError('ERROR: ISOCoatedsb.icc file missing. Have you run "make install-pdfa-helper-files" as part of your Invenio deployment?')
pdfx_header = open(CFG_PDFX_DEF_PATH).read()
pdfx_header = pdfx_header.replace('<<<<TITLEMARKER>>>>', title)
icc_iso_profile_def = ''
if profile == 'pdf/x-1a:2001':
pdfx_version = 'PDF/X-1a:2001'
pdfx_conformance = 'PDF/X-1a:2001'
elif profile == 'pdf/x-1a:2003':
pdfx_version = 'PDF/X-1a:2003'
pdfx_conformance = 'PDF/X-1a:2003'
elif profile == 'pdf/x-3:2002':
icc_iso_profile_def = '/ICCProfile (ISOCoatedsb.icc)'
pdfx_version = 'PDF/X-3:2002'
pdfx_conformance = 'PDF/X-3:2002'
pdfx_header = pdfx_header.replace('<<<<ICCPROFILEDEF>>>>', icc_iso_profile_def)
pdfx_header = pdfx_header.replace('<<<<GTS_PDFXVersion>>>>', pdfx_version)
pdfx_header = pdfx_header.replace('<<<<GTS_PDFXConformance>>>>', pdfx_conformance)
outputpdf = os.path.join(working_dir, 'output_file.pdf')
open(os.path.join(working_dir, 'PDFX_def.ps'), 'w').write(pdfx_header)
if profile in ['pdf/x-3:2002']:
execute_command(CFG_PATH_GS, '-sProcessColorModel=DeviceCMYK', '-dPDFX', '-dBATCH', '-dNOPAUSE', '-dNOOUTERSAVE', '-dUseCIEColor', '-sDEVICE=pdfwrite', '-dAutoRotatePages=/None', '-sOutputFile=output_file.pdf', os.path.join(working_dir, 'PDFX_def.ps'), input_file, cwd=working_dir)
elif profile in ['pdf/x-1a:2001', 'pdf/x-1a:2003']:
execute_command(CFG_PATH_GS, '-sProcessColorModel=DeviceCMYK', '-dPDFX', '-dBATCH', '-dNOPAUSE', '-dNOOUTERSAVE', '-sColorConversionStrategy=CMYK', '-sDEVICE=pdfwrite', '-dAutoRotatePages=/None', '-sOutputFile=output_file.pdf', os.path.join(working_dir, 'PDFX_def.ps'), input_file, cwd=working_dir)
if pdfopt:
execute_command(CFG_PATH_PDFOPT, outputpdf, output_file)
else:
shutil.move(outputpdf, output_file)
clean_working_dir(working_dir)
return output_file
def pdf2pdfa(input_file, output_file=None, title=None, pdfopt=True, **dummy):
"""
Transform any PDF into a PDF/A (see: <http://www.pdfa.org/>)
@param input_file [string] the input file name
@param output_file [string] the output_file file name, None for temporary generated
@param title [string] the title of the document. None for autodiscovery.
@param pdfopt [bool] whether to linearize the pdf, too.
@return [string] output_file input_file
raise InvenioWebSubmitFileConverterError in case of errors.
"""
input_file, output_file, working_dir = prepare_io(input_file, output_file, '.pdf')
if title is None:
stdout = execute_command(CFG_PATH_PDFINFO, input_file)
for line in stdout.split('\n'):
g = _RE_FIND_TITLE.match(line)
if g:
title = g.group(1)
break
if not title:
title = 'No title'
get_file_converter_logger().debug("Extracted title is %s" % title)
if os.path.exists(CFG_ICC_PATH):
shutil.copy(CFG_ICC_PATH, working_dir)
else:
raise InvenioWebSubmitFileConverterError('ERROR: ISOCoatedsb.icc file missing. Have you run "make install-pdfa-helper-files" as part of your Invenio deployment?')
pdfa_header = open(CFG_PDFA_DEF_PATH).read()
pdfa_header = pdfa_header.replace('<<<<TITLEMARKER>>>>', title)
inputps = os.path.join(working_dir, 'input.ps')
outputpdf = os.path.join(working_dir, 'output_file.pdf')
open(os.path.join(working_dir, 'PDFA_def.ps'), 'w').write(pdfa_header)
execute_command(CFG_PATH_PDFTOPS, '-level3', input_file, inputps)
execute_command(CFG_PATH_GS, '-sProcessColorModel=DeviceCMYK', '-dPDFA', '-dBATCH', '-dNOPAUSE', '-dNOOUTERSAVE', '-dUseCIEColor', '-sDEVICE=pdfwrite', '-dAutoRotatePages=/None', '-sOutputFile=output_file.pdf', os.path.join(working_dir, 'PDFA_def.ps'), 'input.ps', cwd=working_dir)
if pdfopt:
execute_command(CFG_PATH_PDFOPT, outputpdf, output_file)
else:
shutil.move(outputpdf, output_file)
clean_working_dir(working_dir)
return output_file
def pdf2pdfopt(input_file, output_file=None, **dummy):
"""
Linearize the input PDF in order to improve the web-experience when
visualizing the document through the web.
@param input_file [string] the input input_file
@param output_file [string] the output_file file name, None for temporary generated
@return [string] output_file input_file
raise InvenioWebSubmitFileConverterError in case of errors.
"""
input_file, output_file, dummy = prepare_io(input_file, output_file, '.pdf', need_working_dir=False)
execute_command(CFG_PATH_PDFOPT, input_file, output_file)
return output_file
def pdf2ps(input_file, output_file=None, level=2, compress=True, **dummy):
"""
Convert from Pdf to Postscript.
"""
if compress:
suffix = '.ps.gz'
else:
suffix = '.ps'
input_file, output_file, working_dir = prepare_io(input_file, output_file, suffix)
execute_command(CFG_PATH_PDFTOPS, '-level%i' % level, input_file, os.path.join(working_dir, 'output.ps'))
if compress:
execute_command(CFG_PATH_GZIP, '-c', os.path.join(working_dir, 'output.ps'), filename_out=output_file)
else:
shutil.move(os.path.join(working_dir, 'output.ps'), output_file)
clean_working_dir(working_dir)
return output_file
def ps2pdfx(input_file, output_file=None, title=None, pdfopt=False, profile="pdf/x-3:2002", **dummy):
"""
Transform any PS into a PDF/X (see: <http://en.wikipedia.org/wiki/PDF/X>)
@param input_file [string] the input file name
@param output_file [string] the output_file file name, None for temporary generated
@param title [string] the title of the document. None for autodiscovery.
@param pdfopt [bool] whether to linearize the pdf, too.
@param profile: [string] the PDFX profile to use. Supports: 'pdf/x-1a:2001', 'pdf/x-1a:2003', 'pdf/x-3:2002'
@return [string] output_file input_file
raise InvenioWebSubmitFileConverterError in case of errors.
"""
input_file, output_file, working_dir = prepare_io(input_file, output_file, '.pdf')
if input_file.endswith('.gz'):
new_input_file = os.path.join(working_dir, 'input.ps')
execute_command(CFG_PATH_GUNZIP, '-c', input_file, filename_out=new_input_file)
input_file = new_input_file
if not title:
title = 'No title'
shutil.copy(CFG_ICC_PATH, working_dir)
pdfx_header = open(CFG_PDFX_DEF_PATH).read()
pdfx_header = pdfx_header.replace('<<<<TITLEMARKER>>>>', title)
icc_iso_profile_def = ''
if profile == 'pdf/x-1a:2001':
pdfx_version = 'PDF/X-1a:2001'
pdfx_conformance = 'PDF/X-1a:2001'
elif profile == 'pdf/x-1a:2003':
pdfx_version = 'PDF/X-1a:2003'
pdfx_conformance = 'PDF/X-1a:2003'
elif profile == 'pdf/x-3:2002':
icc_iso_profile_def = '/ICCProfile (ISOCoatedsb.icc)'
pdfx_version = 'PDF/X-3:2002'
pdfx_conformance = 'PDF/X-3:2002'
pdfx_header = pdfx_header.replace('<<<<ICCPROFILEDEF>>>>', icc_iso_profile_def)
pdfx_header = pdfx_header.replace('<<<<GTS_PDFXVersion>>>>', pdfx_version)
pdfx_header = pdfx_header.replace('<<<<TITLEMARKER>>>>', title)
outputpdf = os.path.join(working_dir, 'output_file.pdf')
open(os.path.join(working_dir, 'PDFX_def.ps'), 'w').write(pdfx_header)
if profile in ['pdf/x-3:2002']:
execute_command(CFG_PATH_GS, '-sProcessColorModel=DeviceCMYK', '-dPDFX', '-dBATCH', '-dNOPAUSE', '-dNOOUTERSAVE', '-dUseCIEColor', '-sDEVICE=pdfwrite', '-dAutoRotatePages=/None', '-sOutputFile=output_file.pdf', os.path.join(working_dir, 'PDFX_def.ps'), 'input.ps', cwd=working_dir)
elif profile in ['pdf/x-1a:2001', 'pdf/x-1a:2003']:
execute_command(CFG_PATH_GS, '-sProcessColorModel=DeviceCMYK', '-dPDFX', '-dBATCH', '-dNOPAUSE', '-dNOOUTERSAVE', '-sColorConversionStrategy=CMYK', '-dAutoRotatePages=/None', '-sDEVICE=pdfwrite', '-sOutputFile=output_file.pdf', os.path.join(working_dir, 'PDFX_def.ps'), 'input.ps', cwd=working_dir)
if pdfopt:
execute_command(CFG_PATH_PDFOPT, outputpdf, output_file)
else:
shutil.move(outputpdf, output_file)
clean_working_dir(working_dir)
return output_file
def ps2pdfa(input_file, output_file=None, title=None, pdfopt=True, **dummy):
"""
Transform any PS into a PDF/A (see: <http://www.pdfa.org/>)
@param input_file [string] the input file name
@param output_file [string] the output_file file name, None for temporary generated
@param title [string] the title of the document. None for autodiscovery.
@param pdfopt [bool] whether to linearize the pdf, too.
@return [string] output_file input_file
raise InvenioWebSubmitFileConverterError in case of errors.
"""
input_file, output_file, working_dir = prepare_io(input_file, output_file, '.pdf')
if input_file.endswith('.gz'):
new_input_file = os.path.join(working_dir, 'input.ps')
execute_command(CFG_PATH_GUNZIP, '-c', input_file, filename_out=new_input_file)
input_file = new_input_file
if not title:
title = 'No title'
shutil.copy(CFG_ICC_PATH, working_dir)
pdfa_header = open(CFG_PDFA_DEF_PATH).read()
pdfa_header = pdfa_header.replace('<<<<TITLEMARKER>>>>', title)
outputpdf = os.path.join(working_dir, 'output_file.pdf')
open(os.path.join(working_dir, 'PDFA_def.ps'), 'w').write(pdfa_header)
execute_command(CFG_PATH_GS, '-sProcessColorModel=DeviceCMYK', '-dPDFA', '-dBATCH', '-dNOPAUSE', '-dNOOUTERSAVE', '-dUseCIEColor', '-sDEVICE=pdfwrite', '-dAutoRotatePages=/None', '-sOutputFile=output_file.pdf', os.path.join(working_dir, 'PDFA_def.ps'), input_file, cwd=working_dir)
if pdfopt:
execute_command(CFG_PATH_PDFOPT, outputpdf, output_file)
else:
shutil.move(outputpdf, output_file)
clean_working_dir(working_dir)
return output_file
def ps2pdf(input_file, output_file=None, pdfopt=True, **dummy):
"""
Transform any PS into a PDF
@param input_file [string] the input file name
@param output_file [string] the output_file file name, None for temporary generated
@param pdfopt [bool] whether to linearize the pdf, too.
@return [string] output_file input_file
raise InvenioWebSubmitFileConverterError in case of errors.
"""
input_file, output_file, working_dir = prepare_io(input_file, output_file, '.pdf')
if input_file.endswith('.gz'):
new_input_file = os.path.join(working_dir, 'input.ps')
execute_command(CFG_PATH_GUNZIP, '-c', input_file, filename_out=new_input_file)
input_file = new_input_file
outputpdf = os.path.join(working_dir, 'output_file.pdf')
execute_command(CFG_PATH_PS2PDF, input_file, outputpdf, cwd=working_dir)
if pdfopt:
execute_command(CFG_PATH_PDFOPT, outputpdf, output_file)
else:
shutil.move(outputpdf, output_file)
clean_working_dir(working_dir)
return output_file
def pdf2pdfhocr(input_pdf, text_hocr, output_pdf, rotations=None, font='Courier', draft=False):
"""
Adds the OCRed text to the original pdf.
@param rotations: a list of angles by which pages should be rotated
"""
def _get_page_rotation(i):
if len(rotations) > i:
return rotations[i]
return 0
if rotations is None:
rotations = []
input_pdf, hocr_pdf, dummy = prepare_io(input_pdf, output_ext='.pdf', need_working_dir=False)
create_pdf(extract_hocr(open(text_hocr).read()), hocr_pdf, font, draft)
input1 = PdfFileReader(file(input_pdf, "rb"))
input2 = PdfFileReader(file(hocr_pdf, "rb"))
output = PdfFileWriter()
info = input1.getDocumentInfo()
if info:
infoDict = output._info.getObject()
infoDict.update(info)
for i in range(0, input1.getNumPages()):
orig_page = input1.getPage(i)
text_page = input2.getPage(i)
angle = _get_page_rotation(i)
if angle != 0:
print >> sys.stderr, "Rotating page %d by %d degrees." % (i, angle)
text_page = text_page.rotateClockwise(angle)
if draft:
below, above = orig_page, text_page
else:
below, above = text_page, orig_page
below.mergePage(above)
if angle != 0 and not draft:
print >> sys.stderr, "Rotating back page %d by %d degrees." % (i, angle)
below.rotateCounterClockwise(angle)
output.addPage(below)
outputStream = file(output_pdf, "wb")
output.write(outputStream)
outputStream.close()
os.remove(hocr_pdf)
return output_pdf
def pdf2hocr2pdf(input_file, output_file=None, ln='en', return_working_dir=False, extract_only_text=False, pdfopt=True, font='Courier', draft=False, **dummy):
"""
Return the text content in input_file.
@param ln is a two letter language code to give the OCR tool a hint.
@param return_working_dir if set to True, will return output_file path and the working_dir path, instead of deleting the working_dir. This is useful in case you need the intermediate images to build again a PDF.
"""
def _perform_rotate(working_dir, imagefile, angle):
"""Rotate imagefile of the corresponding angle. Creates a new file
with rotated.ppm."""
get_file_converter_logger().debug('Performing rotate on %s by %s degrees' % (imagefile, angle))
if not angle:
#execute_command('%s %s %s', CFG_PATH_CONVERT, os.path.join(working_dir, imagefile), os.path.join(working_dir, 'rotated-%s' % imagefile))
shutil.copy(os.path.join(working_dir, imagefile), os.path.join(working_dir, 'rotated.ppm'))
else:
execute_command(CFG_PATH_CONVERT, os.path.join(working_dir, imagefile), '-rotate', str(angle), '-depth', str(8), os.path.join(working_dir, 'rotated.ppm'))
return True
def _perform_deskew(working_dir):
"""Perform ocroscript deskew. Expect to work on rotated-imagefile.
Creates deskewed.ppm.
Return True if deskewing was fine."""
get_file_converter_logger().debug('Performing deskew')
try:
dummy, stderr = execute_command_with_stderr(CFG_PATH_OCROSCRIPT, os.path.join(CFG_ETCDIR, 'websubmit', 'file_converter_templates', 'deskew.lua'), os.path.join(working_dir, 'rotated.ppm'), os.path.join(working_dir, 'deskewed.ppm'))
if stderr.strip():
get_file_converter_logger().debug('Errors found during deskewing')
return False
else:
return True
except InvenioWebSubmitFileConverterError, err:
get_file_converter_logger().debug('Deskewing error: %s' % err)
return False
def _perform_recognize(working_dir):
"""Perform ocroscript recognize. Expect to work on deskewed.ppm.
Creates recognized.out Return True if recognizing was fine."""
get_file_converter_logger().debug('Performing recognize')
if extract_only_text:
output_mode = 'text'
else:
output_mode = 'hocr'
try:
dummy, stderr = execute_command_with_stderr(CFG_PATH_OCROSCRIPT, 'recognize', '--tesslanguage=%s' % ln, '--output-mode=%s' % output_mode, os.path.join(working_dir, 'deskewed.ppm'), filename_out=os.path.join(working_dir, 'recognize.out'))
if stderr.strip():
## There was some output on stderr
get_file_converter_logger().debug('Errors found in recognize.err')
return False
return not guess_ocropus_produced_garbage(os.path.join(working_dir, 'recognize.out'), not extract_only_text)
except InvenioWebSubmitFileConverterError, err:
get_file_converter_logger().debug('Recognizer error: %s' % err)
return False
def _perform_dummy_recognize(working_dir):
"""Return an empty text or an empty hocr referencing the image."""
get_file_converter_logger().debug('Performing dummy recognize')
if extract_only_text:
out = ''
else:
out = """<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta content="ocr_line ocr_page" name="ocr-capabilities"/><meta content="en" name="ocr-langs"/><meta content="Latin" name="ocr-scripts"/><meta content="" name="ocr-microformats"/><title>OCR Output</title></head>
<body><div class="ocr_page" title="bbox 0 0 1 1; image deskewed.ppm">
</div></body></html>"""
open(os.path.join(working_dir, 'recognize.out'), 'w').write(out)
def _find_image_file(working_dir, imageprefix, page):
ret = '%s-%d.ppm' % (imageprefix, page)
if os.path.exists(os.path.join(working_dir, ret)):
return ret
ret = '%s-%02d.ppm' % (imageprefix, page)
if os.path.exists(os.path.join(working_dir, ret)):
return ret
ret = '%s-%03d.ppm' % (imageprefix, page)
if os.path.exists(os.path.join(working_dir, ret)):
return ret
ret = '%s-%04d.ppm' % (imageprefix, page)
if os.path.exists(os.path.join(working_dir, ret)):
return ret
ret = '%s-%05d.ppm' % (imageprefix, page)
if os.path.exists(os.path.join(working_dir, ret)):
return ret
ret = '%s-%06d.ppm' % (imageprefix, page)
if os.path.exists(os.path.join(working_dir, ret)):
return ret
## I guess we won't have documents with more than million pages
return None
def _ocr(tmp_output_file):
"""
Append to tmp_output_file the partial results of OCROpus recognize.
Return a list of rotations.
"""
page = 0
rotations = []
while True:
page += 1
get_file_converter_logger().debug('Page %d.' % page)
execute_command(CFG_PATH_PDFTOPPM, '-f', str(page), '-l', str(page), '-r', str(CFG_PPM_RESOLUTION), '-aa', 'yes', '-freetype', 'yes', input_file, os.path.join(working_dir, 'image'))
imagefile = _find_image_file(working_dir, 'image', page)
if imagefile == None:
break
for angle in (0, 180, 90, 270):
get_file_converter_logger().debug('Trying %d degrees...' % angle)
if _perform_rotate(working_dir, imagefile, angle) and _perform_deskew(working_dir) and _perform_recognize(working_dir):
rotations.append(angle)
break
else:
get_file_converter_logger().debug('Dummy recognize')
rotations.append(0)
_perform_dummy_recognize(working_dir)
open(tmp_output_file, 'a').write(open(os.path.join(working_dir, 'recognize.out')).read())
# clean
os.remove(os.path.join(working_dir, imagefile))
return rotations
if CFG_PATH_OCROSCRIPT:
if len(ln) == 2:
ln = CFG_TWO2THREE_LANG_CODES.get(ln, 'eng')
if extract_only_text:
input_file, output_file, working_dir = prepare_io(input_file, output_file, output_ext='.txt')
_ocr(output_file)
else:
input_file, tmp_output_hocr, working_dir = prepare_io(input_file, output_ext='.hocr')
rotations = _ocr(tmp_output_hocr)
if pdfopt:
input_file, tmp_output_pdf, dummy = prepare_io(input_file, output_ext='.pdf', need_working_dir=False)
tmp_output_pdf, output_file, dummy = prepare_io(tmp_output_pdf, output_file, output_ext='.pdf', need_working_dir=False)
pdf2pdfhocr(input_file, tmp_output_hocr, tmp_output_pdf, rotations=rotations, font=font, draft=draft)
pdf2pdfopt(tmp_output_pdf, output_file)
os.remove(tmp_output_pdf)
else:
input_file, output_file, dummy = prepare_io(input_file, output_file, output_ext='.pdf', need_working_dir=False)
pdf2pdfhocr(input_file, tmp_output_hocr, output_file, rotations=rotations, font=font, draft=draft)
clean_working_dir(working_dir)
return output_file
else:
raise InvenioWebSubmitFileConverterError("It's impossible to generate HOCR output from PDF. OCROpus is not available.")
def pdf2text(input_file, output_file=None, perform_ocr=True, ln='en', **dummy):
"""
Return the text content in input_file.
"""
input_file, output_file, dummy = prepare_io(input_file, output_file, '.txt', need_working_dir=False)
execute_command(CFG_PATH_PDFTOTEXT, '-enc', 'UTF-8', '-eol', 'unix', '-nopgbrk', input_file, output_file)
if perform_ocr and can_perform_ocr():
ocred_output = pdf2hocr2pdf(input_file, ln=ln, extract_only_text=True)
try:
output = open(output_file, 'a')
for row in open(ocred_output):
output.write(row)
output.close()
finally:
silent_remove(ocred_output)
return output_file
def txt2text(input_file, output_file=None, **dummy):
"""
Return the text content in input_file
"""
input_file, output_file, dummy = prepare_io(input_file, output_file, '.txt', need_working_dir=False)
shutil.copy(input_file, output_file)
return output_file
def html2text(input_file, output_file=None, **dummy):
"""
Return the text content of an HTML/XML file.
"""
class HTMLStripper(HTMLParser.HTMLParser):
def __init__(self, output_file):
HTMLParser.HTMLParser.__init__(self)
self.output_file = output_file
def handle_entityref(self, name):
if name in entitydefs:
self.output_file.write(entitydefs[name].decode('latin1').encode('utf8'))
def handle_data(self, data):
if data.strip():
self.output_file.write(_RE_CLEAN_SPACES.sub(' ', data))
def handle_charref(self, data):
try:
self.output_file.write(unichr(int(data)).encode('utf8'))
except:
pass
def close(self):
self.output_file.close()
HTMLParser.HTMLParser.close(self)
input_file, output_file, dummy = prepare_io(input_file, output_file, '.txt', need_working_dir=False)
html_stripper = HTMLStripper(open(output_file, 'w'))
for line in open(input_file):
html_stripper.feed(line)
html_stripper.close()
return output_file
def djvu2text(input_file, output_file=None, **dummy):
"""
Return the text content in input_file.
"""
input_file, output_file, dummy = prepare_io(input_file, output_file, '.txt', need_working_dir=False)
execute_command(CFG_PATH_DJVUTXT, input_file, output_file)
return output_file
def djvu2ps(input_file, output_file=None, level=2, compress=True, **dummy):
"""
Convert a djvu into a .ps[.gz]
"""
if compress:
input_file, output_file, working_dir = prepare_io(input_file, output_file, output_ext='.ps.gz')
try:
execute_command(CFG_PATH_DJVUPS, input_file, os.path.join(working_dir, 'output.ps'))
execute_command(CFG_PATH_GZIP, '-c', os.path.join(working_dir, 'output.ps'), filename_out=output_file)
finally:
clean_working_dir(working_dir)
else:
try:
input_file, output_file, working_dir = prepare_io(input_file, output_file, output_ext='.ps')
execute_command(CFG_PATH_DJVUPS, '-level=%i' % level, input_file, output_file)
finally:
clean_working_dir(working_dir)
return output_file
def tiff2pdf(input_file, output_file=None, pdfopt=True, pdfa=True, perform_ocr=True, **args):
"""
Convert a .tiff into a .pdf
"""
if pdfa or pdfopt or perform_ocr:
input_file, output_file, working_dir = prepare_io(input_file, output_file, '.pdf')
try:
partial_output = os.path.join(working_dir, 'output.pdf')
execute_command(CFG_PATH_TIFF2PDF, '-o', partial_output, input_file)
if perform_ocr:
pdf2hocr2pdf(partial_output, output_file, pdfopt=pdfopt, **args)
elif pdfa:
pdf2pdfa(partial_output, output_file, pdfopt=pdfopt, **args)
else:
pdfopt(partial_output, output_file)
finally:
clean_working_dir(working_dir)
else:
input_file, output_file, dummy = prepare_io(input_file, output_file, '.pdf', need_working_dir=False)
execute_command(CFG_PATH_TIFF2PDF, '-o', output_file, input_file)
return output_file
def pstotext(input_file, output_file=None, **dummy):
"""
Convert a .ps[.gz] into text.
"""
input_file, output_file, working_dir = prepare_io(input_file, output_file, '.txt')
try:
if input_file.endswith('.gz'):
new_input_file = os.path.join(working_dir, 'input.ps')
execute_command(CFG_PATH_GUNZIP, '-c', input_file, filename_out=new_input_file)
input_file = new_input_file
execute_command(CFG_PATH_PSTOTEXT, '-output', output_file, input_file)
finally:
clean_working_dir(working_dir)
return output_file
def gzip(input_file, output_file=None, **dummy):
"""
Compress a file.
"""
input_file, output_file, dummy = prepare_io(input_file, output_file, '.gz', need_working_dir=False)
execute_command(CFG_PATH_GZIP, '-c', input_file, filename_out=output_file)
return output_file
def gunzip(input_file, output_file=None, **dummy):
"""
Uncompress a file.
"""
- from invenio.bibdocfile import decompose_file
+ from invenio.legacy.bibdocfile.api import decompose_file
input_ext = decompose_file(input_file, skip_version=True)[2]
if input_ext.endswith('.gz'):
input_ext = input_ext[:-len('.gz')]
else:
input_ext = None
input_file, output_file, dummy = prepare_io(input_file, output_file, input_ext, need_working_dir=False)
execute_command(CFG_PATH_GUNZIP, '-c', input_file, filename_out=output_file)
return output_file
def prepare_io(input_file, output_file=None, output_ext=None, need_working_dir=True):
"""Clean input_file and the output_file."""
- from invenio.bibdocfile import decompose_file, normalize_format
+ from invenio.legacy.bibdocfile.api import decompose_file, normalize_format
output_ext = normalize_format(output_ext)
get_file_converter_logger().debug('Preparing IO for input=%s, output=%s, output_ext=%s' % (input_file, output_file, output_ext))
if output_ext is None:
if output_file is None:
output_ext = '.tmp'
else:
output_ext = decompose_file(output_file, skip_version=True)[2]
if output_file is None:
try:
(fd, output_file) = tempfile.mkstemp(suffix=output_ext, dir=CFG_TMPDIR)
os.close(fd)
except IOError, err:
raise InvenioWebSubmitFileConverterError("It's impossible to create a temporary file: %s" % err)
else:
output_file = os.path.abspath(output_file)
if os.path.exists(output_file):
os.remove(output_file)
if need_working_dir:
try:
working_dir = tempfile.mkdtemp(dir=CFG_TMPDIR, prefix='conversion')
except IOError, err:
raise InvenioWebSubmitFileConverterError("It's impossible to create a temporary directory: %s" % err)
input_ext = decompose_file(input_file, skip_version=True)[2]
new_input_file = os.path.join(working_dir, 'input' + input_ext)
shutil.copy(input_file, new_input_file)
input_file = new_input_file
else:
working_dir = None
input_file = os.path.abspath(input_file)
get_file_converter_logger().debug('IO prepared: input_file=%s, output_file=%s, working_dir=%s' % (input_file, output_file, working_dir))
return (input_file, output_file, working_dir)
def clean_working_dir(working_dir):
"""
Remove the working_dir.
"""
get_file_converter_logger().debug('Cleaning working_dir: %s' % working_dir)
shutil.rmtree(working_dir)
def execute_command(*args, **argd):
"""Wrapper to run_process_with_timeout."""
get_file_converter_logger().debug("Executing: %s" % (args, ))
args = [str(arg) for arg in args]
res, stdout, stderr = run_process_with_timeout(args, cwd=argd.get('cwd'), filename_out=argd.get('filename_out'), filename_err=argd.get('filename_err'), sudo=argd.get('sudo'))
get_file_converter_logger().debug('res: %s, stdout: %s, stderr: %s' % (res, stdout, stderr))
if res != 0:
message = "ERROR: Error in running %s\n stdout:\n%s\nstderr:\n%s\n" % (args, stdout, stderr)
get_file_converter_logger().error(message)
raise InvenioWebSubmitFileConverterError(message)
return stdout
def execute_command_with_stderr(*args, **argd):
"""Wrapper to run_process_with_timeout."""
get_file_converter_logger().debug("Executing: %s" % (args, ))
res, stdout, stderr = run_process_with_timeout(args, cwd=argd.get('cwd'), filename_out=argd.get('filename_out'), sudo=argd.get('sudo'))
if res != 0:
message = "ERROR: Error in running %s\n stdout:\n%s\nstderr:\n%s\n" % (args, stdout, stderr)
get_file_converter_logger().error(message)
raise InvenioWebSubmitFileConverterError(message)
return stdout, stderr
def silent_remove(path):
"""Remove without errors a path."""
if os.path.exists(path):
try:
os.remove(path)
except OSError:
pass
__CONVERSION_MAP = get_conversion_map()
def main_cli():
"""
main function when the library behaves as a normal CLI tool.
"""
- from invenio.bibdocfile import normalize_format
+ from invenio.legacy.bibdocfile.api import normalize_format
parser = OptionParser()
parser.add_option("-c", "--convert", dest="input_name",
help="convert the specified FILE", metavar="FILE")
parser.add_option("-d", "--debug", dest="debug", action="store_true", help="Enable debug information")
parser.add_option("--special-pdf2hocr2pdf", dest="ocrize", help="convert the given scanned PDF into a PDF with OCRed text", metavar="FILE")
parser.add_option("-f", "--format", dest="output_format", help="the desired output format", metavar="FORMAT")
parser.add_option("-o", "--output", dest="output_name", help="the desired output FILE (if not specified a new file will be generated with the desired output format)")
parser.add_option("--without-pdfa", action="store_false", dest="pdf_a", default=True, help="don't force creation of PDF/A PDFs")
parser.add_option("--without-pdfopt", action="store_false", dest="pdfopt", default=True, help="don't force optimization of PDFs files")
parser.add_option("--without-ocr", action="store_false", dest="ocr", default=True, help="don't force OCR")
parser.add_option("--can-convert", dest="can_convert", help="display all the possible format that is possible to generate from the given format", metavar="FORMAT")
parser.add_option("--is-ocr-needed", dest="check_ocr_is_needed", help="check if OCR is needed for the FILE specified", metavar="FILE")
parser.add_option("-t", "--title", dest="title", help="specify the title (used when creating PDFs)", metavar="TITLE")
parser.add_option("-l", "--language", dest="ln", help="specify the language (used when performing OCR, e.g. en, it, fr...)", metavar="LN", default='en')
(options, dummy) = parser.parse_args()
if options.debug:
from logging import basicConfig
basicConfig()
get_file_converter_logger().setLevel(DEBUG)
if options.can_convert:
if options.can_convert:
input_format = normalize_format(options.can_convert)
if input_format == '.pdf':
if can_pdfopt(True):
print "PDF linearization supported"
else:
print "No PDF linearization support"
if can_pdfa(True):
print "PDF/A generation supported"
else:
print "No PDF/A generation support"
if can_perform_ocr(True):
print "OCR supported"
else:
print "OCR not supported"
print 'Can convert from "%s" to:' % input_format[1:],
for output_format in __CONVERSION_MAP:
if can_convert(input_format, output_format):
print '"%s"' % output_format[1:],
print
elif options.check_ocr_is_needed:
print "Checking if OCR is needed on %s..." % options.check_ocr_is_needed,
sys.stdout.flush()
if guess_is_OCR_needed(options.check_ocr_is_needed):
print "needed."
else:
print "not needed."
elif options.ocrize:
try:
output = pdf2hocr2pdf(options.ocrize, output_file=options.output_name, title=options.title, ln=options.ln)
print "Output stored in %s" % output
except InvenioWebSubmitFileConverterError, err:
print "ERROR: %s" % err
sys.exit(1)
else:
try:
if not options.output_name and not options.output_format:
parser.error("Either --format, --output should be specified")
if not options.input_name:
parser.error("An input should be specified!")
output = convert_file(options.input_name, output_file=options.output_name, output_format=options.output_format, pdfopt=options.pdfopt, pdfa=options.pdf_a, title=options.title, ln=options.ln)
print "Output stored in %s" % output
except InvenioWebSubmitFileConverterError, err:
print "ERROR: %s" % err
sys.exit(1)
if __name__ == "__main__":
main_cli()
diff --git a/invenio/legacy/websubmit/file_metadata.py b/invenio/legacy/websubmit/file_metadata.py
index 9c29350eb..c8c6b4acf 100644
--- a/invenio/legacy/websubmit/file_metadata.py
+++ b/invenio/legacy/websubmit/file_metadata.py
@@ -1,363 +1,363 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
This is the metadata reader and writer module. Contains the proper
plugin containers in order to read/write metadata from images or other
files.
Public APIs:
- read_metadata()
- write_metadata()
"""
__required_plugin_API_version__ = "WebSubmit File Metadata Plugin API 1.0"
import sys
from optparse import OptionParser
-from invenio.bibdocfile import decompose_file
+from invenio.legacy.bibdocfile.api import decompose_file
from invenio.legacy.websubmit.config import InvenioWebSubmitFileMetadataRuntimeError
from invenio.utils.datastructures import LazyDict
from invenio.base.utils import import_submodules_from_packages
metadata_extractor_plugins = LazyDict(lambda: dict(filter(None, map(
plugin_builder_function,
import_submodules_from_packages('websubmit_file_metadata_plugins',
packages=['invenio'])))))
def read_metadata(inputfile, force=None, remote=False,
loginpw=None, verbose=0):
"""
Returns metadata extracted from given file as dictionary.
Availability depends on input file format and installed plugins
(return C{TypeError} if unsupported file format).
@param inputfile: path to a file
@type inputfile: string
@param verbose: verbosity
@type verbose: int
@param force: name of plugin to use, to skip plugin auto-discovery
@type force: string
@param remote: if the file is accessed remotely or not
@type remote: boolean
@param loginpw: credentials to access secure servers (username:password)
@type loginpw: string
@return: dictionary of metadata tags as keys, and (interpreted)
value as value
@rtype: dict
@raise TypeError: if file format is not supported.
@raise RuntimeError: if required library to process file is missing.
@raise InvenioWebSubmitFileMetadataRuntimeError: when metadata cannot be read.
"""
metadata = None
# Check file type (0 base, 1 name, 2 ext)
ext = decompose_file(inputfile)[2]
if verbose > 5:
print ext.lower(), 'extension to extract from'
# Loop through the plugins to find a good one for given file
for plugin_name, plugin in metadata_extractor_plugins.iteritems():
# Local file
if plugin.has_key('can_read_local') and \
plugin['can_read_local'](inputfile) and not remote and \
(not force or plugin_name == force):
if verbose > 5:
print 'Using ' + plugin_name
fetched_metadata = plugin['read_metadata_local'](inputfile,
verbose)
if not metadata:
metadata = fetched_metadata
else:
metadata.update(fetched_metadata)
# Remote file
elif remote and plugin.has_key('can_read_remote') and \
plugin['can_read_remote'](inputfile) and \
(not force or plugin_name == force):
if verbose > 5:
print 'Using ' + plugin_name
fetched_metadata = plugin['read_metadata_remote'](inputfile,
loginpw,
verbose)
if not metadata:
metadata = fetched_metadata
else:
metadata.update(fetched_metadata)
# Return in case we have something
if metadata is not None:
return metadata
# Case of no plugin found, raise
raise TypeError, 'Unsupported file type'
def write_metadata(inputfile, outputfile, metadata_dictionary,
force=None, verbose=0):
"""
Writes metadata to given file.
Availability depends on input file format and installed plugins
(return C{TypeError} if unsupported file format).
@param inputfile: path to a file
@type inputfile: string
@param outputfile: path to the resulting file.
@type outputfile: string
@param verbose: verbosity
@type verbose: int
@param metadata_dictionary: keys and values of metadata to update.
@type metadata_dictionary: dict
@param force: name of plugin to use, to skip plugin auto-discovery
@type force: string
@return: output of the plugin
@rtype: string
@raise TypeError: if file format is not supported.
@raise RuntimeError: if required library to process file is missing.
@raise InvenioWebSubmitFileMetadataRuntimeError: when metadata cannot be updated.
"""
# Check file type (0 base, 1 name, 2 ext)
ext = decompose_file(inputfile)[2]
if verbose > 5:
print ext.lower(), 'extension to write to'
# Loop through the plugins to find a good one to ext
for plugin_name, plugin in metadata_extractor_plugins.iteritems():
if plugin.has_key('can_write_local') and \
plugin['can_write_local'](inputfile) and \
(not force or plugin_name == force):
if verbose > 5:
print 'Using ' + plugin_name
return plugin['write_metadata_local'](inputfile,
outputfile,
metadata_dictionary,
verbose)
# Case of no plugin found, raise
raise TypeError, 'Unsupported file type'
def metadata_info(verbose=0):
"""Shows information about the available plugins"""
print 'Plugin APIs version: %s' % str(__required_plugin_API_version__)
# Plugins
print 'Available plugins:'
# Print each operation on each plugin
for plugin_name, plugin_funcs in metadata_extractor_plugins.iteritems():
if len(plugin_funcs) > 0:
print '-- Name: ' + plugin_name
print ' Supported operation%s: ' % \
(len(plugin_funcs) > 1 and 's' or '') + \
', '.join(plugin_funcs)
# Are there any unloaded plugins?
# broken_plugins = metadata_extractor_plugins.get_broken_plugins()
# if len(broken_plugins.keys()) > 0:
# print 'Could not load the following plugin%s:' % \
# (len(broken_plugins.keys()) > 1 and 's' or '')
# for broken_plugin_name, broken_plugin_trace_info in broken_plugins.iteritems():
# print '-- Name: ' + broken_plugin_name
# if verbose > 5:
# formatted_traceback = \
# traceback.format_exception(broken_plugin_trace_info[0],
# broken_plugin_trace_info[1],
# broken_plugin_trace_info[2])
# print ' ' + ''.join(formatted_traceback).replace('\n', '\n ')
# elif verbose > 0:
# print ' ' + str(broken_plugin_trace_info[1])
def print_metadata(metadata):
"""
Pretty-prints metadata returned by the plugins to standard output.
@param metadata: object returned by the plugins when reading metadata
@type metadata: dict
"""
if metadata:
max_key_length = max([len(key) for key in metadata.keys()])
for key, value in metadata.iteritems():
print key, "." * (max_key_length - len(key)), str(value)
else:
print '(No metadata)'
def plugin_builder_function(plugin):
"""
Internal function used to build the plugin container, so it behaves as a
dictionary.
@param plugin_name: plugin_name
@param plugin_code: plugin_code
@return: the plugin container
@rtype: dict
"""
name = plugin.__name__.split('.')[-1]
if not name.startswith('wsm_'):
return
## Let's check for API version.
api_version = getattr(plugin, '__plugin_version__', None)
if api_version != __required_plugin_API_version__:
raise Exception("Plugin version mismatch."
" Expected %s, found %s" % (__required_plugin_API_version__,
api_version))
ret = {}
for funct_name in ('can_read_local',
'can_read_remote',
'can_write_local',
'read_metadata_local',
'write_metadata_local',
'read_metadata_remote'):
funct = getattr(plugin, funct_name, None)
if funct is not None:
ret[funct_name] = funct
return name, ret
def main():
"""
Manages the arguments, in order to call the proper metadata
handling function
"""
def dictionary_callback(option, opt, value, parser, *args, **kwargs):
"""callback function used to get strings from command line
of the type tag=value and push it into a dictionary
@param parameters: optparse parameters"""
if '=' in value:
key, val = value.split('=', 1)
if getattr(parser.values, 'metadata', None) is None:
parser.values.metadata = {}
parser.values.metadata[key] = val
return
else:
raise ValueError("%s is not in the form key=value" % value)
# Parse arguments
parser = OptionParser(usage="websubmit_file_metadata {-e | -u | -i} " + \
"[-f arg2] [-v] [-d tag=value] [-r] [-l arg3] " + \
"/path/to/file")
parser.add_option("-e", "--extract", dest="extract", action='store_true',
help="extract metadata from file", default=False)
parser.add_option("-u", "--update", dest="update", action='store_true',
help="update file metadata", default=False)
parser.add_option("-o", "--output-file", dest="output_file",
help="Place to save updated file (when --update). Default is same as input file",
type="string", default=None)
parser.add_option("-f", "--force", dest="force_plugin",
help="Plugin we want to be used", type="string",
default=None)
parser.add_option('-v', '--verbose', type="int",
dest='verbose', help='shows detailed information',
default=1)
parser.add_option('-r', '--remote', action='store_true',
dest='remote', help='working with remote file',
default=False)
parser.add_option('-d', '--dictionary-entry',
action="callback",
callback=dictionary_callback, type="string",
help='metadata to update [-d tag=value]')
parser.add_option('-i', '--info', action='store_true',
dest='info', help='shows plugin information',
default=False)
parser.add_option("-l", "--loginpw", dest="loginpw",
help="Login and password to access remote server [login:pw]",
type="string", default=None)
(options, args) = parser.parse_args()
## Get the input file from the arguments list (it should be the
## first argument):
input_file = None
if len(args) > 0:
input_file = args[0]
# If there is no option -d, we avoid metadata option being undefined
if getattr(parser.values, 'metadata', None) is None:
parser.values.metadata = {}
# Is output file specified?
if options.update and not options.output_file:
if options.verbose > 5:
print "Option --output-file not specified. Updating input file."
options.output_file = input_file
elif options.extract and options.output_file:
print "Option --output-file cannot be used with --extract."
print parser.get_usage()
sys.exit(1)
# Make sure there is not extract / write / info at the same time
if (options.extract and options.update) or \
(options.extract and options.info) or \
(options.info and options.update):
print "Choose either --extract, --update or --info"
print parser.get_usage()
sys.exit(1)
elif (options.extract and not input_file) or \
(options.update and not input_file):
print "Input file is missing"
print parser.get_usage()
sys.exit(1)
# Function call based on args
if options.extract:
try:
metadata = read_metadata(input_file,
options.force_plugin,
options.remote,
options.loginpw,
options.verbose)
print_metadata(metadata)
except TypeError, err:
print err
return 1
except RuntimeError, err:
print err
return 1
except InvenioWebSubmitFileMetadataRuntimeError, err:
print err
return 1
elif options.update:
try:
write_metadata(input_file,
options.output_file,
options.metadata,
options.force_plugin,
options.verbose)
except TypeError, err:
print err
return 1
except RuntimeError, err:
print err
return 1
except InvenioWebSubmitFileMetadataRuntimeError, err:
print err
return 1
elif options.info:
try:
metadata_info(options.verbose)
except TypeError:
print 'Problem retrieving plugin information\n'
return 1
else:
parser.error("Incorrect number of arguments\n")
if __name__ == "__main__":
main()
diff --git a/invenio/legacy/websubmit/file_metadata_plugins/extractor_plugin.py b/invenio/legacy/websubmit/file_metadata_plugins/extractor_plugin.py
index 7381b924a..e7234ea23 100644
--- a/invenio/legacy/websubmit/file_metadata_plugins/extractor_plugin.py
+++ b/invenio/legacy/websubmit/file_metadata_plugins/extractor_plugin.py
@@ -1,74 +1,74 @@
## This file is part of Invenio.
## Copyright (C) 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
WebSubmit Metadata Plugin - This is the generic metadata extraction
plugin. Contains methods to extract metadata from many kinds of files.
Dependencies: extractor
"""
__plugin_version__ = "WebSubmit File Metadata Plugin API 1.0"
import extractor
-from invenio.bibdocfile import decompose_file
+from invenio.legacy.bibdocfile.api import decompose_file
def can_read_local(inputfile):
"""
Checks if inputfile is among metadata-readable file types
@param inputfile: path to the image
@type inputfile: string
@rtype: boolean
@return: True if file can be processed
"""
# Check file type (0 base, 1 name, 2 ext)
ext = decompose_file(inputfile)[2]
return ext.lower() in ['.html', '.doc', '.ps', '.xls', '.ppt',
'.ps', '.sxw', '.sdw', '.dvi', '.man', '.flac',
'.mp3', '.nsf', '.sid', '.ogg', '.wav', '.png',
'.deb', '.rpm', '.tar.gz', '.zip', '.elf',
'.s3m', '.xm', '.it', '.flv', '.real', '.avi',
'.mpeg', '.qt', '.asf']
def read_metadata_local(inputfile, verbose):
"""
Metadata extraction from many kind of files
@param inputfile: path to the image
@type inputfile: string
@param verbose: verbosity
@type verbose: int
@rtype: dict
@return: dictionary with metadata
"""
# Initialization dict
meta_info = {}
# Extraction
xtract = extractor.Extractor()
# Get the keywords
keys = xtract.extract(inputfile)
# Loop to dump data to the dict
for keyword_type, keyword in keys:
meta_info[keyword_type.encode('iso-8859-1')] = \
keyword.encode('iso-8859-1')
# Return the dictionary
return meta_info
diff --git a/invenio/legacy/websubmit/file_metadata_plugins/pdftk_plugin.py b/invenio/legacy/websubmit/file_metadata_plugins/pdftk_plugin.py
index 0cd2e2400..8d2fba21a 100644
--- a/invenio/legacy/websubmit/file_metadata_plugins/pdftk_plugin.py
+++ b/invenio/legacy/websubmit/file_metadata_plugins/pdftk_plugin.py
@@ -1,208 +1,208 @@
## This file is part of Invenio.
## Copyright (C) 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
WebSubmit Metadata Plugin - This is the plugin to update metadata from
PDF files.
Dependencies: pdftk
"""
__plugin_version__ = "WebSubmit File Metadata Plugin API 1.0"
import os
import shutil
import tempfile
from invenio.utils.shell import run_shell_command
-from invenio.bibdocfile import decompose_file
+from invenio.legacy.bibdocfile.api import decompose_file
from invenio.config import CFG_PATH_PDFTK, CFG_TMPDIR
from invenio.legacy.websubmit.config import InvenioWebSubmitFileMetadataRuntimeError
if not CFG_PATH_PDFTK:
raise ImportError, "Path to PDFTK is not set in CFG_PATH_PDFTK"
def can_read_local(inputfile):
"""
Checks if inputfile is among metadata-readable file types
@param inputfile: path to the image
@type inputfile: string
@rtype: boolean
@return: True if file can be processed
"""
# Check file type (0 base, 1 name, 2 ext)
ext = decompose_file(inputfile)[2]
return ext.lower() in ['.pdf']
def can_write_local(inputfile):
"""
Checks if inputfile is among metadata-writable file types (pdf)
@param inputfile: path to the image
@type inputfile: string
@rtype: boolean
@return: True if file can be processed
"""
ext = os.path.splitext(inputfile)[1]
return ext.lower() in ['.pdf']
def read_metadata_local(inputfile, verbose):
"""
Metadata extraction from many kind of files
@param inputfile: path to the image
@type inputfile: string
@param verbose: verbosity
@type verbose: int
@rtype: dict
@return: dictionary with metadata
"""
cmd = CFG_PATH_PDFTK + ' %s dump_data'
(exit_status, output_std, output_err) = \
run_shell_command(cmd, args=(inputfile,))
metadata_dict = {}
key = None
value = None
for metadata_line in output_std.splitlines():
if metadata_line.strip().startswith("InfoKey"):
key = metadata_line.split(':', 1)[1].strip()
elif metadata_line.strip().startswith("InfoValue"):
value = metadata_line.split(':', 1)[1].strip()
if key in ["ModDate", "CreationDate"]:
# FIXME: Interpret these dates?
try:
pass
#value = datetime.strptime(value, "D:%Y%m%d%H%M%S%Z")
except:
pass
if key:
metadata_dict[key] = value
key = None
else:
try:
custom_key, custom_value = metadata_line.split(':', 1)
metadata_dict[custom_key.strip()] = custom_value.strip()
except:
# Most probably not relevant line
pass
return metadata_dict
def write_metadata_local(inputfile, outputfile, metadata_dictionary, verbose):
"""
Metadata write method, takes the .pdf as input and creates a new
one with the new info.
@param inputfile: path to the pdf
@type inputfile: string
@param outputfile: path to the resulting pdf
@type outputfile: string
@param verbose: verbosity
@type verbose: int
@param metadata_dictionary: metadata information to update inputfile
@type metadata_dictionary: dict
"""
# Take the file name (0 base, 1 name, 2 ext)
filename = decompose_file(inputfile)[1]
# Print pdf metadata
if verbose > 1:
print 'Metadata information in the PDF file ' + filename + ': \n'
try:
os.system(CFG_PATH_PDFTK + ' ' + inputfile + ' dump_data')
except Exception:
print 'Problem with inputfile to PDFTK'
# Info file for pdftk
(fd, path_to_info) = tempfile.mkstemp(prefix="wsm_pdf_plugin_info_", \
dir=CFG_TMPDIR)
os.close(fd)
file_in = open(path_to_info, 'w')
if verbose > 5:
print "Saving PDFTK info file to %s" % path_to_info
# User interaction to form the info file
# Main Case: Dictionary received through option -d
if not metadata_dictionary == {}:
for tag in metadata_dictionary:
line = 'InfoKey: ' + tag + '\nInfoValue: ' + \
metadata_dictionary[tag] + '\n'
if verbose > 0:
print line
file_in.writelines(line)
else:
data_modified = False
user_input = 'user_input'
print "Entering interactive mode. Choose what you want to do:"
while (user_input):
if not data_modified:
try:
user_input = raw_input('[w]rite / [q]uit\n')
except:
print "Aborting"
return
else:
try:
user_input = raw_input('[w]rite / [q]uit and apply / [a]bort \n')
except:
print "Aborting"
return
if user_input == 'q':
if not data_modified:
return
break
elif user_input == 'w':
try:
tag = raw_input('Tag to update:\n')
value = raw_input('With value:\n')
except:
print "Aborting"
return
# Write to info file
line = 'InfoKey: ' + tag + '\nInfoValue: ' + value + '\n'
data_modified = True
file_in.writelines(line)
elif user_input == 'a':
return
else:
print "Invalid option: "
file_in.close()
(fd, pdf_temp_path) = tempfile.mkstemp(prefix="wsm_pdf_plugin_pdf_", \
dir=CFG_TMPDIR)
os.close(fd)
# Now we call pdftk tool to update the info on a pdf
#try:
cmd_pdftk = '%s %s update_info %s output %s'
(exit_status, output_std, output_err) = \
run_shell_command(cmd_pdftk,
args=(CFG_PATH_PDFTK, inputfile,
path_to_info, pdf_temp_path))
if verbose > 5:
print output_std, output_err
if os.path.exists(pdf_temp_path):
# Move to final destination if exist
try:
shutil.move(pdf_temp_path, outputfile)
except Exception, err:
raise InvenioWebSubmitFileMetadataRuntimeError("Could not move %s to %s" % \
(pdf_temp_path, outputfile))
else:
# Something bad happened
raise InvenioWebSubmitFileMetadataRuntimeError("Could not update metadata " + output_err)
diff --git a/invenio/legacy/websubmit/file_metadata_plugins/pyexiv2_plugin.py b/invenio/legacy/websubmit/file_metadata_plugins/pyexiv2_plugin.py
index aa3a55f84..5ec1b2047 100644
--- a/invenio/legacy/websubmit/file_metadata_plugins/pyexiv2_plugin.py
+++ b/invenio/legacy/websubmit/file_metadata_plugins/pyexiv2_plugin.py
@@ -1,397 +1,397 @@
## This file is part of Invenio.
## Copyright (C) 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
WebSubmit Metadata Plugin - This is a plugin to extract/update
metadata from images.
Dependencies: Exiv2
"""
__plugin_version__ = "WebSubmit File Metadata Plugin API 1.0"
import os
import base64
import httplib
import tempfile
import shutil
import pyexiv2
-from invenio.bibdocfile import decompose_file
+from invenio.legacy.bibdocfile.api import decompose_file
from invenio.config import CFG_TMPDIR
from invenio.legacy.websubmit.config import InvenioWebSubmitFileMetadataRuntimeError
def can_read_local(inputfile):
"""
Checks if inputfile is among metadata-readable file types
@param inputfile: path to the image
@type inputfile: string
@rtype: boolean
@return: True if file can be processed
"""
# Check file type (0 base, 1 name, 2 ext)
ext = decompose_file(inputfile)[2]
return ext.lower() in ['.jpg', '.tiff', '.jpeg', 'jpe',
'.jfif', '.jfi', '.jif']
def can_read_remote(inputfile):
"""Checks if inputfile is among metadata-readable
file types
@param inputfile: (string) path to the image
@type inputfile: string
@rtype: boolean
@return: true if extension casn be handled"""
# Check file type (0 base, 1 name, 2 ext)
ext = decompose_file(inputfile)[2]
return ext.lower() in ['.jpg', '.jpeg', 'jpe',
'.jfif', '.jfi', '.jif']
def can_write_local(inputfile):
"""
Checks if inputfile is among metadata-writable file types
@param inputfile: path to the image
@type inputfile: string
@rtype: boolean
@return: True if file can be processed
"""
# Check file type (0 base, 1 name, 2 ext)
ext = decompose_file(inputfile)[2]
return ext.lower() in ['.jpg', '.tiff', '.jpeg', 'jpe',
'.jfif', '.jfi', '.jif']
def read_metadata_local(inputfile, verbose):
"""
EXIF and IPTC metadata extraction and printing from images
@param inputfile: path to the image
@type inputfile: string
@param verbose: verbosity
@type verbose: int
@rtype: dict
@return: dictionary with metadata
"""
# Load the image
image = pyexiv2.Image(inputfile)
# Read the metadata
image.readMetadata()
image_info = {}
# EXIF metadata
for key in image.exifKeys():
image_info[key] = image.interpretedExifValue(key)
# IPTC metadata
for key in image.iptcKeys():
image_info[key] = repr(image[key])
# Return the dictionary
return image_info
def write_metadata_local(inputfile, outputfile, metadata_dictionary, verbose):
"""
EXIF and IPTC metadata writing, previous tag printing, to
images. If some tag not set, it is auto-added, but be a valid exif
or iptc tag.
@param inputfile: path to the image
@type inputfile: string
@param outputfile: path to the resulting image
@type outputfile: string
@param verbose: verbosity
@type verbose: int
@param metadata_dictionary: metadata information to update inputfile
@rtype: dict
"""
if inputfile != outputfile:
# Create copy of inputfile
try:
shutil.copy2(inputfile, outputfile)
except Exception, err:
raise InvenioWebSubmitFileMetadataRuntimeError(err)
# Load the image
image = pyexiv2.Image(inputfile)
# Read the metadata
image.readMetadata()
# Main Case: Dictionary received through option -d
if metadata_dictionary:
for tag in metadata_dictionary:
if tag in image.exifKeys() or tag in image.iptcKeys():
# Updating
if verbose > 0:
print "Updating %(tag)s from <%(old_value)s> to <%(new_value)s>" % \
{'tag': tag,
'old_value': image[tag],
'new_value': metadata_dictionary[tag]}
else:
# Adding
if verbose > 0:
print "Adding %(tag)s with value <%(new_value)s>" % \
{'tag': tag,
'new_value': metadata_dictionary[tag]}
try:
image[tag] = metadata_dictionary[tag]
image.writeMetadata()
except Exception:
print 'Tag or Value incorrect'
# Alternative way: User interaction
else:
data_modified = False
user_input = 'user_input'
print "Entering interactive mode. Choose what you want to do:"
while (user_input):
if not data_modified:
try:
user_input = raw_input('[w]rite / [q]uit\n')
except:
print "Aborting"
return
else:
try:
user_input = raw_input('[w]rite / [q]uit and apply / [a]bort \n')
except:
print "Aborting"
return
if user_input == 'q':
if not data_modified:
return
break
elif user_input == 'w':
try:
tag = raw_input('Tag to update (Any valid Exif or Iptc Tag):\n')
value = raw_input('With value:\n')
data_modified = True
except:
print "Aborting"
return
try:
image[tag] = value
except Exception, err:
print 'Tag or Value incorrect'
elif user_input == 'a':
return
else:
print "Invalid option: "
try:
image.writeMetadata()
except Exception, err:
raise InvenioWebSubmitFileMetadataRuntimeError("Could not update metadata: " + err)
def read_metadata_remote(inputfile, loginpw, verbose):
"""
EXIF and IPTC metadata extraction and printing from remote images
@param inputfile: path to the remote image
@type inputfile: string
@param verbose: verbosity
@type verbose: int
@param loginpw: credentials to access secure servers (username:password)
@type loginpw: string
@return: dictionary with metadata
@rtype: dict
"""
# Check that inputfile is an URL
secure = False
pos = inputfile.lower().find('http://')
if pos < 0:
secure = True
pos = inputfile.lower().find('https://')
if pos < 0:
raise InvenioWebSubmitFileMetadataRuntimeError("Inputfile (" + inputfile + ") is " + \
"not an URL, nor remote resource.")
# Check if there is login and password
if loginpw != None:
(userid, passwd) = loginpw.split(':')
# Make HTTPS Connection
domain = inputfile.split('/')[2]
if verbose > 3:
print 'Domain: ', domain
url = inputfile.split(domain)[1]
if verbose > 3:
print 'URL: ', url
# Establish headers
if loginpw != None:
_headers = {"Accept": "*/*",
"Authorization": "Basic " + \
base64.encodestring(userid + ':' + passwd).strip()}
else:
_headers = {"Accept": "*/*"}
conn = None
# Establish connection
# Case HTTPS
if secure:
try:
conn = httplib.HTTPSConnection(domain)
## Request a connection
conn.request("GET", url,
headers = _headers)
except Exception:
# Cannot connect
print 'Could not connect'
# Case HTTP
else:
try:
conn = httplib.HTTPConnection(domain)
## Request a connection
conn.request("GET", url,
headers = _headers)
except Exception:
# Cannot connect
print 'Could not connect'
# Get response
if verbose > 5:
print "Fetching data from remote server."
response = conn.getresponse()
if verbose > 2:
print response.status, response.reason
if response.status == 401:
# Authentication required
raise InvenioWebSubmitFileMetadataRuntimeError("URL requires authentication. Use --loginpw option")
# Read first marker from image
data = response.read(2)
# Check if it is a valid image
if data[0:2] != '\xff\xd8':
raise InvenioWebSubmitFileMetadataRuntimeError("URL does not brings to a valid image file.")
else:
if verbose > 5:
print 'Valid JPEG Standard-based image'
# Start the fake image
path_to_fake = fake_image_init(verbose)
# Continue reading
data = response.read(2)
# Check if we find metadata (EXIF or IPTC)
while data[0:2] != '\xff\xdb':
if data[0:2] == '\xff\xe1' or data[0:2] == '\xff\xed':
marker = data
if verbose > 5:
print 'Metadata Marker->', repr(marker), '\nGetting data'
size = response.read(2)
length = ord(size[0]) * 256 + ord(size[1])
meta = response.read(length-2)
insert_metadata(path_to_fake, marker, size, meta, verbose)
break
else:
data = response.read(2)
# Close connection
conn.close()
# Close fake image
fake_image_close(path_to_fake, verbose)
# Extract metadata once fake image is done
return read_metadata_local(path_to_fake, verbose)
def fake_image_init(verbose):
"""
Initializes the fake image
@param verbose: verbosity
@type verbose: int
@rtype: string
@return: path to fake image
"""
# Create temp file for fake image
(fd, path_to_fake) = tempfile.mkstemp(prefix='wsm_image_plugin_img_',
dir=CFG_TMPDIR)
os.close(fd)
# Open fake image and write head to it
fake_image = open(path_to_fake, 'a')
image_head = '\xff\xd8\xff\xe0\x00\x10\x4a\x46\x49\x46\x00' + \
'\x01\x01\x01\x00\x48\x00\x48\x00\x00'
fake_image.write(image_head)
fake_image.close()
return path_to_fake
def fake_image_close(path_to_fake, verbose):
"""
Closes the fake image
@param path_to_fake: path to the fake image
@type path_to_fake: string
@param verbose: verbosity
@type verbose: int
"""
# Open fake image and write image structure info
# (Huffman table[s]...) to it
fake_image = open(path_to_fake, 'a')
image_tail = '\xff\xdb\x00\x43\x00\x05\x03\x04\x04\x04\x03\x05' + \
'\x04\x04\x04\x05\x05\x05\x06\x07\x0c\x08\x07\x07' + \
'\x07\x07\x0f\x0b\x0b\x09\x0c\x11\x0f\x12\x12\x11' + \
'\x0f\x11\x11\x13\x16\x1c\x17\x13\x14\x1a\x15\x11' + \
'\x11\x18\x21\x18\x1a\x1d\x1d\x1f\x1f\x1f\x13\x17' + \
'\x22\x24\x22\x1e\x24\x1c\x1e\x1f\x1e\xff\xdb\x00' + \
'\x43\x01\x05\x05\x05\x07\x06\x07\x0e\x08\x08\x0e' + \
'\x1e\x14\x11\x14\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e' + \
'\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e' + \
'\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e' + \
'\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e\x1e' + \
'\x1e\x1e\x1e\x1e\x1e\x1e\xff\xc0\x00\x11\x08\x00' + \
'\x01\x00\x01\x03\x01\x22\x00\x02\x11\x01\x03\x11' + \
'\x01\xff\xc4\x00\x15\x00\x01\x01\x00\x00\x00\x00' + \
'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08' + \
'\xff\xc4\x00\x14\x10\x01\x00\x00\x00\x00\x00\x00' + \
'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xc4' + \
'\x00\x14\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00' + \
'\x00\x00\x00\x00\x00\x00\x00\x00\xff\xc4\x00\x14' + \
'\x11\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' + \
'\x00\x00\x00\x00\x00\x00\xff\xda\x00\x0c\x03\x01' + \
'\x00\x02\x11\x03\x11\x00\x3f\x00\xb2\xc0\x07\xff\xd9'
fake_image.write(image_tail)
fake_image.close()
def insert_metadata(path_to_fake, marker, size, meta, verbose):
"""
Insert metadata into the fake image
@param path_to_fake: path to the fake image
@type path_to_fake: string
@param marker: JPEG marker
@type marker: string
@param size: size of a JPEG block
@type size: string
@param meta: metadata information
@type meta: string
"""
# Metadata insertion
fake_image = open(path_to_fake, 'a')
fake_image.write(marker)
fake_image.write(size)
fake_image.write(meta)
fake_image.close()
diff --git a/invenio/legacy/websubmit/functions/Add_Files.py b/invenio/legacy/websubmit/functions/Add_Files.py
index b92cf6248..294e4b60a 100644
--- a/invenio/legacy/websubmit/functions/Add_Files.py
+++ b/invenio/legacy/websubmit/functions/Add_Files.py
@@ -1,34 +1,34 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
import os
-from invenio.bibdocfile import BibRecDocs, decompose_file
+from invenio.legacy.bibdocfile.api import BibRecDocs, decompose_file
def Add_Files(parameters, curdir, form, user_info=None):
"""DEPRECATED: Use FFT instead."""
if os.path.exists("%s/files" % curdir):
bibrecdocs = BibRecDocs(sysno)
for current_file in os.listdir("%s/files" % curdir):
fullpath = "%s/files/%s" % (curdir,current_file)
dummy, filename, extension = decompose_file(current_file)
if extension and extension[0] != ".":
extension = '.' + extension
if not bibrecdocs.check_file_exists(fullpath, extension):
bibrecdocs.add_new_file(fullpath, "Main", never_fail=True)
return ""
diff --git a/invenio/legacy/websubmit/functions/Create_Upload_Files_Interface.py b/invenio/legacy/websubmit/functions/Create_Upload_Files_Interface.py
index b6f3cc6b2..5410e3273 100644
--- a/invenio/legacy/websubmit/functions/Create_Upload_Files_Interface.py
+++ b/invenio/legacy/websubmit/functions/Create_Upload_Files_Interface.py
@@ -1,500 +1,500 @@
## $Id: Revise_Files.py,v 1.37 2009/03/26 15:11:05 jerome Exp $
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebSubmit function - Displays a generic interface to upload, delete
and revise files.
To be used on par with Move_Uploaded_Files_to_Storage function:
- Create_Upload_Files_Interface records the actions performed by user.
- Move_Uploaded_Files_to_Storage execute the recorded actions.
NOTE:
=====
- Due to the way WebSubmit works, this function can only work when
positionned at step 1 in WebSubmit admin, and
Move_Uploaded_Files_to_Storage is at step 2
FIXME:
======
- One issue: if we allow deletion or renaming, we might lose track of
a bibdoc: someone adds X, renames X->Y, and adds again another file
with name X: when executing actions, we will add the second X, and
rename it to Y
-> need to go back in previous action when renaming... or check
that name has never been used..
"""
__revision__ = "$Id$"
import os
from invenio.config import \
CFG_SITE_LANG
from invenio.base.i18n import gettext_set_language, wash_language
-from invenio.bibdocfile_managedocfiles import create_file_upload_interface
+from invenio.legacy.bibdocfile.managedocfiles import create_file_upload_interface
def Create_Upload_Files_Interface(parameters, curdir, form, user_info=None):
"""
List files for revisions.
You should use Move_Uploaded_Files_to_Storage.py function in your
submission to apply the changes performed by users with this
interface.
@param parameters:(dictionary) - must contain:
+ maxsize: the max size allowed for uploaded files
+ minsize: the max size allowed for uploaded files
+ doctypes: the list of doctypes (like 'Main' or 'Additional')
and their description that users can choose from
when adding new files.
- When no value is provided, users cannot add new
file (they can only revise/delete/add format)
- When a single value is given, it is used as
default doctype for all new documents
Eg:
main=Main document|additional=Figure, schema. etc
('=' separates doctype and description
'|' separates each doctype/description group)
+ restrictions: the list of restrictions (like 'Restricted' or
'No Restriction') and their description that
users can choose from when adding/revising
files. Restrictions can then be configured at
the level of WebAccess.
- When no value is provided, no restriction is
applied
- When a single value is given, it is used as
default resctriction for all documents.
- The first value of the list is used as default
restriction if the user if not given the
choice of the restriction. CHOOSE THE ORDER!
Eg:
=No restriction|restr=Restricted
('=' separates restriction and description
'|' separates each restriction/description group)
+ canDeleteDoctypes: the list of doctypes that users are
allowed to delete.
Eg:
Main|Additional
('|' separated values)
Use '*' for all doctypes
+ canReviseDoctypes: the list of doctypes that users are
allowed to revise
Eg:
Main|Additional
('|' separated values)
Use '*' for all doctypes
+ canDescribeDoctypes: the list of doctypes that users are
allowed to describe
Eg:
Main|Additional
('|' separated values)
Use '*' for all doctypes
+ canCommentDoctypes: the list of doctypes that users are
allowed to comment
Eg:
Main|Additional
('|' separated values)
Use '*' for all doctypes
+ canKeepDoctypes: the list of doctypes for which users can
choose to keep previous versions visible when
revising a file (i.e. 'Keep previous version'
checkbox). See also parameter 'keepDefault'.
Note that this parameter is ~ignored when
revising the attributes of a file (comment,
description) without uploading a new
file. See also parameter
Move_Uploaded_Files_to_Storage.forceFileRevision
Eg:
Main|Additional
('|' separated values)
Use '*' for all doctypes
+ canAddFormatDoctypes: the list of doctypes for which users can
add new formats. If there is no value,
then no 'add format' link nor warning
about losing old formats are displayed.
Eg:
Main|Additional
('|' separated values)
Use '*' for all doctypes
+ canRestrictDoctypes: the list of doctypes for which users can
choose the access restrictions when adding or
revising a file. If no value is given:
- no restriction is applied if none is defined
in the 'restrictions' parameter.
- else the *first* value of the 'restrictions'
parameter is used as default restriction.
Eg:
Main|Additional
('|' separated values)
Use '*' for all doctypes
+ canRenameDoctypes: the list of doctypes that users are allowed
to rename (when revising)
Eg:
Main|Additional
('|' separated values)
Use '*' for all doctypes
+ canNameNewFiles: if user can choose the name of the files they
upload (1) or not (0)
+ defaultFilenameDoctypes: Rename uploaded files to admin-chosen
values. List here the the files in
current submission directory that
contain the names to use for each doctype.
Eg:
Main=RN|Additional=additional_filename
('=' separates doctype and file in curdir
'|' separates each doctype/file group).
If the same doctype is submitted
several times, a"-%i" suffix is added
to the name defined in the file.
The default filenames are overriden
by user-chosen names if you allow
'canNameNewFiles' or
'canRenameDoctypes'.
+ maxFilesDoctypes: the maximum number of files that users can
upload for each doctype.
Eg:
Main=1|Additional=2
('|' separated values)
Do not specify the doctype here to have an
unlimited number of files for a given
doctype.
+ createRelatedFormats: if uploaded files get converted to
whatever format we can (1) or not (0)
+ keepDefault: the default behaviour for keeping or not previous
version of files when users cannot choose (no
value in canKeepDoctypes): keep (1) or not (0)
Note that this parameter is ignored when revising
the attributes of a file (comment, description)
without uploading a new file. See also parameter
Move_Uploaded_Files_to_Storage.forceFileRevision
+ showLinks: if we display links to files (1) when possible or
not (0)
+ fileLabel: the label for the file field
+ filenameLabel: the label for the file name field
+ descriptionLabel: the label for the description field
+ commentLabel: the label for the comments field
+ restrictionLabel: the label in front of the restrictions list
+ startDoc: the name of a file in curdir that contains some
text/markup to be printed *before* the file revision
box
+ endDoc: the name of a file in curdir that contains some
text/markup to be printed *after* the file revision
box
"""
global sysno
ln = wash_language(form['ln'])
_ = gettext_set_language(ln)
out = ''
## Fetch parameters defined for this function
(minsize, maxsize, doctypes_and_desc, doctypes,
can_delete_doctypes, can_revise_doctypes, can_describe_doctypes,
can_comment_doctypes, can_keep_doctypes, can_rename_doctypes,
can_add_format_to_doctypes, createRelatedFormats_p,
can_name_new_files, keep_default, show_links, file_label,
filename_label, description_label, comment_label, startDoc,
endDoc, restrictions_and_desc, can_restrict_doctypes,
restriction_label, doctypes_to_default_filename,
max_files_for_doctype) = \
wash_function_parameters(parameters, curdir, ln)
try:
recid = int(sysno)
except:
recid = None
out += '<center>'
out += startDoc
out += create_file_upload_interface(recid,
form=form,
print_outside_form_tag=True,
print_envelope=True,
include_headers=True,
ln=ln,
minsize=minsize, maxsize=maxsize,
doctypes_and_desc=doctypes_and_desc,
can_delete_doctypes=can_delete_doctypes,
can_revise_doctypes=can_revise_doctypes,
can_describe_doctypes=can_describe_doctypes,
can_comment_doctypes=can_comment_doctypes,
can_keep_doctypes=can_keep_doctypes,
can_rename_doctypes=can_rename_doctypes,
can_add_format_to_doctypes=can_add_format_to_doctypes,
create_related_formats=createRelatedFormats_p,
can_name_new_files=can_name_new_files,
keep_default=keep_default, show_links=show_links,
file_label=file_label, filename_label=filename_label,
description_label=description_label, comment_label=comment_label,
restrictions_and_desc=restrictions_and_desc,
can_restrict_doctypes=can_restrict_doctypes,
restriction_label=restriction_label,
doctypes_to_default_filename=doctypes_to_default_filename,
max_files_for_doctype=max_files_for_doctype,
sbm_indir=None, sbm_doctype=None, sbm_access=None,
uid=None, sbm_curdir=curdir)[1]
out += endDoc
out += '</center>'
return out
def wash_function_parameters(parameters, curdir, ln=CFG_SITE_LANG):
"""
Returns the functions (admin-defined) parameters washed and
initialized properly, as a tuple:
Parameters:
check Create_Upload_Files_Interface(..) docstring
Returns:
tuple (minsize, maxsize, doctypes_and_desc, doctypes,
can_delete_doctypes, can_revise_doctypes,
can_describe_doctypes can_comment_doctypes, can_keep_doctypes,
can_rename_doctypes, can_add_format_to_doctypes,
createRelatedFormats_p, can_name_new_files, keep_default,
show_links, file_label, filename_label, description_label,
comment_label, startDoc, endDoc, access_restrictions_and_desc,
can_restrict_doctypes, restriction_label,
doctypes_to_default_filename, max_files_for_doctype)
"""
_ = gettext_set_language(ln)
# The min and max files sizes that users can upload
minsize = parameters['minsize']
maxsize = parameters['maxsize']
# The list of doctypes + description that users can select when
# adding new files. If there are no values, then user cannot add
# new files. '|' is used to separate doctypes groups, and '=' to
# separate doctype and description. Eg:
# main=Main document|additional=Figure, schema. etc
doctypes_and_desc = [doctype.strip().split("=") for doctype \
in parameters['doctypes'].split('|') \
if doctype.strip() != '']
doctypes = [doctype for (doctype, desc) in doctypes_and_desc]
doctypes_and_desc = [[doctype, _(desc)] for \
(doctype, desc) in doctypes_and_desc]
# The list of doctypes users are allowed to delete
# (list of values separated by "|")
can_delete_doctypes = [doctype.strip() for doctype \
in parameters['canDeleteDoctypes'].split('|') \
if doctype.strip() != '']
# The list of doctypes users are allowed to revise
# (list of values separated by "|")
can_revise_doctypes = [doctype.strip() for doctype \
in parameters['canReviseDoctypes'].split('|') \
if doctype.strip() != '']
# The list of doctypes users are allowed to describe
# (list of values separated by "|")
can_describe_doctypes = [doctype.strip() for doctype \
in parameters['canDescribeDoctypes'].split('|') \
if doctype.strip() != '']
# The list of doctypes users are allowed to comment
# (list of values separated by "|")
can_comment_doctypes = [doctype.strip() for doctype \
in parameters['canCommentDoctypes'].split('|') \
if doctype.strip() != '']
# The list of doctypes for which users are allowed to decide
# if they want to keep old files or not when revising
# (list of values separated by "|")
can_keep_doctypes = [doctype.strip() for doctype \
in parameters['canKeepDoctypes'].split('|') \
if doctype.strip() != '']
# The list of doctypes users are allowed to rename
# (list of values separated by "|")
can_rename_doctypes = [doctype.strip() for doctype \
in parameters['canRenameDoctypes'].split('|') \
if doctype.strip() != '']
# The mapping from doctype to default filename.
# '|' is used to separate doctypes groups, and '=' to
# separate doctype and file in curdir where the default name is. Eg:
# main=main_filename|additional=additional_filename. etc
default_doctypes_and_curdir_files = [doctype.strip().split("=") for doctype \
in parameters['defaultFilenameDoctypes'].split('|') \
if doctype.strip() != '']
doctypes_to_default_filename = {}
for doctype, curdir_file in default_doctypes_and_curdir_files:
default_filename = read_file(curdir, curdir_file)
if default_filename:
doctypes_to_default_filename[doctype] = os.path.basename(default_filename)
# The maximum number of files that can be uploaded for each doctype
# Eg:
# main=1|additional=3
doctypes_and_max_files = [doctype.strip().split("=") for doctype \
in parameters['maxFilesDoctypes'].split('|') \
if doctype.strip() != '']
max_files_for_doctype = {}
for doctype, max_files in doctypes_and_max_files:
if max_files.isdigit():
max_files_for_doctype[doctype] = int(max_files)
# The list of doctypes for which users are allowed to add new formats
# (list of values separated by "|")
can_add_format_to_doctypes = [doctype.strip() for doctype \
in parameters['canAddFormatDoctypes'].split('|') \
if doctype.strip() != '']
# The list of access restrictions + description that users can
# select when adding new files. If there are no values, no
# restriction is applied . '|' is used to separate access
# restrictions groups, and '=' to separate access restriction and
# description. Eg: main=Main document|additional=Figure,
# schema. etc
access_restrictions_and_desc = [access.strip().split("=") for access \
in parameters['restrictions'].split('|') \
if access.strip() != '']
access_restrictions_and_desc = [[access, _(desc)] for \
(access, desc) in access_restrictions_and_desc]
# The list of doctypes users are allowed to restrict
# (list of values separated by "|")
can_restrict_doctypes = [restriction.strip() for restriction \
in parameters['canRestrictDoctypes'].split('|') \
if restriction.strip() != '']
# If we should create additional formats when applicable (1) or
# not (0)
try:
createRelatedFormats_p = int(parameters['createRelatedFormats'])
except ValueError, e:
createRelatedFormats_p = False
# If users can name the files they add
# Value should be 0 (Cannot rename) or 1 (Can rename)
try:
can_name_new_files = int(parameters['canNameNewFiles'])
except ValueError, e:
can_name_new_files = False
# The default behaviour wrt keeping previous files or not.
# 0 = do not keep, 1 = keep
try:
keep_default = int(parameters['keepDefault'])
except ValueError, e:
keep_default = False
# If we display links to files (1) or not (0)
try:
show_links = int(parameters['showLinks'])
except ValueError, e:
show_links = True
file_label = parameters['fileLabel']
if file_label == "":
file_label = _('Choose a file')
filename_label = parameters['filenameLabel']
if filename_label == "":
filename_label = _('Name')
description_label = parameters['descriptionLabel']
if description_label == "":
description_label = _('Description')
comment_label = parameters['commentLabel']
if comment_label == "":
comment_label = _('Comment')
restriction_label = parameters['restrictionLabel']
if restriction_label == "":
restriction_label = _('Access')
startDoc = parameters['startDoc']
endDoc = parameters['endDoc']
prefix = read_file(curdir, startDoc)
if prefix is None:
prefix = ""
suffix = read_file(curdir, endDoc)
if suffix is None:
suffix = ""
return (minsize, maxsize, doctypes_and_desc, doctypes,
can_delete_doctypes, can_revise_doctypes,
can_describe_doctypes, can_comment_doctypes,
can_keep_doctypes, can_rename_doctypes,
can_add_format_to_doctypes, createRelatedFormats_p,
can_name_new_files, keep_default, show_links, file_label,
filename_label, description_label, comment_label,
prefix, suffix, access_restrictions_and_desc,
can_restrict_doctypes, restriction_label,
doctypes_to_default_filename, max_files_for_doctype)
def read_file(curdir, filename):
"""
Reads a file in curdir.
Returns None if does not exist, cannot be read, or if file is not
really in curdir
"""
try:
file_path = os.path.abspath(os.path.join(curdir, filename))
if not file_path.startswith(curdir):
return None
file_desc = file(file_path, 'r')
content = file_desc.read()
file_desc.close()
except:
content = None
return content
diff --git a/invenio/legacy/websubmit/functions/Insert_Modify_Record.py b/invenio/legacy/websubmit/functions/Insert_Modify_Record.py
index e336034a4..9c8fadec3 100644
--- a/invenio/legacy/websubmit/functions/Insert_Modify_Record.py
+++ b/invenio/legacy/websubmit/functions/Insert_Modify_Record.py
@@ -1,57 +1,57 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
import os
import shutil
import time
import tempfile
from invenio.config import \
CFG_TMPDIR
from invenio.legacy.websubmit.config import InvenioWebSubmitFunctionError
from invenio.legacy.websubmit.functions.Shared_Functions import ParamFromFile
-from invenio.bibtask import task_low_level_submission, bibtask_allocate_sequenceid
+from invenio.legacy.bibsched.bibtask import task_low_level_submission, bibtask_allocate_sequenceid
def Insert_Modify_Record(parameters, curdir, form, user_info=None):
"""
Modify existing record using 'curdir/recmysql' and BibUpload correct
mode. The file must therefore already have been created prior to this
execution of this function, for eg. using "Make_Modify_Record".
This function gets the output of BibConvert and uploads it into
the MySQL bibliographical database.
"""
global rn
sequence_id = bibtask_allocate_sequenceid(curdir)
if os.path.exists(os.path.join(curdir, "recmysqlfmt")):
recfile = "recmysqlfmt"
elif os.path.exists(os.path.join(curdir, "recmysql")):
recfile = "recmysql"
else:
raise InvenioWebSubmitFunctionError("Could not find record file")
initial_file = os.path.join(curdir, recfile)
tmp_fd, final_file = tempfile.mkstemp(dir=CFG_TMPDIR,
prefix="%s_%s" % \
(rn.replace('/', '_'),
time.strftime("%Y-%m-%d_%H:%M:%S")))
os.close(tmp_fd)
shutil.copy(initial_file, final_file)
bibupload_id = task_low_level_submission('bibupload', 'websubmit.Insert_Modify_Record', '-c', final_file, '-P', '3', '-I', str(sequence_id))
open(os.path.join(curdir, 'bibupload_id'), 'w').write(str(bibupload_id))
return ""
diff --git a/invenio/legacy/websubmit/functions/Insert_Record.py b/invenio/legacy/websubmit/functions/Insert_Record.py
index f312680cc..204c6d74a 100644
--- a/invenio/legacy/websubmit/functions/Insert_Record.py
+++ b/invenio/legacy/websubmit/functions/Insert_Record.py
@@ -1,52 +1,52 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
import os
import time
import shutil
import tempfile
from invenio.config import \
CFG_TMPDIR
from invenio.legacy.websubmit.config import InvenioWebSubmitFunctionError
from invenio.legacy.websubmit.functions.Shared_Functions import ParamFromFile
-from invenio.bibtask import task_low_level_submission, bibtask_allocate_sequenceid
+from invenio.legacy.bibsched.bibtask import task_low_level_submission, bibtask_allocate_sequenceid
def Insert_Record(parameters, curdir, form, user_info=None):
"""
Insert record in curdir/recmysql using BibUpload. The file must
therefore already have been created prior to this execution of
this function, for eg. using "Make_Record".
"""
global rn
sequence_id = bibtask_allocate_sequenceid(curdir)
if os.path.exists(os.path.join(curdir, "recmysql")):
recfile = "recmysql"
else:
raise InvenioWebSubmitFunctionError("Could not find record file")
initial_file = os.path.join(curdir, recfile)
tmp_fd, final_file = tempfile.mkstemp(dir=CFG_TMPDIR,
prefix="%s_%s" % \
(rn.replace('/', '_'),
time.strftime("%Y-%m-%d_%H:%M:%S")))
os.close(tmp_fd)
shutil.copy(initial_file, final_file)
bibupload_id = task_low_level_submission('bibupload', 'websubmit.Insert_Record', '-r', '-i', final_file, '-P', '3', '-I', str(sequence_id))
open(os.path.join(curdir, 'bibupload_id'), 'w').write(str(bibupload_id))
return ""
diff --git a/invenio/legacy/websubmit/functions/Link_Records.py b/invenio/legacy/websubmit/functions/Link_Records.py
index 44b18c713..d2a714536 100644
--- a/invenio/legacy/websubmit/functions/Link_Records.py
+++ b/invenio/legacy/websubmit/functions/Link_Records.py
@@ -1,191 +1,191 @@
## This file is part of Invenio.
## Copyright (C) 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
This function schedule a BibUpload append that will create a symmetric link
between two records based on the MARC field 787 OTHER RELATIONSHIP ENTRY (R)
787 OTHER RELATIONSHIP ENTRY (R)
Indicators
First Note controller
0 - Display note (in $i)
1 - Do not display note
Subfield Code(s)
$i Relationship information (R) - [CER]
$r Report number
$w Record control number (R) - [CER]
NOTE: Used to link Conference papers and Slides records ($i Conference paper/Slides - $w CDS recid)
Example:
http://cds.cern.ch/record/1372158
7870_ $$iSlides$$rLHCb-TALK-2011-087$$w1353576
We need to include in the submission form for LHCb-PROC a field for the related repnr, from which to create the 7870 field. It would be perfect if at the same time the inverse 7870 field could be inserted in the TALK record:
7870_ $$iConference paper$$rLHCb-PROC-2011-041$$w1372158
"""
import re
import tempfile
import time
import os
from os.path import exists, join
from invenio.legacy.bibrecord import record_xml_output, record_add_field
from invenio.modules.formatter.api import get_tag_from_name
from invenio.legacy.search_engine import search_pattern, get_fieldvalues
from invenio.config import CFG_TMPDIR
-from invenio.bibtask import task_low_level_submission
+from invenio.legacy.bibsched.bibtask import task_low_level_submission
from invenio.legacy.websubmit.config import InvenioWebSubmitFunctionError
CFG_OTHER_RELATIONSHIP_ENTRY = (get_tag_from_name('other relationship entry') or '787')[:3]
CFG_PRIMARY_REPORTNUMBER = get_tag_from_name('primary report number') or '037__a'
RE_FILENAME = re.compile("\\<pa\\>file\\:(.+)\\<\\/pa\\>", re.I)
def Link_Records(parameters, curdir, form, user_info=None):
"""
This function create a MARC link between two records (the 1st specified in the
edsrn file or SN, the second specified by edsrn2 file, where you can store
the reportnumber or directly the recid.
In "directRelationship" you should specify either the name of a file (by using
<pa>file:filename</pa>) or direclty, what is the relationship
of the second record to be stored in the metadata of the 1st record.
In the file "reverseRelationship" you can similarly specify the other
direction of the harrow.
"""
global sysno
edsrn = parameters["edsrn"]
edsrn2 = parameters["edsrn2"]
direct_relationship = parameters["directRelationship"]
reverse_relationship = parameters["reverseRelationship"]
keep_original_edsrn2 = parameters.get("keep_original_edsrn2", "True")
if keep_original_edsrn2 == "True":
keep_original_edsrn2 = True
elif keep_original_edsrn2 == "False":
keep_original_edsrn2 = False
else:
keep_original_edsrn2 = True
recid_a = int(sysno)
if exists(join(curdir, edsrn)):
rn_a = open(join(curdir, edsrn)).read().strip()
else:
rn_a = ""
if not rn_a:
try:
recid_a, rn_a = get_recid_and_reportnumber(recid=sysno)
except ValueError, err:
raise InvenioWebSubmitFunctionError("Error in finding the current record and its reportnumber: %s" % err)
if exists(join(curdir, edsrn2)):
rn_b = open(join(curdir, edsrn2)).read().strip()
else:
return ""
if not rn_b:
return ""
if rn_b.isdigit():
recid_b = int(rn_b)
rn_b = ""
recid_b, rn_b = get_recid_and_reportnumber(recid=recid_b)
else:
recid_b, rn_b = get_recid_and_reportnumber(reportnumber=rn_b,
keep_original_reportnumber=keep_original_edsrn2)
g = RE_FILENAME.match(direct_relationship)
if g:
filename = g.group(1)
if exists(join(curdir, filename)):
direct_relationship = open(join(curdir, filename)).read().strip()
if not direct_relationship:
raise InvenioWebSubmitFunctionError("Can not retrieve direct relationship")
g = RE_FILENAME.match(reverse_relationship)
if g:
filename = g.group(1)
if exists(join(curdir, filename)):
reverse_relationship = open(join(curdir, filename)).read().strip()
if not reverse_relationship:
raise InvenioWebSubmitFunctionError("Can not retrieve reverse relationship")
marcxml = _prepare_marcxml(recid_a, rn_a, recid_b, rn_b, reverse_relationship, direct_relationship)
fd, name = tempfile.mkstemp(dir=CFG_TMPDIR, prefix="%s_%s" % \
(rn_a.replace('/', '_'),
time.strftime("%Y-%m-%d_%H:%M:%S")), suffix=".xml")
try:
os.write(fd, marcxml)
finally:
os.close(fd)
bibupload_id = task_low_level_submission('bibupload', 'websubmit.Link_Records', '-a', name, '-P', '3')
open(join(curdir, 'bibupload_link_record_id'), 'w').write(str(bibupload_id))
return ""
def get_recid_and_reportnumber(recid=None, reportnumber=None, keep_original_reportnumber=True):
"""
Given at least a recid or a reportnumber, this function will look into
the system for the matching record and will return a normalized
recid and the primary reportnumber.
@raises ValueError: in case of no record matched.
"""
if recid:
## Recid specified receives priority.
recid = int(recid)
values = get_fieldvalues(recid, CFG_PRIMARY_REPORTNUMBER)
if values:
## Let's take whatever reportnumber is stored in the matching record
reportnumber = values[0]
return recid, reportnumber
else:
raise ValueError("The record %s does not have a primary report number" % recid)
elif reportnumber:
## Ok reportnumber specified, let's better try 1st with primary and then
## with other reportnumber
recids = search_pattern(p='%s:"%s"' % (CFG_PRIMARY_REPORTNUMBER, reportnumber))
if not recids:
## Not found as primary
recids = search_pattern(p='reportnumber:"%s"' % reportnumber)
if len(recids) > 1:
raise ValueError('More than one record matches the reportnumber "%s": %s' % (reportnumber, ', '.join(recids)))
elif len(recids) == 1:
recid = list(recids)[0]
if keep_original_reportnumber:
return recid, reportnumber
else:
reportnumbers = get_fieldvalues(recid, CFG_PRIMARY_REPORTNUMBER)
if not reportnumbers:
raise ValueError("The matched record %s does not have a primary report number" % recid)
return recid, reportnumbers[0]
else:
raise ValueError("No records are matched by the provided reportnumber: %s" % reportnumber)
raise ValueError("At least the recid or the reportnumber must be specified")
def _prepare_marcxml(recid_a, rn_a, recid_b, rn_b, what_is_a_for_b, what_is_b_for_a, display_in_a=True, display_in_b=True):
record_a = {}
record_b = {}
record_add_field(record_a, "001", controlfield_value=str(recid_a))
record_add_field(record_a, CFG_OTHER_RELATIONSHIP_ENTRY, ind1=display_in_a and "0" or "1", subfields=[('i', what_is_b_for_a), ('r', rn_b), ('w', str(recid_b))])
record_add_field(record_b, "001", controlfield_value=str(recid_b))
record_add_field(record_b, CFG_OTHER_RELATIONSHIP_ENTRY, ind1=display_in_b and "0" or "1", subfields=[('i', what_is_a_for_b), ('r', rn_a), ('w', str(recid_a))])
return "<collection>\n%s\n%s</collection>" % (record_xml_output(record_a), record_xml_output(record_b))
diff --git a/invenio/legacy/websubmit/functions/Mail_New_Record_Notification.py b/invenio/legacy/websubmit/functions/Mail_New_Record_Notification.py
index 396c2b720..d7088ff67 100644
--- a/invenio/legacy/websubmit/functions/Mail_New_Record_Notification.py
+++ b/invenio/legacy/websubmit/functions/Mail_New_Record_Notification.py
@@ -1,316 +1,316 @@
# -*- coding: utf-8 -*-
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""This module contains the WebSubmit function "Mail_New_Record_Notification",
which should be called when a new record has been submitted to the repository
and notified of the fact should be sent by mail to the submitters/requester/
admins/other general managers.
"""
__revision__ = "$Id$"
import os
from invenio.config import CFG_SITE_NAME, CFG_SITE_SUPPORT_EMAIL, CFG_SITE_URL, CFG_SITE_ADMIN_EMAIL, \
CFG_SITE_RECORD
from invenio.legacy.webuser import email_valid_p
from invenio.legacy.websubmit.config import CFG_WEBSUBMIT_COPY_MAILS_TO_ADMIN
from invenio.legacy.websubmit.functions.Shared_Functions import ParamFromFile
from invenio.ext.email import scheduled_send_email
-from invenio.bibtask import bibtask_allocate_sequenceid
+from invenio.legacy.bibsched.bibtask import bibtask_allocate_sequenceid
CFG_EMAIL_FROM_ADDRESS = '%s Submission Engine <%s>' % (CFG_SITE_NAME, CFG_SITE_SUPPORT_EMAIL)
def Mail_New_Record_Notification(parameters, curdir, form, user_info=None):
"""
This function sends a mail giving notification about the submission
of a new item to the relevant recipients, including:
+ The record's Submitter(s);
+ The site ADMIN;
+ The record-type's "managers" (signified by the "submit_managers"
parameter);
The mail contains details of the new item's reference number(s), its
title and its author(s). It also contains a link to the item in the
Invenio repository.
@param parameters: (dictionary) - contains the following parameter
strings used by this function:
+ item_status: (string) - the status of the new item. It can be
either "ADDED" (in which case the new item has been integrated
into the repository), or "APPROVAL" (in which case the item is
awaiting a referee's approval before being integrated into the
repository, and the mail should state this fact);
+ mail_submitters: (string) - a flag containing "Y" or "N" (defaulting
to "Y"). Determines whether or not the notification mail will be
sent to the submitters;
+ item_managers: (string) - a comma-separated list of email
addresses, each of which corresponds to a "manager" for the class
of item that has been submitted. These managers will receive the
notification message sent by this function;
+ author_file: (string) - the name of a file that contains the names
of the item's authors (one author per line);
+ title_file: (string) - the name of a file that contains the title
of the new item;
+ owners_file: (string) - the name of a file that contains the email
addresses of the "owners" of the submitted item. I.e. those who
will be classed as "submitters" of the item and will therefore
have modification rights over it. The mail will be sent to these
people. There should be one email-address per line in this file;
+ rn_file1: (string) - the name of the the file containing the item's
principal reference number;
+ rn_file2: (string) - the name of the file containing the item's
additional reference number(s) (e.g. sometimes two reference numbers
are allocated during the submission process;
@param curdir: (string) - the current submission's working directory. All
files containing data related to the submission are stored here and
therefore all of the files referred to in the "parameters" dictionary
are considered to be within "curdir";
@param form: (string) - a dictionary-like structure containing the fields
that were present in the WebSubmit submission form;
@return: (string) - an empty string;
"""
global sysno ## (I'm really sorry for that! :-O )
sequence_id = bibtask_allocate_sequenceid(curdir)
## Read items from the parameters array into local vars:
item_status = parameters["item_status"]
mail_submitters = parameters["mail_submitters"]
item_managers = parameters["item_managers"]
author_file = parameters["author_file"]
title_file = parameters["title_file"]
owners_file = parameters["owners_file"]
rn_file1 = parameters["rn_file1"]
rn_file2 = parameters["rn_file2"]
## Now wash the parameters' values:
##
## item_status:
try:
## If item_status isn't "added" or "approval", make it "added" by
## default. Else, keep its value:
item_status = (item_status.upper() in ("ADDED", "APPROVAL") \
and item_status.upper()) or "ADDED"
except AttributeError:
## Oops - item_status wasn't a string (NoneType?) Anyway, default
## it to "ADDED".
item_status = "ADDED"
## mail_submitters:
try:
## If mail_submitters isn't "Y" or "N", make it "Y" by
## default. Else, keep its value:
mail_submitters = (mail_submitters.upper() in ("Y", "N") \
and mail_submitters.upper()) or "Y"
except AttributeError:
## Oops - mail_submitters wasn't a string (NoneType?) Anyway, default
## it to "Y".
mail_submitters = "Y"
## item_managers:
## A string in which the item_managers' email addresses will be stored:
managers_email = ""
try:
## We assume that the email addresses of item managers are
## separated by commas.
item_managers_list = item_managers.split(",")
for manager in item_managers_list:
manager_address = manager.strip()
## Test that this manager's email address is OK, adding it if so:
if email_valid_p(manager_address):
## This address is OK - add it to the string of manager
## addresses:
managers_email += "%s," % manager_address
## Strip the trailing comma from managers_email (if there is one):
managers_email = managers_email.strip().rstrip(",")
except AttributeError:
## Oops - item_managers doesn't seem to be a string? Treat it as
## though it were empty:
managers_email = ""
## author_file:
authors = ""
try:
## Read in the authors from author_file, putting them into the "authors"
## variable, one per line:
fp_author_file = open("%s/%s" % (curdir, author_file), "r")
for author in fp_author_file:
authors += "%s\n" % author.strip()
fp_author_file.close()
except IOError:
## Unable to correctly read from "author_file", Skip it as though
## there were no authors:
authors = "-"
## title_file:
title = ""
try:
## Read in the lines from title_file, putting them into the "title"
## variable on one line:
fp_title_file = open("%s/%s" % (curdir, title_file), "r")
for line in fp_title_file:
title += "%s " % line.strip()
fp_title_file.close()
title = title.strip()
except IOError:
## Unable to correctly read from "title_file", Skip it as though
## there were no title:
title = "-"
## owners_file:
## A string in which the item_owners' email addresses will be stored:
owners_email = ""
try:
fp_owners_file = open("%s/%s" % (curdir, owners_file), "r")
for line in fp_owners_file:
owner_address = line.strip()
## Test that this owner's email address is OK, adding it if so:
if email_valid_p(owner_address):
## This address is OK - add it to the string of item owner
## addresses:
owners_email += "%s," % owner_address
## Strip the trailing comma from owners_email (if there is one):
owners_email = owners_email.strip().rstrip(",")
except IOError:
## Unable to correctly read from "owners_file", Skip it as though
## there were no title:
owners_email = ""
## Add "SuE" (the submitter) into the list of document "owners":
try:
fp_sue = open("%s/SuE" % curdir, "r")
sue = fp_sue.readline()
fp_sue.close()
except IOError:
sue = ""
else:
if sue.lower() not in owners_email.lower().split(","):
## The submitter is not listed in the "owners" mails,
## add her:
owners_email = "%s,%s" % (sue, owners_email)
owners_email = owners_email.strip().rstrip(",")
## rn_file1 & rn_file2:
reference_numbers = ""
try:
fp_rnfile1 = open("%s/%s" % (curdir, rn_file1), "r")
for line in fp_rnfile1:
reference_number = line.strip()
reference_number = \
reference_number.replace("\n", "").replace("\r", "").\
replace(" ", "")
if reference_number != "":
## Add this reference number into the "reference numbers"
## variable:
reference_numbers += "%s " % reference_number
fp_rnfile1.close()
except IOError:
reference_numbers = ""
try:
fp_rnfile2 = open("%s/%s" % (curdir, rn_file2), "r")
for line in fp_rnfile2:
reference_number = line.strip()
reference_number = \
reference_number.replace("\n", "").replace("\r", "").\
replace(" ", "")
if reference_number != "":
## Add this reference number into the "reference numbers"
## variable:
reference_numbers += "%s " % reference_number
fp_rnfile2.close()
except IOError:
pass
## Strip any trailing whitespace from the reference numbers:
reference_numbers = reference_numbers.strip()
## Now build the email from the information we've collected:
email_txt = """
The following item has been submitted to %(sitename)s:
Reference(s): %(reference)s
Title: %(title)s
Author(s): %(author)s
""" % { 'sitename' : CFG_SITE_NAME,
'reference' : reference_numbers,
'title' : title,
'author' : authors,
}
if item_status == "ADDED":
## The item has been added into the repository.
email_txt += """
It will soon be made available and you will be able to check it at the
following URL:
<%(siteurl)s/%(CFG_SITE_RECORD)s/%(record-id)s>
Please report any problems to <%(sitesupportemail)s>.
""" % { 'siteurl' : CFG_SITE_URL,
'CFG_SITE_RECORD' : CFG_SITE_RECORD,
'record-id' : sysno,
'sitesupportemail' : CFG_SITE_SUPPORT_EMAIL,
}
else:
## The item has not yet been added - instead it awaits the
## approval of a referee. Let the email reflect this detail:
email_txt += """
The item is now awaiting a referee's approval before being integrated
into the repository. You will be alerted by email as soon as a decision
has been taken.
"""
## Finish the message with a signature:
email_txt += """
Thank you for submitting your item into %(sitename)s.
""" % { 'sitename' : CFG_SITE_NAME, }
## Send the email:
if mail_submitters == "Y" and len(owners_email) != "":
## Mail-to is "owners_email":
if managers_email != "":
## Managers should also be copied into the mail:
owners_email += ",%s" % managers_email
## Post the mail:
scheduled_send_email(CFG_EMAIL_FROM_ADDRESS, owners_email, \
"[%s] Submitted" % reference_numbers, \
email_txt, copy_to_admin=CFG_WEBSUBMIT_COPY_MAILS_TO_ADMIN, \
other_bibtasklet_arguments=['-I', str(sequence_id)])
elif managers_email != "":
## Although it's not desirable to mail the submitters, if "managers"
## have been given, it is reasonable to mail them:
scheduled_send_email(CFG_EMAIL_FROM_ADDRESS, managers_email, \
"[%s] Submitted" % reference_numbers, \
email_txt, copy_to_admin=CFG_WEBSUBMIT_COPY_MAILS_TO_ADMIN, \
other_bibtasklet_arguments=['-I', str(sequence_id)])
elif CFG_WEBSUBMIT_COPY_MAILS_TO_ADMIN:
## We don't want to mail the "owners". Let's mail the admin instead:
scheduled_send_email(CFG_EMAIL_FROM_ADDRESS, CFG_SITE_ADMIN_EMAIL, \
"[%s] Submitted" % reference_numbers, email_txt, \
other_bibtasklet_arguments=['-I', str(sequence_id)])
## Return an empty string
return ""
diff --git a/invenio/legacy/websubmit/functions/Mail_Submitter.py b/invenio/legacy/websubmit/functions/Mail_Submitter.py
index 2a02d73ad..9888de88c 100644
--- a/invenio/legacy/websubmit/functions/Mail_Submitter.py
+++ b/invenio/legacy/websubmit/functions/Mail_Submitter.py
@@ -1,136 +1,136 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
##
## Name: Mail_Submitter.py
## Description: function Mail_Submitter
## This function sends a confirmation email to the submitter
## of the document
## Author: T.Baron
##
## PARAMETERS: authorfile: name of the file containing the author
## titleFile: name of the file containing the title
## emailFile: name of the file containing the email
## status: one of "ADDED" (the document has been integrated
## into the database) or "APPROVAL" (an email has
## been sent to a referee - simple approval)
## edsrn: name of the file containing the reference
## newrnin: name of the file containing the 2nd reference
## (if any)
## OUTPUT: HTML
##
import os
import re
from invenio.config import CFG_SITE_NAME, \
CFG_SITE_URL, \
CFG_SITE_SUPPORT_EMAIL, \
CFG_SITE_RECORD
from invenio.legacy.websubmit.config import CFG_WEBSUBMIT_COPY_MAILS_TO_ADMIN
from invenio.legacy.websubmit.functions.Shared_Functions import get_nice_bibsched_related_message, ParamFromFile
from invenio.ext.email import scheduled_send_email
-from invenio.bibtask import bibtask_allocate_sequenceid
+from invenio.legacy.bibsched.bibtask import bibtask_allocate_sequenceid
def Mail_Submitter(parameters, curdir, form, user_info=None):
"""
This function send an email to the submitter to warn him the
document he has just submitted has been correctly received.
Parameters:
* authorfile: Name of the file containing the authors of the
document
* titleFile: Name of the file containing the title of the
document
* emailFile: Name of the file containing the email of the
submitter of the document
* status: Depending on the value of this parameter, the function
adds an additional text to the email. This parameter
can be one of: ADDED: The file has been integrated in
the database. APPROVAL: The file has been sent for
approval to a referee. or can stay empty.
* edsrn: Name of the file containing the reference of the
document
* newrnin: Name of the file containing the 2nd reference of the
document (if any)
"""
FROMADDR = '%s Submission Engine <%s>' % (CFG_SITE_NAME,CFG_SITE_SUPPORT_EMAIL)
sequence_id = bibtask_allocate_sequenceid(curdir)
# retrieve report number
edsrn = parameters['edsrn']
newrnin = parameters['newrnin']
fp = open("%s/%s" % (curdir,edsrn),"r")
rn = fp.read()
fp.close()
rn = re.sub("[\n\r]+","",rn)
if newrnin != "" and os.path.exists("%s/%s" % (curdir,newrnin)):
fp = open("%s/%s" % (curdir,newrnin),"r")
additional_rn = fp.read()
fp.close()
additional_rn = re.sub("[\n\r]+","",additional_rn)
fullrn = "%s and %s" % (additional_rn,rn)
else:
fullrn = rn
fullrn = fullrn.replace("\n"," ")
# The title is read from the file specified by 'titlefile'
try:
fp = open("%s/%s" % (curdir,parameters['titleFile']),"r")
m_title = fp.read().replace("\n"," ")
fp.close()
except:
m_title = "-"
# The name of the author is read from the file specified by 'authorfile'
try:
fp = open("%s/%s" % (curdir,parameters['authorfile']),"r")
m_author = fp.read().replace("\n"," ")
fp.close()
except:
m_author = "-"
# The submitters email address is read from the file specified by 'emailFile'
try:
fp = open("%s/%s" % (curdir,parameters['emailFile']),"r")
m_recipient = fp.read().replace ("\n"," ")
fp.close()
except:
m_recipient = ""
# create email body
email_txt = "The document %s\nTitle: %s\nAuthor(s): %s\n\nhas been correctly received\n\n" % (fullrn,m_title,m_author)
# The user is either informed that the document has been added to the database, or sent for approval
if parameters['status'] == "APPROVAL":
email_txt = email_txt + "An email has been sent to the referee. You will be warned by email as soon as the referee takes his/her decision regarding your document.\n\n"
elif parameters['status'] == "ADDED":
email_txt = email_txt + "It will be soon added to our Document Server.\n\nOnce inserted, you will be able to check the bibliographic information and the quality of the electronic documents at this URL:\n<%s/%s/%s>\nIf you detect an error please let us know by sending an email to %s. \n\n" % (CFG_SITE_URL,CFG_SITE_RECORD,sysno,CFG_SITE_SUPPORT_EMAIL)
email_txt += get_nice_bibsched_related_message(curdir)
email_txt = email_txt + "Thank you for using %s Submission Interface.\n" % CFG_SITE_NAME
## send the mail, if there are any recipients or copy to admin
if m_recipient or CFG_WEBSUBMIT_COPY_MAILS_TO_ADMIN:
scheduled_send_email(FROMADDR, m_recipient.strip(), "%s: Document Received" % fullrn, email_txt,
copy_to_admin=CFG_WEBSUBMIT_COPY_MAILS_TO_ADMIN,
other_bibtasklet_arguments=['-I', str(sequence_id)])
return ""
diff --git a/invenio/legacy/websubmit/functions/Move_CKEditor_Files_to_Storage.py b/invenio/legacy/websubmit/functions/Move_CKEditor_Files_to_Storage.py
index 2a41fedab..21bb6ba16 100644
--- a/invenio/legacy/websubmit/functions/Move_CKEditor_Files_to_Storage.py
+++ b/invenio/legacy/websubmit/functions/Move_CKEditor_Files_to_Storage.py
@@ -1,187 +1,187 @@
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
WebSubmit function - Replaces the links that have been created by the
CKEditor
"""
__revision__ = "$Id$"
import re
import os
import urllib
-from invenio.bibdocfile import decompose_file
+from invenio.legacy.bibdocfile.api import decompose_file
from invenio.config import \
CFG_SITE_URL, \
CFG_SITE_SECURE_URL, \
CFG_PREFIX, \
CFG_SITE_RECORD
re_ckeditor_link = re.compile('"(' + CFG_SITE_URL + '|' + CFG_SITE_SECURE_URL + ')' + \
r'/submit/getattachedfile/(?P<uid>\d+)/(?P<type>(image|file|media|flash))/(?P<filename>.*?)"')
def Move_CKEditor_Files_to_Storage(parameters, curdir, form, user_info=None):
"""
Moves the files uploaded via the CKEditor that are linked to
the given field. Replace these links with URLs 'local' to the
record (recid/files/).
When attaching a file, the editor post the file to a temporary
drop box accessible via a URL for previews. We want to fetch
these files (via FFT) to integrate them to the record, and change
the links in the record to point to the integrated files.
The function *MUST* be run BEFORE the record has been created
(with Make_Record.py or Make_Modify_Record.py).
You *HAVE* to include the created FFT field (output of this
function) in your BibConvert template.
Parameters:
input_fields - *str* a comma separated list of file names that
should be processed by this element. Eg:
'ABSE,ABSF' in order to process values of the
English and French abstracts
"""
input_filenames = [input_filename for input_filename in \
parameters['input_fields'].split(',') if \
os.path.exists(curdir + os.sep + input_filename)]
processed_paths = []
for input_filename in input_filenames:
input_file = file(curdir + os.sep + input_filename)
input_string = input_file.read()
input_file.close()
def translate_link(match_obj):
"""Replace CKEditor link by 'local' record link. Also
create the FFT for that link"""
file_type = match_obj.group('type')
file_name = match_obj.group('filename')
uid = match_obj.group('uid')
dummy, name, extension = decompose_file(file_name)
new_url = build_url(sysno, name, file_type, extension)
original_location = match_obj.group()[1:-1]
icon_location = original_location
# Prepare FFT that will fetch the file (+ the original
# file in the case of images)
if file_type == 'image':
# Does original file exists, or do we just have the
# icon? We expect the original file at a well defined
# location
possible_original_path = os.path.join(CFG_PREFIX,
'var', 'tmp',
'attachfile',
uid,
file_type,
'original',
file_name)
if os.path.exists(possible_original_path):
icon_location = original_location
original_location = possible_original_path
new_url = build_url(sysno, name,
file_type, extension, is_icon=True)
docname = build_docname(name, file_type, extension)
if original_location not in processed_paths:
# Must create an FFT only if we have not yet processed
# the file. This can happen if same image exists on
# the same page (either in two different CKEditor
# instances, or twice in the HTML)
processed_paths.append(original_location)
write_fft(curdir,
original_location,
docname,
icon_location,
doctype=file_type)
return '"' + new_url + '"'
output_string = re_ckeditor_link.sub(translate_link, input_string)
output_file = file(curdir + os.sep + input_filename, 'w')
output_file.write(output_string)
output_file.close()
def build_url(sysno, name, file_type, extension, is_icon=False):
"""
Build the local URL to the file with given parameters
@param sysno: record ID
@name name: base name of the file
@param file_type: as chosen by CKEditor: 'File', 'Image', 'Flash', 'Media'
@param extension: file extension, including '.'
"""
return CFG_SITE_URL + '/'+ CFG_SITE_RECORD +'/' + str(sysno) + \
'/files/' + urllib.quote(build_docname(name, file_type, extension)) + \
(is_icon and '?subformat=icon' or '')
def build_docname(name, file_type, extension):
"""
Build the docname of the file.
In order to ensure uniqueness of the docname, we have to prefix
the filename with the filetype: CKEditor takes care of filenames
uniqueness for each diffrent filetype, but not that files in
different filetypes will not have the same name
"""
return name + '_' + file_type + extension
def write_fft(curdir, file_location, docname, icon_location=None, doctype="image"):
"""
Append a new FFT for the record. Write the result to the FFT file on disk.
May only be used for files attached with CKEditor (i.e. URLs
matching re_ckeditor_link)
"""
if file_location.startswith(CFG_SITE_URL) or \
file_location.startswith(CFG_SITE_SECURE_URL):
# CKEditor does not url-encode filenames, and FFT does not
# like URLs that are not quoted. So do it now (but only for
# file name, in URL context!)
url_parts = file_location.split("/")
try:
file_location = "/".join(url_parts[:-1]) + \
'/' + urllib.quote(url_parts[-1])
except:
pass
if icon_location.startswith(CFG_SITE_URL) or \
icon_location.startswith(CFG_SITE_SECURE_URL):
# Ditto quote file name
url_parts = icon_location.split("/")
try:
icon_location = "/".join(url_parts[:-1]) + \
'/' + urllib.quote(url_parts[-1])
except:
pass
icon_subfield = ''
if icon_location:
icon_subfield = '<subfield code="x">%s</subfield>' % icon_location
fft_file = file(os.path.join(curdir, 'FFT'), 'a')
fft_file.write("""
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(location)s</subfield>
<subfield code="n">%(docname)s</subfield>
<subfield code="t">%(doctype)s</subfield>
%(icon_subfield)s
</datafield>""" % {'location': file_location,
'icon_subfield': icon_subfield,
'doctype': doctype,
'docname': docname})
fft_file.close()
diff --git a/invenio/legacy/websubmit/functions/Move_Files_Archive.py b/invenio/legacy/websubmit/functions/Move_Files_Archive.py
index bd27b30e8..828328241 100644
--- a/invenio/legacy/websubmit/functions/Move_Files_Archive.py
+++ b/invenio/legacy/websubmit/functions/Move_Files_Archive.py
@@ -1,48 +1,48 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
import os
-from invenio.bibdocfile import BibRecDocs, decompose_file, normalize_format
+from invenio.legacy.bibdocfile.api import BibRecDocs, decompose_file, normalize_format
def Move_Files_Archive(parameters, curdir, form, user_info=None):
"""DEPRECATED: Use FFT instead."""
MainDir = "%s/files/MainFiles" % curdir
IncludeDir = "%s/files/AdditionalFiles" % curdir
watcheddirs = {'Main' : MainDir, 'Additional' : IncludeDir}
for type, dir in watcheddirs.iteritems():
if os.path.exists(dir):
formats = {}
files = os.listdir(dir)
files.sort()
for file in files:
dummy, filename, extension = decompose_file(file)
if not formats.has_key(filename):
formats[filename] = []
formats[filename].append(normalize_format(extension))
# first delete all missing files
bibarchive = BibRecDocs(sysno)
existingBibdocs = bibarchive.list_bibdocs(type)
for existingBibdoc in existingBibdocs:
if not formats.has_key(bibarchive.get_docname(existingBibdoc.id)):
existingBibdoc.delete()
# then create/update the new ones
for key in formats.keys():
# instanciate bibdoc object
bibarchive.add_new_file('%s/%s%s' % (dir, key, formats[key]), doctype=type, never_fail=True)
return ""
diff --git a/invenio/legacy/websubmit/functions/Move_Files_to_Storage.py b/invenio/legacy/websubmit/functions/Move_Files_to_Storage.py
index b7f475819..ab8dfe804 100644
--- a/invenio/legacy/websubmit/functions/Move_Files_to_Storage.py
+++ b/invenio/legacy/websubmit/functions/Move_Files_to_Storage.py
@@ -1,270 +1,270 @@
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Function for archiving files"""
__revision__ = "$Id$"
-from invenio.bibdocfile import \
+from invenio.legacy.bibdocfile.api import \
BibRecDocs, \
decompose_file, \
InvenioBibDocFileError, \
CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT
import os
import re
from invenio.websubmit_icon_creator import create_icon
from invenio.legacy.websubmit.config import InvenioWebSubmitFunctionWarning
from invenio.legacy.websubmit.functions.Shared_Functions import get_dictionary_from_string, \
createRelatedFormats
from invenio.ext.logging import register_exception
from invenio.config import CFG_BINDIR
from invenio.legacy.dbquery import run_sql
from invenio.utils.shell import run_shell_command
def Move_Files_to_Storage(parameters, curdir, form, user_info=None):
"""
The function moves files received from the standard submission's
form through file input element(s). The document are assigned a
'doctype' (or category) corresponding to the file input element
(eg. a file uploaded throught 'DEMOPIC_FILE' will go to
'DEMOPIC_FILE' doctype/category).
Websubmit engine builds the following file organization in the
directory curdir/files:
curdir/files
|
_____________________________________________________________________
| | |
./file input 1 element's name ./file input 2 element's name ....
(for eg. 'DEMOART_MAILFILE') (for eg. 'DEMOART_APPENDIX')
| |
test1.pdf test2.pdf
There is only one instance of all possible extension(pdf, gz...) in each part
otherwise we may encounter problems when renaming files.
+ parameters['rename']: if given, all the files in curdir/files
are renamed. parameters['rename'] is of the form:
<PA>elemfilename[re]</PA>* where re is an regexp to select(using
re.sub) what part of the elem file has to be selected.
e.g: <PA>file:TEST_FILE_RN</PA>
+ parameters['documenttype']: if given, other formats are created.
It has 2 possible values: - if "picture" icon in gif format is created
- if "fulltext" ps, gz .... formats are created
+ parameters['paths_and_suffixes']: directories to look into and
corresponding suffix to add to every file inside. It must have
the same structure as a Python dictionnary of the following form
{'FrenchAbstract':'french', 'EnglishAbstract':''}
The keys are the file input element name from the form <=>
directories in curdir/files The values associated are the
suffixes which will be added to all the files in
e.g. curdir/files/FrenchAbstract
+ parameters['iconsize'] need only if 'icon' is selected in
parameters['documenttype']
+ parameters['paths_and_restrictions']: the restrictions to apply
to each uploaded file. The parameter must have the same
structure as a Python dictionnary of the following form:
{'DEMOART_APPENDIX':'restricted'}
Files not specified in this parameter are not restricted.
The specified restrictions can include a variable that can be
replaced at runtime, for eg:
{'DEMOART_APPENDIX':'restricted to <PA>file:SuE</PA>'}
+ parameters['paths_and_doctypes']: if a doctype is specified,
the file will be saved under the 'doctype/collection' instead
of under the default doctype/collection given by the name
of the upload element that was used on the websubmit interface.
to configure the doctype in websubmit, enter the value as in a
dictionnary, for eg:
{'PATHS_SWORD_UPL' : 'PUSHED_TO_ARXIV'} -> from
Demo_Export_Via_Sword [DEMOSWR] Document Types
"""
global sysno
paths_and_suffixes = parameters['paths_and_suffixes']
paths_and_restrictions = parameters['paths_and_restrictions']
rename = parameters['rename']
documenttype = parameters['documenttype']
iconsizes = parameters['iconsize'].split(',')
paths_and_doctypes = parameters['paths_and_doctypes']
## Create an instance of BibRecDocs for the current recid(sysno)
bibrecdocs = BibRecDocs(sysno)
paths_and_suffixes = get_dictionary_from_string(paths_and_suffixes)
paths_and_restrictions = get_dictionary_from_string(paths_and_restrictions)
paths_and_doctypes = get_dictionary_from_string(paths_and_doctypes)
## Go through all the directories specified in the keys
## of parameters['paths_and_suffixes']
for path in paths_and_suffixes.keys():
## Check if there is a directory for the current path
if os.path.exists("%s/files/%s" % (curdir, path)):
## Retrieve the restriction to apply to files in this
## directory
restriction = paths_and_restrictions.get(path, '')
restriction = re.sub('<PA>(?P<content>[^<]*)</PA>',
get_pa_tag_content,
restriction)
## Go through all the files in curdir/files/path
for current_file in os.listdir("%s/files/%s" % (curdir, path)):
## retrieve filename and extension
dummy, filename, extension = decompose_file(current_file)
if extension and extension[0] != ".":
extension = '.' + extension
if len(paths_and_suffixes[path]) != 0:
extension = "_%s%s" % (paths_and_suffixes[path], extension)
## Build the new file name if rename parameter has been given
if rename:
filename = re.sub('<PA>(?P<content>[^<]*)</PA>', \
get_pa_tag_content, \
parameters['rename'])
if rename or len(paths_and_suffixes[path]) != 0 :
## Rename the file
try:
# Write the log rename_cmd
fd = open("%s/rename_cmd" % curdir, "a+")
fd.write("%s/files/%s/%s" % (curdir, path, current_file) + " to " +\
"%s/files/%s/%s%s" % (curdir, path, filename, extension) + "\n\n")
## Rename
os.rename("%s/files/%s/%s" % (curdir, path, current_file), \
"%s/files/%s/%s%s" % (curdir, path, filename, extension))
fd.close()
## Save the new name in a text file in curdir so that
## the new filename can be used by templates to created the recmysl
fd = open("%s/%s_RENAMED" % (curdir, path), "w")
fd.write("%s%s" % (filename, extension))
fd.close()
except OSError, err:
msg = "Cannot rename the file.[%s]"
msg %= str(err)
raise InvenioWebSubmitFunctionWarning(msg)
fullpath = "%s/files/%s/%s%s" % (curdir, path, filename, extension)
## Check if there is any existing similar file
if not bibrecdocs.check_file_exists(fullpath, extension):
bibdoc = bibrecdocs.add_new_file(fullpath, doctype=paths_and_doctypes.get(path, path), never_fail=True)
bibdoc.set_status(restriction)
## Fulltext
if documenttype == "fulltext":
additionalformats = createRelatedFormats(fullpath)
if len(additionalformats) > 0:
for additionalformat in additionalformats:
try:
bibrecdocs.add_new_format(additionalformat)
except InvenioBibDocFileError:
pass
## Icon
elif documenttype == "picture":
has_added_default_icon_subformat_p = False
for iconsize in iconsizes:
try:
iconpath, iconname = create_icon({
'input-file' : fullpath,
'icon-scale' : iconsize,
'icon-name' : None,
'icon-file-format' : None,
'multipage-icon' : False,
'multipage-icon-delay' : 100,
'verbosity' : 0,
})
except Exception, e:
register_exception(prefix='Impossible to create icon for %s (record %s)' % (fullpath, sysno), alert_admin=True)
continue
iconpath = os.path.join(iconpath, iconname)
docname = decompose_file(fullpath)[1]
try:
mybibdoc = bibrecdocs.get_bibdoc(docname)
except InvenioBibDocFileError:
mybibdoc = None
if iconpath is not None and mybibdoc is not None:
try:
icon_suffix = iconsize.replace('>', '').replace('<', '').replace('^', '').replace('!', '')
if not has_added_default_icon_subformat_p:
mybibdoc.add_icon(iconpath)
has_added_default_icon_subformat_p = True
else:
mybibdoc.add_icon(iconpath, subformat=CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT + "-" + icon_suffix)
## Save the new icon filename in a text file in curdir so that
## it can be used by templates to created the recmysl
try:
if not has_added_default_icon_subformat_p:
fd = open("%s/%s_ICON" % (curdir, path), "w")
else:
fd = open("%s/%s_ICON_%s" % (curdir, path, iconsize + '_' + icon_suffix), "w")
fd.write(os.path.basename(iconpath))
fd.close()
except OSError, err:
msg = "Cannot store icon filename.[%s]"
msg %= str(err)
raise InvenioWebSubmitFunctionWarning(msg)
except InvenioBibDocFileError, e:
# Most probably icon already existed.
pass
elif mybibdoc is not None:
mybibdoc.delete_icon()
# Update the MARC
bibdocfile_bin = os.path.join(CFG_BINDIR, 'bibdocfile --yes-i-know')
run_shell_command(bibdocfile_bin + " --fix-marc --recid=%s", (str(sysno),))
# Delete the HB BibFormat cache in the DB, so that the fulltext
# links do not point to possible dead files
run_sql("DELETE LOW_PRIORITY from bibfmt WHERE format='HB' AND id_bibrec=%s", (sysno,))
return ""
def get_pa_tag_content(pa_content):
"""Get content for <PA>XXX</PA>.
@param pa_content: MatchObject for <PA>(.*)</PA>.
return: the content of the file possibly filtered by an regular expression
if pa_content=file[re]:a_file => first line of file a_file matching re
if pa_content=file*p[re]:a_file => all lines of file a_file, matching re,
separated by - (dash) char.
"""
pa_content = pa_content.groupdict()['content']
sep = '-'
out = ''
if pa_content.startswith('file'):
filename = ""
regexp = ""
if "[" in pa_content:
split_index_start = pa_content.find("[")
split_index_stop = pa_content.rfind("]")
regexp = pa_content[split_index_start+1:split_index_stop]
filename = pa_content[split_index_stop+2:]## ]:
else :
filename = pa_content.split(":")[1]
if os.path.exists(os.path.join(curdir, filename)):
fp = open(os.path.join(curdir, filename), 'r')
if pa_content[:5] == "file*":
out = sep.join(map(lambda x: re.split(regexp, x.strip())[-1], fp.readlines()))
else:
out = re.split(regexp, fp.readline().strip())[-1]
fp.close()
return out
diff --git a/invenio/legacy/websubmit/functions/Move_Photos_to_Storage.py b/invenio/legacy/websubmit/functions/Move_Photos_to_Storage.py
index 11966fc16..1aa594f3f 100644
--- a/invenio/legacy/websubmit/functions/Move_Photos_to_Storage.py
+++ b/invenio/legacy/websubmit/functions/Move_Photos_to_Storage.py
@@ -1,554 +1,554 @@
## This file is part of Invenio.
## Copyright (C) 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebSubmit function - Batch photo uploader
To be used with WebSubmit element 'Upload_Photos' or one of its
derivatives in order to create a batch photos uploader.
Requirements:
=============
JQuery:
- jquery.min.js
JQuery UI:
- jquery-ui.min.js
- UI "base" theme:
- jquery.ui.slider.css
- jquery.ui.core.css
- jquery.ui.theme.css
- images
Uploadify 2.0.1 (JQuery plugin):
- jquery.uploadify.min.js
- sfwobject.js
- uploadify.css
- cancel.png
- uploadify.swf, uploadify.allglyphs.swf and uploadify.fla
"""
import os
import time
import re
from urllib import quote
from cgi import escape
-from invenio.bibdocfile import BibRecDocs, InvenioBibDocFileError
+from invenio.legacy.bibdocfile.api import BibRecDocs, InvenioBibDocFileError
from invenio.config import CFG_BINDIR, CFG_SITE_URL
from invenio.legacy.dbquery import run_sql
from invenio.websubmit_icon_creator import create_icon, InvenioWebSubmitIconCreatorError
-from invenio.bibdocfile_config import CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT
+from invenio.legacy.bibdocfile.config import CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT
def Move_Photos_to_Storage(parameters, curdir, form, user_info=None):
"""
The function moves files received from the submission's form
through the PHOTO_MANAGER element and its asynchronous uploads at
CFG_SITE_URL/submit/uploadfile.
Parameters:
@iconsize - Seperate multiple sizes with commas. The ImageMagick geometry inputs are supported.
Use type 'geometry' as defined in ImageMagick.
(eg. 320 or 320x240 or 100> or 5%)
Example: "180>,700>" will create two icons, one with maximum dimension 180px, one 700px
@iconformat - Allowed extensions (as defined in websubmit_icon_creator.py) are:
"pdf", "gif", "jpg",
"jpeg", "ps", "png", "bmp"
"eps", "epsi", "epsf"
The PHOTO_MANAGER elements builds the following file organization
in the directory curdir::
curdir/
|
______________________________________________________________________
| | |
files/ PHOTO_MANAGER_ICONS icons/
| PHOTO_MANAGER_ORDER |
(user id)/ PHOTO_MANAGER_DELETE (user id)/
| PHOTO_MANAGER_NEW |
NewFile/ PHOTO_MANAGER_DESCRIPTION_X NewFile/
| |
_______________________ _____________________
| | | | | |
photo1.jpg myPhoto.gif ... photo1.jpg myPhoto.gif ...
where the files are:
- PHOTO_MANAGER_ORDER: ordered list of file IDs. One per line.
- PHOTO_MANAGER_ICONS: mappings from file IDs to URL of the icons.
One per line. Separator: /
- PHOTO_MANAGER_NEW: mapping from file ID to filename on disk. Only
applicable to files that have just been
uploaded (i.e. not bibdocfiles). One per
line. Separator: /
- PHOTO_MANAGER_DELETE: list of files IDs that must be deleted. One
per line
- PHOTO_MANAGER_DESCRIPTION_X, where X is file ID: contains photos
descriptions (one per file)
"""
global sysno
icon_sizes = parameters.get('iconsize').split(',')
icon_format = parameters.get('iconformat')
if not icon_format:
icon_format = 'gif'
PHOTO_MANAGER_ICONS = read_param_file(curdir, 'PHOTO_MANAGER_ICONS', split_lines=True)
photo_manager_icons_dict = dict([value.split('/', 1) \
for value in PHOTO_MANAGER_ICONS \
if '/' in value])
PHOTO_MANAGER_ORDER = read_param_file(curdir, 'PHOTO_MANAGER_ORDER', split_lines=True)
photo_manager_order_list = [value for value in PHOTO_MANAGER_ORDER if value.strip()]
PHOTO_MANAGER_DELETE = read_param_file(curdir, 'PHOTO_MANAGER_DELETE', split_lines=True)
photo_manager_delete_list = [value for value in PHOTO_MANAGER_DELETE if value.strip()]
PHOTO_MANAGER_NEW = read_param_file(curdir, 'PHOTO_MANAGER_NEW', split_lines=True)
photo_manager_new_dict = dict([value.split('/', 1) \
for value in PHOTO_MANAGER_NEW \
if '/' in value])
## Create an instance of BibRecDocs for the current recid(sysno)
bibrecdocs = BibRecDocs(sysno)
for photo_id in photo_manager_order_list:
photo_description = read_param_file(curdir, 'PHOTO_MANAGER_DESCRIPTION_' + photo_id)
# We must take different actions depending if we deal with a
# file that already exists, or if it is a new file
if photo_id in photo_manager_new_dict.keys():
# New file
if photo_id not in photo_manager_delete_list:
filename = photo_manager_new_dict[photo_id]
filepath = os.path.join(curdir, 'files', str(user_info['uid']),
'NewFile', filename)
icon_filename = os.path.splitext(filename)[0] + ".gif"
fileiconpath = os.path.join(curdir, 'icons', str(user_info['uid']),
'NewFile', icon_filename)
# Add the file
if os.path.exists(filepath):
_do_log(curdir, "Adding file %s" % filepath)
bibdoc = bibrecdocs.add_new_file(filepath, doctype="picture", never_fail=True)
has_added_default_icon_subformat_p = False
for icon_size in icon_sizes:
# Create icon if needed
try:
(icon_path, icon_name) = create_icon(
{ 'input-file' : filepath,
'icon-name' : icon_filename,
'icon-file-format' : icon_format,
'multipage-icon' : False,
'multipage-icon-delay' : 100,
'icon-scale' : icon_size, # Resize only if width > 300
'verbosity' : 0,
})
fileiconpath = os.path.join(icon_path, icon_name)
except InvenioWebSubmitIconCreatorError, e:
_do_log(curdir, "Icon could not be created to %s: %s" % (filepath, e))
pass
if os.path.exists(fileiconpath):
try:
if not has_added_default_icon_subformat_p:
bibdoc.add_icon(fileiconpath)
has_added_default_icon_subformat_p = True
_do_log(curdir, "Added icon %s" % fileiconpath)
else:
icon_suffix = icon_size.replace('>', '').replace('<', '').replace('^', '').replace('!', '')
bibdoc.add_icon(fileiconpath, subformat=CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT + "-" + icon_suffix)
_do_log(curdir, "Added icon %s" % fileiconpath)
except InvenioBibDocFileError, e:
# Most probably icon already existed.
pass
if photo_description and bibdoc:
for file_format in [bibdocfile.get_format() \
for bibdocfile in bibdoc.list_latest_files()]:
bibdoc.set_comment(photo_description, file_format)
_do_log(curdir, "Added comment %s" % photo_description)
else:
# Existing file
bibdocname = bibrecdocs.get_docname(int(photo_id))
if photo_id in photo_manager_delete_list:
# In principle we should not get here. but just in case...
bibrecdocs.delete_bibdoc(bibdocname)
_do_log(curdir, "Deleted %s" % bibdocname)
else:
bibdoc = bibrecdocs.get_bibdoc(bibdocname)
for file_format in [bibdocfile.get_format() \
for bibdocfile in bibdoc.list_latest_files()]:
bibdoc.set_comment(photo_description, file_format)
_do_log(curdir, "Added comment %s" % photo_description)
# Now delete requeted files
for photo_id in photo_manager_delete_list:
try:
bibdocname = bibrecdocs.get_docname(int(photo_id))
bibrecdocs.delete_bibdoc(bibdocname)
_do_log(curdir, "Deleted %s" % bibdocname)
except:
# we tried to delete a photo that does not exist (maybe already deleted)
pass
# Update the MARC
_do_log(curdir, "Asking bibdocfile to fix marc")
bibdocfile_bin = os.path.join(CFG_BINDIR, 'bibdocfile --yes-i-know')
os.system(bibdocfile_bin + " --fix-marc --recid=" + str(sysno))
# Delete the HB BibFormat cache in the DB, so that the fulltext
# links do not point to possible dead files
run_sql("DELETE LOW_PRIORITY from bibfmt WHERE format='HB' AND id_bibrec=%s", (sysno,))
return ""
def read_param_file(curdir, param, split_lines=False):
"Helper function to access files in submission dir"
param_value = ""
path = os.path.join(curdir, param)
try:
if os.path.abspath(path).startswith(curdir):
fd = file(path)
if split_lines:
param_value = [line.strip() for line in fd.readlines()]
else:
param_value = fd.read()
fd.close()
except Exception, e:
_do_log(curdir, 'Could not read %s: %s' % (param, e))
pass
return param_value
def _do_log(log_dir, msg):
"""
Log what we have done, in case something went wrong.
Nice to compare with bibdocactions.log
Should be removed when the development is over.
"""
log_file = os.path.join(log_dir, 'performed_actions.log')
file_desc = open(log_file, "a+")
file_desc.write("%s --> %s\n" %(time.strftime("%Y-%m-%d %H:%M:%S"), msg))
file_desc.close()
def get_session_id(req, uid, user_info):
"""
Returns by all means the current session id of the user.
Raises ValueError if cannot be found
"""
# Get the session id
## This can be later simplified once user_info object contain 'sid' key
session_id = None
try:
try:
from flask import session
session_id = session.sid
except AttributeError, e:
# req was maybe not available (for eg. when this is run
# through Create_Modify_Interface.py)
session_id = user_info['session']
except Exception, e:
raise ValueError("Cannot retrieve user session")
return session_id
def create_photos_manager_interface(sysno, session_id, uid,
doctype, indir, curdir, access,
can_delete_photos=True,
can_reorder_photos=True,
can_upload_photos=True,
editor_width=None,
editor_height=None,
initial_slider_value=100,
max_slider_value=200,
min_slider_value=80):
"""
Creates and returns the HTML of the photos manager interface for
submissions.
@param sysno: current record id
@param session_id: user session_id (as retrieved by get_session_id(...) )
@param uid: user id
@param doctype: doctype of the submission
@param indir: submission "indir"
@param curdir: submission "curdir"
@param access: submission "access"
@param can_delete_photos: if users can delete photos
@param can_reorder_photos: if users can reorder photos
@param can_upload_photos: if users can upload photos
@param editor_width: width (in pixels) of the editor
@param editor_height: height (in pixels) of the editor
@param initial_slider_value: initial value of the photo size slider
@param max_slider_value: max value of the photo size slider
@param min_slider_value: min value of the photo size slider
"""
out = ''
PHOTO_MANAGER_ICONS = read_param_file(curdir, 'PHOTO_MANAGER_ICONS', split_lines=True)
photo_manager_icons_dict = dict([value.split('/', 1) for value in PHOTO_MANAGER_ICONS if '/' in value])
PHOTO_MANAGER_ORDER = read_param_file(curdir, 'PHOTO_MANAGER_ORDER', split_lines=True)
photo_manager_order_list = [value for value in PHOTO_MANAGER_ORDER if value.strip()]
PHOTO_MANAGER_DELETE = read_param_file(curdir, 'PHOTO_MANAGER_DELETE', split_lines=True)
photo_manager_delete_list = [value for value in PHOTO_MANAGER_DELETE if value.strip()]
PHOTO_MANAGER_NEW = read_param_file(curdir, 'PHOTO_MANAGER_NEW', split_lines=True)
photo_manager_new_dict = dict([value.split('/', 1) for value in PHOTO_MANAGER_NEW if '/' in value])
photo_manager_descriptions_dict = {}
# Compile a regular expression that can match the "default" icon,
# and not larger version.
CFG_BIBDOCFILE_ICON_SUBFORMAT_RE_DEFAULT = re.compile(CFG_BIBDOCFILE_DEFAULT_ICON_SUBFORMAT + '\Z')
# Load the existing photos from the DB if we are displaying
# this interface for the first time, and if a record exists
if sysno and not PHOTO_MANAGER_ORDER:
bibarchive = BibRecDocs(sysno)
for doc in bibarchive.list_bibdocs():
if doc.get_icon() is not None:
original_url = doc.list_latest_files()[0].get_url()
doc_id = str(doc.get_id())
icon_url = doc.get_icon(subformat_re=CFG_BIBDOCFILE_ICON_SUBFORMAT_RE_DEFAULT).get_url() # Get "default" icon
description = ""
for bibdoc_file in doc.list_latest_files():
#format = bibdoc_file.get_format().lstrip('.').upper()
#url = bibdoc_file.get_url()
#photo_files.append((format, url))
if not description and bibdoc_file.get_comment():
description = escape(bibdoc_file.get_comment())
name = bibarchive.get_docname(doc.id)
photo_manager_descriptions_dict[doc_id] = description
photo_manager_icons_dict[doc_id] = icon_url
photo_manager_order_list.append(doc_id) # FIXME: respect order
# Prepare the list of photos to display.
photos_img = []
for doc_id in photo_manager_order_list:
if not photo_manager_icons_dict.has_key(doc_id):
continue
icon_url = photo_manager_icons_dict[doc_id]
if PHOTO_MANAGER_ORDER:
# Get description from disk only if some changes have been done
description = escape(read_param_file(curdir, 'PHOTO_MANAGER_DESCRIPTION_' + doc_id))
else:
description = escape(photo_manager_descriptions_dict[doc_id])
photos_img.append('''
<li id="%(doc_id)s" style="width:%(initial_slider_value)spx;">
<div class="imgBlock">
<div class="normalLineHeight" style="margin-left:auto;margin-right:auto;display:inline" >
<img id="delete_%(doc_id)s" class="hidden" src="/img/cross_red.gif" alt="Delete" style="position:absolute;top:0;" onclick="delete_photo('%(doc_id)s');"/>
<img src="%(icon_url)s" class="imgIcon"/>
</div>
<div class="normalLineHeight">
<textarea style="width:95%%" id="PHOTO_MANAGER_DESCRIPTION_%(doc_id)s" name="PHOTO_MANAGER_DESCRIPTION_%(doc_id)s">%(description)s</textarea>
</div>
</div>
</li>''' % \
{'initial_slider_value': initial_slider_value,
'doc_id': doc_id,
'icon_url': icon_url,
'description': description})
out += '''
<link rel="stylesheet" href="%(CFG_SITE_URL)s/img/jquery-ui/themes/base/jquery.ui.slider.css" type="text/css" charset="utf-8"/>
<link rel="stylesheet" href="%(CFG_SITE_URL)s/img/jquery-ui/themes/base/jquery.ui.core.css" type="text/css" charset="utf-8"/>
<link rel="stylesheet" href="%(CFG_SITE_URL)s/img/jquery-ui/themes/base/jquery.ui.theme.css" type="text/css" charset="utf-8"/>
<style type="text/css">
#sortable { list-style-type: none; margin: 0; padding: 0; }
#sortable li { margin: auto 3px; padding: 1px; float: left; width: 180px; font-size:small; text-align: center; position: relative;}
#sortable .imgIcon {max-height:95%%;max-width:95%%;margin: 2px;max-height:130px;}
#sortable li div.imgBlock {vertical-align: middle; margin:
auto;display:inline;display:inline-table;display:inline-block;vertical-align:middle;text-align : center; width:100%%;position:relative}
#sortable li div.imgBlock .hidden {display:none;}
%(delete_hover_class)s
.fileUploadQueue{text-align:left; margin: 0 auto; width:300px;}
.normalLineHeight {line-height:normal}
</style>
<div id="uploadedFiles" style="%(hide_photo_viewer)sborder-left:1px solid #555; border-top:1px solid #555;border-right:1px solid #eee;border-bottom:1px solid #eee;overflow:auto;%(editor_height_style)s%(editor_width_style)sbackground-color:#eee;margin:3px;text-align:left;position:relative"><ul id="sortable">%(photos_img)s</ul></div>
<div id="grid_slider" style="%(hide_photo_viewer)swidth:300px;">
<div class='ui-slider-handle'></div>
</div>
<script type="text/javascript" src="%(CFG_SITE_URL)s/js/jquery.uploadify.min.js"></script>
<script type="text/javascript" src="%(CFG_SITE_URL)s/js/swfobject.js"></script>
<script type="text/javascript" src="%(CFG_SITE_URL)s/js/jquery-ui.min.js"></script>
<link rel="stylesheet" href="%(CFG_SITE_URL)s/img/uploadify.css" type="text/css" />
<script type="text/javascript">
$(document).ready(function() {
/* Uploading */
if (%(can_upload_photos)s) {
$('#uploadFile').uploadify({
'uploader': '%(CFG_SITE_URL)s/img/uploadify.swf',
'script': '/submit/uploadfile',
'cancelImg': '%(CFG_SITE_URL)s/img/cancel.png',
'multi' : true,
'auto' : true,
'simUploadLimit': 2,
'scriptData' : {'type': 'File', 'uid': %(uid)s, 'session_id': '%(session_id)s', 'indir': '%(indir)s', 'doctype': '%(doctype)s', 'access': '%(access)s'},
'displayDate': 'percentage',
'buttonText': 'Browse',
'fileDataName': 'NewFile' /* be compatible with CKEditor */,
'onSelectOnce': function(event, data) {
},
'onSelect': function(event, queueID, fileObj, response, data) {
$('#loading').css("visibility","visible");
},
'onAllComplete' : function(event, queueID, fileObj, response, data) {
$('#loading').css("visibility","hidden");
},
/*'onCheck': function(event, checkScript, fileQueue, folder, single) {
return false;
},*/
'onComplete': function(event, queueID, fileObj, response, data) {
$('#grid_slider').css("display","block");
$('#uploadedFiles').css("display","block");
var cur_width = $("#grid_slider").slider('option', 'value');
var response_obj = parse_invenio_response(response);
icon_url = '%(CFG_SITE_URL)s/img/file-icon-blank-96x128.gif'
if ("NewFile" in response_obj) {
filename = response_obj["NewFile"]["name"]
if ('iconName' in response_obj["NewFile"]){
icon_name = response_obj["NewFile"]["iconName"]
icon_url = '%(CFG_SITE_URL)s/submit/getuploadedfile?indir=%(indir)s&doctype=%(doctype)s&access=%(access)s&key=NewFile&icon=1&filename=' + icon_name
}
} else {
return true;
}
$('#sortable').append('<li id="'+ queueID +'" style="width:'+cur_width+'px;"><div class="imgBlock"><div class="normalLineHeight" style="margin-left:auto;margin-right:auto;display:inline" ><img id="delete_'+ queueID +'" class="hidden" src="/img/cross_red.gif" alt="Delete" style="position:absolute;top:0;" onclick="delete_photo(\\''+ queueID +'\\');"/><img src="'+ icon_url +'" class="imgIcon"/></div><div class="normalLineHeight"><textarea style="width:95%%" id="PHOTO_MANAGER_DESCRIPTION_'+ queueID +'" name="PHOTO_MANAGER_DESCRIPTION_'+ queueID +'"></textarea></div></div></li>');
update_order_field();
$('#photo_manager_icons').val($("#photo_manager_icons").val() + '\\n' + queueID + '/' + icon_url);
$('#photo_manager_new').val($("#photo_manager_new").val() + '\\n' + queueID + '/' + filename);
update_CSS();
return true;
}
});
}
/* Resizing */
$("#grid_slider").slider({
value: %(initial_slider_value)s,
max: %(max_slider_value)s,
min: %(min_slider_value)s,
slide: function(event, ui) {
update_CSS();
}
});
/* Update CSS to ensure that existing photos get nicely laid out*/
update_CSS();
});
/* Ordering */
$(function() {
if (%(can_reorder_photos)s) {
$("#sortable").sortable();
$("#sortable").bind('sortupdate', function(event, ui) {
update_order_field();
});
}
});
function delete_photo(docid){
if (confirm("Are you sure you want to delete the photo? (The file will be deleted after you apply all the modifications)")) {
$("#" + docid).remove();
$("#photo_manager_delete").val($("#photo_manager_delete").val() + '\\n' + docid);
update_order_field();
}
}
/* CSS-related */
function update_CSS(){
/* Update some style according to the slider size */
var slider_value = $("#grid_slider").slider('option', 'value');
$('#uploadedFiles li').css('width', slider_value+"px");
/*$('#uploadedFiles div.floater').css('width', slider_value+"px");*/
/* Update height attr accordingly so that image get centered.
First we need to get the tallest element of the list.
*/
var max_height = 0;
$('#uploadedFiles li div').each(function() {
this_height = $(this).height();
if(this_height > max_height) {
max_height = this_height;
}
});
$('#uploadedFiles li').css('height',max_height+"px");
$('#uploadedFiles li').css('line-height',max_height+"px");
}
/* Utils */
function update_order_field(){
$("#photo_manager_order").val($("#sortable").sortable('toArray').join('\\n'));
}
function parse_invenio_response(response){
/* Return the javascript object included in the
the given Invenio message. Really dirty implementation, but ok
in this very simple scenario */
/*var object_string = response.substring(response.indexOf('<![CDATA[')+9, response.lastIndexOf(']]>'));*/ object_string = response;
var object = {};
eval('object=' + object_string);
return object;
}
</script>
<div style="margin: 0 auto;">
<img src="%(CFG_SITE_URL)s/img/loading.gif" style="visibility: hidden" id="loading"/>
<input type="file" size="40" id="uploadFile" name="PHOTO_FILE" style="margin: 0 auto;%(upload_display)s"/>
</div>
<!--<a href="javascript:$('#uploadFile').fileUploadStart();">Upload Files</a> -->
<textarea id="photo_manager_icons" style="display:none" name="PHOTO_MANAGER_ICONS">%(PHOTO_MANAGER_ICONS)s</textarea>
<textarea id="photo_manager_order" style="display:none" name="PHOTO_MANAGER_ORDER">%(PHOTO_MANAGER_ORDER)s</textarea>
<textarea id="photo_manager_new" style="display:none" name="PHOTO_MANAGER_NEW">%(PHOTO_MANAGER_NEW)s</textarea>
<textarea id="photo_manager_delete" style="display:none" name="PHOTO_MANAGER_DELETE">%(PHOTO_MANAGER_DELETE)s</textarea>
''' % {'CFG_SITE_URL': CFG_SITE_URL,
#'curdir': cgi.escape(quote(curdir, safe="")),#quote(curdir, safe=""),
'uid': uid,
'access': quote(access, safe=""),
'doctype': quote(doctype, safe=""),
'indir': quote(indir, safe=""),
'session_id': quote(session_id, safe=""),
'PHOTO_MANAGER_ICONS': '\n'.join([key + '/' + value for key, value in photo_manager_icons_dict.iteritems()]),
'PHOTO_MANAGER_ORDER': '\n'.join(photo_manager_order_list),
'PHOTO_MANAGER_DELETE': '\n'.join(photo_manager_delete_list),
'PHOTO_MANAGER_NEW': '\n'.join([key + '/' + value for key, value in photo_manager_new_dict.iteritems()]),
'initial_slider_value': initial_slider_value,
'max_slider_value': max_slider_value,
'min_slider_value': min_slider_value,
'photos_img': '\n'.join(photos_img),
'hide_photo_viewer': (len(photos_img) == 0 and len(photo_manager_new_dict.keys()) == 0) and 'display:none;' or '',
'delete_hover_class': can_delete_photos and "#sortable li div.imgBlock:hover .hidden {display:inline;}" or '',
'can_reorder_photos': can_reorder_photos and 'true' or 'false',
'can_upload_photos': can_upload_photos and 'true' or 'false',
'upload_display': not can_upload_photos and 'display: none' or '',
'editor_width_style': editor_width and 'width:%spx;' % editor_width or '',
'editor_height_style': editor_height and 'height:%spx;' % editor_height or ''}
return out
diff --git a/invenio/legacy/websubmit/functions/Move_Revised_Files_to_Storage.py b/invenio/legacy/websubmit/functions/Move_Revised_Files_to_Storage.py
index b0d58ae49..2bd09651c 100644
--- a/invenio/legacy/websubmit/functions/Move_Revised_Files_to_Storage.py
+++ b/invenio/legacy/websubmit/functions/Move_Revised_Files_to_Storage.py
@@ -1,420 +1,420 @@
## $Id: Move_Revised_Files_to_Storage.py,v 1.20 2009/03/26 13:48:42 jerome Exp $
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebSubmit function - Archives uploaded files
TODO:
- Add parameter 'elementNameToFilename' so that files to revise can
be matched by name instead of doctype.
- Icons are created only for uploaded files, but not for related format
created on the fly.
"""
__revision__ = "$Id$"
import time
import os
-from invenio.bibdocfile import \
+from invenio.legacy.bibdocfile.api import \
InvenioBibDocFileError, \
BibRecDocs
from invenio.ext.logging import register_exception
from invenio.websubmit_icon_creator import \
create_icon, InvenioWebSubmitIconCreatorError
from invenio.config import CFG_BINDIR
from invenio.legacy.dbquery import run_sql
from invenio.legacy.websubmit.functions.Shared_Functions import \
createRelatedFormats
-from invenio.bibdocfile_managedocfiles import get_description_and_comment
+from invenio.legacy.bibdocfile.managedocfiles import get_description_and_comment
def Move_Revised_Files_to_Storage(parameters, curdir, form, user_info=None):
"""
The function revises the files of a record with the newly uploaded
files.
This function can work only if you can define a mapping from the
WebSubmit element name that uploads the file, to the doctype of
the file. In most cases, the doctype is equivalent to the element
name, or just map to 'Main' doctype. That is typically the case if
you use the Move_Files_to_Storage.py function to upload the files
at submission step. For eg. with the DEMOBOO submission of the
Atlantis Demo site, a file is uploaded thanks to the DEMOBOO_FILE
element/File input, which is mapped to doctype DEMOBOO_FILE.
The function ignores files for which multiple files exist for a
single doctype in the record, or when several files are uploaded
with the same element name. If the record to revise does not have
a corresponding file, the file is inserted
This function is similar to Move_Uploaded_Files_to_Storage.py,
excepted that Move_Uploaded_Files_to_Storage relies on files
uploaded from the web interface created by
Create_Upload_Files_Interface.py, while this function relies on
the files uploaded by a regular WebSubmit page that you have built
from WebSubmit admin:
Regular WebSubmit interface --(upload file)--> Move_Revised_Files_to_Storage.py
Create_Upload_Files_Interface.py --(upload file)--> Move_Uploaded_Files_to_Storage.py
The main advantages of this function over the functions
Create_Upload_Files_Interface.py/Move_Uploaded_Files_to_Storage is
that it lets you customize the display of your submission in the
way you want, which could be simpler for your users if you usually
only upload a few and fixed number of files per record. The
disadvantages are that this function is not capable of : deleting
files, adding an alternative format to a file, add a variable
number of files, does not allow to set permissions at the level of
file, does not support user comments, renaming, etc.
@param parameters:(dictionary) - must contain:
+ elementNameToDoctype: maps an element/field name to a doctype.
Eg. the file uploaded from the
DEMOBOO_FILE element (input file tag)
should revise the file with document
type (doctype) "Main":
DEMOBOO_FILE=Main|DEMOBOO_FILE_2=ADDITIONAL
('=' separates element name and doctype
'|' separates each doctype/element name group)
In most cases, the element name == doctype:
DEMOBOO_FILE=DEMOBOO_FILE|DEMOBOO_FILE_2=DEMOBOO_FILE_2
+ createIconDoctypes: the list of doctypes for which an icon
should be created when revising the file.
Eg:
Figure|Graph
('|' separated values)
Use '*' for all doctypes
+ iconsize: size of the icon to create (when applicable)
+ keepPreviousVersionDoctypes: the list of doctypes for which
the function should keep previous
versions visible when revising a
file.
Eg:
Main|Additional
('|' separated values)
Default is all
+ createRelatedFormats: if uploaded files get converted to
whatever format we can (1) or not (0)
"""
# pylint: disable=E0602
# sysno is defined in the WebSubmit functions sandbox.
global sysno
bibrecdocs = BibRecDocs(int(sysno))
# Wash function parameters
(element_name_and_doctype, create_icon_doctypes, iconsize,
keep_previous_version_doctypes, createRelatedFormats_p) = \
wash_function_parameters(parameters, curdir)
for element_name, doctype in element_name_and_doctype:
_do_log(curdir, "Processing " + element_name)
# Check if there is a corresponding file
file_path = os.path.join(curdir, 'files', element_name,
read_file(curdir, element_name))
if file_path and os.path.exists(file_path):
# Now identify which file to revise
files_in_record = bibrecdocs.list_bibdocs(doctype)
if len(files_in_record) == 1:
# Ok, we can revise
bibdoc_name = bibrecdocs.get_docname(files_in_record[0].id)
revise(bibrecdocs, curdir, sysno, file_path,
bibdoc_name, doctype, iconsize,
create_icon_doctypes,
keep_previous_version_doctypes,
createRelatedFormats_p)
elif len(files_in_record) == 0:
# We must add the file
add(bibrecdocs, curdir, sysno, file_path,
doctype, iconsize, create_icon_doctypes,
createRelatedFormats_p)
else:
_do_log(curdir, " %s ignored, because multiple files found for same doctype %s in record %s: %s" %\
(element_name, doctype, sysno,
', '.join(files_in_record)))
else:
_do_log(curdir, " No corresponding file found (%s)" % file_path)
# Update the MARC
bibdocfile_bin = os.path.join(CFG_BINDIR, 'bibdocfile --yes-i-know')
os.system(bibdocfile_bin + " --fix-marc --recid=" + sysno)
# Delete the HB BibFormat cache in the DB, so that the fulltext
# links do not point to possible dead files
run_sql("DELETE LOW_PRIORITY from bibfmt WHERE format='HB' AND id_bibrec=%s", (sysno,))
# pylint: enable=E0602
def add(bibrecdocs, curdir, sysno, file_path, doctype,
iconsize, create_icon_doctypes, createRelatedFormats_p):
"""
Adds the file using bibdocfile
"""
try:
# Add file
bibdoc = bibrecdocs.add_new_file(file_path,
doctype,
never_fail=True)
_do_log(curdir, ' Added ' + bibrecdocs.get_docname(bibdoc.id) + ': ' + \
file_path)
# Add icon
iconpath = ''
if doctype in create_icon_doctypes or \
'*' in create_icon_doctypes:
iconpath = _create_icon(file_path, iconsize)
if iconpath is not None:
bibdoc.add_icon(iconpath)
_do_log(curdir, ' Added icon to ' + \
bibrecdocs.get_docname(bibdoc.id) + ': ' + iconpath)
# Automatically create additional formats when
# possible.
additional_formats = []
if createRelatedFormats_p:
additional_formats = createRelatedFormats(file_path,
overwrite=False)
for additional_format in additional_formats:
bibdoc.add_new_format(additional_format,
bibrecdocs.get_docname(bibdoc.id))
# Log
_do_log(curdir, ' Added format ' + additional_format + \
' to ' + bibrecdocs.get_docname(bibdoc.id) + ': ' + iconpath)
except InvenioBibDocFileError, e:
# Format already existed. How come? We should
# have checked this in Create_Upload_Files_Interface.py
register_exception(prefix='Move_Revised_Files_to_Storage ' \
'tried to add already existing file %s ' \
'to record %i. %s' % \
(file_path, sysno, curdir),
alert_admin=True)
def revise(bibrecdocs, curdir, sysno, file_path, bibdoc_name, doctype,
iconsize, create_icon_doctypes,
keep_previous_version_doctypes, createRelatedFormats_p):
"""
Revises the given bibdoc with a new file
"""
try:
# Retrieve the current description and comment, or they
# will be lost when revising
latest_files = bibrecdocs.list_bibdocs(doctype)[0].list_latest_files()
prev_desc, prev_comment = get_description_and_comment(latest_files)
if doctype in keep_previous_version_doctypes:
# Standard procedure, keep previous version
bibdoc = bibrecdocs.add_new_version(file_path,
bibdoc_name,
prev_desc,
prev_comment)
_do_log(curdir, ' Revised ' + bibrecdocs.get_docname(bibdoc.id) + \
' with : ' + file_path)
else:
# Soft-delete previous versions, and add new file
# (we need to get the doctype before deleting)
if bibrecdocs.has_docname_p(bibdoc_name):
# Delete only if bibdoc originally
# existed
bibrecdocs.delete_bibdoc(bibdoc_name)
_do_log(curdir, ' Deleted ' + bibdoc_name)
try:
bibdoc = bibrecdocs.add_new_file(file_path,
doctype,
bibdoc_name,
never_fail=True,
description=prev_desc,
comment=prev_comment)
_do_log(curdir, ' Added ' + bibrecdocs.get_docname(bibdoc.id) + ': ' + \
file_path)
except InvenioBibDocFileError, e:
_do_log(curdir, str(e))
register_exception(prefix='Move_Uploaded_Files_to_Storage ' \
'tried to revise a file %s ' \
'named %s in record %i. %s' % \
(file_path, bibdoc_name, sysno, curdir),
alert_admin=True)
# Add icon
iconpath = ''
if doctype in create_icon_doctypes or \
'*' in create_icon_doctypes:
iconpath = _create_icon(file_path, iconsize)
if iconpath is not None:
bibdoc.add_icon(iconpath)
_do_log(curdir, 'Added icon to ' + \
bibrecdocs.get_docname(bibdoc.id) + ': ' + iconpath)
# Automatically create additional formats when
# possible.
additional_formats = []
if createRelatedFormats_p:
additional_formats = createRelatedFormats(file_path,
overwrite=False)
for additional_format in additional_formats:
bibdoc.add_new_format(additional_format,
bibdoc_name,
prev_desc,
prev_comment)
# Log
_do_log(curdir, ' Addeded format ' + additional_format + \
' to ' + bibrecdocs.get_docname(bibdoc.id) + ': ' + iconpath)
except InvenioBibDocFileError, e:
# Format already existed. How come? We should
# have checked this in Create_Upload_Files_Interface.py
register_exception(prefix='Move_Revised_Files_to_Storage ' \
'tried to revise a file %s ' \
'named %s in record %i. %s' % \
(file_path, bibdoc_name, sysno, curdir),
alert_admin=True)
def wash_function_parameters(parameters, curdir):
"""
Returns the functions (admin-defined) parameters washed and
initialized properly, as a tuple:
Parameters:
check Move_Revised_Files_to_Storage(..) docstring
Returns:
tuple (element_name_and_doctype, create_icon_doctypes, iconsize,
keep_previous_version_doctypes, createRelatedFormats_p)
"""
# The mapping element name -> doctype.
# '|' is used to separate mapping groups, and '=' to separate
# element name and doctype.
# Eg: DEMOBOO_FILE=Main|DEMOBOO_FILEADDITIONAL=Additional File
element_name_and_doctype = [mapping.strip().split("=") for mapping \
in parameters['elementNameToDoctype'].split('|') \
if mapping.strip() != '']
# The list of doctypes for which we want to create an icon
# (list of values separated by "|")
create_icon_doctypes = [doctype.strip() for doctype \
in parameters['createIconDoctypes'].split('|') \
if doctype.strip() != '']
# If we should create additional formats when applicable (1) or
# not (0)
try:
createRelatedFormats_p = int(parameters['createRelatedFormats'])
except ValueError, e:
createRelatedFormats_p = False
# Icons size
iconsize = parameters.get('iconsize')
# The list of doctypes for which we want to keep previous versions
# of files visible.
# (list of values separated by "|")
keep_previous_version_doctypes = [doctype.strip() for doctype \
in parameters['keepPreviousVersionDoctypes'].split('|') \
if doctype.strip() != '']
if not keep_previous_version_doctypes:
# Nothing specified: keep all by default
keep_previous_version_doctypes = [doctype for (elem, doctype) \
in element_name_and_doctype]
return (element_name_and_doctype, create_icon_doctypes, iconsize,
keep_previous_version_doctypes, createRelatedFormats_p)
def _do_log(log_dir, msg):
"""
Log what we have done, in case something went wrong.
Nice to compare with bibdocactions.log
Should be removed when the development is over.
"""
log_file = os.path.join(log_dir, 'performed_actions.log')
file_desc = open(log_file, "a+")
file_desc.write("%s --> %s\n" %(time.strftime("%Y-%m-%d %H:%M:%S"), msg))
file_desc.close()
def _create_icon(file_path, icon_size, format='gif', verbosity=9):
"""
Creates icon of given file.
Returns path to the icon. If creation fails, return None, and
register exception (send email to admin).
Parameters:
- file_path : *str* full path to icon
- icon_size : *int* the scaling information to be used for the
creation of the new icon.
- verbosity : *int* the verbosity level under which the program
is to run;
"""
icon_path = None
try:
filename = os.path.splitext(os.path.basename(file_path))[0]
(icon_dir, icon_name) = create_icon(
{'input-file':file_path,
'icon-name': "icon-%s" % filename,
'multipage-icon': False,
'multipage-icon-delay': 0,
'icon-scale': icon_size,
'icon-file-format': format,
'verbosity': verbosity})
icon_path = icon_dir + os.sep + icon_name
except InvenioWebSubmitIconCreatorError, e:
register_exception(prefix='Icon for file %s could not be created: %s' % \
(file_path, str(e)),
alert_admin=False)
return icon_path
def read_file(curdir, filename):
"""
Reads a file in curdir.
Returns None if does not exist, cannot be read, or if file is not
really in curdir
"""
try:
file_path = os.path.abspath(os.path.join(curdir, filename))
if not file_path.startswith(curdir):
return None
file_desc = file(file_path, 'r')
content = file_desc.read()
file_desc.close()
except:
content = None
return content
diff --git a/invenio/legacy/websubmit/functions/Notify_URL.py b/invenio/legacy/websubmit/functions/Notify_URL.py
index d3370dac1..ee6c12ff7 100644
--- a/invenio/legacy/websubmit/functions/Notify_URL.py
+++ b/invenio/legacy/websubmit/functions/Notify_URL.py
@@ -1,116 +1,116 @@
## This file is part of Invenio.
## Copyright (C) 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import os
-from invenio.bibtask import \
+from invenio.legacy.bibsched.bibtask import \
task_low_level_submission, \
bibtask_allocate_sequenceid
from invenio.legacy.websubmit.functions.Shared_Functions import ParamFromFile
def Notify_URL(parameters, curdir, form, user_info=None):
"""
Access a given URL, and possibly post some content.
Could be used to notify that a record has been fully integrated.
(the URL is only accessed once the BibTask created by this
function runs in BibSched, not the when the function is run. The
BibTask uses a task sequence ID to respect ordering of tasks)
if URL is empty, skip the notification.
@param parameters: (dictionary) - contains the following parameter
strings used by this function:
+ url: (string) - the URL to be contacted by this function
(must start with http/https)
If value starts with "FILE:", will look for
the URL in a file on curdir with the given name.
for eg: "FILE:my_url"
(value retrieved when function is run)
+ data: (string) - (optional) the data to be posted at the
given URL. if no value is given, the URL
will be accessed via GET.
If value starts with "FILE:", will look for
the data in a file on curdir with the given name.
for eg: "FILE:my_data"
(value retrieved when function is run)
+ content_type: (string) - (optional) the content-type to use
to post data. Default is 'text/plain'.
Ignored if not data is posted.
+ attempt_times: (int) - (optional) up to how many time shall
we try to contact the URL in case we
fail at contacting it?
+ attempt_sleeptime: (int) - (optional) how many seconds to
sleep between each attempt?
+ admin_emails: (string) - (optional) list of emails (comma-separated
values) to contact in case the URL
cannot be accessed after all attempts.
If value starts with "FILE:", will look for
the emails in a file on curdir with the given name.
for eg: "FILE:my_email"
(value retrieved when function is run)
+ user: (string) - the user to be used to launch the task
(visible in BibSched). If value starts
with"FILE:", will look for the emails in a file on
curdir with the given name.
for eg:"FILE:my_user"
(value retrieved when function is run)
"""
other_bibtasklet_arguments = []
sequence_id = bibtask_allocate_sequenceid(curdir)
url = parameters["url"]
data = parameters["data"]
admin_emails = parameters["admin_emails"]
content_type = parameters["content_type"]
attempt_times = parameters["attempt_times"]
attempt_sleeptime = parameters["attempt_sleeptime"]
user = parameters["user"]
# Maybe some params must be read from disk
if url.startswith('FILE:'):
url = ParamFromFile(os.path.join(curdir, url[5:]))
if not url:
return ""
if data.startswith('FILE:'):
data = ParamFromFile(os.path.join(curdir, data[5:]))
if admin_emails.startswith('FILE:'):
admin_emails = ParamFromFile(os.path.join(curdir, admin_emails[5:]))
if user.startswith('FILE:'):
user = ParamFromFile(os.path.join(curdir, user[5:]))
if data:
other_bibtasklet_arguments.extend(("-a", "data=%s" % data))
other_bibtasklet_arguments.extend(("-a", "content_type=%s" % content_type))
return task_low_level_submission(
"bibtasklet", user, "-T", "bst_notify_url",
"-I", str(sequence_id),
"-a", "url=%s" % url,
"-a", "attempt_times=%s" % attempt_times,
"-a", "attempt_sleeptime=%s" % attempt_sleeptime,
"-a", "admin_emails=%s" % admin_emails,
*other_bibtasklet_arguments)
diff --git a/invenio/legacy/websubmit/functions/Send_APP_Mail.py b/invenio/legacy/websubmit/functions/Send_APP_Mail.py
index 9f45af60a..2c0d44ee8 100644
--- a/invenio/legacy/websubmit/functions/Send_APP_Mail.py
+++ b/invenio/legacy/websubmit/functions/Send_APP_Mail.py
@@ -1,278 +1,278 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
## Description: function Send_APP_Mail
## This function send an email informing the original
## submitter of a document that the referee has approved/
## rejected the document. The email is also sent to the
## referee for checking.
## Author: T.Baron
## PARAMETERS:
## newrnin: name of the file containing the 2nd reference
## addressesAPP: email addresses to which the email will
## be sent (additionally to the author)
## categformatAPP: variable needed to derive the addresses
## mentioned above
import os
import re
from invenio.config import CFG_SITE_NAME, \
CFG_SITE_URL, \
CFG_SITE_SUPPORT_EMAIL, \
CFG_CERN_SITE, \
CFG_SITE_RECORD
from invenio.modules.access.control import acc_get_role_users, acc_get_role_id
from invenio.legacy.dbquery import run_sql
from invenio.legacy.websubmit.config import CFG_WEBSUBMIT_COPY_MAILS_TO_ADMIN
from invenio.ext.logging import register_exception
from invenio.legacy.search_engine import print_record
from invenio.ext.email import scheduled_send_email
-from invenio.bibtask import bibtask_allocate_sequenceid
+from invenio.legacy.bibsched.bibtask import bibtask_allocate_sequenceid
## The field in which to search for the record submitter/owner's email address:
if CFG_CERN_SITE:
## This is a CERN site - we use 859__f for submitter/record owner's email:
CFG_WEBSUBMIT_RECORD_OWNER_EMAIL = "859__f"
else:
## Non-CERN site. Use 8560_f for submitter/record owner's email:
CFG_WEBSUBMIT_RECORD_OWNER_EMAIL = "8560_f"
def Send_APP_Mail (parameters, curdir, form, user_info=None):
"""
This function send an email informing the original submitter of a
document that the referee has approved/ rejected the document. The
email is also sent to the referee for checking.
Parameters:
* addressesAPP: email addresses of the people who will receive
this email (comma separated list). this parameter may contain
the <CATEG> string. In which case the variable computed from
the [categformatAFP] parameter replaces this string.
eg.: "<CATEG>-email@cern.ch"
* categformatAPP contains a regular expression used to compute
the category of the document given the reference of the
document.
eg.: if [categformatAFP]="TEST-<CATEG>-.*" and the reference
of the document is "TEST-CATEGORY1-2001-001", then the computed
category equals "CATEGORY1"
* newrnin: Name of the file containing the 2nd reference of the
approved document (if any).
* edsrn: Name of the file containing the reference of the
approved document.
"""
global titlevalue,authorvalue, emailvalue,sysno,rn
FROMADDR = '%s Submission Engine <%s>' % (CFG_SITE_NAME,CFG_SITE_SUPPORT_EMAIL)
sequence_id = bibtask_allocate_sequenceid(curdir)
doctype = form['doctype']
titlevalue = titlevalue.replace("\n"," ")
authorvalue = authorvalue.replace("\n","; ")
# variables declaration
categformat = parameters['categformatAPP']
otheraddresses = parameters['addressesAPP']
newrnpath = parameters['newrnin']
## Get the name of the decision file:
try:
decision_filename = parameters['decision_file']
except KeyError:
decision_filename = ""
## Get the name of the comments file:
try:
comments_filename = parameters['comments_file']
except KeyError:
comments_filename = ""
## Now try to read the comments from the comments_filename:
if comments_filename in (None, "", "NULL"):
## We don't have a name for the comments file.
## For backward compatibility reasons, try to read the comments from
## a file called 'COM' in curdir:
if os.path.exists("%s/COM" % curdir):
try:
fh_comments = open("%s/COM" % curdir, "r")
comment = fh_comments.read()
fh_comments.close()
except IOError:
## Unable to open the comments file
exception_prefix = "Error in WebSubmit function " \
"Send_APP_Mail. Tried to open " \
"comments file [%s/COM] but was " \
"unable to." % curdir
register_exception(prefix=exception_prefix)
comment = ""
else:
comment = comment.strip()
else:
comment = ""
else:
## Try to read the comments from the comments file:
if os.path.exists("%s/%s" % (curdir, comments_filename)):
try:
fh_comments = open("%s/%s" % (curdir, comments_filename), "r")
comment = fh_comments.read()
fh_comments.close()
except IOError:
## Oops, unable to open the comments file.
comment = ""
exception_prefix = "Error in WebSubmit function " \
"Send_APP_Mail. Tried to open comments " \
"file [%s/%s] but was unable to." \
% (curdir, comments_filename)
register_exception(prefix=exception_prefix)
else:
comment = comment.strip()
else:
comment = ""
## Now try to read the decision from the decision_filename:
if decision_filename in (None, "", "NULL"):
## We don't have a name for the decision file.
## For backward compatibility reasons, try to read the decision from
## a file called 'decision' in curdir:
if os.path.exists("%s/decision" % curdir):
try:
fh_decision = open("%s/decision" % curdir, "r")
decision = fh_decision.read()
fh_decision.close()
except IOError:
## Unable to open the decision file
exception_prefix = "Error in WebSubmit function " \
"Send_APP_Mail. Tried to open " \
"decision file [%s/decision] but was " \
"unable to." % curdir
register_exception(prefix=exception_prefix)
decision = ""
else:
decision = decision.strip()
else:
decision = ""
else:
## Try to read the decision from the decision file:
try:
fh_decision = open("%s/%s" % (curdir, decision_filename), "r")
decision = fh_decision.read()
fh_decision.close()
except IOError:
## Oops, unable to open the decision file.
decision = ""
exception_prefix = "Error in WebSubmit function " \
"Send_APP_Mail. Tried to open decision " \
"file [%s/%s] but was unable to." \
% (curdir, decision_filename)
register_exception(prefix=exception_prefix)
else:
decision = decision.strip()
if os.path.exists("%s/%s" % (curdir,newrnpath)):
fp = open("%s/%s" % (curdir,newrnpath) , "r")
newrn = fp.read()
fp.close()
else:
newrn = ""
# Document name
res = run_sql("SELECT ldocname FROM sbmDOCTYPE WHERE sdocname=%s", (doctype,))
docname = res[0][0]
# retrieve category
categformat = categformat.replace("<CATEG>", "([^-]*)")
m_categ_search = re.match(categformat, rn)
if m_categ_search is not None:
if len(m_categ_search.groups()) > 0:
## Found a match for the category of this document. Get it:
category = m_categ_search.group(1)
else:
## This document has no category.
category = "unknown"
else:
category = "unknown"
## Get the referee email address:
if CFG_CERN_SITE:
## The referees system in CERN now works with listbox membership.
## List names should take the format
## "service-cds-referee-doctype-category@cern.ch"
## Make sure that your list exists!
## FIXME - to be replaced by a mailing alias in webaccess in the
## future.
referee_listname = "service-cds-referee-%s" % doctype.lower()
if category != "":
referee_listname += "-%s" % category.lower()
referee_listname += "@cern.ch"
addresses = referee_listname
else:
# Build referee's email address
refereeaddress = ""
# Try to retrieve the referee's email from the referee's database
for user in acc_get_role_users(acc_get_role_id("referee_%s_%s" % (doctype,category))):
refereeaddress += user[1] + ","
# And if there is a general referee
for user in acc_get_role_users(acc_get_role_id("referee_%s_*" % doctype)):
refereeaddress += user[1] + ","
refereeaddress = re.sub(",$","",refereeaddress)
# Creation of the mail for the referee
otheraddresses = otheraddresses.replace("<CATEG>",category)
addresses = ""
if refereeaddress != "":
addresses = refereeaddress + ","
if otheraddresses != "":
addresses += otheraddresses
else:
addresses = re.sub(",$","",addresses)
## Add the record's submitter(s) into the list of recipients:
## Get the email address(es) of the record submitter(s)/owner(s) from
## the record itself:
record_owners = print_record(sysno, 'tm', \
[CFG_WEBSUBMIT_RECORD_OWNER_EMAIL]).strip()
if record_owners != "":
record_owners_list = record_owners.split("\n")
record_owners_list = [email.lower().strip() \
for email in record_owners_list]
else:
#if the record owner can not be retrieved from the metadata
#(in case the record has not been inserted yet),
#try to use the global variable emailvalue
try:
record_owners_list = [emailvalue]
except NameError:
record_owners_list = []
record_owners = ",".join([owner for owner in record_owners_list])
if record_owners != "":
addresses += ",%s" % record_owners
if decision == "approve":
mailtitle = "%s has been approved" % rn
mailbody = "The %s %s has been approved." % (docname,rn)
mailbody += "\nIt will soon be accessible here:\n\n<%s/%s/%s>" % (CFG_SITE_URL,CFG_SITE_RECORD,sysno)
else:
mailtitle = "%s has been rejected" % rn
mailbody = "The %s %s has been rejected." % (docname,rn)
if rn != newrn and decision == "approve" and newrn != "":
mailbody += "\n\nIts new reference number is: %s" % newrn
mailbody += "\n\nTitle: %s\n\nAuthor(s): %s\n\n" % (titlevalue,authorvalue)
if comment != "":
mailbody += "Comments from the referee:\n%s\n" % comment
# Send mail to referee if any recipients or copy to admin
if addresses or CFG_WEBSUBMIT_COPY_MAILS_TO_ADMIN:
scheduled_send_email(FROMADDR, addresses, mailtitle, mailbody,
copy_to_admin=CFG_WEBSUBMIT_COPY_MAILS_TO_ADMIN,
other_bibtasklet_arguments=['-I', str(sequence_id)])
return ""
diff --git a/invenio/legacy/websubmit/functions/Set_Embargo.py b/invenio/legacy/websubmit/functions/Set_Embargo.py
index e851a88e3..9987d0ff2 100644
--- a/invenio/legacy/websubmit/functions/Set_Embargo.py
+++ b/invenio/legacy/websubmit/functions/Set_Embargo.py
@@ -1,58 +1,58 @@
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import os
import time
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
def Set_Embargo(parameters, curdir, form):
"""set the embargo on all the documents of a given record.
@param date_file: the file from which to read the embargo end date.
@param date_format: the format in which the date in L{date_file} is
expected to be found. (default C{%Y-%m-%d})
@note: this function should be used after any file that need to
be added to a record has been added, that is after any call to
e.g. L{Move_Files_to_Storage}
@note: This function expect C{sysno} to exist and be set to the current record.
(that means it should be called after L{Get_Recid} or L{Create_Recid})
"""
## Let's retrieve the date from date_file.
date_file = parameters['date_file']
if not date_file:
return
date = open(os.path.join(curdir, date_file)).read().strip()
if not date:
return
## Let's retrieve the expected date format.
date_format = parameters['date_format'].strip()
if not date_format:
date_format = '%Y-%m-%d'
## Date normalization.
date = time.strftime("%Y-%m-%d", time.strptime(date, date_format))
## Let's prepare the firerole rule.
firerole = """\
deny until "%s"
allow all
""" % date
## Applying the embargo.
for bibdoc in BibRecDocs(sysno).list_bibdocs():
bibdoc.set_status("firerole: %s")
diff --git a/invenio/legacy/websubmit/functions/Shared_Functions.py b/invenio/legacy/websubmit/functions/Shared_Functions.py
index bc005c16f..6afe230ff 100644
--- a/invenio/legacy/websubmit/functions/Shared_Functions.py
+++ b/invenio/legacy/websubmit/functions/Shared_Functions.py
@@ -1,268 +1,268 @@
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Functions shared by websubmit_functions"""
__revision__ = "$Id$"
import os
import cgi
import glob
import sys
from logging import DEBUG
from invenio.config import \
CFG_PATH_CONVERT, \
CFG_SITE_LANG
-from invenio.bibdocfile import decompose_file
+from invenio.legacy.bibdocfile.api import decompose_file
from invenio.ext.logging import register_exception
-from invenio.websubmit_file_converter import convert_file, InvenioWebSubmitFileConverterError, get_missing_formats, get_file_converter_logger
+from invenio.legacy.websubmit.file_converter import convert_file, InvenioWebSubmitFileConverterError, get_missing_formats, get_file_converter_logger
from invenio.legacy.websubmit.config import InvenioWebSubmitFunctionError
from invenio.legacy.dbquery import run_sql
-from invenio.bibsched import server_pid
+from invenio.legacy.bibsched.scripts.bibsched import server_pid
from invenio.base.i18n import gettext_set_language
from invenio.legacy.search_engine import get_record
from invenio.legacy.bibrecord import record_get_field_values, record_get_field_value
def createRelatedFormats(fullpath, overwrite=True, debug=False):
"""Given a fullpath, this function extracts the file's extension and
finds in which additional format the file can be converted and converts it.
@param fullpath: (string) complete path to file
@param overwrite: (bool) overwrite already existing formats
Return a list of the paths to the converted files
"""
file_converter_logger = get_file_converter_logger()
old_logging_level = file_converter_logger.getEffectiveLevel()
if debug:
file_converter_logger.setLevel(DEBUG)
try:
createdpaths = []
basedir, filename, extension = decompose_file(fullpath)
extension = extension.lower()
if debug:
print >> sys.stderr, "basedir: %s, filename: %s, extension: %s" % (basedir, filename, extension)
filelist = glob.glob(os.path.join(basedir, '%s*' % filename))
if debug:
print >> sys.stderr, "filelist: %s" % filelist
missing_formats = get_missing_formats(filelist)
if debug:
print >> sys.stderr, "missing_formats: %s" % missing_formats
for path, formats in missing_formats.iteritems():
if debug:
print >> sys.stderr, "... path: %s, formats: %s" % (path, formats)
for aformat in formats:
if debug:
print >> sys.stderr, "...... aformat: %s" % aformat
newpath = os.path.join(basedir, filename + aformat)
if debug:
print >> sys.stderr, "...... newpath: %s" % newpath
try:
convert_file(path, newpath)
createdpaths.append(newpath)
except InvenioWebSubmitFileConverterError, msg:
if debug:
print >> sys.stderr, "...... Exception: %s" % msg
register_exception(alert_admin=True)
finally:
if debug:
file_converter_logger.setLevel(old_logging_level)
return createdpaths
def createIcon(fullpath, iconsize):
"""Given a fullpath, this function extracts the file's extension and
if the format is compatible it converts it to icon.
@param fullpath: (string) complete path to file
Return the iconpath if successful otherwise None
"""
basedir = os.path.dirname(fullpath)
filename = os.path.basename(fullpath)
filename, extension = os.path.splitext(filename)
if extension == filename:
extension == ""
iconpath = "%s/icon-%s.gif" % (basedir, filename)
if os.path.exists(fullpath) and extension.lower() in ['.pdf', '.gif', '.jpg', '.jpeg', '.ps']:
os.system("%s -scale %s %s %s" % (CFG_PATH_CONVERT, iconsize, fullpath, iconpath))
if os.path.exists(iconpath):
return iconpath
else:
return None
def get_dictionary_from_string(dict_string):
"""Given a string version of a "dictionary", split the string into a
python dictionary.
For example, given the following string:
{'TITLE' : 'EX_TITLE', 'AUTHOR' : 'EX_AUTHOR', 'REPORTNUMBER' : 'EX_RN'}
A dictionary in the following format will be returned:
{
'TITLE' : 'EX_TITLE',
'AUTHOR' : 'EX_AUTHOR',
'REPORTNUMBER' : 'EX_RN',
}
@param dict_string: (string) - the string version of the dictionary.
@return: (dictionary) - the dictionary build from the string.
"""
try:
# Evaluate the dictionary string in an empty local/global
# namespaces. An empty '__builtins__' variable is still
# provided, otherwise Python will add the real one for us,
# which would access to undesirable functions, such as
# 'file()', 'open()', 'exec()', etc.
evaluated_dict = eval(dict_string, {"__builtins__": {}}, {})
except:
evaluated_dict = {}
# Check that returned value is a dict. Do not check with
# isinstance() as we do not even want to match subclasses of dict.
if type(evaluated_dict) is dict:
return evaluated_dict
else:
return {}
def ParamFromFile(afile):
""" Pipe a multi-line file into a single parameter"""
parameter = ''
afile = afile.strip()
if afile == '': return parameter
try:
fp = open(afile, "r")
lines = fp.readlines()
for line in lines:
parameter = parameter + line
fp.close()
except IOError:
pass
return parameter
def write_file(filename, filedata):
"""Open FILENAME and write FILEDATA to it."""
filename1 = filename.strip()
try:
of = open(filename1,'w')
except IOError:
raise InvenioWebSubmitFunctionError('Cannot open ' + filename1 + ' to write')
of.write(filedata)
of.close()
return ""
def get_nice_bibsched_related_message(curdir, ln=CFG_SITE_LANG):
"""
@return: a message suitable to display to the user, explaining the current
status of the system.
@rtype: string
"""
bibupload_id = ParamFromFile(os.path.join(curdir, 'bibupload_id'))
if not bibupload_id:
## No BibUpload scheduled? Then we don't care about bibsched
return ""
## Let's get an estimate about how many processes are waiting in the queue.
## Our bibupload might be somewhere in it, but it's not really so important
## WRT informing the user.
_ = gettext_set_language(ln)
res = run_sql("SELECT id,proc,runtime,status,priority FROM schTASK WHERE (status='WAITING' AND runtime<=NOW()) OR status='SLEEPING'")
pre = _("Note that your submission as been inserted into the bibliographic task queue and is waiting for execution.\n")
if server_pid():
## BibSched is up and running
msg = _("The task queue is currently running in automatic mode, and there are currently %s tasks waiting to be executed. Your record should be available within a few minutes and searchable within an hour or thereabouts.\n") % (len(res))
else:
msg = _("Because of a human intervention or a temporary problem, the task queue is currently set to the manual mode. Your submission is well registered but may take longer than usual before it is fully integrated and searchable.\n")
return pre + msg
def txt2html(msg):
"""Transform newlines into paragraphs."""
rows = msg.split('\n')
rows = [cgi.escape(row) for row in rows]
rows = "<p>" + "</p><p>".join(rows) + "</p>"
return rows
def get_all_values_in_curdir(curdir):
"""
Return a dictionary with all the content of curdir.
@param curdir: the path to the current directory.
@type curdir: string
@return: the content
@rtype: dict
"""
ret = {}
for filename in os.listdir(curdir):
if not filename.startswith('.') and os.path.isfile(os.path.join(curdir, filename)):
ret[filename] = open(os.path.join(curdir, filename)).read().strip()
return ret
def get_current_record(curdir, system_number_file='SN'):
"""
Return the current record (in case it's being modified).
@param curdir: the path to the current directory.
@type curdir: string
@param system_number_file: is the name of the file on disk in curdir, that
is supposed to contain the record id.
@type system_number_file: string
@return: the record
@rtype: as in L{get_record}
"""
if os.path.exists(os.path.join(curdir, system_number_file)):
recid = open(os.path.join(curdir, system_number_file)).read().strip()
if recid:
recid = int(recid)
return get_record(recid)
return {}
def retrieve_field_values(curdir, field_name, separator=None, system_number_file='SN', tag=None):
"""
This is a handy function to retrieve values either from the current
submission directory, when a form has been just submitted, or from
an existing record (e.g. during MBI action).
@param curdir: is the current submission directory.
@type curdir: string
@param field_name: is the form field name that might exists on disk.
@type field_name: string
@param separator: is an optional separator. If it exists, it will be used
to retrieve multiple values contained in the field.
@type separator: string
@param system_number_file: is the name of the file on disk in curdir, that
is supposed to contain the record id.
@type system_number_file: string
@param tag: is the full MARC tag (tag+ind1+ind2+code) that should
contain values. If not specified, only values in curdir will
be retrieved.
@type tag: 6-chars
@return: the field value(s).
@rtype: list of strings.
@note: if field_name exists in curdir it will take precedence over
retrieving the values from the record.
"""
field_file = os.path.join(curdir, field_name)
if os.path.exists(field_file):
field_value = open(field_file).read()
if separator is not None:
return [value.strip() for value in field_value.split(separator) if value.strip()]
else:
return [field_value.strip()]
elif tag is not None:
system_number_file = os.path.join(curdir, system_number_file)
if os.path.exists(system_number_file):
recid = int(open(system_number_file).read().strip())
record = get_record(recid)
if separator:
return record_get_field_values(record, tag[:3], tag[3], tag[4], tag[5])
else:
return [record_get_field_value(record, tag[:3], tag[3], tag[4], tag[5])]
return []
diff --git a/invenio/legacy/websubmit/functions/Stamp_Replace_Single_File_Approval.py b/invenio/legacy/websubmit/functions/Stamp_Replace_Single_File_Approval.py
index 4a5089608..64d65c482 100644
--- a/invenio/legacy/websubmit/functions/Stamp_Replace_Single_File_Approval.py
+++ b/invenio/legacy/websubmit/functions/Stamp_Replace_Single_File_Approval.py
@@ -1,511 +1,511 @@
## This file is part of Invenio.
## Copyright (C) 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Stamp_Replace_Single_File_Approval: A function to allow a single file
that is already attached to a record to be stamped at approval time.
"""
__revision__ = "$Id$"
-from invenio.bibdocfile import BibRecDocs, InvenioBibDocFileError
+from invenio.legacy.bibdocfile.api import BibRecDocs, InvenioBibDocFileError
from invenio.ext.logging import register_exception
from invenio import websubmit_file_stamper
from invenio.legacy.websubmit.config import InvenioWebSubmitFunctionWarning, \
InvenioWebSubmitFunctionError, InvenioWebSubmitFileStamperError
import os.path
import re
import cgi
import time
def Stamp_Replace_Single_File_Approval(parameters, \
curdir, \
form, \
user_info=None):
"""
This function is intended to be called when a document has been
approved and needs to be stamped.
The function should be used when there is ONLY ONE file to be
stamped after approval (for example, the "main file").
The name of the file to be stamped should be known and should be stored
in a file in the submission's working directory (without the extension).
Generally, this will work our fine as the main file is named after the
report number of the document, this will be stored in the report number
file.
@param parameters: (dictionary) - must contain:
+ latex_template: (string) - the name of the LaTeX template that
should be used for the creation of the stamp.
+ latex_template_vars: (string) - a string-ified dictionary
of variables to be replaced in the LaTeX template and the
values (or names of files in curdir containing the values)
with which to replace them. Use prefix 'FILE:' to specify
that the stamped value must be read from a file in
submission directory instead of being a fixed value to
stamp.
E.G.:
{ 'TITLE' : 'FILE:DEMOTHESIS_TITLE',
'DATE' : 'FILE:DEMOTHESIS_DATE'
}
+ file_to_be_stamped: (string) - this is the name of a file in the
submission's working directory that contains the name of the
bibdocfile that is to be stamped.
+ new_file_name: (string) - this is the name of a file in the
submission's working directory that contains the name that is to
be given to the file after it has been stamped. If empty, or if
that file doesn't exist, the file will not be renamed after
stamping.
+ switch_file: (string) - when this value is set, specifies
the name of a file that will swith on/off the
stamping. The stamp will be applied if the file exists in
the submission directory and is not empty. If the file
cannot be found or is empty, the stamp is not applied.
Useful for eg. if you want to let your users control the
stamping with a checkbox on your submission page.
Leave this parameter empty to always stamp by default.
+ stamp: (string) - the type of stamp to be applied to the file.
should be one of:
+ first (only the first page is stamped);
+ all (all pages are stamped);
+ coverpage (a separate cover-page is added to the file as a
first page);
+ layer: (string) - the position of the stamp. Should be one of:
+ background (invisible if original file has a white
-not transparent- background layer)
+ foreground (on top of the stamped file. If the stamp
does not have a transparent background, will hide all
of the document layers)
The default value is 'background'.
"""
############
## Definition of important variables:
############
## The file stamper needs to be called with a dictionary of options of
## the following format:
## { 'latex-template' : "", ## TEMPLATE_NAME
## 'latex-template-var' : {}, ## TEMPLATE VARIABLES
## 'input-file' : "", ## INPUT FILE
## 'output-file' : "", ## OUTPUT FILE
## 'stamp' : "", ## STAMP TYPE
## 'layer' : "", ## LAYER TO STAMP
## 'verbosity' : 0, ## VERBOSITY (we don't care about it)
## }
file_stamper_options = { 'latex-template' : "",
'latex-template-var' : { },
'input-file' : "",
'output-file' : "",
'stamp' : "",
'layer' : "",
'verbosity' : 0,
}
## Check if stamping is enabled
switch_file = parameters.get('switch_file', '')
if switch_file:
# Good, a "switch file" was specified. Check if it exists, and
# it its value is not empty.
if not _read_in_file(os.path.join(curdir, switch_file)):
# File does not exist, or is emtpy. Silently abort
# stamping.
return ""
## Submission access number:
access = _read_in_file("%s/access" % curdir)
## record ID for the current submission. It is found in the special file
## "SN" (sysno) in curdir:
recid = _read_in_file("%s/SN" % curdir)
try:
recid = int(recid)
except ValueError:
## No record ID. Cannot continue.
err_msg = "Error in Stamp_Replace_Single_File_Approval: " \
"Cannot recover record ID from the submission's working " \
"directory. Stamping cannot be carried out. The " \
"submission ID is [%s]." % cgi.escape(access)
register_exception(prefix=err_msg)
raise InvenioWebSubmitFunctionError(err_msg)
############
## Resolution of function parameters:
############
## The name of the LaTeX template to be used for stamp creation:
latex_template = "%s" % ((type(parameters['latex_template']) is str \
and parameters['latex_template']) or "")
## A string containing the variables/values that should be substituted
## in the final (working) LaTeX template:
latex_template_vars_string = "%s" % \
((type(parameters['latex_template_vars']) is str \
and parameters['latex_template_vars']) or "")
## The type of stamp to be applied to the file(s):
stamp = "%s" % ((type(parameters['stamp']) is str and \
parameters['stamp'].lower()) or "")
## The layer to use for stamping:
try:
layer = parameters['layer']
except KeyError:
layer = "background"
if not layer in ('background', 'foreground'):
layer = "background"
## Get the name of the file to be stamped from the file indicated in
## the file_to_be_stamped parameter:
try:
file_to_stamp_file = parameters['file_to_be_stamped']
except KeyError:
file_to_stamp_file = ""
else:
if file_to_stamp_file is None:
file_to_stamp_file = ""
## Get the "basename" for the file to be stamped (it's mandatory that it
## be in curdir):
file_to_stamp_file = os.path.basename(file_to_stamp_file).strip()
name_file_to_stamp = _read_in_file("%s/%s" % (curdir, file_to_stamp_file))
name_file_to_stamp.replace("\n", "").replace("\r", "")
##
## Get the name to be given to the file after it has been stamped (if there
## is one.) Once more, it will be found in a file in curdir:
try:
new_file_name_file = parameters['new_file_name']
except KeyError:
new_file_name_file = ""
else:
if new_file_name_file is None:
new_file_name_file = ""
## Get the "basename" for the file containing the new file name. (It's
## mandatory that it be in curdir):
new_file_name_file = os.path.basename(new_file_name_file).strip()
new_file_name = _read_in_file("%s/%s" % (curdir, new_file_name_file))
############
## Begin:
############
##
## If no name for the file to stamp, warning.
if name_file_to_stamp == "":
wrn_msg = "Warning in Stamp_Replace_Single_File_Approval: " \
"It was not possible to recover a valid name for the " \
"file to be stamped. Stamping could not, therefore, be " \
"carried out. The submission ID is [%s]." \
% access
raise InvenioWebSubmitFunctionWarning(wrn_msg)
##
## The file to be stamped is a bibdoc. We will only stamp it (a) if it
## exists; and (b) if it is a PDF file. So, get the path (in the bibdocs
## tree) to the file to be stamped:
##
## First get the object representing the bibdocs belonging to this record:
bibrecdocs = BibRecDocs(recid)
try:
bibdoc_file_to_stamp = bibrecdocs.get_bibdoc("%s" % name_file_to_stamp)
except InvenioBibDocFileError:
## Couldn't get a bibdoc object for this filename. Probably the file
## that we wanted to stamp wasn't attached to this record.
wrn_msg = "Warning in Stamp_Replace_Single_File_Approval: " \
"It was not possible to recover a bibdoc object for the " \
"filename [%s] when trying to stamp the main file. " \
"Stamping could not be carried out. The submission ID is " \
"[%s] and the record ID is [%s]." \
% (name_file_to_stamp, access, recid)
register_exception(prefix=wrn_msg)
raise InvenioWebSubmitFunctionWarning(wrn_msg)
## Get the BibDocFile object for the PDF version of the bibdoc to be
## stamped:
try:
bibdocfile_file_to_stamp = bibdoc_file_to_stamp.get_file("pdf")
except InvenioBibDocFileError:
## This bibdoc doesn't have a physical file with the extension ".pdf"
## (take note of the lower-case extension - the bibdocfile library
## is case-sensitive with respect to filenames. Log that there was
## no "pdf" and check for a file with extension "PDF":
wrn_msg = "Warning in Stamp_Replace_Single_File_Approval: " \
"It wasn't possible to recover a PDF BibDocFile object " \
"for the file with the name [%s], using the extension " \
"[pdf] - note the lower case - the bibdocfile library " \
"relies upon the case of an extension. The submission ID " \
"is [%s] and the record ID is [%s]. Going to try " \
"looking for a file with a [PDF] extension before giving " \
"up . . . " \
% (name_file_to_stamp, access, recid)
register_exception(prefix=wrn_msg)
try:
bibdocfile_file_to_stamp = bibdoc_file_to_stamp.get_file("PDF")
except InvenioBibDocFileError:
wrn_msg = "Warning in Stamp_Replace_Single_File_Approval: " \
"It wasn't possible to recover a PDF " \
"BibDocFile object for the file with the name [%s], " \
"using the extension [PDF] - note the upper case. " \
"Had previously tried searching for [pdf] - now " \
"giving up. Stamping could not be carried out. " \
"The submission ID is [%s] and the record ID is [%s]." \
% (name_file_to_stamp, access, recid)
register_exception(prefix=wrn_msg)
raise InvenioWebSubmitFunctionWarning(wrn_msg)
############
## Go ahead and prepare the details for the LaTeX stamp template and its
## variables:
############
## Strip the LaTeX filename into the basename (All templates should be
## in the template repository):
latex_template = os.path.basename(latex_template)
## Convert the string of latex template variables into a dictionary
## of search-term/replacement-term pairs:
latex_template_vars = get_dictionary_from_string(latex_template_vars_string)
## For each of the latex variables, check in `CURDIR' for a file with that
## name. If found, use it's contents as the template-variable's value.
## If not, just use the raw value string already held by the template
## variable:
latex_template_varnames = latex_template_vars.keys()
for varname in latex_template_varnames:
## Get this variable's value:
varvalue = latex_template_vars[varname].strip()
if not ((varvalue.find("date(") == 0 and varvalue[-1] == ")") or \
(varvalue.find("include(") == 0 and varvalue[-1] == ")")) \
and varvalue != "":
## We don't want to interfere with date() or include() directives,
## so we only do this if the variable value didn't contain them:
##
## Is this variable value the name of a file in the current
## submission's working directory, from which a literal value for
## use in the template should be extracted? If yes, it will
## begin with "FILE:". If no, we leave the value exactly as it is.
if varvalue.upper().find("FILE:") == 0:
## The value to be used is to be taken from a file. Clean the
## file name and if it's OK, extract that value from the file.
##
seekvalue_fname = varvalue[5:].strip()
seekvalue_fname = os.path.basename(seekvalue_fname).strip()
if seekvalue_fname != "":
## Attempt to extract the value from the file:
if os.access("%s/%s" % (curdir, seekvalue_fname), \
os.R_OK|os.F_OK):
## The file exists. Extract its value:
try:
repl_file_val = \
open("%s/%s" \
% (curdir, seekvalue_fname), "r").readlines()
except IOError:
## The file was unreadable.
err_msg = "Error in Stamp_Replace_Single_File_" \
"Approval: The function attempted to " \
"read a LaTex template variable " \
"value from the following file in the " \
"current submission's working " \
"directory: [%s]. However, an " \
"unexpected error was encountered " \
"when doing so. Please inform the " \
"administrator." \
% seekvalue_fname
register_exception(req=user_info['req'])
raise InvenioWebSubmitFunctionError(err_msg)
else:
final_varval = ""
for line in repl_file_val:
final_varval += line
final_varval = final_varval.rstrip()
## Replace the variable value with that which has
## been read from the file:
latex_template_vars[varname] = final_varval
else:
## The file didn't actually exist in the current
## submission's working directory. Use an empty
## value:
latex_template_vars[varname] = ""
else:
## The filename was not valid.
err_msg = "Error in Stamp_Replace_Single_File_Approval: " \
"The function was configured to read a LaTeX " \
"template variable from a file with the " \
"following instruction: [%s --> %s]. The " \
"filename, however, was not considered valid. " \
"Please report this to the administrator." \
% (varname, varvalue)
raise InvenioWebSubmitFunctionError(err_msg)
## Put the 'fixed' values into the file_stamper_options dictionary:
file_stamper_options['latex-template'] = latex_template
file_stamper_options['latex-template-var'] = latex_template_vars
file_stamper_options['stamp'] = stamp
file_stamper_options['layer'] = layer
## Put the input file and output file into the file_stamper_options
## dictionary:
file_stamper_options['input-file'] = bibdocfile_file_to_stamp.fullpath
file_stamper_options['output-file'] = bibdocfile_file_to_stamp.get_full_name()
##
## Before attempting to stamp the file, log the dictionary of arguments
## that will be passed to websubmit_file_stamper:
try:
fh_log = open("%s/websubmit_file_stamper-calls-options.log" \
% curdir, "a+")
fh_log.write("%s\n" % file_stamper_options)
fh_log.flush()
fh_log.close()
except IOError:
## Unable to log the file stamper options.
exception_prefix = "Unable to write websubmit_file_stamper " \
"options to log file " \
"%s/websubmit_file_stamper-calls-options.log" \
% curdir
register_exception(prefix=exception_prefix)
try:
## Try to stamp the file:
(stamped_file_path_only, stamped_file_name) = \
websubmit_file_stamper.stamp_file(file_stamper_options)
except InvenioWebSubmitFileStamperError:
## It wasn't possible to stamp this file.
## Register the exception along with an informational message:
wrn_msg = "Warning in Stamp_Replace_Single_File_Approval: " \
"There was a problem stamping the file with the name [%s] " \
"and the fullpath [%s]. The file has not been stamped. " \
"The submission ID is [%s] and the record ID is [%s]." \
% (name_file_to_stamp, \
file_stamper_options['input-file'], \
access, \
recid)
register_exception(prefix=wrn_msg)
raise InvenioWebSubmitFunctionWarning(wrn_msg)
else:
## Stamping was successful. The BibDocFile must now be revised with
## the latest (stamped) version of the file:
file_comment = "Stamped by WebSubmit: %s" \
% time.strftime("%d/%m/%Y", time.localtime())
try:
dummy = \
bibrecdocs.add_new_version("%s/%s" \
% (stamped_file_path_only, \
stamped_file_name), \
name_file_to_stamp, \
comment=file_comment, \
flags=('STAMPED', ))
except InvenioBibDocFileError:
## Unable to revise the file with the newly stamped version.
wrn_msg = "Warning in Stamp_Replace_Single_File_Approval: " \
"After having stamped the file with the name [%s] " \
"and the fullpath [%s], it wasn't possible to revise " \
"that file with the newly stamped version. Stamping " \
"was unsuccessful. The submission ID is [%s] and the " \
"record ID is [%s]." \
% (name_file_to_stamp, \
file_stamper_options['input-file'], \
access, \
recid)
register_exception(prefix=wrn_msg)
raise InvenioWebSubmitFunctionWarning(wrn_msg)
else:
## File revised. If the file should be renamed after stamping,
## do so.
if new_file_name != "":
try:
bibrecdocs.change_name(newname = new_file_name, docid = bibdoc_file_to_stamp.id)
except (IOError, InvenioBibDocFileError):
## Unable to change the name
wrn_msg = "Warning in Stamp_Replace_Single_File_Approval" \
": After having stamped and revised the file " \
"with the name [%s] and the fullpath [%s], it " \
"wasn't possible to rename it to [%s]. The " \
"submission ID is [%s] and the record ID is " \
"[%s]." \
% (name_file_to_stamp, \
file_stamper_options['input-file'], \
new_file_name, \
access, \
recid)
## Finished.
return ""
def get_dictionary_from_string(dict_string):
"""Given a string version of a "dictionary", split the string into a
python dictionary.
For example, given the following string:
{'TITLE' : 'EX_TITLE', 'AUTHOR' : 'EX_AUTHOR', 'REPORTNUMBER' : 'EX_RN'}
A dictionary in the following format will be returned:
{
'TITLE' : 'EX_TITLE',
'AUTHOR' : 'EX_AUTHOR',
'REPORTNUMBER' : 'EX_RN',
}
@param dict_string: (string) - the string version of the dictionary.
@return: (dictionary) - the dictionary build from the string.
"""
## First, strip off the leading and trailing spaces and braces:
dict_string = dict_string.strip(" {}")
## Next, split the string on commas (,) that have not been escaped
## So, the following string: """'hello' : 'world', 'click' : 'here'"""
## will be split into the following list:
## ["'hello' : 'world'", " 'click' : 'here'"]
##
## However, the string """'hello\, world' : '!', 'click' : 'here'"""
## will be split into: ["'hello\, world' : '!'", " 'click' : 'here'"]
## I.e. the comma that was escaped in the string has been kept.
##
## So basically, split on unescaped parameters at first:
key_vals = re.split(r'(?<!\\),', dict_string)
## Now we should have a list of "key" : "value" terms. For each of them,
## check it is OK. If not in the format "Key" : "Value" (quotes are
## optional), discard it. As with the comma separator in the previous
## splitting, this one splits on the first colon (:) ONLY.
final_dictionary = {}
for key_value_string in key_vals:
## Split the pair apart, based on the first ":":
key_value_pair = key_value_string.split(":", 1)
## check that the length of the new list is 2:
if len(key_value_pair) != 2:
## There was a problem with the splitting - pass this pair
continue
## The split was made.
## strip white-space, single-quotes and double-quotes from around the
## key and value pairs:
key_term = key_value_pair[0].strip(" '\"")
value_term = key_value_pair[1].strip(" '\"")
## Is the left-side (key) term empty?
if len(key_term) == 0:
continue
## Now, add the search-replace pair to the dictionary of
## search-replace terms:
final_dictionary[key_term] = value_term
return final_dictionary
def _read_in_file(filepath):
"""Read the contents of a file into a string in memory.
@param filepath: (string) - the path to the file to be read in.
@return: (string) - the contents of the file.
"""
if filepath != "" and \
os.path.exists("%s" % filepath):
try:
fh_filepath = open("%s" % filepath, "r")
file_contents = fh_filepath.read()
fh_filepath.close()
except IOError:
register_exception()
file_contents = ""
else:
file_contents = ""
return file_contents
diff --git a/invenio/legacy/websubmit/inveniounoconv.py b/invenio/legacy/websubmit/inveniounoconv.py
index 4ced1394f..fdf2e7f84 100644
--- a/invenio/legacy/websubmit/inveniounoconv.py
+++ b/invenio/legacy/websubmit/inveniounoconv.py
@@ -1,1187 +1,1187 @@
#!@OPENOFFICE_PYTHON@
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Run-Unoconv-as-nobody wrapper.
"""
### This program is free software; you can redistribute it and/or modify
### it under the terms of the GNU General Public License as published by
### the Free Software Foundation; version 2 only
###
### This program is distributed in the hope that it will be useful,
### but WITHOUT ANY WARRANTY; without even the implied warranty of
### MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
### GNU General Public License for more details.
###
### You should have received a copy of the GNU General Public License
### along with this program; if not, write to the Free Software
### Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
### Copyright 2007-2010 Dag Wieers <dag@wieers.com>
from distutils.version import LooseVersion
import getopt
import glob
import os
import subprocess
import sys
import time
import signal
import errno
from invenio.flaskshell import *
-from invenio.websubmit_file_converter import CFG_OPENOFFICE_TMPDIR
+from invenio.legacy.websubmit.file_converter import CFG_OPENOFFICE_TMPDIR
CFG_SOFFICE_PID = os.path.join(CFG_OPENOFFICE_TMPDIR, 'soffice.pid')
__version__ = "$Revision$"
# $Source$
VERSION = '0.6'
doctypes = ('document', 'graphics', 'presentation', 'spreadsheet')
global convertor, office, ooproc, product
ooproc = None
exitcode = 0
class Office:
def __init__(self, basepath, urepath, unopath, pyuno, binary, python, pythonhome):
self.basepath = basepath
self.urepath = urepath
self.unopath = unopath
self.pyuno = pyuno
self.binary = binary
self.python = python
self.pythonhome = pythonhome
def __str__(self):
return self.basepath
def __repr__(self):
return self.basepath
### The first thing we ought to do is find a suitable Office installation
### with a compatible pyuno library that we can import.
###
### See: http://user.services.openoffice.org/en/forum/viewtopic.php?f=45&t=36370&p=166783
def find_offices():
ret = []
extrapaths = []
### Try using UNO_PATH first (in many incarnations, we'll see what sticks)
if 'UNO_PATH' in os.environ:
extrapaths += [ os.environ['UNO_PATH'],
os.path.dirname(os.environ['UNO_PATH']),
os.path.dirname(os.path.dirname(os.environ['UNO_PATH'])) ]
else:
if os.name in ( 'nt', 'os2' ):
if 'PROGRAMFILES' in os.environ.keys():
extrapaths += glob.glob(os.environ['PROGRAMFILES']+'\\LibreOffice*') + \
glob.glob(os.environ['PROGRAMFILES']+'\\OpenOffice.org*')
if 'PROGRAMFILES(X86)' in os.environ.keys():
extrapaths += glob.glob(os.environ['PROGRAMFILES(X86)']+'\\LibreOffice*') + \
glob.glob(os.environ['PROGRAMFILES(X86)']+'\\OpenOffice.org*')
elif os.name in ( 'mac', ) or sys.platform in ( 'darwin', ):
extrapaths += [ '/Applications/LibreOffice.app/Contents',
'/Applications/NeoOffice.app/Contents',
'/Applications/OpenOffice.org.app/Contents' ]
else:
extrapaths += glob.glob('/usr/lib*/libreoffice*') + \
glob.glob('/usr/lib*/openoffice*') + \
glob.glob('/usr/lib*/ooo*') + \
glob.glob('/opt/libreoffice*') + \
glob.glob('/opt/openoffice*') + \
glob.glob('/opt/ooo*') + \
glob.glob('/usr/local/libreoffice*') + \
glob.glob('/usr/local/openoffice*') + \
glob.glob('/usr/local/ooo*') + \
glob.glob('/usr/local/lib/libreoffice*')
### Find a working set for python UNO bindings
for basepath in extrapaths:
if os.name in ( 'nt', 'os2' ):
officelibraries = ( 'pyuno.pyd', )
officebinaries = ( 'soffice.exe' ,)
pythonbinaries = ( 'python.exe', )
pythonhomes = ()
elif os.name in ( 'mac', ) or sys.platform in ( 'darwin', ):
officelibraries = ( 'pyuno.so', 'pyuno.dylib' )
officebinaries = ( 'soffice.bin', )
pythonbinaries = ( 'python.bin', 'python' )
pythonhomes = ( 'OOoPython.framework/Versions/*/lib/python*', )
else:
officelibraries = ( 'pyuno.so', )
officebinaries = ( 'soffice.bin', )
pythonbinaries = ( 'python.bin', 'python', )
pythonhomes = ( 'python-core-*', )
### Older LibreOffice/OpenOffice and Windows use basis-link/ or basis/
libpath = 'error'
for basis in ( 'basis-link', 'basis', '' ):
for lib in officelibraries:
if os.path.isfile(os.path.join(basepath, basis, 'program', lib)):
libpath = os.path.join(basepath, basis, 'program')
officelibrary = os.path.join(libpath, lib)
info(3, "Found %s in %s" % (lib, libpath))
# Break the inner loop...
break
# Continue if the inner loop wasn't broken.
else:
continue
# Inner loop was broken, break the outer.
break
else:
continue
### MacOSX have soffice binaries installed in MacOS subdirectory, not program
unopath = 'error'
for basis in ( 'basis-link', 'basis', '' ):
for bin in officebinaries:
if os.path.isfile(os.path.join(basepath, basis, 'program', bin)):
unopath = os.path.join(basepath, basis, 'program')
officebinary = os.path.join(unopath, bin)
info(3, "Found %s in %s" % (bin, unopath))
# Break the inner loop...
break
# Continue if the inner loop wasn't broken.
else:
continue
# Inner loop was broken, break the outer.
break
else:
continue
### Windows/MacOSX does not provide or need a URE/lib directory ?
urepath = ''
for basis in ( 'basis-link', 'basis', '' ):
for ure in ( 'ure-link', 'ure', 'URE', '' ):
if os.path.isfile(os.path.join(basepath, basis, ure, 'lib', 'unorc')):
urepath = os.path.join(basepath, basis, ure)
info(3, "Found %s in %s" % ('unorc', os.path.join(urepath, 'lib')))
# Break the inner loop...
break
# Continue if the inner loop wasn't broken.
else:
continue
# Inner loop was broken, break the outer.
break
pythonhome = None
for home in pythonhomes:
if glob.glob(os.path.join(libpath, home)):
pythonhome = glob.glob(os.path.join(libpath, home))[0]
info(3, "Found %s in %s" % (home, pythonhome))
break
# if not os.path.isfile(os.path.join(basepath, program, officebinary)):
# continue
# info(3, "Found %s in %s" % (officebinary, os.path.join(basepath, program)))
# if not glob.glob(os.path.join(basepath, basis, program, 'python-core-*')):
# continue
for pythonbinary in pythonbinaries:
if os.path.isfile(os.path.join(unopath, pythonbinary)):
info(3, "Found %s in %s" % (pythonbinary, unopath))
ret.append(Office(basepath, urepath, unopath, officelibrary, officebinary,
os.path.join(unopath, pythonbinary), pythonhome))
else:
info(3, "Considering %s" % basepath)
ret.append(Office(basepath, urepath, unopath, officelibrary, officebinary,
sys.executable, None))
return ret
def office_environ(office):
### Set PATH so that crash_report is found
os.environ['PATH'] = os.path.join(office.basepath, 'program') + os.pathsep + os.environ['PATH']
### Set UNO_PATH so that "officehelper.bootstrap()" can find soffice executable:
os.environ['UNO_PATH'] = office.unopath
### Set URE_BOOTSTRAP so that "uno.getComponentContext()" bootstraps a complete
### UNO environment
if os.name in ( 'nt', 'os2' ):
os.environ['URE_BOOTSTRAP'] = 'vnd.sun.star.pathname:' + os.path.join(office.basepath, 'program', 'fundamental.ini')
else:
os.environ['URE_BOOTSTRAP'] = 'vnd.sun.star.pathname:' + os.path.join(office.basepath, 'program', 'fundamentalrc')
### Set LD_LIBRARY_PATH so that "import pyuno" finds libpyuno.so:
if 'LD_LIBRARY_PATH' in os.environ:
os.environ['LD_LIBRARY_PATH'] = office.unopath + os.pathsep + \
os.path.join(office.urepath, 'lib') + os.pathsep + \
os.environ['LD_LIBRARY_PATH']
else:
os.environ['LD_LIBRARY_PATH'] = office.unopath + os.pathsep + \
os.path.join(office.urepath, 'lib')
if office.pythonhome:
for libpath in ( os.path.join(office.pythonhome, 'lib'),
os.path.join(office.pythonhome, 'lib', 'lib-dynload'),
os.path.join(office.pythonhome, 'lib', 'lib-tk'),
os.path.join(office.pythonhome, 'lib', 'site-packages'),
office.unopath):
sys.path.insert(0, libpath)
else:
### Still needed for system python using LibreOffice UNO bindings
### Although we prefer to use a system UNO binding in this case
sys.path.append(office.unopath)
def debug_office():
if 'URE_BOOTSTRAP' in os.environ:
print >> sys.stderr, 'URE_BOOTSTRAP=%s' % os.environ['URE_BOOTSTRAP']
if 'UNO_PATH' in os.environ:
print >> sys.stderr, 'UNO_PATH=%s' % os.environ['UNO_PATH']
if 'UNO_TYPES' in os.environ:
print >> sys.stderr, 'UNO_TYPES=%s' % os.environ['UNO_TYPES']
print 'PATH=%s' % os.environ['PATH']
if 'PYTHONHOME' in os.environ:
print >> sys.stderr, 'PYTHONHOME=%s' % os.environ['PYTHONHOME']
if 'PYTHONPATH' in os.environ:
print >> sys.stderr, 'PYTHONPATH=%s' % os.environ['PYTHONPATH']
if 'LD_LIBRARY_PATH' in os.environ:
print >> sys.stderr, 'LD_LIBRARY_PATH=%s' % os.environ['LD_LIBRARY_PATH']
def python_switch(office):
if office.pythonhome:
os.environ['PYTHONHOME'] = office.pythonhome
os.environ['PYTHONPATH'] = os.path.join(office.pythonhome, 'lib') + os.pathsep + \
os.path.join(office.pythonhome, 'lib', 'lib-dynload') + os.pathsep + \
os.path.join(office.pythonhome, 'lib', 'lib-tk') + os.pathsep + \
os.path.join(office.pythonhome, 'lib', 'site-packages') + os.pathsep + \
office.unopath
os.environ['HOME'] = CFG_OPENOFFICE_TMPDIR
os.environ['UNO_PATH'] = office.unopath
info(3, "-> Switching from %s to %s" % (sys.executable, office.python))
if os.name in ('nt', 'os2'):
### os.execv is broken on Windows and can't properly parse command line
### arguments and executable name if they contain whitespaces. subprocess
### fixes that behavior.
ret = subprocess.call([office.python] + sys.argv[0:])
sys.exit(ret)
else:
### Set LD_LIBRARY_PATH so that "import pyuno" finds libpyuno.so:
if 'LD_LIBRARY_PATH' in os.environ:
os.environ['LD_LIBRARY_PATH'] = office.unopath + os.pathsep + \
os.path.join(office.urepath, 'lib') + os.pathsep + \
os.environ['LD_LIBRARY_PATH']
else:
os.environ['LD_LIBRARY_PATH'] = office.unopath + os.pathsep + \
os.path.join(office.urepath, 'lib')
try:
os.execvpe(office.python, [office.python, ] + sys.argv[0:], os.environ)
except OSError:
### Mac OS X versions prior to 10.6 do not support execv in
### a process that contains multiple threads. Instead of
### re-executing in the current process, start a new one
### and cause the current process to exit. This isn't
### ideal since the new process is detached from the parent
### terminal and thus cannot easily be killed with ctrl-C,
### but it's better than not being able to autoreload at
### all.
### Unfortunately the errno returned in this case does not
### appear to be consistent, so we can't easily check for
### this error specifically.
ret = os.spawnvpe(os.P_WAIT, office.python, [office.python, ] + sys.argv[0:], os.environ)
sys.exit(ret)
class Fmt:
def __init__(self, doctype, name, extension, summary, filter):
self.doctype = doctype
self.name = name
self.extension = extension
self.summary = summary
self.filter = filter
def __str__(self):
return "%s [.%s]" % (self.summary, self.extension)
def __repr__(self):
return "%s/%s" % (self.name, self.doctype)
class FmtList:
def __init__(self):
self.list = []
def add(self, doctype, name, extension, summary, filter):
self.list.append(Fmt(doctype, name, extension, summary, filter))
def byname(self, name):
ret = []
for fmt in self.list:
if fmt.name == name:
ret.append(fmt)
return ret
def byextension(self, extension):
ret = []
for fmt in self.list:
if os.extsep + fmt.extension == extension:
ret.append(fmt)
return ret
def bydoctype(self, doctype, name):
ret = []
for fmt in self.list:
if fmt.name == name and fmt.doctype == doctype:
ret.append(fmt)
return ret
def display(self, doctype):
print >> sys.stderr, "The following list of %s formats are currently available:\n" % doctype
for fmt in self.list:
if fmt.doctype == doctype:
print >> sys.stderr, " %-8s - %s" % (fmt.name, fmt)
print >> sys.stderr
fmts = FmtList()
### TextDocument
fmts.add('document', 'bib', 'bib', 'BibTeX', 'BibTeX_Writer') ### 22
fmts.add('document', 'doc', 'doc', 'Microsoft Word 97/2000/XP', 'MS Word 97') ### 29
fmts.add('document', 'doc6', 'doc', 'Microsoft Word 6.0', 'MS WinWord 6.0') ### 24
fmts.add('document', 'doc95', 'doc', 'Microsoft Word 95', 'MS Word 95') ### 28
fmts.add('document', 'docbook', 'xml', 'DocBook', 'DocBook File') ### 39
fmts.add('document', 'docx', 'docx', 'Microsoft Office Open XML', 'Office Open XML Text')
fmts.add('document', 'docx7', 'docx', 'Microsoft Office Open XML', 'MS Word 2007 XML')
fmts.add('document', 'fodt', 'fodt', 'OpenDocument Text (Flat XML)', 'OpenDocument Text Flat XML')
fmts.add('document', 'html', 'html', 'HTML Document (OpenOffice.org Writer)', 'HTML (StarWriter)') ### 3
fmts.add('document', 'latex', 'ltx', 'LaTeX 2e', 'LaTeX_Writer') ### 31
fmts.add('document', 'mediawiki', 'txt', 'MediaWiki', 'MediaWiki')
fmts.add('document', 'odt', 'odt', 'ODF Text Document', 'writer8') ### 10
fmts.add('document', 'ooxml', 'xml', 'Microsoft Office Open XML', 'MS Word 2003 XML') ### 11
fmts.add('document', 'ott', 'ott', 'Open Document Text', 'writer8_template') ### 21
fmts.add('document', 'pdb', 'pdb', 'AportisDoc (Palm)', 'AportisDoc Palm DB')
fmts.add('document', 'pdf', 'pdf', 'Portable Document Format', 'writer_pdf_Export') ### 18
fmts.add('document', 'psw', 'psw', 'Pocket Word', 'PocketWord File')
fmts.add('document', 'rtf', 'rtf', 'Rich Text Format', 'Rich Text Format') ### 16
fmts.add('document', 'sdw', 'sdw', 'StarWriter 5.0', 'StarWriter 5.0') ### 23
fmts.add('document', 'sdw4', 'sdw', 'StarWriter 4.0', 'StarWriter 4.0') ### 2
fmts.add('document', 'sdw3', 'sdw', 'StarWriter 3.0', 'StarWriter 3.0') ### 20
fmts.add('document', 'stw', 'stw', 'Open Office.org 1.0 Text Document Template', 'writer_StarOffice_XML_Writer_Template') ### 9
fmts.add('document', 'sxw', 'sxw', 'Open Office.org 1.0 Text Document', 'StarOffice XML (Writer)') ### 1
fmts.add('document', 'text', 'txt', 'Text Encoded', 'Text (encoded)') ### 26
fmts.add('document', 'txt', 'txt', 'Text', 'Text') ### 34
fmts.add('document', 'uot', 'uot', 'Unified Office Format text','UOF text') ### 27
fmts.add('document', 'vor', 'vor', 'StarWriter 5.0 Template', 'StarWriter 5.0 Vorlage/Template') ### 6
fmts.add('document', 'vor4', 'vor', 'StarWriter 4.0 Template', 'StarWriter 4.0 Vorlage/Template') ### 5
fmts.add('document', 'vor3', 'vor', 'StarWriter 3.0 Template', 'StarWriter 3.0 Vorlage/Template') ### 4
fmts.add('document', 'xhtml', 'html', 'XHTML Document', 'XHTML Writer File') ### 33
### WebDocument
fmts.add('web', 'etext', 'txt', 'Text Encoded (OpenOffice.org Writer/Web)', 'Text (encoded) (StarWriter/Web)') ### 14
fmts.add('web', 'html10', 'html', 'OpenOffice.org 1.0 HTML Template', 'writer_web_StarOffice_XML_Writer_Web_Template') ### 11
fmts.add('web', 'html', 'html', 'HTML Document', 'HTML') ### 2
fmts.add('web', 'html', 'html', 'HTML Document Template', 'writerweb8_writer_template') ### 13
fmts.add('web', 'mediawiki', 'txt', 'MediaWiki', 'MediaWiki_Web') ### 9
fmts.add('web', 'pdf', 'pdf', 'PDF - Portable Document Format', 'writer_web_pdf_Export') ### 10
fmts.add('web', 'sdw3', 'sdw', 'StarWriter 3.0 (OpenOffice.org Writer/Web)', 'StarWriter 3.0 (StarWriter/Web)') ### 3
fmts.add('web', 'sdw4', 'sdw', 'StarWriter 4.0 (OpenOffice.org Writer/Web)', 'StarWriter 4.0 (StarWriter/Web)') ### 4
fmts.add('web', 'sdw', 'sdw', 'StarWriter 5.0 (OpenOffice.org Writer/Web)', 'StarWriter 5.0 (StarWriter/Web)') ### 5
fmts.add('web', 'txt', 'txt', 'OpenOffice.org Text (OpenOffice.org Writer/Web)', 'writerweb8_writer') ### 12
fmts.add('web', 'text10', 'txt', 'OpenOffice.org 1.0 Text Document (OpenOffice.org Writer/Web)', 'writer_web_StarOffice_XML_Writer') ### 15
fmts.add('web', 'text', 'txt', 'Text (OpenOffice.org Writer/Web)', 'Text (StarWriter/Web)') ### 8
fmts.add('web', 'vor4', 'vor', 'StarWriter/Web 4.0 Template', 'StarWriter/Web 4.0 Vorlage/Template') ### 6
fmts.add('web', 'vor', 'vor', 'StarWriter/Web 5.0 Template', 'StarWriter/Web 5.0 Vorlage/Template') ### 7
### Spreadsheet
fmts.add('spreadsheet', 'csv', 'csv', 'Text CSV', 'Text - txt - csv (StarCalc)') ### 16
fmts.add('spreadsheet', 'dbf', 'dbf', 'dBASE', 'dBase') ### 22
fmts.add('spreadsheet', 'dif', 'dif', 'Data Interchange Format', 'DIF') ### 5
fmts.add('spreadsheet', 'fods', 'fods', 'OpenDocument Spreadsheet (Flat XML)', 'OpenDocument Spreadsheet Flat XML')
fmts.add('spreadsheet', 'html', 'html', 'HTML Document (OpenOffice.org Calc)', 'HTML (StarCalc)') ### 7
fmts.add('spreadsheet', 'ods', 'ods', 'ODF Spreadsheet', 'calc8') ### 15
fmts.add('spreadsheet', 'ooxml', 'xml', 'Microsoft Excel 2003 XML', 'MS Excel 2003 XML') ### 23
fmts.add('spreadsheet', 'ots', 'ots', 'ODF Spreadsheet Template', 'calc8_template') ### 14
fmts.add('spreadsheet', 'pdf', 'pdf', 'Portable Document Format', 'calc_pdf_Export') ### 34
fmts.add('spreadsheet', 'pxl', 'pxl', 'Pocket Excel', 'Pocket Excel')
fmts.add('spreadsheet', 'sdc', 'sdc', 'StarCalc 5.0', 'StarCalc 5.0') ### 31
fmts.add('spreadsheet', 'sdc4', 'sdc', 'StarCalc 4.0', 'StarCalc 4.0') ### 11
fmts.add('spreadsheet', 'sdc3', 'sdc', 'StarCalc 3.0', 'StarCalc 3.0') ### 29
fmts.add('spreadsheet', 'slk', 'slk', 'SYLK', 'SYLK') ### 35
fmts.add('spreadsheet', 'stc', 'stc', 'OpenOffice.org 1.0 Spreadsheet Template', 'calc_StarOffice_XML_Calc_Template') ### 2
fmts.add('spreadsheet', 'sxc', 'sxc', 'OpenOffice.org 1.0 Spreadsheet', 'StarOffice XML (Calc)') ### 3
fmts.add('spreadsheet', 'uos', 'uos', 'Unified Office Format spreadsheet', 'UOF spreadsheet') ### 9
fmts.add('spreadsheet', 'vor3', 'vor', 'StarCalc 3.0 Template', 'StarCalc 3.0 Vorlage/Template') ### 18
fmts.add('spreadsheet', 'vor4', 'vor', 'StarCalc 4.0 Template', 'StarCalc 4.0 Vorlage/Template') ### 19
fmts.add('spreadsheet', 'vor', 'vor', 'StarCalc 5.0 Template', 'StarCalc 5.0 Vorlage/Template') ### 20
fmts.add('spreadsheet', 'xhtml', 'xhtml', 'XHTML', 'XHTML Calc File') ### 26
fmts.add('spreadsheet', 'xls', 'xls', 'Microsoft Excel 97/2000/XP', 'MS Excel 97') ### 12
fmts.add('spreadsheet', 'xls5', 'xls', 'Microsoft Excel 5.0', 'MS Excel 5.0/95') ### 8
fmts.add('spreadsheet', 'xls95', 'xls', 'Microsoft Excel 95', 'MS Excel 95') ### 10
fmts.add('spreadsheet', 'xlt', 'xlt', 'Microsoft Excel 97/2000/XP Template', 'MS Excel 97 Vorlage/Template') ### 6
fmts.add('spreadsheet', 'xlt5', 'xlt', 'Microsoft Excel 5.0 Template', 'MS Excel 5.0/95 Vorlage/Template') ### 28
fmts.add('spreadsheet', 'xlt95', 'xlt', 'Microsoft Excel 95 Template', 'MS Excel 95 Vorlage/Template') ### 21
### Graphics
fmts.add('graphics', 'bmp', 'bmp', 'Windows Bitmap', 'draw_bmp_Export') ### 21
fmts.add('graphics', 'emf', 'emf', 'Enhanced Metafile', 'draw_emf_Export') ### 15
fmts.add('graphics', 'eps', 'eps', 'Encapsulated PostScript', 'draw_eps_Export') ### 48
fmts.add('graphics', 'fodg', 'fodg', 'OpenDocument Drawing (Flat XML)', 'OpenDocument Drawing Flat XML')
fmts.add('graphics', 'gif', 'gif', 'Graphics Interchange Format', 'draw_gif_Export') ### 30
fmts.add('graphics', 'html', 'html', 'HTML Document (OpenOffice.org Draw)', 'draw_html_Export') ### 37
fmts.add('graphics', 'jpg', 'jpg', 'Joint Photographic Experts Group', 'draw_jpg_Export') ### 3
fmts.add('graphics', 'met', 'met', 'OS/2 Metafile', 'draw_met_Export') ### 43
fmts.add('graphics', 'odd', 'odd', 'OpenDocument Drawing', 'draw8') ### 6
fmts.add('graphics', 'otg', 'otg', 'OpenDocument Drawing Template', 'draw8_template') ### 20
fmts.add('graphics', 'pbm', 'pbm', 'Portable Bitmap', 'draw_pbm_Export') ### 14
fmts.add('graphics', 'pct', 'pct', 'Mac Pict', 'draw_pct_Export') ### 41
fmts.add('graphics', 'pdf', 'pdf', 'Portable Document Format', 'draw_pdf_Export') ### 28
fmts.add('graphics', 'pgm', 'pgm', 'Portable Graymap', 'draw_pgm_Export') ### 11
fmts.add('graphics', 'png', 'png', 'Portable Network Graphic', 'draw_png_Export') ### 2
fmts.add('graphics', 'ppm', 'ppm', 'Portable Pixelmap', 'draw_ppm_Export') ### 5
fmts.add('graphics', 'ras', 'ras', 'Sun Raster Image', 'draw_ras_Export') ## 31
fmts.add('graphics', 'std', 'std', 'OpenOffice.org 1.0 Drawing Template', 'draw_StarOffice_XML_Draw_Template') ### 53
fmts.add('graphics', 'svg', 'svg', 'Scalable Vector Graphics', 'draw_svg_Export') ### 50
fmts.add('graphics', 'svm', 'svm', 'StarView Metafile', 'draw_svm_Export') ### 55
fmts.add('graphics', 'swf', 'swf', 'Macromedia Flash (SWF)', 'draw_flash_Export') ### 23
fmts.add('graphics', 'sxd', 'sxd', 'OpenOffice.org 1.0 Drawing', 'StarOffice XML (Draw)') ### 26
fmts.add('graphics', 'sxd3', 'sxd', 'StarDraw 3.0', 'StarDraw 3.0') ### 40
fmts.add('graphics', 'sxd5', 'sxd', 'StarDraw 5.0', 'StarDraw 5.0') ### 44
fmts.add('graphics', 'sxw', 'sxw', 'StarOffice XML (Draw)', 'StarOffice XML (Draw)')
fmts.add('graphics', 'tiff', 'tiff', 'Tagged Image File Format', 'draw_tif_Export') ### 13
fmts.add('graphics', 'vor', 'vor', 'StarDraw 5.0 Template', 'StarDraw 5.0 Vorlage') ### 36
fmts.add('graphics', 'vor3', 'vor', 'StarDraw 3.0 Template', 'StarDraw 3.0 Vorlage') ### 35
fmts.add('graphics', 'wmf', 'wmf', 'Windows Metafile', 'draw_wmf_Export') ### 8
fmts.add('graphics', 'xhtml', 'xhtml', 'XHTML', 'XHTML Draw File') ### 45
fmts.add('graphics', 'xpm', 'xpm', 'X PixMap', 'draw_xpm_Export') ### 19
### Presentation
fmts.add('presentation', 'bmp', 'bmp', 'Windows Bitmap', 'impress_bmp_Export') ### 15
fmts.add('presentation', 'emf', 'emf', 'Enhanced Metafile', 'impress_emf_Export') ### 16
fmts.add('presentation', 'eps', 'eps', 'Encapsulated PostScript', 'impress_eps_Export') ### 17
fmts.add('presentation', 'fodp', 'fodp', 'OpenDocument Presentation (Flat XML)', 'OpenDocument Presentation Flat XML')
fmts.add('presentation', 'gif', 'gif', 'Graphics Interchange Format', 'impress_gif_Export') ### 18
fmts.add('presentation', 'html', 'html', 'HTML Document (OpenOffice.org Impress)', 'impress_html_Export') ### 43
fmts.add('presentation', 'jpg', 'jpg', 'Joint Photographic Experts Group', 'impress_jpg_Export') ### 19
fmts.add('presentation', 'met', 'met', 'OS/2 Metafile', 'impress_met_Export') ### 20
fmts.add('presentation', 'odg', 'odg', 'ODF Drawing (Impress)', 'impress8_draw') ### 29
fmts.add('presentation', 'odp', 'odp', 'ODF Presentation', 'impress8') ### 9
fmts.add('presentation', 'otp', 'otp', 'ODF Presentation Template', 'impress8_template') ### 38
fmts.add('presentation', 'pbm', 'pbm', 'Portable Bitmap', 'impress_pbm_Export') ### 21
fmts.add('presentation', 'pct', 'pct', 'Mac Pict', 'impress_pct_Export') ### 22
fmts.add('presentation', 'pdf', 'pdf', 'Portable Document Format', 'impress_pdf_Export') ### 23
fmts.add('presentation', 'pgm', 'pgm', 'Portable Graymap', 'impress_pgm_Export') ### 24
fmts.add('presentation', 'png', 'png', 'Portable Network Graphic', 'impress_png_Export') ### 25
fmts.add('presentation', 'potm', 'potm', 'Microsoft PowerPoint 2007/2010 XML Template', 'Impress MS PowerPoint 2007 XML Template')
fmts.add('presentation', 'pot', 'pot', 'Microsoft PowerPoint 97/2000/XP Template', 'MS PowerPoint 97 Vorlage') ### 3
fmts.add('presentation', 'ppm', 'ppm', 'Portable Pixelmap', 'impress_ppm_Export') ### 26
fmts.add('presentation', 'pptx', 'pptx', 'Microsoft PowerPoint 2007/2010 XML', 'Impress MS PowerPoint 2007 XML') ### 36
fmts.add('presentation', 'pps', 'pps', 'Microsoft PowerPoint 97/2000/XP (Autoplay)', 'MS PowerPoint 97 Autoplay') ### 36
fmts.add('presentation', 'ppt', 'ppt', 'Microsoft PowerPoint 97/2000/XP', 'MS PowerPoint 97') ### 36
fmts.add('presentation', 'pwp', 'pwp', 'PlaceWare', 'placeware_Export') ### 30
fmts.add('presentation', 'ras', 'ras', 'Sun Raster Image', 'impress_ras_Export') ### 27
fmts.add('presentation', 'sda', 'sda', 'StarDraw 5.0 (OpenOffice.org Impress)', 'StarDraw 5.0 (StarImpress)') ### 8
fmts.add('presentation', 'sdd', 'sdd', 'StarImpress 5.0', 'StarImpress 5.0') ### 6
fmts.add('presentation', 'sdd3', 'sdd', 'StarDraw 3.0 (OpenOffice.org Impress)', 'StarDraw 3.0 (StarImpress)') ### 42
fmts.add('presentation', 'sdd4', 'sdd', 'StarImpress 4.0', 'StarImpress 4.0') ### 37
fmts.add('presentation', 'sxd', 'sxd', 'OpenOffice.org 1.0 Drawing (OpenOffice.org Impress)', 'impress_StarOffice_XML_Draw') ### 31
fmts.add('presentation', 'sti', 'sti', 'OpenOffice.org 1.0 Presentation Template', 'impress_StarOffice_XML_Impress_Template') ### 5
fmts.add('presentation', 'svg', 'svg', 'Scalable Vector Graphics', 'impress_svg_Export') ### 14
fmts.add('presentation', 'svm', 'svm', 'StarView Metafile', 'impress_svm_Export') ### 13
fmts.add('presentation', 'swf', 'swf', 'Macromedia Flash (SWF)', 'impress_flash_Export') ### 34
fmts.add('presentation', 'sxi', 'sxi', 'OpenOffice.org 1.0 Presentation', 'StarOffice XML (Impress)') ### 41
fmts.add('presentation', 'tiff', 'tiff', 'Tagged Image File Format', 'impress_tif_Export') ### 12
fmts.add('presentation', 'uop', 'uop', 'Unified Office Format presentation', 'UOF presentation') ### 4
fmts.add('presentation', 'vor', 'vor', 'StarImpress 5.0 Template', 'StarImpress 5.0 Vorlage') ### 40
fmts.add('presentation', 'vor3', 'vor', 'StarDraw 3.0 Template (OpenOffice.org Impress)', 'StarDraw 3.0 Vorlage (StarImpress)') ###1
fmts.add('presentation', 'vor4', 'vor', 'StarImpress 4.0 Template', 'StarImpress 4.0 Vorlage') ### 39
fmts.add('presentation', 'vor5', 'vor', 'StarDraw 5.0 Template (OpenOffice.org Impress)', 'StarDraw 5.0 Vorlage (StarImpress)') ### 2
fmts.add('presentation', 'wmf', 'wmf', 'Windows Metafile', 'impress_wmf_Export') ### 11
fmts.add('presentation', 'xhtml', 'xml', 'XHTML', 'XHTML Impress File') ### 33
fmts.add('presentation', 'xpm', 'xpm', 'X PixMap', 'impress_xpm_Export') ### 10
class Options:
def __init__(self, args):
self.connection = None
self.debug = False
self.doctype = None
self.exportfilter = []
self.filenames = []
self.format = None
self.importfilter = ""
self.listener = False
self.nolaunch = False
self.kill = False
self.output = None
self.password = None
self.pipe = None
self.port = '2002'
self.server = 'localhost'
self.showlist = False
self.stdout = False
self.template = None
self.timeout = 6
self.verbose = 0
self.remove = None
### Get options from the commandline
try:
opts, args = getopt.getopt (args, 'c:Dd:e:f:hi:Llko:np:s:T:t:vr:',
['connection=', 'debug', 'doctype=', 'export', 'format=',
'help', 'import', 'listener', 'kill', 'no-launch', 'output=',
'outputpath', 'password=', 'pipe=', 'port=', 'server=',
'timeout=', 'show', 'stdout', 'template', 'verbose',
'version', 'remove='] )
except getopt.error, exc:
print 'unoconv: %s, try unoconv -h for a list of all the options' % str(exc)
sys.exit(255)
for opt, arg in opts:
if opt in ['-h', '--help']:
self.usage()
print
self.help()
sys.exit(1)
elif opt in ['-c', '--connection']:
self.connection = arg
elif opt in ['--debug']:
self.debug = True
elif opt in ['-d', '--doctype']:
self.doctype = arg
elif opt in ['-e', '--export']:
l = arg.split('=')
if len(l) == 2:
(name, value) = l
if value in ('True', 'true'):
self.exportfilter.append( PropertyValue( name, 0, True, 0 ) )
elif value in ('False', 'false'):
self.exportfilter.append( PropertyValue( name, 0, False, 0 ) )
else:
try:
self.exportfilter.append( PropertyValue( name, 0, int(value), 0 ) )
except ValueError:
self.exportfilter.append( PropertyValue( name, 0, value, 0 ) )
else:
print >> sys.stderr, 'Warning: Option %s cannot be parsed, ignoring.' % arg
# self.exportfilter = arg
elif opt in ['-f', '--format']:
self.format = arg
elif opt in ['-i', '--import']:
self.importfilter = arg
elif opt in ['-l', '--listener']:
self.listener = True
elif opt in ['-k', '--kill']:
self.kill = True
elif opt in ['-n', '--no-launch']:
self.nolaunch = True
elif opt in ['-o', '--output']:
self.output = arg
elif opt in ['--outputpath']:
print >> sys.stderr, 'Warning: This option is deprecated by --output.'
self.output = arg
elif opt in ['--password']:
self.password = arg
elif opt in ['--pipe']:
self.pipe = arg
elif opt in ['-p', '--port']:
self.port = arg
elif opt in ['-s', '--server']:
self.server = arg
elif opt in ['--show']:
self.showlist = True
elif opt in ['--stdout']:
self.stdout = True
elif opt in ['-t', '--template']:
self.template = arg
elif opt in ['-T', '--timeout']:
self.timeout = int(arg)
elif opt in ['-v', '--verbose']:
self.verbose = self.verbose + 1
elif opt in ['-r', '--remove']:
self.remove = arg
elif opt in ['--version']:
self.version()
sys.exit(255)
### Enable verbosity
if self.verbose >= 2:
print >> sys.stderr, 'Verbosity set to level %d' % self.verbose
self.filenames = args
if self.remove:
if os.path.exists(self.remove):
os.remove(self.remove)
print >> sys.stderr, "%s file created by OpenOffice was successfully removed." % self.remove
sys.stderr.flush()
sys.exit(0)
if self.kill:
from invenio.utils.shell import run_shell_command
run_shell_command('killall %s', [os.path.basename(office.binary)])
time.sleep(1)
run_shell_command('killall -9 %s', [os.path.basename(office.binary)])
print >> sys.stderr, 'soffice.bin was hopefully already killed.'
sys.exit(0)
if not self.listener and not self.showlist and self.doctype != 'list' and not self.filenames:
print >> sys.stderr, 'unoconv: you have to provide a filename as argument'
print >> sys.stderr, 'Try `unoconv -h\' for more information.'
sys.exit(255)
### Set connection string
if not self.connection:
if not self.pipe:
self.connection = "socket,host=%s,port=%s;urp;StarOffice.ComponentContext" % (self.server, self.port)
# self.connection = "socket,host=%s,port=%s;urp;" % (self.server, self.port)
else:
self.connection = "pipe,name=%s;urp;StarOffice.ComponentContext" % (self.pipe)
### Make it easier for people to use a doctype (first letter is enough)
if self.doctype:
for doctype in doctypes:
if doctype.startswith(self.doctype):
self.doctype = doctype
### Check if the user request to see the list of formats
if self.showlist or self.format == 'list':
if self.doctype:
fmts.display(self.doctype)
else:
for t in doctypes:
fmts.display(t)
sys.exit(0)
### If no format was specified, probe it or provide it
if not self.format:
l = sys.argv[0].split('2')
if len(l) == 2:
self.format = l[1]
else:
self.format = 'pdf'
def version(self):
### Get office product information
product = uno.getComponentContext().ServiceManager.createInstance("com.sun.star.configuration.ConfigurationProvider").createInstanceWithArguments("com.sun.star.configuration.ConfigurationAccess", UnoProps(nodepath="/org.openoffice.Setup/Product"))
print 'unoconv %s' % VERSION
print 'Written by Dag Wieers <dag@wieers.com>'
print 'Patched to run within Invenio by <info@invenio-software.org>'
print 'Homepage at http://dag.wieers.com/home-made/unoconv/'
print
print 'platform %s/%s' % (os.name, sys.platform)
print 'python %s' % sys.version
print product.ooName, product.ooSetupVersion
print
print 'build revision $Rev$'
def usage(self):
print >> sys.stderr, 'usage: unoconv [options] file [file2 ..]'
def help(self):
print >> sys.stderr, '''Convert from and to any format supported by LibreOffice
unoconv options:
-c, --connection=string use a custom connection string
-d, --doctype=type specify document type
(document, graphics, presentation, spreadsheet)
-e, --export=name=value set export filter options
eg. -e PageRange=1-2
-f, --format=format specify the output format
-i, --import=string set import filter option string
eg. -i utf8
-l, --listener start a permanent listener to use by unoconv clients
-k, --kill kill any listener on the local machine (Invenio)
-r, --remove=filename remove a file created by LibreOffice (Invenio)
-n, --no-launch fail if no listener is found (default: launch one)
-o, --output=name output basename, filename or directory
--pipe=name alternative method of connection using a pipe
-p, --port=port specify the port (default: 2002)
to be used by client or listener
--password=string provide a password to decrypt the document
-s, --server=server specify the server address (default: localhost)
to be used by client or listener
--show list the available output formats
--stdout write output to stdout
-t, --template=file import the styles from template (.ott)
-T, --timeout=secs timeout after secs if connection to listener fails
-v, --verbose be more and more verbose (-vvv for debugging)
'''
class Convertor:
def __init__(self):
global exitcode, ooproc, office, product
unocontext = None
### Do the LibreOffice component dance
self.context = uno.getComponentContext()
self.svcmgr = self.context.ServiceManager
resolver = self.svcmgr.createInstanceWithContext("com.sun.star.bridge.UnoUrlResolver", self.context)
### Test for an existing connection
info(3, 'Connection type: %s' % op.connection)
try:
unocontext = resolver.resolve("uno:%s" % op.connection)
except NoConnectException, e:
# info(3, "Existing listener not found.\n%s" % e)
info(3, "Existing listener not found.")
if op.nolaunch:
die(113, "Existing listener not found. Unable start listener by parameters. Aborting.")
### Start our own OpenOffice instance
info(3, "Launching our own listener using %s." % office.binary)
try:
product = self.svcmgr.createInstance("com.sun.star.configuration.ConfigurationProvider").createInstanceWithArguments("com.sun.star.configuration.ConfigurationAccess", UnoProps(nodepath="/org.openoffice.Setup/Product"))
if product.ooName != "LibreOffice" or LooseVersion(product.ooSetupVersion) <= LooseVersion('3.3'):
ooproc = subprocess.Popen([office.binary, "-headless", "-invisible", "-nocrashreport", "-nodefault", "-nofirststartwizard", "-nologo", "-norestore", "-accept=%s" % op.connection], env=os.environ)
else:
ooproc = subprocess.Popen([office.binary, "--headless", "--invisible", "--nocrashreport", "--nodefault", "--nofirststartwizard", "--nologo", "--norestore", "--accept=%s" % op.connection], env=os.environ)
info(2, '%s listener successfully started. (pid=%s)' % (product.ooName, ooproc.pid))
### Try connection to it for op.timeout seconds (flakky OpenOffice)
timeout = 0
while timeout <= op.timeout:
### Is it already/still running ?
retcode = ooproc.poll()
if retcode != None:
info(3, "Process %s (pid=%s) exited with %s." % (office.binary, ooproc.pid, retcode))
break
try:
unocontext = resolver.resolve("uno:%s" % op.connection)
break
except NoConnectException:
time.sleep(0.5)
timeout += 0.5
except:
raise
else:
error("Failed to connect to %s (pid=%s) in %d seconds.\n%s" % (office.binary, ooproc.pid, op.timeout, e))
except Exception, e:
raise
error("Launch of %s failed.\n%s" % (office.binary, e))
if not unocontext:
die(251, "Unable to connect or start own listener. Aborting.")
### And some more LibreOffice magic
unosvcmgr = unocontext.ServiceManager
self.desktop = unosvcmgr.createInstanceWithContext("com.sun.star.frame.Desktop", unocontext)
self.cwd = unohelper.systemPathToFileUrl( os.getcwd() )
### List all filters
# self.filters = unosvcmgr.createInstanceWithContext( "com.sun.star.document.FilterFactory", unocontext)
# for filter in self.filters.getElementNames():
# print filter
# #print dir(filter), dir(filter.format)
def getformat(self, inputfn):
doctype = None
### Get the output format from mapping
if op.doctype:
outputfmt = fmts.bydoctype(op.doctype, op.format)
else:
outputfmt = fmts.byname(op.format)
if not outputfmt:
outputfmt = fmts.byextension(os.extsep + op.format)
### If no doctype given, check list of acceptable formats for input file ext doctype
### FIXME: This should go into the for-loop to match each individual input filename
if outputfmt:
inputext = os.path.splitext(inputfn)[1]
inputfmt = fmts.byextension(inputext)
if inputfmt:
for fmt in outputfmt:
if inputfmt[0].doctype == fmt.doctype:
doctype = inputfmt[0].doctype
outputfmt = fmt
break
else:
outputfmt = outputfmt[0]
# print >> sys.stderr, 'unoconv: format `%s\' is part of multiple doctypes %s, selecting `%s\'.' % (format, [fmt.doctype for fmt in outputfmt], outputfmt[0].doctype)
else:
outputfmt = outputfmt[0]
### No format found, throw error
if not outputfmt:
if doctype:
print >> sys.stderr, 'unoconv: format [%s/%s] is not known to unoconv.' % (op.doctype, op.format)
else:
print >> sys.stderr, 'unoconv: format [%s] is not known to unoconv.' % op.format
die(1)
return outputfmt
def convert(self, inputfn):
global exitcode
document = None
outputfmt = self.getformat(inputfn)
if op.verbose > 0:
print >> sys.stderr, 'Input file:', inputfn
if not os.path.exists(inputfn):
print >> sys.stderr, 'unoconv: file `%s\' does not exist.' % inputfn
exitcode = 1
try:
### Import phase
phase = "import"
### Load inputfile
inputprops = UnoProps(Hidden=True, ReadOnly=True, UpdateDocMode=QUIET_UPDATE, FilterOptions=op.importfilter)
# if op.password:
# info = UnoProps(algorithm-name="PBKDF2", salt="salt", iteration-count=1024, hash="hash")
# inputprops += UnoProps(ModifyPasswordInfo=info)
inputurl = unohelper.absolutize(self.cwd, unohelper.systemPathToFileUrl(inputfn))
# print dir(self.desktop)
document = self.desktop.loadComponentFromURL( inputurl , "_blank", 0, inputprops )
if not document:
raise UnoException("The document '%s' could not be opened." % inputurl, None)
### Import style template
phase = "import-style"
if op.template:
if os.path.exists(op.template):
info(1, "Template file: %s" % op.template)
templateprops = UnoProps(OverwriteStyles=True)
templateurl = unohelper.absolutize(self.cwd, unohelper.systemPathToFileUrl(op.template))
document.StyleFamilies.loadStylesFromURL(templateurl, templateprops)
else:
print >> sys.stderr, 'unoconv: template file `%s\' does not exist.' % op.template
exitcode = 1
### Update document links
phase = "update-links"
try:
document.updateLinks()
except AttributeError:
# the document doesn't implement the XLinkUpdate interface
pass
### Update document indexes
phase = "update-indexes"
try:
document.refresh()
indexes = document.getDocumentIndexes()
except AttributeError:
# the document doesn't implement the XRefreshable and/or
# XDocumentIndexesSupplier interfaces
pass
else:
for i in range(0, indexes.getCount()):
indexes.getByIndex(i).update()
info(1, "Selected output format: %s" % outputfmt)
info(2, "Selected office filter: %s" % outputfmt.filter)
info(2, "Used doctype: %s" % outputfmt.doctype)
### Export phase
phase = "export"
outputprops = UnoProps(FilterName=outputfmt.filter, OutputStream=OutputStream(), Overwrite=True)
# PropertyValue( "FilterData" , 0, ( PropertyValue( "SelectPdfVersion" , 0, 1 , uno.getConstantByName( "com.sun.star.beans.PropertyState.DIRECT_VALUE" ) ) ), uno.getConstantByName( "com.sun.star.beans.PropertyState.DIRECT_VALUE" ) ),
### Cannot use UnoProps for FilterData property
if op.exportfilter:
outputprops += ( PropertyValue( "FilterData", 0, uno.Any("[]com.sun.star.beans.PropertyValue", tuple( op.exportfilter ), ), 0 ), )
if outputfmt.filter == 'Text (encoded)':
outputprops += UnoProps(FilterOptions="UTF8, LF")
elif outputfmt.filter == 'Text':
outputprops += UnoProps(FilterOptions="UTF8")
elif outputfmt.filter == 'Text - txt - csv (StarCalc)':
outputprops += UnoProps(FilterOptions="44,34,0")
elif outputfmt.filter in ('writer_pdf_Export', 'impress_pdf_Export', 'calc_pdf_Export', 'draw_pdf_Export'):
outputprops += UnoProps(SelectPdfVersion=1)
if not op.stdout:
(outputfn, ext) = os.path.splitext(inputfn)
if not op.output:
outputfn = outputfn + os.extsep + outputfmt.extension
elif os.path.isdir(op.output):
outputfn = os.path.join(op.output, os.path.basename(outputfn) + os.extsep + outputfmt.extension)
elif len(op.filenames) > 1:
outputfn = op.output + os.extsep + outputfmt.extension
else:
outputfn = op.output
outputurl = unohelper.absolutize( self.cwd, unohelper.systemPathToFileUrl(outputfn) )
info(1, "Output file: %s" % outputfn)
else:
outputurl = "private:stream"
try:
document.storeToURL(outputurl, tuple(outputprops) )
except IOException, e:
from invenio.ext.logging import get_pretty_traceback
print >> sys.stderr, get_pretty_traceback()
raise UnoException("Unable to store document to %s with properties %s. Exception: %s" % (outputurl, outputprops, e), None)
phase = "dispose"
document.dispose()
document.close(True)
except SystemError, e:
error("unoconv: SystemError during %s phase: %s" % (phase, e))
exitcode = 1
except RuntimeException, e:
error("unoconv: RuntimeException during %s phase: Office probably died. %s" % (phase, e))
exitcode = 6
except DisposedException, e:
error("unoconv: DisposedException during %s phase: Office probably died. %s" % (phase, e))
exitcode = 7
except IllegalArgumentException, e:
error("UNO IllegalArgument during %s phase: Source file cannot be read. %s" % (phase, e))
exitcode = 8
except IOException, e:
# for attr in dir(e): print '%s: %s', (attr, getattr(e, attr))
error("unoconv: IOException during %s phase: %s" % (phase, e.Message))
exitcode = 3
except CannotConvertException, e:
# for attr in dir(e): print '%s: %s', (attr, getattr(e, attr))
error("unoconv: CannotConvertException during %s phase: %s" % (phase, e.Message))
exitcode = 4
except UnoException, e:
if hasattr(e, 'ErrCode'):
error("unoconv: UnoException during %s phase in %s (ErrCode %d)" % (phase, repr(e.__class__), e.ErrCode))
exitcode = e.ErrCode
pass
if hasattr(e, 'Message'):
error("unoconv: UnoException during %s phase: %s" % (phase, e.Message))
exitcode = 5
else:
error("unoconv: UnoException during %s phase in %s" % (phase, repr(e.__class__)))
exitcode = 2
pass
class Listener:
def __init__(self):
global product
info(1, "Start listener on %s:%s" % (op.server, op.port))
self.context = uno.getComponentContext()
self.svcmgr = self.context.ServiceManager
try:
resolver = self.svcmgr.createInstanceWithContext("com.sun.star.bridge.UnoUrlResolver", self.context)
product = self.svcmgr.createInstance("com.sun.star.configuration.ConfigurationProvider").createInstanceWithArguments("com.sun.star.configuration.ConfigurationAccess", UnoProps(nodepath="/org.openoffice.Setup/Product"))
try:
unocontext = resolver.resolve("uno:%s" % op.connection)
except NoConnectException, e:
pass
else:
info(1, "Existing %s listener found, nothing to do." % product.ooName)
return
if product.ooName != "LibreOffice" or LooseVersion(product.ooSetupVersion) <= LooseVersion('3.3'):
subprocess.call([office.binary, "-headless", "-invisible", "-nocrashreport", "-nodefault", "-nologo", "-nofirststartwizard", "-norestore", "-accept=%s" % op.connection], env=os.environ)
else:
subprocess.call([office.binary, "--headless", "--invisible", "--nocrashreport", "--nodefault", "--nologo", "--nofirststartwizard", "--norestore", "--accept=%s" % op.connection], env=os.environ)
except Exception, e:
error("Launch of %s failed.\n%s" % (office.binary, e))
else:
info(1, "Existing %s listener found, nothing to do." % product.ooName)
def error(msg):
"Output error message"
print >> sys.stderr, msg
def info(level, msg):
"Output info message"
if 'op' not in globals():
pass
elif op.verbose >= 3 and level >= 3:
print >> sys.stderr, "DEBUG:", msg
elif not op.stdout and level <= op.verbose:
print >> sys.stdout, msg
elif level <= op.verbose:
print >> sys.stderr, msg
def die(ret, msg=None):
"Print optional error and exit with errorcode"
global convertor, ooproc, office
if msg:
error('Error: %s' % msg)
### Did we start our own listener instance ?
if not op.listener and ooproc and convertor:
### If there is a GUI now attached to the instance, disable listener
if convertor.desktop.getCurrentFrame():
info(2, 'Trying to stop %s GUI listener.' % product.ooName)
try:
if product.ooName != "LibreOffice" or product.ooSetupVersion <= 3.3:
subprocess.Popen([office.binary, "-headless", "-invisible", "-nocrashreport", "-nodefault", "-nofirststartwizard", "-nologo", "-norestore", "-unaccept=%s" % op.connection], env=os.environ)
else:
subprocess.Popen([office.binary, "--headless", "--invisible", "--nocrashreport", "--nodefault", "--nofirststartwizard", "--nologo", "--norestore", "--unaccept=%s" % op.connection], env=os.environ)
ooproc.wait()
info(2, '%s listener successfully disabled.' % product.ooName)
except Exception, e:
error("Terminate using %s failed.\n%s" % (office.binary, e))
### If there is no GUI attached to the instance, terminate instance
else:
info(3, 'Terminating %s instance.' % product.ooName)
try:
convertor.desktop.terminate()
except DisposedException:
info(2, '%s instance unsuccessfully closed, sending TERM signal.' % product.ooName)
try:
ooproc.terminate()
except AttributeError:
os.kill(ooproc.pid, 15)
info(3, 'Waiting for %s instance to exit.' % product.ooName)
ooproc.wait()
### LibreOffice processes may get stuck and we have to kill them
### Is it still running ?
if ooproc.poll() == None:
info(1, '%s instance still running, please investigate...' % product.ooName)
ooproc.wait()
info(2, '%s instance unsuccessfully terminated, sending KILL signal.' % product.ooName)
try:
ooproc.kill()
except AttributeError:
os.kill(ooproc.pid, 9)
info(3, 'Waiting for %s with pid %s to disappear.' % (ooproc.pid, product.ooName))
ooproc.wait()
# allow Python GC to garbage collect pyuno object *before* exit call
# which avoids random segmentation faults --vpa
convertor = None
sys.exit(ret)
def main():
global convertor, exitcode
convertor = None
try:
if op.listener:
listener = Listener()
if op.filenames:
convertor = Convertor()
for inputfn in op.filenames:
convertor.convert(inputfn)
except NoConnectException, e:
error("unoconv: could not find an existing connection to LibreOffice at %s:%s." % (op.server, op.port))
if op.connection:
info(0, "Please start an LibreOffice instance on server '%s' by doing:\n\n unoconv --listener --server %s --port %s\n\nor alternatively:\n\n soffice -nologo -nodefault -accept=\"%s\"" % (op.server, op.server, op.port, op.connection))
else:
info(0, "Please start an LibreOffice instance on server '%s' by doing:\n\n unoconv --listener --server %s --port %s\n\nor alternatively:\n\n soffice -nologo -nodefault -accept=\"socket,host=%s,port=%s;urp;\"" % (op.server, op.server, op.port, op.server, op.port))
info(0, "Please start an soffice instance on server '%s' by doing:\n\n soffice -nologo -nodefault -accept=\"socket,host=localhost,port=%s;urp;\"" % (op.server, op.port))
exitcode = 1
# except UnboundLocalError:
# die(252, "Failed to connect to remote listener.")
except OSError:
error("Warning: failed to launch Office suite. Aborting.")
### Main entrance
if __name__ == '__main__':
os.environ['HOME'] = CFG_OPENOFFICE_TMPDIR
exitcode = 0
info(3, 'sysname=%s, platform=%s, python=%s, python-version=%s' % (os.name, sys.platform, sys.executable, sys.version))
for of in find_offices():
if of.python != sys.executable and not sys.executable.startswith(of.basepath):
python_switch(of)
office_environ(of)
# debug_office()
try:
import uno, unohelper
office = of
break
except:
# debug_office()
print >> sys.stderr, "unoconv: Cannot find a suitable pyuno library and python binary combination in %s" % of
print >> sys.stderr, "ERROR:", sys.exc_info()[1]
print >> sys.stderr
else:
# debug_office()
print >> sys.stderr, "unoconv: Cannot find a suitable office installation on your system."
print >> sys.stderr, "ERROR: Please locate your office installation and send your feedback to:"
print >> sys.stderr, " http://github.com/dagwieers/unoconv/issues"
sys.exit(1)
### Now that we have found a working pyuno library, let's import some classes
from com.sun.star.beans import PropertyValue
from com.sun.star.connection import NoConnectException
from com.sun.star.document.UpdateDocMode import QUIET_UPDATE
from com.sun.star.lang import DisposedException, IllegalArgumentException
from com.sun.star.io import IOException, XOutputStream
from com.sun.star.script import CannotConvertException
from com.sun.star.uno import Exception as UnoException
from com.sun.star.uno import RuntimeException
### And now that we have those classes, build on them
class OutputStream( unohelper.Base, XOutputStream ):
def __init__( self ):
self.closed = 0
def closeOutput(self):
self.closed = 1
def writeBytes( self, seq ):
sys.stdout.write( seq.value )
def flush( self ):
pass
def UnoProps(**args):
props = []
for key in args:
prop = PropertyValue()
prop.Name = key
prop.Value = args[key]
props.append(prop)
return tuple(props)
op = Options(sys.argv[1:])
info(2, "Using office base path: %s" % office.basepath)
info(2, "Using office binary path: %s" % office.unopath)
try:
main()
except KeyboardInterrupt, e:
die(6, 'Exiting on user request')
except:
from invenio.ext.logging import register_exception
register_exception(alert_admin=True)
die(exitcode)
diff --git a/invenio/legacy/websubmit/webinterface.py b/invenio/legacy/websubmit/webinterface.py
index 60824760f..24c3c1910 100644
--- a/invenio/legacy/websubmit/webinterface.py
+++ b/invenio/legacy/websubmit/webinterface.py
@@ -1,929 +1,929 @@
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__lastupdated__ = """$Date$"""
__revision__ = "$Id$"
import os
import errno
import time
import cgi
import sys
import shutil
from urllib import urlencode
from invenio.config import \
CFG_ACCESS_CONTROL_LEVEL_SITE, \
CFG_SITE_LANG, \
CFG_SITE_NAME, \
CFG_SITE_URL, \
CFG_SITE_SECURE_URL, \
CFG_WEBSUBMIT_STORAGEDIR, \
CFG_PREFIX, \
CFG_CERN_SITE
from invenio.utils import apache
from invenio.legacy.dbquery import run_sql
from invenio.modules.access.engine import acc_authorize_action
from invenio.modules.access.control import acc_is_role
from invenio.legacy.webpage import warning_page
from invenio.legacy.webuser import getUid, page_not_authorized, collect_user_info, \
isGuestUser
from invenio.ext.legacy.handler import wash_urlargd, WebInterfaceDirectory
from invenio.utils.url import make_canonical_urlargd, redirect_to_url
from invenio.base.i18n import gettext_set_language
-from invenio.bibdocfile import stream_file, \
+from invenio.legacy.bibdocfile.api import stream_file, \
decompose_file, propose_next_docname
from invenio.ext.logging import register_exception
from invenio.utils.html import is_html_text_editor_installed
from invenio.websubmit_icon_creator import create_icon, InvenioWebSubmitIconCreatorError
from invenio.ckeditor_invenio_connector import process_CKEditor_upload, send_response
import invenio.legacy.template
websubmit_templates = invenio.legacy.template.load('websubmit')
-from invenio.websearchadminlib import get_detailed_page_tabs
+from invenio.legacy.websearch.adminlib import get_detailed_page_tabs
from invenio.utils.json import json, CFG_JSON_AVAILABLE
import invenio.legacy.template
from flask import session
webstyle_templates = invenio.legacy.template.load('webstyle')
websearch_templates = invenio.legacy.template.load('websearch')
from invenio.legacy.websubmit.engine import home, action, interface, endaction, makeCataloguesTable
class WebInterfaceSubmitPages(WebInterfaceDirectory):
_exports = ['summary', 'sub', 'direct', '', 'attachfile', 'uploadfile', \
'getuploadedfile', 'upload_video', ('continue', 'continue_')]
def uploadfile(self, req, form):
"""
Similar to /submit, but only consider files. Nice for
asynchronous Javascript uploads. Should be used to upload a
single file.
Also try to create an icon, and return URL to file(s) + icon(s)
Authentication is performed based on session ID passed as
parameter instead of cookie-based authentication, due to the
use of this URL by the Flash plugin (to upload multiple files
at once), which does not route cookies.
FIXME: consider adding /deletefile and /modifyfile functions +
parsing of additional parameters to rename files, add
comments, restrictions, etc.
"""
argd = wash_urlargd(form, {
'doctype': (str, ''),
'access': (str, ''),
'indir': (str, ''),
'session_id': (str, ''),
'rename': (str, ''),
})
curdir = None
if not form.has_key("indir") or \
not form.has_key("doctype") or \
not form.has_key("access"):
raise apache.SERVER_RETURN(apache.HTTP_BAD_REQUEST)
else:
curdir = os.path.join(CFG_WEBSUBMIT_STORAGEDIR,
argd['indir'],
argd['doctype'],
argd['access'])
user_info = collect_user_info(req)
if form.has_key("session_id"):
# Are we uploading using Flash, which does not transmit
# cookie? The expect to receive session_id as a form
# parameter. First check that IP addresses do not
# mismatch.
uid = session.uid
user_info = collect_user_info(uid)
try:
act_fd = file(os.path.join(curdir, 'act'))
action = act_fd.read()
act_fd.close()
except:
action = ""
# Is user authorized to perform this action?
(auth_code, auth_message) = acc_authorize_action(uid, "submit",
authorized_if_no_roles=not isGuestUser(uid),
verbose=0,
doctype=argd['doctype'],
act=action)
if acc_is_role("submit", doctype=argd['doctype'], act=action) and auth_code != 0:
# User cannot submit
raise apache.SERVER_RETURN(apache.HTTP_UNAUTHORIZED)
else:
# Process the upload and get the response
added_files = {}
for key, formfields in form.items():
filename = key.replace("[]", "")
file_to_open = os.path.join(curdir, filename)
if hasattr(formfields, "filename") and formfields.filename:
dir_to_open = os.path.abspath(os.path.join(curdir,
'files',
str(user_info['uid']),
key))
try:
assert(dir_to_open.startswith(CFG_WEBSUBMIT_STORAGEDIR))
except AssertionError:
register_exception(req=req, prefix='curdir="%s", key="%s"' % (curdir, key))
raise apache.SERVER_RETURN(apache.HTTP_FORBIDDEN)
if not os.path.exists(dir_to_open):
try:
os.makedirs(dir_to_open)
except OSError, e:
if e.errno != errno.EEXIST:
# If the issue is only that directory
# already exists, then continue, else
# report
register_exception(req=req, alert_admin=True)
raise apache.SERVER_RETURN(apache.HTTP_FORBIDDEN)
filename = formfields.filename
## Before saving the file to disc, wash the filename (in particular
## washing away UNIX and Windows (e.g. DFS) paths):
filename = os.path.basename(filename.split('\\')[-1])
filename = filename.strip()
if filename != "":
# Check that file does not already exist
n = 1
while os.path.exists(os.path.join(dir_to_open, filename)):
#dirname, basename, extension = decompose_file(new_destination_path)
basedir, name, extension = decompose_file(filename)
new_name = propose_next_docname(name)
filename = new_name + extension
# This may be dangerous if the file size is bigger than the available memory
fp = open(os.path.join(dir_to_open, filename), "w")
fp.write(formfields.file.read())
fp.close()
fp = open(os.path.join(curdir, "lastuploadedfile"), "w")
fp.write(filename)
fp.close()
fp = open(file_to_open, "w")
fp.write(filename)
fp.close()
try:
# Create icon
(icon_path, icon_name) = create_icon(
{ 'input-file' : os.path.join(dir_to_open, filename),
'icon-name' : filename, # extension stripped automatically
'icon-file-format' : 'gif',
'multipage-icon' : False,
'multipage-icon-delay' : 100,
'icon-scale' : "300>", # Resize only if width > 300
'verbosity' : 0,
})
icons_dir = os.path.join(os.path.join(curdir,
'icons',
str(user_info['uid']),
key))
if not os.path.exists(icons_dir):
# Create uid/icons dir if needed
try:
os.makedirs(icons_dir)
except OSError, e:
if e.errno != errno.EEXIST:
# If the issue is only that
# directory already exists,
# then continue, else report
register_exception(req=req, alert_admin=True)
raise apache.SERVER_RETURN(apache.HTTP_FORBIDDEN)
os.rename(os.path.join(icon_path, icon_name),
os.path.join(icons_dir, icon_name))
added_files[key] = {'name': filename,
'iconName': icon_name}
except InvenioWebSubmitIconCreatorError, e:
# We could not create the icon
added_files[key] = {'name': filename}
continue
else:
raise apache.SERVER_RETURN(apache.HTTP_BAD_REQUEST)
# Send our response
if CFG_JSON_AVAILABLE:
return json.dumps(added_files)
def upload_video(self, req, form):
"""
A clone of uploadfile but for (large) videos.
Does not copy the uploaded file to the websubmit directory.
Instead, the path to the file is stored inside the submission directory.
"""
def gcd(a, b):
""" the euclidean algorithm """
while a:
a, b = b % a, a
return b
from invenio.modules.encoder.extract import extract_frames
from invenio.modules.encoder.config import CFG_BIBENCODE_WEBSUBMIT_ASPECT_SAMPLE_DIR, CFG_BIBENCODE_WEBSUBMIT_ASPECT_SAMPLE_FNAME
from invenio.modules.encoder.encode import determine_aspect
from invenio.modules.encoder.utils import probe
from invenio.modules.encoder.metadata import ffprobe_metadata
from invenio.legacy.websubmit.config import CFG_WEBSUBMIT_TMP_VIDEO_PREFIX
argd = wash_urlargd(form, {
'doctype': (str, ''),
'access': (str, ''),
'indir': (str, ''),
'session_id': (str, ''),
'rename': (str, ''),
})
curdir = None
if not form.has_key("indir") or \
not form.has_key("doctype") or \
not form.has_key("access"):
raise apache.SERVER_RETURN(apache.HTTP_BAD_REQUEST)
else:
curdir = os.path.join(CFG_WEBSUBMIT_STORAGEDIR,
argd['indir'],
argd['doctype'],
argd['access'])
user_info = collect_user_info(req)
if form.has_key("session_id"):
# Are we uploading using Flash, which does not transmit
# cookie? The expect to receive session_id as a form
# parameter. First check that IP addresses do not
# mismatch.
uid = session.uid
user_info = collect_user_info(uid)
try:
act_fd = file(os.path.join(curdir, 'act'))
action = act_fd.read()
act_fd.close()
except:
act = ""
# Is user authorized to perform this action?
(auth_code, auth_message) = acc_authorize_action(uid, "submit",
authorized_if_no_roles=not isGuestUser(uid),
verbose=0,
doctype=argd['doctype'],
act=action)
if acc_is_role("submit", doctype=argd['doctype'], act=action) and auth_code != 0:
# User cannot submit
raise apache.SERVER_RETURN(apache.HTTP_UNAUTHORIZED)
else:
# Process the upload and get the response
json_response = {}
for key, formfields in form.items():
filename = key.replace("[]", "")
if hasattr(formfields, "filename") and formfields.filename:
dir_to_open = os.path.abspath(os.path.join(curdir,
'files',
str(user_info['uid']),
key))
try:
assert(dir_to_open.startswith(CFG_WEBSUBMIT_STORAGEDIR))
except AssertionError:
register_exception(req=req, prefix='curdir="%s", key="%s"' % (curdir, key))
raise apache.SERVER_RETURN(apache.HTTP_FORBIDDEN)
if not os.path.exists(dir_to_open):
try:
os.makedirs(dir_to_open)
except OSError, e:
if e.errno != errno.EEXIST:
# If the issue is only that directory
# already exists, then continue, else
# report
register_exception(req=req, alert_admin=True)
raise apache.SERVER_RETURN(apache.HTTP_FORBIDDEN)
filename = formfields.filename
## Before saving the file to disc, wash the filename (in particular
## washing away UNIX and Windows (e.g. DFS) paths):
filename = os.path.basename(filename.split('\\')[-1])
filename = filename.strip()
if filename != "":
# Check that file does not already exist
while os.path.exists(os.path.join(dir_to_open, filename)):
#dirname, basename, extension = decompose_file(new_destination_path)
basedir, name, extension = decompose_file(filename)
new_name = propose_next_docname(name)
filename = new_name + extension
#-------------#
# VIDEO STUFF #
#-------------#
## Remove all previous uploads
filelist = os.listdir(os.path.split(formfields.file.name)[0])
for afile in filelist:
if argd['access'] in afile:
os.remove(os.path.join(os.path.split(formfields.file.name)[0], afile))
## Check if the file is a readable video
## We must exclude all image and audio formats that are readable by ffprobe
if (os.path.splitext(filename)[1] in ['jpg', 'jpeg', 'gif', 'tiff', 'bmp', 'png', 'tga',
'jp2', 'j2k', 'jpf', 'jpm', 'mj2', 'biff', 'cgm',
'exif', 'img', 'mng', 'pic', 'pict', 'raw', 'wmf', 'jpe', 'jif',
'jfif', 'jfi', 'tif', 'webp', 'svg', 'ai', 'ps', 'psd',
'wav', 'mp3', 'pcm', 'aiff', 'au', 'flac', 'wma', 'm4a', 'wv', 'oga',
'm4a', 'm4b', 'm4p', 'm4r', 'aac', 'mp4', 'vox', 'amr', 'snd']
or not probe(formfields.file.name)):
formfields.file.close()
raise apache.SERVER_RETURN(apache.HTTP_FORBIDDEN)
## We have no "delete" attribute in Python 2.4
if sys.hexversion < 0x2050000:
## We need to rename first and create a dummy file
## Rename the temporary file for the garbage collector
new_tmp_fullpath = os.path.split(formfields.file.name)[0] + "/" + CFG_WEBSUBMIT_TMP_VIDEO_PREFIX + argd['access'] + "_" + os.path.split(formfields.file.name)[1]
os.rename(formfields.file.name, new_tmp_fullpath)
dummy = open(formfields.file.name, "w")
dummy.close()
formfields.file.close()
else:
# Mark the NamedTemporatyFile as not to be deleted
formfields.file.delete = False
formfields.file.close()
## Rename the temporary file for the garbage collector
new_tmp_fullpath = os.path.split(formfields.file.name)[0] + "/" + CFG_WEBSUBMIT_TMP_VIDEO_PREFIX + argd['access'] + "_" + os.path.split(formfields.file.name)[1]
os.rename(formfields.file.name, new_tmp_fullpath)
# Write the path to the temp file to a file in STORAGEDIR
fp = open(os.path.join(dir_to_open, "filepath"), "w")
fp.write(new_tmp_fullpath)
fp.close()
fp = open(os.path.join(dir_to_open, "filename"), "w")
fp.write(filename)
fp.close()
## We are going to extract some thumbnails for websubmit ##
sample_dir = os.path.join(curdir, 'files', str(user_info['uid']), CFG_BIBENCODE_WEBSUBMIT_ASPECT_SAMPLE_DIR)
try:
## Remove old thumbnails
shutil.rmtree(sample_dir)
except OSError:
register_exception(req=req, alert_admin=False)
try:
os.makedirs(os.path.join(curdir, 'files', str(user_info['uid']), sample_dir))
except OSError:
register_exception(req=req, alert_admin=False)
try:
extract_frames(input_file=new_tmp_fullpath,
output_file=os.path.join(sample_dir, CFG_BIBENCODE_WEBSUBMIT_ASPECT_SAMPLE_FNAME),
size="600x600",
numberof=5)
json_response['frames'] = []
for extracted_frame in os.listdir(sample_dir):
json_response['frames'].append(extracted_frame)
except:
## If the frame extraction fails, something was bad with the video
os.remove(new_tmp_fullpath)
register_exception(req=req, alert_admin=False)
raise apache.SERVER_RETURN(apache.HTTP_FORBIDDEN)
## Try to detect the aspect. if this fails, the video is not readable
## or a wrong file might have been uploaded
try:
(aspect, width, height) = determine_aspect(new_tmp_fullpath)
if aspect:
aspx, aspy = aspect.split(':')
else:
the_gcd = gcd(width, height)
aspx = str(width / the_gcd)
aspy = str(height / the_gcd)
json_response['aspx'] = aspx
json_response['aspy'] = aspy
except TypeError:
## If the aspect detection completely fails
os.remove(new_tmp_fullpath)
register_exception(req=req, alert_admin=False)
raise apache.SERVER_RETURN(apache.HTTP_FORBIDDEN)
## Try to extract some metadata from the video container
metadata = ffprobe_metadata(new_tmp_fullpath)
json_response['meta_title'] = metadata['format'].get('TAG:title')
json_response['meta_description'] = metadata['format'].get('TAG:description')
json_response['meta_year'] = metadata['format'].get('TAG:year')
json_response['meta_author'] = metadata['format'].get('TAG:author')
## Empty file name
else:
raise apache.SERVER_RETURN(apache.HTTP_BAD_REQUEST)
## We found our file, we can break the loop
break;
# Send our response
if CFG_JSON_AVAILABLE:
dumped_response = json.dumps(json_response)
# store the response in the websubmit directory
# this is needed if the submission is not finished and continued later
response_dir = os.path.join(curdir, 'files', str(user_info['uid']), "response")
try:
os.makedirs(response_dir)
except OSError:
# register_exception(req=req, alert_admin=False)
pass
fp = open(os.path.join(response_dir, "response"), "w")
fp.write(dumped_response)
fp.close()
return dumped_response
def getuploadedfile(self, req, form):
"""
Stream uploaded files.
For the moment, restrict to files in ./curdir/files/uid or
./curdir/icons/uid directory, so that we are sure we stream
files only to the user who uploaded them.
"""
argd = wash_urlargd(form, {'indir': (str, None),
'doctype': (str, None),
'access': (str, None),
'icon': (int, 0),
'key': (str, None),
'filename': (str, None),
'nowait': (int, 0)})
if None in argd.values():
raise apache.SERVER_RETURN(apache.HTTP_BAD_REQUEST)
uid = getUid(req)
if argd['icon']:
file_path = os.path.join(CFG_WEBSUBMIT_STORAGEDIR,
argd['indir'],
argd['doctype'],
argd['access'],
'icons',
str(uid),
argd['key'],
argd['filename']
)
else:
file_path = os.path.join(CFG_WEBSUBMIT_STORAGEDIR,
argd['indir'],
argd['doctype'],
argd['access'],
'files',
str(uid),
argd['key'],
argd['filename']
)
abs_file_path = os.path.abspath(file_path)
if abs_file_path.startswith(CFG_WEBSUBMIT_STORAGEDIR):
# Check if file exist. Note that icon might not yet have
# been created.
if not argd['nowait']:
for i in range(5):
if os.path.exists(abs_file_path):
return stream_file(req, abs_file_path)
time.sleep(1)
else:
if os.path.exists(abs_file_path):
return stream_file(req, abs_file_path)
# Send error 404 in all other cases
raise apache.SERVER_RETURN(apache.HTTP_NOT_FOUND)
def attachfile(self, req, form):
"""
Process requests received from CKEditor to upload files.
If the uploaded file is an image, create an icon version
"""
if not is_html_text_editor_installed():
return apache.HTTP_NOT_FOUND
if not form.has_key('type'):
form['type'] = 'File'
if not form.has_key('upload') or \
not form['type'] in \
['File', 'Image', 'Flash', 'Media']:
#return apache.HTTP_NOT_FOUND
pass
filetype = form['type'].lower()
uid = getUid(req)
# URL where the file can be fetched after upload
user_files_path = '%(CFG_SITE_URL)s/submit/getattachedfile/%(uid)s' % \
{'uid': uid,
'CFG_SITE_URL': CFG_SITE_URL,
'filetype': filetype}
# Path to directory where uploaded files are saved
user_files_absolute_path = '%(CFG_PREFIX)s/var/tmp/attachfile/%(uid)s/%(filetype)s' % \
{'uid': uid,
'CFG_PREFIX': CFG_PREFIX,
'filetype': filetype}
try:
os.makedirs(user_files_absolute_path)
except:
pass
user_info = collect_user_info(req)
(auth_code, auth_message) = acc_authorize_action(user_info, 'attachsubmissionfile')
msg = ""
if user_info['email'] == 'guest':
# User is guest: must login prior to upload
msg = 'Please login before uploading file.'
elif auth_code:
# User cannot submit
msg = 'Sorry, you are not allowed to submit files.'
## elif len(form['upload']) != 1:
## msg = 'Sorry, you must upload one single file'
else:
# Process the upload and get the response
(msg, uploaded_file_path, uploaded_file_name, uploaded_file_url, callback_function) = \
process_CKEditor_upload(form, uid, user_files_path, user_files_absolute_path)
if uploaded_file_path:
# Create an icon
if form.get('type','') == 'Image':
try:
(icon_path, icon_name) = create_icon(
{ 'input-file' : uploaded_file_path,
'icon-name' : os.path.splitext(uploaded_file_name)[0],
'icon-file-format' : os.path.splitext(uploaded_file_name)[1][1:] or 'gif',
'multipage-icon' : False,
'multipage-icon-delay' : 100,
'icon-scale' : "700>", # Resize only if width > 700
'verbosity' : 0,
})
# Move original file to /original dir, and replace it with icon file
original_user_files_absolute_path = os.path.join(user_files_absolute_path,
'original')
if not os.path.exists(original_user_files_absolute_path):
# Create /original dir if needed
os.mkdir(original_user_files_absolute_path)
os.rename(uploaded_file_path,
original_user_files_absolute_path + os.sep + uploaded_file_name)
os.rename(icon_path + os.sep + icon_name,
uploaded_file_path)
except InvenioWebSubmitIconCreatorError, e:
pass
user_files_path += '/' + filetype + '/' + uploaded_file_name
else:
user_files_path = ''
if not msg:
msg = 'No valid file found'
# Send our response
send_response(req, msg, user_files_path, callback_function)
def _lookup(self, component, path):
""" This handler is invoked for the dynamic URLs (for getting
and putting attachments) Eg:
/submit/getattachedfile/41336978/image/myfigure.png
/submit/attachfile/41336978/image/myfigure.png
"""
if component == 'getattachedfile' and len(path) > 2:
uid = path[0] # uid of the submitter
file_type = path[1] # file, image, flash or media (as
# defined by CKEditor)
if file_type in ['file', 'image', 'flash', 'media']:
file_name = '/'.join(path[2:]) # the filename
def answer_get(req, form):
"""Accessing files attached to submission."""
form['file'] = file_name
form['type'] = file_type
form['uid'] = uid
return self.getattachedfile(req, form)
return answer_get, []
# All other cases: file not found
return None, []
def getattachedfile(self, req, form):
"""
Returns a file uploaded to the submission 'drop box' by the
CKEditor.
"""
argd = wash_urlargd(form, {'file': (str, None),
'type': (str, None),
'uid': (int, 0)})
# Can user view this record, i.e. can user access its
# attachments?
uid = getUid(req)
user_info = collect_user_info(req)
if not argd['file'] is None:
# Prepare path to file on disk. Normalize the path so that
# ../ and other dangerous components are removed.
path = os.path.abspath(CFG_PREFIX + '/var/tmp/attachfile/' + \
'/' + str(argd['uid']) + \
'/' + argd['type'] + '/' + argd['file'])
# Check that we are really accessing attachements
# directory, for the declared record.
if path.startswith(CFG_PREFIX + '/var/tmp/attachfile/') and os.path.exists(path):
return stream_file(req, path)
# Send error 404 in all other cases
return(apache.HTTP_NOT_FOUND)
def continue_(self, req, form):
"""
Continue an interrupted submission.
"""
args = wash_urlargd(form, {'access': (str, ''), 'doctype': (str, '')})
ln = args['ln']
_ = gettext_set_language(ln)
access = args['access']
doctype = args['doctype']
if not access or not doctype:
return warning_page(_("Sorry, invalid arguments"), req=req, ln=ln)
user_info = collect_user_info(req)
email = user_info['email']
res = run_sql("SELECT action, status FROM sbmSUBMISSIONS WHERE id=%s AND email=%s and doctype=%s", (access, email, doctype))
if res:
action, status = res[0]
if status == 'finished':
return warning_page(_("Note: the requested submission has already been completed"), req=req, ln=ln)
redirect_to_url(req, CFG_SITE_SECURE_URL + '/submit/direct?' + urlencode({
'sub': action + doctype,
'access': access}))
return warning_page(_("Sorry, you don't seem to have initiated a submission with the provided access number"), req=req, ln=ln)
def direct(self, req, form):
"""Directly redirected to an initialized submission."""
args = wash_urlargd(form, {'sub': (str, ''),
'access' : (str, '')})
sub = args['sub']
access = args['access']
ln = args['ln']
_ = gettext_set_language(ln)
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "direct",
navmenuid='submit')
myQuery = req.args
if not sub:
return warning_page(_("Sorry, 'sub' parameter missing..."), req, ln=ln)
res = run_sql("SELECT docname,actname FROM sbmIMPLEMENT WHERE subname=%s", (sub,))
if not res:
return warning_page(_("Sorry. Cannot analyse parameter"), req, ln=ln)
else:
# get document type
doctype = res[0][0]
# get action name
action = res[0][1]
# retrieve other parameter values
params = dict(form)
# find existing access number
if not access:
# create 'unique' access number
pid = os.getpid()
now = time.time()
access = "%i_%s" % (now, pid)
# retrieve 'dir' value
res = run_sql ("SELECT dir FROM sbmACTION WHERE sactname=%s", (action,))
dir = res[0][0]
mainmenu = req.headers_in.get('referer')
params['access'] = access
params['act'] = action
params['doctype'] = doctype
params['startPg'] = '1'
params['mainmenu'] = mainmenu
params['ln'] = ln
params['indir'] = dir
url = "%s/submit?%s" % (CFG_SITE_SECURE_URL, urlencode(params))
redirect_to_url(req, url)
def sub(self, req, form):
"""DEPRECATED: /submit/sub is deprecated now, so raise email to the admin (but allow submission to continue anyway)"""
args = wash_urlargd(form, {'password': (str, '')})
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../sub/",
navmenuid='submit')
try:
raise DeprecationWarning, 'submit/sub handler has been used. Please use submit/direct. e.g. "submit/sub?RN=123@SBIFOO" -> "submit/direct?RN=123&sub=SBIFOO"'
except DeprecationWarning:
register_exception(req=req, alert_admin=True)
ln = args['ln']
_ = gettext_set_language(ln)
#DEMOBOO_RN=DEMO-BOOK-2008-001&ln=en&password=1223993532.26572%40APPDEMOBOO
params = dict(form)
password = args['password']
if password:
del params['password']
if "@" in password:
params['access'], params['sub'] = password.split('@', 1)
else:
params['sub'] = password
else:
args = str(req.args).split('@')
if len(args) > 1:
params = {'sub' : args[-1]}
args = '@'.join(args[:-1])
params.update(cgi.parse_qs(args))
else:
return warning_page(_("Sorry, invalid URL..."), req, ln=ln)
url = "%s/submit/direct?%s" % (CFG_SITE_SECURE_URL, urlencode(params, doseq=True))
redirect_to_url(req, url)
def summary(self, req, form):
args = wash_urlargd(form, {
'doctype': (str, ''),
'act': (str, ''),
'access': (str, ''),
'indir': (str, '')})
ln = args['ln']
uid = getUid(req)
if uid == -1 or CFG_ACCESS_CONTROL_LEVEL_SITE >= 1:
return page_not_authorized(req, "../summary",
navmenuid='submit')
t = ""
curdir = os.path.join(CFG_WEBSUBMIT_STORAGEDIR, args['indir'], args['doctype'], args['access'])
try:
assert(curdir == os.path.abspath(curdir))
except AssertionError:
register_exception(req=req, alert_admin=True, prefix='Possible cracking tentative: indir="%s", doctype="%s", access="%s"' % (args['indir'], args['doctype'], args['access']))
return warning_page("Invalid parameters", req, ln)
subname = "%s%s" % (args['act'], args['doctype'])
res = run_sql("select sdesc,fidesc,pagenb,level from sbmFIELD where subname=%s "
"order by pagenb,fieldnb", (subname,))
nbFields = 0
values = []
for arr in res:
if arr[0] != "":
val = {
'mandatory' : (arr[3] == 'M'),
'value' : '',
'page' : arr[2],
'name' : arr[0],
}
if os.path.exists(os.path.join(curdir, curdir, arr[1])):
fd = open(os.path.join(curdir, arr[1]),"r")
value = fd.read()
fd.close()
value = value.replace("\n"," ")
value = value.replace("Select:","")
else:
value = ""
val['value'] = value
values.append(val)
return websubmit_templates.tmpl_submit_summary(
ln = args['ln'],
values = values,
)
def index(self, req, form):
args = wash_urlargd(form, {
'c': (str, CFG_SITE_NAME),
'doctype': (str, ''),
'act': (str, ''),
'startPg': (str, "1"),
'access': (str, ''),
'mainmenu': (str, ''),
'fromdir': (str, ''),
'nextPg': (str, ''),
'nbPg': (str, ''),
'curpage': (str, '1'),
'step': (str, '0'),
'mode': (str, 'U'),
})
## Strip whitespace from beginning and end of doctype and action:
args["doctype"] = args["doctype"].strip()
args["act"] = args["act"].strip()
def _index(req, c, ln, doctype, act, startPg, access,
mainmenu, fromdir, nextPg, nbPg, curpage, step,
mode):
auth_args = {}
if doctype:
auth_args['doctype'] = doctype
if act:
auth_args['act'] = act
uid = getUid(req)
if CFG_CERN_SITE:
## HACK BEGIN: this is a hack for CMS and ATLAS draft
user_info = collect_user_info(req)
if doctype == 'CMSPUB' and act == "" and 'cds-admin [CERN]' not in user_info['group'] and not user_info['email'].lower() == 'cds.support@cern.ch':
if isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({'referer' : CFG_SITE_SECURE_URL + req.unparsed_uri, 'ln' : args['ln']}, {}))
, norobot=True)
if 'cms-publication-committee-chair [CERN]' not in user_info['group']:
return page_not_authorized(req, "../submit", text="In order to access this submission interface you need to be member of the CMS Publication Committee Chair.",
navmenuid='submit')
elif doctype == 'ATLPUB' and 'cds-admin [CERN]' not in user_info['group'] and not user_info['email'].lower() == 'cds.support@cern.ch':
if isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({'referer' : CFG_SITE_SECURE_URL + req.unparsed_uri, 'ln' : args['ln']}, {}))
, norobot=True)
if 'atlas-gen [CERN]' not in user_info['group']:
return page_not_authorized(req, "../submit", text="In order to access this submission interface you need to be member of ATLAS.",
navmenuid='submit')
## HACK END
if doctype == "":
catalogues_text, at_least_one_submission_authorized, submission_exists = makeCataloguesTable(req, ln=CFG_SITE_LANG)
if not at_least_one_submission_authorized and submission_exists:
if isGuestUser(uid):
return redirect_to_url(req, "%s/youraccount/login%s" % (
CFG_SITE_SECURE_URL,
make_canonical_urlargd({'referer' : CFG_SITE_SECURE_URL + req.unparsed_uri, 'ln' : args['ln']}, {}))
, norobot=True)
else:
return page_not_authorized(req, "../submit",
uid=uid,
navmenuid='submit')
return home(req, catalogues_text, c, ln)
elif act == "":
return action(req, c, ln, doctype)
elif int(step)==0:
return interface(req, c, ln, doctype, act, startPg, access, mainmenu, fromdir, nextPg, nbPg, curpage)
else:
return endaction(req, c, ln, doctype, act, startPg, access, mainmenu, fromdir, nextPg, nbPg, curpage, step, mode)
return _index(req, **args)
# Answer to both /submit/ and /submit
__call__ = index
## def retrieve_most_recent_attached_file(file_path):
## """
## Retrieve the latest file that has been uploaded with the
## CKEditor. This is the only way to retrieve files that the
## CKEditor has renamed after the upload.
## Eg: 'prefix/image.jpg' was uploaded but did already
## exist. CKEditor silently renamed it to 'prefix/image(1).jpg':
## >>> retrieve_most_recent_attached_file('prefix/image.jpg')
## 'prefix/image(1).jpg'
## """
## (base_path, filename) = os.path.split(file_path)
## base_name = os.path.splitext(filename)[0]
## file_ext = os.path.splitext(filename)[1][1:]
## most_recent_filename = filename
## i = 0
## while True:
## i += 1
## possible_filename = "%s(%d).%s" % \
## (base_name, i, file_ext)
## if os.path.exists(base_path + os.sep + possible_filename):
## most_recent_filename = possible_filename
## else:
## break
## return os.path.join(base_path, most_recent_filename)
diff --git a/invenio/legacy/wsgi/__init__.py b/invenio/legacy/wsgi/__init__.py
index c40ffd047..09aa5069d 100644
--- a/invenio/legacy/wsgi/__init__.py
+++ b/invenio/legacy/wsgi/__init__.py
@@ -1,621 +1,621 @@
# -*- coding: utf-8 -*-
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""mod_python->WSGI Framework"""
import sys
import os
import re
import cgi
import gc
import inspect
from fnmatch import fnmatch
from urlparse import urlparse, urlunparse
from wsgiref.util import FileWrapper
from invenio.legacy.wsgi.utils import table
from invenio.utils.apache import \
HTTP_STATUS_MAP, SERVER_RETURN, OK, DONE, \
HTTP_NOT_FOUND, HTTP_INTERNAL_SERVER_ERROR
from invenio.config import CFG_WEBDIR, CFG_SITE_LANG, \
CFG_WEBSTYLE_HTTP_STATUS_ALERT_LIST, CFG_DEVEL_SITE, CFG_SITE_URL, \
CFG_SITE_SECURE_URL, CFG_WEBSTYLE_REVERSE_PROXY_IPS
from invenio.ext.logging import register_exception
from invenio.utils.datastructures import flatten_multidict
## TODO for future reimplementation of stream_file
-#from invenio.bibdocfile import StreamFileException
+#from invenio.legacy.bibdocfile.api import StreamFileException
from flask import request, after_this_request
## Magic regexp to search for usage of CFG_SITE_URL within src/href or
## any src usage of an external website
_RE_HTTPS_REPLACES = re.compile(r"\b((?:src\s*=|url\s*\()\s*[\"']?)http\://", re.I)
## Regexp to verify that the IP starts with a number (filter cases where 'unknown')
## It is faster to verify only the start (585 ns) compared with verifying
## the whole ip address - re.compile('^\d+\.\d+\.\d+\.\d+$') (1.01 µs)
_RE_IPADDRESS_START = re.compile("^\d+\.")
def _http_replace_func(match):
## src external_site -> CFG_SITE_SECURE_URL/sslredirect/external_site
return match.group(1) + CFG_SITE_SECURE_URL + '/sslredirect/'
_ESCAPED_CFG_SITE_URL = cgi.escape(CFG_SITE_URL, True)
_ESCAPED_CFG_SITE_SECURE_URL = cgi.escape(CFG_SITE_SECURE_URL, True)
def https_replace(html):
html = html.replace(_ESCAPED_CFG_SITE_URL, _ESCAPED_CFG_SITE_SECURE_URL)
return _RE_HTTPS_REPLACES.sub(_http_replace_func, html)
class InputProcessed(object):
"""
Auxiliary class used when reading input.
@see: <http://www.wsgi.org/wsgi/Specifications/handling_post_forms>.
"""
def read(self, *args):
raise EOFError('The wsgi.input stream has already been consumed')
readline = readlines = __iter__ = read
from werkzeug import BaseResponse, ResponseStreamMixin, \
CommonResponseDescriptorsMixin
class Response(BaseResponse, ResponseStreamMixin,
CommonResponseDescriptorsMixin):
"""
Full featured response object implementing :class:`ResponseStreamMixin`
to add support for the `stream` property.
"""
class SimulatedModPythonRequest(object):
"""
mod_python like request object.
Minimum and cleaned implementation to make moving out of mod_python
easy.
@see: <http://www.modpython.org/live/current/doc-html/pyapi-mprequest.html>
"""
def __init__(self, environ, start_response):
self.response = Response()
self.__environ = environ
self.__start_response = start_response
self.__response_sent_p = False
self.__content_type_set_p = False
self.__buffer = ''
self.__low_level_headers = []
self.__filename = None
self.__disposition_type = None
self.__bytes_sent = 0
self.__allowed_methods = []
self.__cleanups = []
self.headers_out = {'Cache-Control': None}
#self.headers_out.update(dict(request.headers))
## See: <http://www.python.org/dev/peps/pep-0333/#the-write-callable>
self.__write = None
self.__write_error = False
self.__errors = environ['wsgi.errors']
self.__headers_in = table([])
self.__tainted = False
self.__is_https = self.__environ.get('wsgi.url_scheme') == 'https'
self.__replace_https = False
self.track_writings = False
self.__what_was_written = ""
self.__cookies_out = {}
self.g = {} ## global dictionary in case it's needed
for key, value in environ.iteritems():
if key.startswith('HTTP_'):
self.__headers_in[key[len('HTTP_'):].replace('_', '-')] = value
if environ.get('CONTENT_LENGTH'):
self.__headers_in['content-length'] = environ['CONTENT_LENGTH']
if environ.get('CONTENT_TYPE'):
self.__headers_in['content-type'] = environ['CONTENT_TYPE']
def get_wsgi_environ(self):
return self.__environ
def get_post_form(self):
""" Returns only POST form. """
self.__tainted = True
form = flatten_multidict(request.values)
if request.files:
form.update(request.files.to_dict())
return form
def get_response_sent_p(self):
return self.__response_sent_p
def get_low_level_headers(self):
return self.__low_level_headers
def get_buffer(self):
return self.__buffer
def write(self, string, flush=1):
if isinstance(string, unicode):
self.__buffer += string.encode('utf8')
else:
self.__buffer += string
if flush:
self.flush()
def flush(self):
self.send_http_header()
if self.__buffer:
self.__bytes_sent += len(self.__buffer)
try:
if not self.__write_error:
if self.__replace_https:
self.__write(https_replace(self.__buffer))
else:
if self.__buffer:
self.__write(self.__buffer)
if self.track_writings:
if self.__replace_https:
self.__what_was_written += https_replace(self.__buffer)
else:
self.__what_was_written += self.__buffer
except IOError, err:
if "failed to write data" in str(err) or "client connection closed" in str(err):
## Let's just log this exception without alerting the admin:
register_exception(req=self)
self.__write_error = True ## This flag is there just
## to not report later other errors to the admin.
else:
raise
self.__buffer = ''
def set_content_type(self, content_type):
self.__content_type_set_p = True
self.response.content_type = content_type
if self.__is_https:
if content_type.startswith("text/html") or content_type.startswith("application/rss+xml"):
self.__replace_https = True
def get_content_type(self):
return self.response.content_type
def send_http_header(self):
for (k, v) in self.__low_level_headers:
self.response.headers[k] = v
for k, v in self.headers_out.iteritems():
self.response.headers[k] = v
self.__write = self.response.stream.write
def get_unparsed_uri(self):
return '?'.join([self.__environ['PATH_INFO'], self.__environ['QUERY_STRING']])
def get_uri(self):
return request.environ['PATH_INFO']
def get_headers_in(self):
return request.headers
def get_subprocess_env(self):
return self.__environ
def add_common_vars(self):
pass
def get_args(self):
return request.environ['QUERY_STRING']
def get_remote_ip(self):
if 'X-FORWARDED-FOR' in self.__headers_in and \
self.__headers_in.get('X-FORWARDED-SERVER', '') == \
self.__headers_in.get('X-FORWARDED-HOST', '') == \
urlparse(CFG_SITE_URL)[1]:
# we are using proxy setup
if self.__environ.get('REMOTE_ADDR') in CFG_WEBSTYLE_REVERSE_PROXY_IPS:
# we trust this proxy
ip_list = self.__headers_in['X-FORWARDED-FOR'].split(',')
for ip in ip_list:
if _RE_IPADDRESS_START.match(ip):
return ip
# no IP has the correct format, return a default IP
return '10.0.0.10'
else:
# we don't trust this proxy
register_exception(prefix="You are running in a proxy configuration, but the " + \
"CFG_WEBSTYLE_REVERSE_PROXY_IPS variable does not contain " + \
"the IP of your proxy, thus the remote IP addresses of your " + \
"clients are not trusted. Please configure this variable.",
alert_admin=True)
return '10.0.0.11'
return request.remote_addr
def get_remote_host(self):
return request.environ.get('REMOTE_HOST', # apache
request.environ.get('HTTP_HOST',
'0.0.0.0')) # not found
def get_header_only(self):
return request.environ['REQUEST_METHOD'] == 'HEAD'
def set_status(self, status):
self.response.status_code = status
def get_status(self):
return self.response.status_code
def get_wsgi_status(self):
return '%s %s' % (self.response.status_code,
HTTP_STATUS_MAP.get(int(self.response.status_code),
'Explanation not available'))
def sendfile(self, path, offset=0, the_len=-1):
try:
self.send_http_header()
file_to_send = open(path)
file_to_send.seek(offset)
file_wrapper = FileWrapper(file_to_send)
count = 0
if the_len < 0:
for chunk in file_wrapper:
count += len(chunk)
self.__bytes_sent += len(chunk)
self.__write(chunk)
else:
for chunk in file_wrapper:
if the_len >= len(chunk):
the_len -= len(chunk)
count += len(chunk)
self.__bytes_sent += len(chunk)
self.__write(chunk)
else:
count += the_len
self.__bytes_sent += the_len
self.__write(chunk[:the_len])
break
except IOError, err:
if "failed to write data" in str(err) or "client connection closed" in str(err):
## Let's just log this exception without alerting the admin:
register_exception(req=self)
else:
raise
return self.__bytes_sent
def set_content_length(self, content_length):
if content_length is not None:
self.response.headers['content-length'] = str(content_length)
else:
del self.response.headers['content-length']
def is_https(self):
return self.__is_https
def get_method(self):
return request.environ['REQUEST_METHOD']
def get_hostname(self):
return request.environ.get('HTTP_HOST', '')
def set_filename(self, filename):
self.__filename = filename
if self.__disposition_type is None:
self.__disposition_type = 'inline'
self.response.headers['content-disposition'] = '%s; filename=%s' % (self.__disposition_type, self.__filename)
def set_encoding(self, encoding):
if encoding:
self.response.headers['content-encoding'] = str(encoding)
else:
del self.response.headers['content-encoding']
def get_bytes_sent(self):
return self.__bytes_sent
def log_error(self, message):
self.__errors.write(message.strip() + '\n')
def get_content_type_set_p(self):
return self.__content_type_set_p and \
bool(self.response.headers['content-type'])
def allow_methods(self, methods, reset=0):
if reset:
self.__allowed_methods = []
self.__allowed_methods += [method.upper().strip() for method in methods]
def get_allowed_methods(self):
return self.__allowed_methods
def readline(self, hint=None):
try:
return request.stream.readline(hint)
except TypeError:
## the hint param is not part of wsgi pep, although
## it's great to exploit it in when reading FORM
## with large files, in order to avoid filling up the memory
## Too bad it's not there :-(
return request.stream.readline()
def readlines(self, hint=None):
return request.stream.readlines(hint)
def read(self, hint=None):
return request.stream.read(hint)
def register_cleanup(self, callback, data=None):
@after_this_request
def f(response):
callback(data)
def get_cleanups(self):
return self.__cleanups
def get_referer(self):
return request.referrer
def get_what_was_written(self):
return self.__what_was_written
def __str__(self):
from pprint import pformat
out = ""
for key in dir(self):
try:
if not callable(getattr(self, key)) and not key.startswith("_SimulatedModPythonRequest") and not key.startswith('__'):
out += 'req.%s: %s\n' % (key, pformat(getattr(self, key)))
except:
pass
return out
def get_original_wsgi_environment(self):
"""
Return the original WSGI environment used to initialize this request
object.
@return: environ, start_response
@raise AssertionError: in case the environment has been altered, i.e.
either the input has been consumed or something has already been
written to the output.
"""
assert not self.__tainted, "The original WSGI environment is tainted since at least req.write or req.form has been used."
return self.__environ, self.__start_response
content_type = property(get_content_type, set_content_type)
unparsed_uri = property(get_unparsed_uri)
uri = property(get_uri)
headers_in = property(get_headers_in)
subprocess_env = property(get_subprocess_env)
args = property(get_args)
header_only = property(get_header_only)
status = property(get_status, set_status)
method = property(get_method)
hostname = property(get_hostname)
filename = property(fset=set_filename)
encoding = property(fset=set_encoding)
bytes_sent = property(get_bytes_sent)
content_type_set_p = property(get_content_type_set_p)
allowed_methods = property(get_allowed_methods)
response_sent_p = property(get_response_sent_p)
form = property(get_post_form)
remote_ip = property(get_remote_ip)
remote_host = property(get_remote_host)
referer = property(get_referer)
what_was_written = property(get_what_was_written)
def alert_admin_for_server_status_p(status, referer):
"""
Check the configuration variable
CFG_WEBSTYLE_HTTP_STATUS_ALERT_LIST to see if the exception should
be registered and the admin should be alerted.
"""
status = str(status)
for pattern in CFG_WEBSTYLE_HTTP_STATUS_ALERT_LIST:
pattern = pattern.lower()
must_have_referer = False
if pattern.endswith('r'):
## e.g. "404 r"
must_have_referer = True
pattern = pattern[:-1].strip() ## -> "404"
if fnmatch(status, pattern) and (not must_have_referer or referer):
return True
return False
def application(environ, start_response, handler=None):
"""
Entry point for wsgi.
"""
## Needed for mod_wsgi, see: <http://code.google.com/p/modwsgi/wiki/ApplicationIssues>
req = SimulatedModPythonRequest(environ, start_response)
#print 'Starting mod_python simulation'
try:
if handler is None:
from invenio.ext.legacy.layout import invenio_handler
invenio_handler(req)
else:
handler(req)
req.flush()
## TODO for future reimplementation of stream_file
#except StreamFileException as e:
# return e.value
except SERVER_RETURN, status:
redirection, = status.args
from werkzeug.wrappers import BaseResponse
if isinstance(redirection, BaseResponse):
return redirection
status = int(str(status))
if status == 404:
from werkzeug.exceptions import NotFound
raise NotFound()
if status not in (OK, DONE):
req.status = status
req.headers_out['content-type'] = 'text/html'
admin_to_be_alerted = alert_admin_for_server_status_p(status,
req.headers_in.get('referer'))
if admin_to_be_alerted:
register_exception(req=req, alert_admin=True)
if not req.response_sent_p:
start_response(req.get_wsgi_status(), req.get_low_level_headers(), sys.exc_info())
map(req.write, generate_error_page(req, admin_to_be_alerted))
req.flush()
finally:
##for (callback, data) in req.get_cleanups():
## callback(data)
#if hasattr(req, '_session'):
# ## The session handler saves for caching a request_wrapper
# ## in req.
# ## This saves req as an attribute, creating a circular
# ## reference.
# ## Since we have have reached the end of the request handler
# ## we can safely drop the request_wrapper so to avoid
# ## memory leaks.
# delattr(req, '_session')
#if hasattr(req, '_user_info'):
# ## For the same reason we can delete the user_info.
# delattr(req, '_user_info')
## as suggested in
## <http://www.python.org/doc/2.3.5/lib/module-gc.html>
del gc.garbage[:]
return req.response
def generate_error_page(req, admin_was_alerted=True, page_already_started=False):
"""
Returns an iterable with the error page to be sent to the user browser.
"""
from invenio.legacy.webpage import page
from invenio.legacy import template
webstyle_templates = template.load('webstyle')
ln = req.form.get('ln', CFG_SITE_LANG)
if page_already_started:
return [webstyle_templates.tmpl_error_page(status=req.get_wsgi_status(), ln=ln, admin_was_alerted=admin_was_alerted)]
else:
return [page(title=req.get_wsgi_status(), body=webstyle_templates.tmpl_error_page(status=req.get_wsgi_status(), ln=ln, admin_was_alerted=admin_was_alerted), language=ln, req=req)]
def is_static_path(path):
"""
Returns True if path corresponds to an exsting file under CFG_WEBDIR.
@param path: the path.
@type path: string
@return: True if path corresponds to an exsting file under CFG_WEBDIR.
@rtype: bool
"""
path = os.path.abspath(CFG_WEBDIR + path)
if path.startswith(CFG_WEBDIR) and os.path.isfile(path):
return path
return None
def is_mp_legacy_publisher_path(path):
"""
Checks path corresponds to an exsting Python file under CFG_WEBDIR.
@param path: the path.
@type path: string
@return: the path of the module to load and the function to call there.
@rtype: tuple
"""
path = path.split('/')
for index, component in enumerate(path):
if component.endswith('.py'):
possible_module = os.path.abspath(CFG_WEBDIR + os.path.sep + os.path.sep.join(path[:index + 1]))
possible_handler = '/'.join(path[index + 1:]).strip()
if possible_handler.startswith('_'):
return None, None
if not possible_handler:
possible_handler = 'index'
if os.path.exists(possible_module) and possible_module.startswith(CFG_WEBDIR):
return (possible_module, possible_handler)
else:
return None, None
def mp_legacy_publisher(req, possible_module, possible_handler):
"""
mod_python legacy publisher minimum implementation.
"""
from invenio.session import get_session
from invenio.ext.legacy.handler import CFG_HAS_HTTPS_SUPPORT, CFG_FULL_HTTPS
the_module = open(possible_module).read()
module_globals = {}
exec(the_module, module_globals)
if possible_handler in module_globals and callable(module_globals[possible_handler]):
from invenio.ext.legacy.handler import _check_result
## req is the required first parameter of any handler
expected_args = list(inspect.getargspec(module_globals[possible_handler])[0])
if not expected_args or 'req' != expected_args[0]:
## req was not the first argument. Too bad!
raise SERVER_RETURN, HTTP_NOT_FOUND
## the req.form must be casted to dict because of Python 2.4 and earlier
## otherwise any object exposing the mapping interface can be
## used with the magic **
form = dict()
for key, value in req.form.items():
## FIXME: this is a backward compatibility workaround
## because most of the old administration web handler
## expect parameters to be of type str.
## When legacy publisher will be removed all this
## pain will go away anyway :-)
if isinstance(value, unicode):
form[key] = value.encode('utf8')
else:
## NOTE: this is a workaround for e.g. legacy webupload
## that is still using legacy publisher and expect to
## have a file (Field) instance instead of a string.
form[key] = value
if (CFG_FULL_HTTPS or CFG_HAS_HTTPS_SUPPORT and get_session(req).need_https) and not req.is_https():
from invenio.utils.url import redirect_to_url
# We need to isolate the part of the URI that is after
# CFG_SITE_URL, and append that to our CFG_SITE_SECURE_URL.
original_parts = urlparse(req.unparsed_uri)
plain_prefix_parts = urlparse(CFG_SITE_URL)
secure_prefix_parts = urlparse(CFG_SITE_SECURE_URL)
# Compute the new path
plain_path = original_parts[2]
plain_path = secure_prefix_parts[2] + \
plain_path[len(plain_prefix_parts[2]):]
# ...and recompose the complete URL
final_parts = list(secure_prefix_parts)
final_parts[2] = plain_path
final_parts[-3:] = original_parts[-3:]
target = urlunparse(final_parts)
redirect_to_url(req, target)
try:
return _check_result(req, module_globals[possible_handler](req, **form))
except TypeError, err:
if ("%s() got an unexpected keyword argument" % possible_handler) in str(err) or ('%s() takes at least' % possible_handler) in str(err):
inspected_args = inspect.getargspec(module_globals[possible_handler])
expected_args = list(inspected_args[0])
expected_defaults = list(inspected_args[3])
expected_args.reverse()
expected_defaults.reverse()
register_exception(req=req, prefix="Wrong GET parameter set in calling a legacy publisher handler for %s: expected_args=%s, found_args=%s" % (possible_handler, repr(expected_args), repr(req.form.keys())), alert_admin=CFG_DEVEL_SITE)
cleaned_form = {}
for index, arg in enumerate(expected_args):
if arg == 'req':
continue
if index < len(expected_defaults):
cleaned_form[arg] = form.get(arg, expected_defaults[index])
else:
cleaned_form[arg] = form.get(arg, None)
return _check_result(req, module_globals[possible_handler](req, **cleaned_form))
else:
raise
else:
raise SERVER_RETURN, HTTP_NOT_FOUND
diff --git a/invenio/legacy/wsgi/utils.py b/invenio/legacy/wsgi/utils.py
index e5d89514e..dd5b8735c 100644
--- a/invenio/legacy/wsgi/utils.py
+++ b/invenio/legacy/wsgi/utils.py
@@ -1,887 +1,887 @@
# -*- coding: utf-8 -*-
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
mod_python->WSGI Framework utilities
This code has been taken from mod_python original source code and rearranged
here to easying the migration from mod_python to wsgi.
The code taken from mod_python is under the following License.
"""
# Copyright 2004 Apache Software Foundation
#
# Licensed under the Apache License, Version 2.0 (the "License"); you
# may not use this file except in compliance with the License. You
# may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied. See the License for the specific language governing
# permissions and limitations under the License.
#
# Originally developed by Gregory Trubetskoy.
#
# $Id: apache.py 468216 2006-10-27 00:54:12Z grahamd $
try:
import threading
except:
import dummy_threading as threading
from wsgiref.headers import Headers
import time
import re
import os
import cgi
import cStringIO
import tempfile
from types import TypeType, ClassType, BuiltinFunctionType, MethodType, ListType
from invenio.config import CFG_TMPDIR, CFG_TMPSHAREDDIR
from invenio.utils.apache import \
SERVER_RETURN, \
HTTP_LENGTH_REQUIRED, \
HTTP_BAD_REQUEST, \
InvenioWebInterfaceWSGIContentLenghtError, \
InvenioWebInterfaceWSGIContentTypeError, \
InvenioWebInterfaceWSGIContentMD5Error
# Cache for values of PythonPath that have been seen already.
_path_cache = {}
_path_cache_lock = threading.Lock()
class table(Headers):
add = Headers.add_header
iteritems = Headers.items
def __getitem__(self, name):
ret = Headers.__getitem__(self, name)
if ret is None:
return ''
else:
return str(ret)
## Some functions made public
exists_config_define = lambda dummy: True
## Some constants
class metaCookie(type):
def __new__(cls, clsname, bases, clsdict):
_valid_attr = (
"version", "path", "domain", "secure",
"comment", "expires", "max_age",
# RFC 2965
"commentURL", "discard", "port",
# Microsoft Extension
"httponly" )
# _valid_attr + property values
# (note __slots__ is a new Python feature, it
# prevents any other attribute from being set)
__slots__ = _valid_attr + ("name", "value", "_value",
"_expires", "__data__")
clsdict["_valid_attr"] = _valid_attr
clsdict["__slots__"] = __slots__
def set_expires(self, value):
if type(value) == type(""):
# if it's a string, it should be
# valid format as per Netscape spec
try:
t = time.strptime(value, "%a, %d-%b-%Y %H:%M:%S GMT")
except ValueError:
raise ValueError, "Invalid expires time: %s" % value
t = time.mktime(t)
else:
# otherwise assume it's a number
# representing time as from time.time()
t = value
value = time.strftime("%a, %d-%b-%Y %H:%M:%S GMT",
time.gmtime(t))
self._expires = "%s" % value
def get_expires(self):
return self._expires
clsdict["expires"] = property(fget=get_expires, fset=set_expires)
return type.__new__(cls, clsname, bases, clsdict)
class Cookie(object):
"""
This class implements the basic Cookie functionality. Note that
unlike the Python Standard Library Cookie class, this class represents
a single cookie (not a list of Morsels).
"""
__metaclass__ = metaCookie
DOWNGRADE = 0
IGNORE = 1
EXCEPTION = 3
def parse(Class, str, **kw):
"""
Parse a Cookie or Set-Cookie header value, and return
a dict of Cookies. Note: the string should NOT include the
header name, only the value.
"""
dict = _parse_cookie(str, Class, **kw)
return dict
parse = classmethod(parse)
def __init__(self, name, value, **kw):
"""
This constructor takes at least a name and value as the
arguments, as well as optionally any of allowed cookie attributes
as defined in the existing cookie standards.
"""
self.name, self.value = name, value
for k in kw:
setattr(self, k.lower(), kw[k])
# subclasses can use this for internal stuff
self.__data__ = {}
def __str__(self):
"""
Provides the string representation of the Cookie suitable for
sending to the browser. Note that the actual header name will
not be part of the string.
This method makes no attempt to automatically double-quote
strings that contain special characters, even though the RFC's
dictate this. This is because doing so seems to confuse most
browsers out there.
"""
result = ["%s=%s" % (self.name, self.value)]
# pylint: disable=E1101
# The attribute _valid_attr is provided by the metaclass 'metaCookie'.
for name in self._valid_attr:
if hasattr(self, name):
if name in ("secure", "discard", "httponly"):
result.append(name)
else:
result.append("%s=%s" % (name, getattr(self, name)))
# pylint: enable=E1101
return "; ".join(result)
def __repr__(self):
return '<%s: %s>' % (self.__class__.__name__,
str(self))
# This is a simplified and in some places corrected
# (at least I think it is) pattern from standard lib Cookie.py
_cookiePattern = re.compile(
r"(?x)" # Verbose pattern
r"[,\ ]*" # space/comma (RFC2616 4.2) before attr-val is eaten
r"(?P<key>" # Start of group 'key'
r"[^;\ =]+" # anything but ';', ' ' or '='
r")" # End of group 'key'
r"\ *(=\ *)?" # a space, then may be "=", more space
r"(?P<val>" # Start of group 'val'
r'"(?:[^\\"]|\\.)*"' # a doublequoted string
r"|" # or
r"[^;]*" # any word or empty string
r")" # End of group 'val'
r"\s*;?" # probably ending in a semi-colon
)
def _parse_cookie(str, Class, names=None):
# XXX problem is we should allow duplicate
# strings
result = {}
matchIter = _cookiePattern.finditer(str)
for match in matchIter:
key, val = match.group("key"), match.group("val")
# We just ditch the cookies names which start with a dollar sign since
# those are in fact RFC2965 cookies attributes. See bug [#MODPYTHON-3].
if key[0] != '$' and names is None or key in names:
result[key] = Class(key, val)
return result
_RE_BAD_MSIE = re.compile("MSIE\s+(\d+\.\d+)")
def add_cookies(req, cookies):
"""
Sets one or more cookie in outgoing headers and adds a cache
directive so that caches don't cache the cookie.
"""
if not req.headers_out.has_key("Set-Cookie"):
g = _RE_BAD_MSIE.search(req.headers_in.get('User-Agent', "MSIE 6.0"))
bad_msie = g and float(g.group(1)) < 9.0
if not (bad_msie and req.is_https()):
req.headers_out.add("Cache-Control", 'no-cache="set-cookie"')
for cookie in cookies:
req.headers_out.add("Set-Cookie", str(cookie))
def get_cookies(req, Class=Cookie, **kw):
"""
A shorthand for retrieveing and parsing cookies given
a Cookie class. The class must be one of the classes from
this module.
"""
if not req.headers_in.has_key("cookie"):
return {}
cookies = req.headers_in["cookie"]
if type(cookies) == type([]):
cookies = '; '.join(cookies)
return Class.parse(cookies, **kw)
def get_cookie(req, name, Class=Cookie, **kw):
cookies = get_cookies(req, Class, names=[name], **kw)
if cookies.has_key(name):
return cookies[name]
parse_qs = cgi.parse_qs
parse_qsl = cgi.parse_qsl
# Maximum line length for reading. (64KB)
# Fixes memory error when upload large files such as 700+MB ISOs.
readBlockSize = 65368
""" The classes below are a (almost) a drop-in replacement for the
standard cgi.py FieldStorage class. They should have pretty much the
same functionality.
These classes differ in that unlike cgi.FieldStorage, they are not
recursive. The class FieldStorage contains a list of instances of
Field class. Field class is incapable of storing anything in it.
These objects should be considerably faster than the ones in cgi.py
because they do not expect CGI environment, and are
optimized specifically for Apache and mod_python.
"""
class Field:
def __init__(self, name, *args, **kwargs):
self.name = name
# Some third party packages such as Trac create
# instances of the Field object and insert it
# directly into the list of form fields. To
# maintain backward compatibility check for
# where more than just a field name is supplied
# and invoke an additional initialisation step
# to process the arguments. Ideally, third party
# code should use the add_field() method of the
# form, but if they need to maintain backward
# compatibility with older versions of mod_python
# they will not have a choice but to use old
# way of doing things and thus we need this code
# for the forseeable future to cope with that.
if args or kwargs:
self.__bc_init__(*args, **kwargs)
def __bc_init__(self, file, ctype, type_options,
disp, disp_options, headers = {}):
self.file = file
self.type = ctype
self.type_options = type_options
self.disposition = disp
self.disposition_options = disp_options
if disp_options.has_key("filename"):
self.filename = disp_options["filename"]
else:
self.filename = None
self.headers = headers
def __repr__(self):
"""Return printable representation."""
return "Field(%s, %s)" % (`self.name`, `self.value`)
def __getattr__(self, name):
if name != 'value':
raise AttributeError, name
if self.file:
self.file.seek(0)
value = self.file.read()
self.file.seek(0)
else:
value = None
return value
def __del__(self):
self.file.close()
class StringField(str):
""" This class is basically a string with
added attributes for compatibility with std lib cgi.py. Basically, this
works the opposite of Field, as it stores its data in a string, but creates
a file on demand. Field creates a value on demand and stores data in a file.
"""
filename = None
headers = {}
ctype = "text/plain"
type_options = {}
disposition = None
disp_options = None
def __new__(cls, value):
'''Create StringField instance. You'll have to set name yourself.'''
obj = str.__new__(cls, value)
obj.value = value
return obj
def __str__(self):
return str.__str__(self)
def __getattr__(self, name):
if name != 'file':
raise AttributeError, name
self.file = cStringIO.StringIO(self.value)
return self.file
def __repr__(self):
"""Return printable representation (to pass unit tests)."""
return "Field(%s, %s)" % (`self.name`, `self.value`)
class FieldList(list):
def __init__(self):
self.__table = None
list.__init__(self)
def table(self):
if self.__table is None:
self.__table = {}
for item in self:
if item.name in self.__table:
self.__table[item.name].append(item)
else:
self.__table[item.name] = [item]
return self.__table
def __delitem__(self, *args):
self.__table = None
return list.__delitem__(self, *args)
def __delslice__(self, *args):
self.__table = None
return list.__delslice__(self, *args)
def __iadd__(self, *args):
self.__table = None
return list.__iadd__(self, *args)
def __imul__(self, *args):
self.__table = None
return list.__imul__(self, *args)
def __setitem__(self, *args):
self.__table = None
return list.__setitem__(self, *args)
def __setslice__(self, *args):
self.__table = None
return list.__setslice__(self, *args)
def append(self, *args):
self.__table = None
return list.append(self, *args)
def extend(self, *args):
self.__table = None
return list.extend(self, *args)
def insert(self, *args):
self.__table = None
return list.insert(self, *args)
def pop(self, *args):
self.__table = None
return list.pop(self, *args)
def remove(self, *args):
self.__table = None
return list.remove(self, *args)
class FieldStorage:
def __init__(self, req, keep_blank_values=0, strict_parsing=0, file_callback=None, field_callback=None, to_tmp_shared=False):
#
# Whenever readline is called ALWAYS use the max size EVEN when
# not expecting a long line. - this helps protect against
# malformed content from exhausting memory.
#
self.list = FieldList()
self.wsgi_input_consumed = False
# always process GET-style parameters
if req.args:
pairs = parse_qsl(req.args, keep_blank_values)
for pair in pairs:
self.add_field(pair[0], pair[1])
if req.method != "POST":
return
try:
clen = int(req.headers_in["content-length"])
except (KeyError, ValueError):
# absent content-length is not acceptable
raise SERVER_RETURN, HTTP_LENGTH_REQUIRED
self.clen = clen
self.count = 0
if not req.headers_in.has_key("content-type"):
ctype = "application/x-www-form-urlencoded"
else:
ctype = req.headers_in["content-type"]
if ctype.startswith("application/x-www-form-urlencoded"):
pairs = parse_qsl(req.read(clen), keep_blank_values)
self.wsgi_input_consumed = True
for pair in pairs:
self.add_field(pair[0], pair[1])
return
elif not ctype.startswith("multipart/"):
# we don't understand this content-type
return
self.wsgi_input_consumed = True
# figure out boundary
try:
i = ctype.lower().rindex("boundary=")
boundary = ctype[i+9:]
if len(boundary) >= 2 and boundary[0] == boundary[-1] == '"':
boundary = boundary[1:-1]
boundary = re.compile("--" + re.escape(boundary) + "(--)?\r?\n")
except ValueError:
raise SERVER_RETURN, HTTP_BAD_REQUEST
# read until boundary
self.read_to_boundary(req, boundary, None)
end_of_stream = False
while not end_of_stream and not self.eof(): # jjj JIM BEGIN WHILE
## parse headers
ctype, type_options = "text/plain", {}
disp, disp_options = None, {}
headers = table([])
line = req.readline(readBlockSize)
self.count += len(line)
if self.eof():
end_of_stream = True
match = boundary.match(line)
if (not line) or match:
# we stop if we reached the end of the stream or a stop
# boundary (which means '--' after the boundary) we
# continue to the next part if we reached a simple
# boundary in either case this would mean the entity is
# malformed, but we're tolerating it anyway.
end_of_stream = (not line) or (match.group(1) is not None)
continue
skip_this_part = False
while line not in ('\r','\r\n'):
nextline = req.readline(readBlockSize)
self.count += len(nextline)
if self.eof():
end_of_stream = True
while nextline and nextline[0] in [ ' ', '\t']:
line = line + nextline
nextline = req.readline(readBlockSize)
self.count += len(nextline)
if self.eof():
end_of_stream = True
# we read the headers until we reach an empty line
# NOTE : a single \n would mean the entity is malformed, but
# we're tolerating it anyway
h, v = line.split(":", 1)
headers.add(h, v)
h = h.lower()
if h == "content-disposition":
disp, disp_options = parse_header(v)
elif h == "content-type":
ctype, type_options = parse_header(v)
#
# NOTE: FIX up binary rubbish sent as content type
# from Microsoft IE 6.0 when sending a file which
# does not have a suffix.
#
if ctype.find('/') == -1:
ctype = 'application/octet-stream'
line = nextline
match = boundary.match(line)
if (not line) or match:
# we stop if we reached the end of the stream or a
# stop boundary (which means '--' after the
# boundary) we continue to the next part if we
# reached a simple boundary in either case this
# would mean the entity is malformed, but we're
# tolerating it anyway.
skip_this_part = True
end_of_stream = (not line) or (match.group(1) is not None)
break
if skip_this_part:
continue
if disp_options.has_key("name"):
name = disp_options["name"]
else:
name = None
# create a file object
# is this a file?
if disp_options.has_key("filename"):
if file_callback and callable(file_callback):
file = file_callback(disp_options["filename"])
else:
if to_tmp_shared:
file = tempfile.NamedTemporaryFile(dir=CFG_TMPSHAREDDIR)
else:
file = tempfile.NamedTemporaryFile(dir=CFG_TMPDIR)
else:
if field_callback and callable(field_callback):
file = field_callback()
else:
file = cStringIO.StringIO()
# read it in
self.read_to_boundary(req, boundary, file)
if self.eof():
end_of_stream = True
file.seek(0)
# make a Field
if disp_options.has_key("filename"):
field = Field(name)
field.filename = disp_options["filename"]
else:
field = StringField(file.read())
field.name = name
field.file = file
field.type = ctype
field.type_options = type_options
field.disposition = disp
field.disposition_options = disp_options
field.headers = headers
self.list.append(field)
def add_field(self, key, value):
"""Insert a field as key/value pair"""
item = StringField(value)
item.name = key
self.list.append(item)
def __setitem__(self, key, value):
table = self.list.table()
if table.has_key(key):
items = table[key]
for item in items:
self.list.remove(item)
item = StringField(value)
item.name = key
self.list.append(item)
def read_to_boundary(self, req, boundary, file):
previous_delimiter = None
while not self.eof():
line = req.readline(readBlockSize)
self.count += len(line)
if not line:
# end of stream
if file is not None and previous_delimiter is not None:
file.write(previous_delimiter)
return True
match = boundary.match(line)
if match:
# the line is the boundary, so we bail out
# if the two last chars are '--' it is the end of the entity
return match.group(1) is not None
if line[-2:] == '\r\n':
# the line ends with a \r\n, which COULD be part
# of the next boundary. We write the previous line delimiter
# then we write the line without \r\n and save it for the next
# iteration if it was not part of the boundary
if file is not None:
if previous_delimiter is not None: file.write(previous_delimiter)
file.write(line[:-2])
previous_delimiter = '\r\n'
elif line[-1:] == '\r':
# the line ends with \r, which is only possible if
# readBlockSize bytes have been read. In that case the
# \r COULD be part of the next boundary, so we save it
# for the next iteration
assert len(line) == readBlockSize
if file is not None:
if previous_delimiter is not None: file.write(previous_delimiter)
file.write(line[:-1])
previous_delimiter = '\r'
elif line == '\n' and previous_delimiter == '\r':
# the line us a single \n and we were in the middle of a \r\n,
# so we complete the delimiter
previous_delimiter = '\r\n'
else:
if file is not None:
if previous_delimiter is not None: file.write(previous_delimiter)
file.write(line)
previous_delimiter = None
def eof(self):
return self.clen <= self.count
def __getitem__(self, key):
"""Dictionary style indexing."""
found = self.list.table()[key]
if len(found) == 1:
return found[0]
else:
return found
def get(self, key, default):
try:
return self.__getitem__(key)
except (TypeError, KeyError):
return default
def keys(self):
"""Dictionary style keys() method."""
return self.list.table().keys()
def __iter__(self):
return iter(self.keys())
def __repr__(self):
return repr(self.list.table())
def has_key(self, key):
"""Dictionary style has_key() method."""
return (key in self.list.table())
__contains__ = has_key
def __len__(self):
"""Dictionary style len(x) support."""
return len(self.list.table())
def getfirst(self, key, default=None):
""" return the first value received """
try:
return self.list.table()[key][0]
except KeyError:
return default
def getlist(self, key):
""" return a list of received values """
try:
return self.list.table()[key]
except KeyError:
return []
def items(self):
"""Dictionary-style items(), except that items are returned in the same
order as they were supplied in the form."""
return [(item.name, item) for item in self.list]
def __delitem__(self, key):
table = self.list.table()
values = table[key]
for value in values:
self.list.remove(value)
def clear(self):
self.list = FieldList()
def parse_header(line):
"""Parse a Content-type like header.
Return the main content-type and a dictionary of options.
"""
plist = map(lambda a: a.strip(), line.split(';'))
key = plist[0].lower()
del plist[0]
pdict = {}
for p in plist:
i = p.find('=')
if i >= 0:
name = p[:i].strip().lower()
value = p[i+1:].strip()
if len(value) >= 2 and value[0] == value[-1] == '"':
value = value[1:-1]
pdict[name] = value
return key, pdict
def apply_fs_data(object, fs, **args):
"""
Apply FieldStorage data to an object - the object must be
callable. Examine the args, and match then with fs data,
then call the object, return the result.
"""
# we need to weed out unexpected keyword arguments
# and for that we need to get a list of them. There
# are a few options for callable objects here:
fc = None
expected = []
if hasattr(object, "func_code"):
# function
fc = object.func_code
expected = fc.co_varnames[0:fc.co_argcount]
elif hasattr(object, 'im_func'):
# method
fc = object.im_func.func_code
expected = fc.co_varnames[1:fc.co_argcount]
elif type(object) in (TypeType,ClassType):
# class
fc = object.__init__.im_func.func_code
expected = fc.co_varnames[1:fc.co_argcount]
elif type(object) is BuiltinFunctionType:
# builtin
fc = None
expected = []
elif hasattr(object, '__call__'):
# callable object
if type(object.__call__) is MethodType:
fc = object.__call__.im_func.func_code
expected = fc.co_varnames[1:fc.co_argcount]
else:
# abuse of objects to create hierarchy
return apply_fs_data(object.__call__, fs, **args)
# add form data to args
for field in fs.list:
if field.filename:
val = field
else:
val = field.value
args.setdefault(field.name, []).append(val)
# replace lists with single values
for arg in args:
if ((type(args[arg]) is ListType) and
(len(args[arg]) == 1)):
args[arg] = args[arg][0]
# remove unexpected args unless co_flags & 0x08,
# meaning function accepts **kw syntax
if fc is None:
args = {}
elif not (fc.co_flags & 0x08):
for name in args.keys():
if name not in expected:
del args[name]
return object(**args)
RE_CDISPOSITION_FILENAME = re.compile(r'filename=(?P<filename>[\w\.]*)')
def handle_file_post(req, allowed_mimetypes=None):
"""
Handle the POST of a file.
@return: the a tuple with the full path to the file saved on disk,
and it's mimetype as provided by the request.
@rtype: (string, string)
"""
- from invenio.bibdocfile import decompose_file, md5
+ from invenio.legacy.bibdocfile.api import decompose_file, md5
## We retrieve the length
clen = req.headers_in["Content-Length"]
if clen is None:
raise InvenioWebInterfaceWSGIContentLenghtError("Content-Length header is missing")
try:
clen = int(clen)
assert (clen > 1)
except (ValueError, AssertionError):
raise InvenioWebInterfaceWSGIContentLenghtError("Content-Length header should contain a positive integer")
## Let's take the content type
ctype = req.headers_in["Content-Type"]
if allowed_mimetypes and ctype not in allowed_mimetypes:
raise InvenioWebInterfaceWSGIContentTypeError("Content-Type not in allowed list of content types: %s" % allowed_mimetypes)
## Let's optionally accept a suggested filename
suffix = prefix = ''
g = RE_CDISPOSITION_FILENAME.search(req.headers_in.get("Content-Disposition", ""))
if g:
dummy, prefix, suffix = decompose_file(g.group("filename"))
## Let's optionally accept an MD5 hash (and use it later for comparison)
cmd5 = req.headers_in.get("Content-MD5")
if cmd5:
the_md5 = md5()
## Ok. We can initialize the file
fd, path = tempfile.mkstemp(suffix=suffix, prefix=prefix, dir=CFG_TMPDIR)
the_file = os.fdopen(fd, 'w')
## Let's read the file
while True:
chunk = req.read(max(10240, clen))
if len(chunk) < clen:
## We expected to read at least clen (which is different than 0)
## but chunk was shorter! Gosh! Error! Panic!
the_file.close()
os.close(fd)
os.remove(path)
raise InvenioWebInterfaceWSGIContentLenghtError("File shorter than what specified in Content-Length")
if cmd5:
## MD5 was in the header let's compute it
the_md5.update(chunk)
## And let's definitively write the content to disk :-)
the_file.write(chunk)
clen -= len(chunk)
if clen == 0:
## That's it. Everything was read.
break
if cmd5 and the_md5.hexdigest().lower() != cmd5.strip().lower():
## Let's check the MD5
the_file.close()
os.close(fd)
os.remove(path)
raise InvenioWebInterfaceWSGIContentMD5Error("MD5 checksum does not match")
## Let's clean everything up
the_file.close()
return (path, ctype)
diff --git a/invenio/modules/access/scripts/webaccessadmin.py b/invenio/modules/access/scripts/webaccessadmin.py
index 3a606860f..f129caff1 100644
--- a/invenio/modules/access/scripts/webaccessadmin.py
+++ b/invenio/modules/access/scripts/webaccessadmin.py
@@ -1,122 +1,123 @@
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011,
## 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
__revision__ = "$Id$"
import getopt
import sys
from invenio.base.helpers import with_app_context
def usage(exitcode=1, msg=""):
"""Prints usage info."""
if msg:
print >> sys.stderr, "Error: %s." % msg
print >> sys.stderr
print >> sys.stderr, """Usage: %s [options]
General options:
-h, --help\t\tprint this help
-V, --version\t\tprint version number
Authentication options:
-u, --user=USER\tUser name needed to perform the administrative task
Option to administrate authorizations:
-a, --add\t\tadd default authorization settings
-c, --compile\t\tcompile firewall like role definitions (FireRole)
-r, --reset\t\treset to default settings
-D, --demo\t\tto be used with -a or -r in order to consider demo site authorizations
""" % sys.argv[0]
sys.exit(exitcode)
@with_app_context()
def main():
"""Main function that analyzes command line input and calls whatever
is appropriate. """
from invenio.modules.access.firerole import repair_role_definitions
- import invenio.access_control_admin as acca
+ from invenio.modules.access.control import (acc_add_default_settings,
+ acc_reset_default_settings)
from invenio.base.globals import cfg
- from invenio.bibtask import authenticate
+ from invenio.legacy.bibsched.bibtask import authenticate
from invenio.modules.access.local_config import DEF_DEMO_USER_ROLES, \
DEF_DEMO_ROLES, DEF_DEMO_AUTHS
## parse command line:
# set user-defined options:
options = {'user' : '', 'reset' : 0, 'compile' : 0, 'add' : 0, 'demo' : 0}
try:
opts, args = getopt.getopt(sys.argv[1:], "hVu:racD",
["help", "version", "user=",
"reset", "add", "compile", "demo"])
except getopt.GetoptError, err:
usage(1, err)
try:
for opt in opts:
if opt[0] in ("-h", "--help"):
usage(0)
elif opt[0] in ("-V", "--version"):
print __revision__
sys.exit(0)
elif opt[0] in ("-u", "--user"):
options["user"] = opt[1]
elif opt[0] in ("-r", "--reset"):
options["reset"] = 1
elif opt[0] in ("-a", "--add"):
options["add"] = 1
elif opt[0] in ("-c", "--compile"):
options["compile"] = 1
elif opt[0] in ("-D", "--demo"):
options["demo"] = 1
else:
usage(1)
if options['add'] or options['reset'] or options['compile']:
#if acca.acc_get_action_id('cfgwebaccess'):
# # Action exists hence authentication works :-)
# options['user'] = authenticate(options['user'],
# authorization_msg="WebAccess Administration",
# authorization_action="cfgwebaccess")
if options['reset'] and options['demo']:
- acca.acc_reset_default_settings([cfg['CFG_SITE_ADMIN_EMAIL']], DEF_DEMO_USER_ROLES, DEF_DEMO_ROLES, DEF_DEMO_AUTHS)
+ acc_reset_default_settings([cfg['CFG_SITE_ADMIN_EMAIL']], DEF_DEMO_USER_ROLES, DEF_DEMO_ROLES, DEF_DEMO_AUTHS)
print "Reset default demo site settings."
elif options['reset']:
- acca.acc_reset_default_settings([cfg['CFG_SITE_ADMIN_EMAIL']])
+ acc_reset_default_settings([cfg['CFG_SITE_ADMIN_EMAIL']])
print "Reset default settings."
elif options['add'] and options['demo']:
- acca.acc_add_default_settings([cfg['CFG_SITE_ADMIN_EMAIL']], DEF_DEMO_USER_ROLES, DEF_DEMO_ROLES, DEF_DEMO_AUTHS)
+ acc_add_default_settings([cfg['CFG_SITE_ADMIN_EMAIL']], DEF_DEMO_USER_ROLES, DEF_DEMO_ROLES, DEF_DEMO_AUTHS)
print "Added default demo site settings."
elif options['add']:
- acca.acc_add_default_settings([cfg['CFG_SITE_ADMIN_EMAIL']])
+ acc_add_default_settings([cfg['CFG_SITE_ADMIN_EMAIL']])
print "Added default settings."
if options['compile']:
repair_role_definitions()
print "Compiled firewall like role definitions."
else:
usage(1, "You must specify at least one command")
except StandardError, e:
from invenio.ext.logging import register_exception
register_exception()
usage(e)
return
### okay, here we go:
if __name__ == '__main__':
main()
diff --git a/invenio/modules/apikeys/models.py b/invenio/modules/apikeys/models.py
index 1cb6ec544..e89e1d587 100644
--- a/invenio/modules/apikeys/models.py
+++ b/invenio/modules/apikeys/models.py
@@ -1,329 +1,329 @@
# -*- coding: utf-8 -*-
#
## This file is part of Invenio.
## Copyright (C) 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02D111-1307, USA.
"""
Web API Key database models.
"""
# General imports.
from werkzeug import cached_property
from urlparse import parse_qs, urlparse, urlunparse
from sqlalchemy.exc import IntegrityError
from sqlalchemy.orm.exc import NoResultFound
import hmac
import time
import re
try:
from uuid import uuid4
except ImportError:
import random
def uuid4():
return "%x" % random.getrandbits(16*8)
from urllib import urlencode, basejoin
from invenio.base.globals import cfg
from invenio.utils.hash import sha1
from invenio.ext.sqlalchemy import db
# Create your models here.
from invenio.modules.accounts.models import User
class WebAPIKey(db.Model):
"""Represents a Web API Key record."""
__tablename__ = 'webapikey'
#There are three status key that must be here: OK, REMOVED and REVOKED
#the value doesn't matter at all
CFG_WEB_API_KEY_STATUS = {'OK': 'OK',
'REMOVED': 'REMOVED',
'REVOKED': 'REVOKED',
'WARNING': 'WARNING',
}
@cached_property
def allowed_url(self):
"""List of allowed urls."""
return [(re.compile(_url), _authorized_time, _need_timestamp)
for _url, _authorized_time, _need_timestamp in
cfg.get('CFG_WEB_API_KEY_ALLOWED_URL', [])]
id = db.Column(db.String(150), primary_key=True, nullable=False)
secret = db.Column(db.String(150), nullable=False)
id_user = db.Column(db.Integer(15, unsigned=True), db.ForeignKey(User.id),
nullable=False)
status = db.Column(db.String(25), nullable=False,
server_default='OK', index=True)
description = db.Column(db.String(255), nullable=True)
@classmethod
def create_new(cls, uid, key_description=None):
"""
Creates a new pair REST API key / secret key for the user. To do that it
uses the uuid4 function.
@param uid: User's id for the new REST API key
@type uid: int
@param key_description: User's description for the REST API key
@type key_description: string
"""
key_id = str(uuid4())
key_secrect = str(uuid4())
while True:
try:
new_key = WebAPIKey(id=key_id, secret=key_secrect, id_user=uid,
description=key_description)
db.session.add(new_key)
db.session.commit()
break
except IntegrityError:
key_id = str(uuid4())
@classmethod
def show_keys(cls, uid, diff_status=None):
"""
Makes a query to the DB to obtain all the user's REST API keys
@param uid: User's id
@type uid: int
@param diff_status: This string indicates if the query will show
all the REST API keys or only the ones that still active (usefull in the
admin part)
@type diff_statusparam: string
@return: Tuples with the id, description and status of the user's REST API
keys
"""
if diff_status is None:
diff_status = cls.CFG_WEB_API_KEY_STATUS['REMOVED']
return db.session.query(WebAPIKey.id, WebAPIKey.description, WebAPIKey.status).\
filter(WebAPIKey.id_user == uid,
WebAPIKey.status != diff_status).all()
@classmethod
def mark_as(cls, key_id, status):
"""
When the user wants to remove one of his key, this functions puts the status
value of that key to remove, this way the user doesn't see the key anymore
but the admin user stills see it, make statistics whit it, etc.
@param key_id: The id of the REST key that will be "removed"
@type key_id: string
"""
assert status in cls.CFG_WEB_API_KEY_STATUS
cls.query.filter_by(id=key_id).\
update({'status': status})
@classmethod
def get_available(cls, uid=None, apikey=None):
"""
Search for all the available REST keys, it means all the user's keys that are
not marked as REMOVED or REVOKED
@param uid: The user id
@type uid: int
@param apikey: the apikey/id
@return: WebAPIKey objects
"""
filters = {}
if uid is not None:
filters['id_user'] = uid
if apikey is not None:
filters['id'] = apikey
return cls.query.\
filter_by(**filters). \
filter(WebAPIKey.status != cls.CFG_WEB_API_KEY_STATUS['REMOVED'],
WebAPIKey.status != cls.CFG_WEB_API_KEY_STATUS['REVOKED']
).all()
@classmethod
def get_server_signature(cls, secret, url):
from flask import request
secret = str(secret)
if request.base_url not in url:
url = basejoin(request.base_url, url)
return hmac.new(secret, url, sha1).hexdigest()
@classmethod
def acc_get_uid_from_request(cls):
"""
Looks in the data base for the secret that matches with the API key in the
request. If the REST API key is found and if the signature is correct
returns the user's id.
@return: If everything goes well it returns the user's uid, if not -1
"""
- from invenio.webstat import register_customevent
+ from invenio.legacy.webstat.api import register_customevent
from flask import request
api_key = signature = timestamp = None
# Get the params from the GET/POST request
if 'apikey' in request.values:
api_key = request.values['apikey']
if 'signature' in request.values:
signature = request.values['signature']
if 'timestamp' in request.values:
timestamp = request.values['timestamp']
# Check if the request is well built
if api_key is None or signature is None:
return -1
# Remove signature from the url params
path = request.base_url
url_req = request.url
parsed_url = urlparse(url_req)
params = parse_qs(parsed_url.query)
params = dict([(i, j[0]) for i, j in list(params.items())])
try:
del params['signature']
except KeyError: # maybe signature was in post params
pass
# Reconstruct the url
query = urlencode(sorted(params.items(), key=lambda x: x[0]))
url_req = urlunparse((parsed_url.scheme,
parsed_url.netloc,
parsed_url.path,
parsed_url.params,
query,
parsed_url.fragment))
authorized_time = None
need_timestamp = False
for url, authorized_time, need_timestamp in self.allowed_url:
if url.match(url_req) is not None:
break
if need_timestamp and timestamp is None:
return -1
if authorized_time is None:
return -1
if authorized_time != 0 and need_timestamp:
time_lapse = time.time() - float(timestamp)
if time_lapse > authorized_time or time_lapse < 0:
return -1
keys = cls.get_available(apikey=api_key)
if not len(keys):
return -1
key = keys[0]
uid = key.id_user
secret_key = key.secret
server_signature = cls.get_server_signature(secret_key, url_req)
if signature == server_signature:
#If the signature is fine, log the key activity and return the UID
register_customevent("apikeyusage", [uid, api_key, path, url_req])
return uid
else:
return -1
@classmethod
def build_web_request(cls, path, params=None, uid=-1, api_key=None, timestamp=True):
"""
Build a new request that uses REST authentication.
1. Add your REST API key to the params
2. Add the current timestamp to the params, if needed
3. Sort the query string params
4. Merge path and the sorted query string to a single string
5. Create a HMAC-SHA1 signature of this string using your secret key as the key
6. Append the hex-encoded signature to your query string
@note: If the api_key parameter is None, then this method performs a search
in the data base using the uid parameter to get on of the user's REST
API key. If the user has one or more usable REST API key this method
uses the first to appear.
@param path: uri of the request until the "?" (i.e.: /search)
@type path: string
@param params: All the params of the request (i.e.: req.args or a dictionary
with the param name as key)
@type params: string or dict
@param api_key: User REST API key
@type api_key: string
@param uid: User's id to do the search for the REST API key
@type uid: int
@param timestamp: Indicates if timestamp is needed in the request
@type timestamp: boolean
@return: Signed request string or, in case of error, ''
"""
if params is None:
params = {}
if not isinstance(params, dict):
if len(params) != 0 and params[0] == '?':
params = params.replace('?', '')
params = parse_qs(params)
params = dict([(i, j[0]) for i, j in list(params.items())])
if api_key:
params['apikey'] = api_key
elif uid > 0:
keys = cls.get_available(uid=uid)
if len(keys):
api_key = keys[0][0]
params['apikey'] = api_key
else:
return ''
else:
return ''
if timestamp:
params['timestamp'] = str(int(time.time()))
parsed_url = urlparse(path)
query = urlencode(sorted(params.items(), key=lambda x: x[0]))
url = urlunparse((parsed_url.scheme,
parsed_url.netloc,
parsed_url.path,
parsed_url.params,
query,
parsed_url.fragment))
try:
secret_key = cls.query.filter_by(id=api_key).one().secret
except NoResultFound:
return ''
signature = cls.get_server_signature(secret_key, url)
params['signature'] = signature
query = urlencode(params)
return urlunparse((parsed_url.scheme,
parsed_url.netloc,
parsed_url.path,
parsed_url.params,
query,
parsed_url.fragment))
diff --git a/invenio/modules/bulletin/format_elements/bfe_webjournal_articles_overview.py b/invenio/modules/bulletin/format_elements/bfe_webjournal_articles_overview.py
index bb426754a..0834c2d7b 100644
--- a/invenio/modules/bulletin/format_elements/bfe_webjournal_articles_overview.py
+++ b/invenio/modules/bulletin/format_elements/bfe_webjournal_articles_overview.py
@@ -1,488 +1,488 @@
# -*- coding: utf-8 -*-
## $Id: bfe_webjournal_MainArticleOverview.py,v 1.28 2009/02/12 10:00:57 jerome Exp $
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
WebJournal Element - Creates an overview of all the articles of a
certain category in one specific issue.
"""
import re
import os
import urllib, urllib2
try:
from PIL import Image
PIL_imported = True
except ImportError:
PIL_imported = False
from invenio.modules.formatter.engine import BibFormatObject
from invenio.utils.html import HTMLWasher, remove_html_markup
from invenio.base.i18n import gettext_set_language
from invenio.config import \
CFG_ACCESS_CONTROL_LEVEL_SITE, \
CFG_TMPDIR, \
CFG_SITE_LANG
from invenio.webjournal_utils import \
cache_index_page, \
get_index_page_from_cache, \
parse_url_string, \
make_journal_url, \
get_journal_articles, \
issue_is_later_than, \
get_current_issue
from invenio.webjournal_utils import \
img_pattern, \
header_pattern, \
header_pattern2, \
para_pattern
from invenio.utils.url import create_html_link
-from invenio.bibdocfile import decompose_file
+from invenio.legacy.bibdocfile.api import decompose_file
def format_element(bfo, number_of_featured_articles="1",
number_of_articles_with_image="3", new_articles_first='yes',
image_px_width="300", small_image_px_width="200",
subject_to_css_class_kb="WebJournalSubject2CSSClass",
link_image_to_article='yes', image_alignment='left'):
"""
Creates an overview of all the articles of a certain category in one
specific issue.
Note the following:
<ul>
<li>The element consider only the latest issue: when viewing
archives of your journal, readers will see the newest articles of
the latest issue, not the ones of the issue they are looking
at</li>
<li>This is not an index of the articles of the latest issue: it
display only <b>new</b> articles, that is articles that have never
appeared in a previous issue</li>
<li>This element produces a table-based layout, in order to have a
more or less readable HTML alert when sent some Email clients
(Outlook 2007)</li>
<li>When producing the HTML output of images, this element tries to
insert the width and height attributes to the img tag: this is
necessary in order to produce nice HTML alerts. This dimension
therefore overrides any dimension defined in the CSS. The Python
Image Library (PIL) should be installed for this element to
recognize the size of images.</li>
</ul>
@param number_of_featured_articles: the max number of records with emphasized title
@param number_of_articles_with_image: the max number of records for which their image is displayed
@param new_articles_first: if 'yes', display new articles before other articles
@param image_px_width: (integer) width of first image featured on this page
@param small_image_px_width: (integer) width of small images featured on this page
@param subject_to_css_class_kb: knowledge base that maps 595__a to a CSS class
@param link_image_to_article: if 'yes', link image (if any) to article
@param image_alignment: 'left', 'center' or 'right'. To help rendering in Outlook.
"""
args = parse_url_string(bfo.user_info['uri'])
journal_name = args["journal_name"]
this_issue_number = args["issue"]
category_name = args["category"]
verbose = args["verbose"]
ln = bfo.lang
_ = gettext_set_language(ln)
if image_px_width.isdigit():
image_px_width = int(image_px_width)
else:
image_px_width = None
if small_image_px_width.isdigit():
small_image_px_width = int(small_image_px_width)
else:
small_image_px_width = None
# We want to put emphasis on the n first articles (which are not
# new)
if number_of_featured_articles.isdigit():
number_of_featured_articles = int(number_of_featured_articles)
else:
number_of_featured_articles = 0
# Only n first articles will display images
if number_of_articles_with_image.isdigit():
number_of_articles_with_image = int(number_of_articles_with_image)
else:
number_of_articles_with_image = 0
# Help image alignement without CSS, to have better rendering in Outlook
img_align = ''
if image_alignment:
img_align = 'align="%s"' % image_alignment
# Try to get the page from cache. Only if issue is older or equal
# to latest release.
latest_released_issue = get_current_issue(ln, journal_name)
if verbose == 0 and not issue_is_later_than(this_issue_number,
latest_released_issue):
cached_html = get_index_page_from_cache(journal_name, category_name,
this_issue_number, ln)
if cached_html:
return cached_html
out = '<table border="0" cellpadding="0" cellspacing="0">'
# Get the id list
ordered_articles = get_journal_articles(journal_name,
this_issue_number,
category_name,
newest_first=new_articles_first.lower() == 'yes')
new_articles_only = False
if ordered_articles.keys() and max(ordered_articles.keys()) < 0:
# If there are only new articles, don't bother marking them as
# new
new_articles_only = True
order_numbers = ordered_articles.keys()
order_numbers.sort()
img_css_class = "featuredImageScale"
for order_number in order_numbers:
for article_id in ordered_articles[order_number]:
# A record is considered as new if its position is
# negative and there are some non-new articles
article_is_new = (order_number < 0 and not new_articles_only)
temp_rec = BibFormatObject(article_id)
title = ''
if ln == "fr":
title = temp_rec.field('246_1a')
if title == '':
title = temp_rec.field('245__a')
else:
title = temp_rec.field('245__a')
if title == '':
title = temp_rec.field('246_1a')
# Get CSS class (if relevant)
notes = temp_rec.fields('595__a')
css_classes = [temp_rec.kb(subject_to_css_class_kb, note, None) \
for note in notes]
css_classes = [css_class for css_class in css_classes \
if css_class is not None]
if article_is_new:
css_classes.append('new')
# Maybe we want to force image to appear?
display_image_on_index = False
if 'display_image_on_index' in notes:
display_image_on_index = True
# Build generic link to this article
article_link = make_journal_url(bfo.user_info['uri'], {'recid':str(article_id),
'ln': bfo.lang})
# Build the "more" link
more_link = '''<a class="readMore" title="link to the article" href="%s"> &gt;&gt; </a>
''' % (article_link)
# If we should display an image along with the text,
# prepare it here
img = ''
if (number_of_articles_with_image > 0 and \
not article_is_new) or display_image_on_index:
img = _get_feature_image(temp_rec, ln)
if img != "":
# Now we will try to identify image size in order
# to resize it in the HTML for a nicer rendering
# of the HTML alert in email clients (Outlook wants
# both height and width)
img_width = None
img_height = None
small_img_width = None
small_img_height = None
width_and_height = ''
if PIL_imported:
try:
local_img = os.path.join(CFG_TMPDIR,
'webjournal_' + \
''.join([char for char in img \
if char.isalnum()]))
if len(local_img) > 255:
# Shorten to 255 chars
local_img = local_img[0:100] + '_' + local_img[156:]
if not os.path.exists(local_img):
# Too bad, must download entire image for PIL
content_type = get_content_type(img)
if 'image' in content_type:
(local_img, headers) = urllib.urlretrieve(img, local_img)
img_file = Image.open(local_img) # IOError if not readable image
else:
raise IOError('Not an image')
else:
img_file = Image.open(local_img) # IOError if not readable image
except IOError, e:
pass
else:
orig_img_width = img_file.size[0]
orig_img_height = img_file.size[1]
# Then scale according to user-defined width
## First image
ratio = float(orig_img_width) / image_px_width
img_width = image_px_width
img_height = int(orig_img_height / ratio)
## Other smaller images
ratio = float(orig_img_width) / small_image_px_width
small_img_width = small_image_px_width
small_img_height = int(orig_img_height / ratio)
# Note that we cannot reuse the nice phl, ph and
# phr classes to put a frame around the image:
# this is not supported in Outlook 2007 when HTML
# alert is sent.
if not img_css_class == "featuredImageScale":
# Not first image: display smaller
img_width = small_img_width
img_height = small_img_height
if img_width and img_height:
width_and_height = 'width="%i" height="%i"' % \
(img_width, img_height)
img = '<img alt="" class="%s" src="%s" %s %s/>' % \
(img_css_class, img, img_align, width_and_height)
number_of_articles_with_image -= 1
# Next images will be displayed smaller
img_css_class = "featuredImageScaleSmall"
# Determine size of the title
header_tag_size = '3'
if number_of_featured_articles > 0 and \
not article_is_new:
# n first articles are especially featured
header_tag_size = '2'
number_of_featured_articles -= 1
# Finally create the output. Two different outputs
# depending on if we have text to display or not
text = ''
if not article_is_new:
text = _get_feature_text(temp_rec, ln)
# Link image to article if wanted
if link_image_to_article.lower() == 'yes':
img = create_html_link(urlbase=article_link,
link_label=img,
urlargd={})
if text != '':
out += '''
<tr><td class="article">
<h%(header_tag_size)s class="%(css_classes)s articleTitle" style="clear:both;">
<a title="link to the article" href="%(article_link)s">%(title)s</a>
</h%(header_tag_size)s>
<div class="articleBody">
%(img)s
%(text)s
%(more_link)s
</div>
</td></tr>
''' % {'article_link': article_link,
'title': title,
'img': img,
'text': text,
'more_link': more_link,
'css_classes': ' '.join(css_classes),
'header_tag_size': header_tag_size}
else:
out += '''
<tr><td class="article">
<h%(header_tag_size)s class="%(css_classes)s articleTitle" style="clear:both;">
<a title="link to the article" href="%(article_link)s">%(title)s</a>&nbsp;&nbsp;
%(more_link)s
</h%(header_tag_size)s>
%(img)s
</td></tr>
''' % {'article_link': article_link,
'title': title,
'more_link': more_link,
'img': img,
'css_classes': ' '.join(css_classes),
'header_tag_size': header_tag_size}
out += '</table>'
if verbose == 0 and not CFG_ACCESS_CONTROL_LEVEL_SITE == 2 :
cache_index_page(out, journal_name, category_name,
this_issue_number, ln)
return out
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
def _get_feature_image(record, ln=CFG_SITE_LANG):
"""
Looks for an image that can be featured on the article overview page.
"""
src = ''
if ln == "fr":
article = ''.join(record.fields('590__b'))
if not article:
article = ''.join(record.fields('520__b'))
else:
article = ''.join(record.fields('520__b'))
if not article:
article = ''.join(record.fields('590__b'))
image = re.search(img_pattern, article)
if image:
src = image.group("image")
if not src:
# Look for an attached image
icons = [icon for icon in record.fields('8564_q') if \
(decompose_file(icon)[2] in ['jpg', 'jpeg', 'png', 'gif'])]
if icons:
src = icons[0]
return src
def _get_first_sentence_or_part(header_text):
"""
Tries to cut the text at the end of the first sentence or an empty space
between char 200 and 300. Else return 250 first chars.
"""
header_text = header_text.lstrip()
first_sentence = header_text[100:].find(".")
if first_sentence == -1:
# try question mark
first_sentence = header_text[100:].find("?")
if first_sentence == -1:
# try exclamation mark
first_sentence = header_text[100:].find("!")
if first_sentence != -1 and first_sentence < 250:
return "%s." % header_text[:(100+first_sentence)]
else:
an_empty_space = header_text[200:].find(" ")
if an_empty_space != -1 and an_empty_space < 300:
return "%s..." % header_text[:(200+an_empty_space)]
else:
return "%s..." % header_text[:250]
def _get_feature_text(record, language):
"""
Looks for a text (header) that can be featured on the article overview
page.
"""
washer = HTMLWasher()
header_text = ""
# Check if there is a header
if language == "fr":
header = record.field('590__a')
if header.strip() in \
['', '<br/>', '<!--HTML--><br />', '<!--HTML-->']:
header = record.field('520__a')
else:
header = record.field('520__a')
if header.strip() in \
['', '<br/>', '<!--HTML--><br />', '<!--HTML-->']:
header = record.field('590__a')
header = washer.wash(html_buffer=header,
allowed_tag_whitelist=[],
allowed_attribute_whitelist=[])
if header != "":
header_text = header
else:
if language == "fr":
article = record.fields('590__b')
if not article or \
(len(article) == 1 and \
article[0].strip() in \
['', '<br />', '<!--HTML--><br />', '<!--HTML-->']):
article = record.fields('520__b')
else:
article = record.fields('520__b')
if not article or \
(len(article) == 1 and \
article[0].strip() in \
['', '<br />', '<!--HTML--><br />', '<!--HTML-->']):
article = record.fields('590__b')
try:
article = article[0]
except:
return ''
match_obj = re.search(header_pattern, article)
if not match_obj:
match_obj = re.search(header_pattern2, article)
try:
header_text = match_obj.group("header")
header_text = washer.wash(html_buffer=header_text,
allowed_tag_whitelist=['a'],
allowed_attribute_whitelist=['href',
'target',
'class'])
if header_text == "":
raise Exception
except:
article = article.replace(header_text, '')
article = article.replace('<p/>', '')
article = article.replace('<p>&nbsp;</p>', '')
match_obj = re.search(para_pattern, article)
try:
# get the first paragraph
header_text = match_obj.group("paragraph")
try:
header_text = washer.wash(html_buffer=header_text,
allowed_tag_whitelist=[],
allowed_attribute_whitelist=[])
except:
# was not able to parse correctly the HTML. Use
# this safer function, but producing less good
# results
header_text = remove_html_markup(header_text)
if header_text.strip() == "":
raise Exception
else:
if len(header_text) > 250:
header_text = _get_first_sentence_or_part(header_text)
except:
# in a last instance get the first sentence
try:
article = washer.wash(article,
allowed_tag_whitelist=[],
allowed_attribute_whitelist=[])
except:
# was not able to parse correctly the HTML. Use
# this safer function, but producing less good
# results
article = remove_html_markup(article)
header_text = _get_first_sentence_or_part(article)
return header_text
def get_content_type(url):
"""
Returns the content-type of the given URL.
Return empty string if content-type could not be resolved
@param url: URL for which we would like to get the content-type
@type url: string
@return: the content-type of the given URL
@rtype: string
"""
req = urllib2.Request(url)
try:
response = urllib2.urlopen(req)
return response.info().getheader('content-type')
except Exception, e:
return ''
diff --git a/invenio/modules/bulletin/format_elements/bfe_webjournal_trackback_auto_discovery.py b/invenio/modules/bulletin/format_elements/bfe_webjournal_trackback_auto_discovery.py
index 579c328b0..52cc34b34 100644
--- a/invenio/modules/bulletin/format_elements/bfe_webjournal_trackback_auto_discovery.py
+++ b/invenio/modules/bulletin/format_elements/bfe_webjournal_trackback_auto_discovery.py
@@ -1,48 +1,48 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
WebJournal Element - return trackback auto discovery tag
"""
import cgi
from invenio.webjournal_utils import parse_url_string
-from invenio.weblinkback_templates import get_trackback_auto_discovery_tag
+from invenio.legacy.weblinkback.templates import get_trackback_auto_discovery_tag
from invenio.config import CFG_WEBLINKBACK_TRACKBACK_ENABLED
def format_element(bfo):
"""
Return trackback auto discovery tag if recid != -1, will return "" for recid == -1 like index pages
"""
html = ""
if CFG_WEBLINKBACK_TRACKBACK_ENABLED:
# Retrieve context (journal, issue and category) from URI
args = parse_url_string(bfo.user_info['uri'])
recid = args["recid"]
if recid != -1:
html = get_trackback_auto_discovery_tag(recid)
return html
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
diff --git a/invenio/modules/comments/static/css/comments/comments.css b/invenio/modules/comments/static/css/comments/comments.css
index 155c4b5d9..bf3495cb3 100644
--- a/invenio/modules/comments/static/css/comments/comments.css
+++ b/invenio/modules/comments/static/css/comments/comments.css
@@ -1,7 +1,11 @@
.comments .collapsed i.icon-chevron-down {
background-position: -456px -72px !important;
}
.comments hr {
- margin: 0px;
+ margin: 8px;
+}
+
+.comments li {
+ min-height: 10px;
}
diff --git a/invenio/modules/comments/templates/comments/comments.html b/invenio/modules/comments/templates/comments/comments.html
index 1775616c4..8e5160e8c 100644
--- a/invenio/modules/comments/templates/comments/comments.html
+++ b/invenio/modules/comments/templates/comments/comments.html
@@ -1,175 +1,178 @@
{#
## This file is part of Invenio.
## Copyright (C) 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
#}
{%- if not request.is_xhr -%}
{% extends "records/base.html" %}
{%- endif -%}
{% css url_for('comments.static', filename='css/comments/comments.css'), '10-comments' %}
{% js url_for('comments.static', filename='js/comments/collapse.js'), '10-comments' %}
{% block record_content %}
<div class="page-header">
{{ format_record(recid, 'hs', ln=g.ln)|safe }}
</div>
<div class="page-header">
<h4>
{{ _("Comments") }}
<small>
{% if current_user.is_guest %}
<a class="btn pull-right" href="{{ url_for('webaccount.login', referer=request.url) }}">
<i class="icon-pencil"></i> {{ _('write comment') }}
</a>
{% else %}
<a class="btn pull-right" data-toggle="modal" href="{{ url_for('comments.add_comment', recid=recid) }}">
<i class="icon-pencil"></i> {{ _('write comment') }}
</a>
{% endif %}
</small>
</h4>
</div>
{%- if comments -%}
<ul class="comments unstyled">
{%- for c in comments recursive -%}
<li name="{{ c.id }}">
<a class="collapse-comment pull-left{{ ' collapsed' if c.is_collapsed(current_user.get_id()) }}"
style="margin-right: 5px;"
data-toggle="collapse"
data-target="#collapse-{{ c.id }}"
href="{#{ url_for('comments.toggle', recid=recid, id=c.id) }#}">
<i class="icon-chevron-down"></i>
</a>
<h4>
- {{ c.title }}
- <small>
- {%- if c.nb_votes_total > 0 -%}
- {%- set votes = c.nb_votes_yes-(c.nb_votes_total-c.nb_votes_yes) -%}
- {%- if votes > 0 -%}
- / <span class="badge badge-success">+{{ votes }}</span>
- {%- elif votes < 0 -%}
- / <span class="badge badge-important">{{ votes }}</span>
+ {{ c.title }}
+ <small>
+ {%- if c.nb_votes_total > 0 -%}
+ {%- set votes = c.nb_votes_yes-(c.nb_votes_total-c.nb_votes_yes) -%}
+ {%- if votes > 0 -%}
+ / <span class="badge badge-success">+{{ votes }}</span>
+ {%- elif votes < 0 -%}
+ / <span class="badge badge-important">{{ votes }}</span>
+ {%- endif -%}
+ {%- endif -%}
+ <a class="pull-right" title="{{ _('Permalink to this comment') }}" href="#{{ c.id }}">¶</a>
+ </small>
+ {%- if not c.title -%}
+ &nbsp;
{%- endif -%}
- {%- endif -%}
- <a class="pull-right" title="{{ _('Permalink to this comment') }}" href="#{{ c.id }}">¶</a>
- </small>
</h4>
<div id="collapse-{{ c.id }}"
data-action="{{ url_for('comments.toggle', recid=recid, id=c.id) }}"
class="collapse{{ ' in' if not c.is_collapsed(current_user.get_id()) }}">
<blockquote>
<p style="font-size:90%;">
{{ c.body|quoted_txt2html(
indent_html=(
'<span style="border-left: 3px solid #CCC; padding-left:5px;">',
'</span>'))|safe }}
</p>
<small>
{%- if c.user -%}
<img src="{{ c.user.email|gravatar(size=14, default=url_for('static', filename='img/user-icon-1-16x16.gif', _external=True)) }}" alt="avatar"/>
<a href="{{ url_for('webmessage.add', sent_to_user_nicks=c.user.nickname) }}">
{{ c.user.nickname }}
</a>
{%- else -%}
<img src="/img/user-icon-1-16x16.gif" alt="avatar"/>
{{ _('Guest') }}
{%- endif -%} &nbsp;
- <i class="icon-time"></i> {{ c.date_creation }}
- <i class="icon-pencil"></i>
<a data-toggle="modal" href="{{ url_for('comments.add_comment', recid=recid, in_reply=c.id) }}">
{{ _('reply') }}
</a>
- <i class="icon-question-sign"></i> {{ _('Was it helpful?') }}
<a href="{{ url_for('comments.vote', recid=recid, id=c.id, value=1,
referer=request.url
) }}">
<i class="icon-thumbs-up"></i>
{{ _('yes') }}
</a> /
<a href="{{ url_for('comments.vote', recid=recid, id=c.id, value=-1,
referer=request.url
) }}">
<i class="icon-thumbs-down"></i>
{{ _('no') }}
</a>
-
<a href="{{ url_for('comments.report', recid=recid, id=c.id) }}">
<i class="icon-exclamation-sign"></i>
{{ _('report abuse') }}
</a>
</small>
</blockquote>
{%- if c.replies -%}
<ul class="unstyled" style="padding-left: 20px;">
{{ loop(c.replies) }}
</ul>
{%- endif -%}
</div>
{% if not loop.last %}
<hr/>
{% endif %}
</li>
{%- endfor -%}
</ul>
{% if current_user.is_guest %}
<a class="btn pull-right" href="{{ url_for('webaccount.login', referer=request.url) }}">
<i class="icon-pencil"></i> {{ _('write comment') }}
</a>
{% else %}
<a class="btn pull-right" data-toggle="modal" href="{{ url_for('comments.add_comment', recid=recid) }}">
<i class="icon-pencil"></i> {{ _('write comment') }}
</a>
{% endif %}
{%- else -%}
<div class="alert alert-info">
{{ _('There are no comments. Be the first commenting this record.') }}
</div>
{%- endif -%}
<div style="clear:both"/></div>
<hr/>
{%- if record.user_comment_subscritions|length -%}
<div class="alert">
{%- set info_subs = _('%(open_tag)s Unsubscribe %(close_tag)s from this discussion. You will not receive any new comments by email.') % {
'open_tag': '<strong><a href="'+url_for('comments.unsubscribe', recid=recid)+'"><i class="icon-trash"></i>',
'close_tag':'</a></strong>'} -%}
{{ info_subs|safe }}
</div>
{%- else -%}
<div class="alert alert-info">
{%- set info_subs = _('%(open_tag)s Subscribe %(close_tag)s to this discussion. You will then receive all new comments by email.') % {
'open_tag': '<strong><a href="'+url_for('comments.subscribe', recid=recid)+'"><i class="icon-envelope"></i>',
'close_tag':'</a></strong>'} -%}
{{ info_subs|safe }}
</div>
{%- endif -%}
{% endblock %}
diff --git a/invenio/modules/comments/views.py b/invenio/modules/comments/views.py
index bd31cbfcd..f65d897ab 100644
--- a/invenio/modules/comments/views.py
+++ b/invenio/modules/comments/views.py
@@ -1,336 +1,336 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebSearch Flask Blueprint"""
from datetime import datetime
import socket
from flask import g, render_template, request, flash, redirect, url_for, \
current_app, abort, Blueprint
from invenio.ext.sqlalchemy import db
from invenio.utils.mail import email_quote_txt
from .models import CmtRECORDCOMMENT, CmtSUBSCRIPTION, \
CmtACTIONHISTORY
from .forms import AddCmtRECORDCOMMENTForm, AddCmtRECORDCOMMENTFormReview
from invenio.base.i18n import _
from invenio.base.decorators import templated
from flask.ext.login import current_user, login_required
from invenio.ext.menu import register_menu
from invenio.ext.breadcrumb import register_breadcrumb
from invenio.ext.principal import permission_required
#from invenio.config import CFG_SITE_RECORD
CFG_SITE_RECORD = 'record'
-from .config import CFG_WEBCOMMENT_ACTION_CODE
+from invenio.base.globals import cfg
blueprint = Blueprint('comments', __name__, url_prefix="/" + CFG_SITE_RECORD,
template_folder='templates', static_folder='static')
from invenio.modules.records.views import request_record
def log_comment_action(action_code, id, recid, uid=None):
action = CmtACTIONHISTORY(
id_cmtRECORDCOMMENT=id,
id_bibrec=recid,
id_user=uid or current_user.get_id(),
client_host=socket.inet_aton(request.remote_addr),
action_time=datetime.now(),
action_code=action_code)
db.session.add(action)
db.session.commit()
class CommentRights(object):
def __init__(self, comment, uid=None):
self.id = comment
self.uid = uid or current_user.get_id()
self.id_collection = 0 # FIXME
def authorize_action(self, *args, **kwargs):
from invenio.modules.access.engine import acc_authorize_action
return acc_authorize_action(*args, **kwargs)
def can_perform_action(self, action=None):
cond = CmtACTIONHISTORY.id_user == self.uid \
if self.uid > 0 else \
CmtACTIONHISTORY.client_host == socket.inet_aton(request.remote_addr)
- if action in CFG_WEBCOMMENT_ACTION_CODE:
+ if action in cfg['CFG_WEBCOMMENT_ACTION_CODE']:
cond = db.and_(cond, CmtACTIONHISTORY.action_code ==
- CFG_WEBCOMMENT_ACTION_CODE[action])
+ cfg['CFG_WEBCOMMENT_ACTION_CODE'][action])
return CmtACTIONHISTORY.query.filter(
CmtACTIONHISTORY.id_cmtRECORDCOMMENT == self.id, cond).\
count() == 0
def can_view_restricted_comment(self, restriction):
#restriction = self.comment.restriction
if restriction == "":
return (0, '')
return self.authorize_action(
self.uid,
'viewrestrcomment',
status=restriction)
def can_send_comment(self):
return self.authorize_action(
self.uid,
'sendcomment',
authorized_if_no_roles=True,
collection=self.id_collection)
def can_attach_comment_file(self):
return self.authorize_action(
self.uid,
'attachcommentfile',
authorized_if_no_roles=False,
collection=self.id__collection)
@blueprint.route('/<int:recid>/comments/add', methods=['GET', 'POST'])
@request_record
@login_required
@permission_required('sendcomment', authorized_if_no_roles=True,
collection=lambda: g.collection.id)
def add_comment(recid):
uid = current_user.get_id()
in_reply = request.args.get('in_reply', type=int)
if in_reply is not None:
comment = CmtRECORDCOMMENT.query.get(in_reply)
if comment.id_bibrec != recid or comment.is_deleted:
abort(401)
if comment is not None:
c = CmtRECORDCOMMENT()
c.title = _('Re: ') + comment.title
c.body = email_quote_txt(comment.body or '')
c.in_reply_to_id_cmtRECORDCOMMENT = in_reply
form = AddCmtRECORDCOMMENTForm(request.form, obj=c)
return render_template('comments/add.html', form=form)
form = AddCmtRECORDCOMMENTForm(request.values)
if form.validate_on_submit():
c = CmtRECORDCOMMENT()
form.populate_obj(c)
c.id_bibrec = recid
c.id_user = uid
c.date_creation = datetime.now()
c.star_score = 0
try:
db.session.add(c)
db.session.commit()
flash(_('Comment was sent'), "info")
return redirect(url_for('comments.comments', recid=recid))
except:
db.session.rollback()
return render_template('comments/add.html', form=form)
@blueprint.route('/<int:recid>/reviews/add', methods=['GET', 'POST'])
@request_record
@login_required
@permission_required('sendcomment', authorized_if_no_roles=True,
collection=lambda: g.collection.id)
def add_review(recid):
uid = current_user.get_id()
form = AddCmtRECORDCOMMENTFormReview(request.values)
if form.validate_on_submit():
c = CmtRECORDCOMMENT()
form.populate_obj(c)
c.id_bibrec = recid
c.id_user = uid
c.date_creation = datetime.now()
try:
db.session.add(c)
db.session.commit()
flash(_('Review was sent'), "info")
return redirect(url_for('comments.reviews', recid=recid))
except:
db.session.rollback()
return render_template('comments/add_review.html', form=form)
@blueprint.route('/<int:recid>/comments', methods=['GET', 'POST'])
@request_record
def comments(recid):
from invenio.modules.access.local_config import VIEWRESTRCOLL
from invenio.modules.access.mailcookie import \
mail_cookie_create_authorize_action
from .api import check_user_can_view_comments
auth_code, auth_msg = check_user_can_view_comments(current_user, recid)
if auth_code and current_user.is_guest:
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {
'collection': g.collection})
url_args = {'action': cookie, 'ln': g.ln, 'referer': request.referrer}
flash(_("Authorization failure"), 'error')
return redirect(url_for('webaccount.login', **url_args))
elif auth_code:
flash(auth_msg, 'error')
abort(401)
# FIXME check restricted discussion
comments = CmtRECORDCOMMENT.query.filter(db.and_(
CmtRECORDCOMMENT.id_bibrec == recid,
CmtRECORDCOMMENT.in_reply_to_id_cmtRECORDCOMMENT == 0,
CmtRECORDCOMMENT.star_score == 0
)).order_by(CmtRECORDCOMMENT.date_creation).all()
return render_template('comments/comments.html', comments=comments)
@blueprint.route('/<int:recid>/reviews', methods=['GET', 'POST'])
@request_record
def reviews(recid):
from invenio.modules.access.local_config import VIEWRESTRCOLL
from invenio.modules.access.mailcookie import \
mail_cookie_create_authorize_action
from .api import check_user_can_view_comments
auth_code, auth_msg = check_user_can_view_comments(current_user, recid)
if auth_code and current_user.is_guest:
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {
'collection': g.collection})
url_args = {'action': cookie, 'ln': g.ln, 'referer': request.referrer}
flash(_("Authorization failure"), 'error')
return redirect(url_for('webaccount.login', **url_args))
elif auth_code:
flash(auth_msg, 'error')
abort(401)
comments = CmtRECORDCOMMENT.query.filter(db.and_(
CmtRECORDCOMMENT.id_bibrec == recid,
CmtRECORDCOMMENT.in_reply_to_id_cmtRECORDCOMMENT == 0,
CmtRECORDCOMMENT.star_score > 0
)).order_by(CmtRECORDCOMMENT.date_creation).all()
return render_template('comments/reviews.html', comments=comments)
@blueprint.route('/<int:recid>/report/<int:id>', methods=['GET', 'POST'])
@request_record
def report(recid, id):
if CommentRights(id).can_perform_action():
CmtRECORDCOMMENT.query.filter(CmtRECORDCOMMENT.id == id).update(dict(
nb_abuse_reports=CmtRECORDCOMMENT.nb_abuse_reports + 1),
synchronize_session='fetch')
- log_comment_action(CFG_WEBCOMMENT_ACTION_CODE['REPORT_ABUSE'], id, recid)
+ log_comment_action(cfg['CFG_WEBCOMMENT_ACTION_CODE']['REPORT_ABUSE'], id, recid)
flash(_('Comment has been reported.'), 'success')
else:
flash(_('Comment has been already reported.'), 'error')
return redirect(url_for('comments.comments', recid=recid))
@blueprint.route('/<int:recid>/vote/<int:id>/<value>',
methods=['GET', 'POST'])
@request_record
def vote(recid, id, value):
if CommentRights(id).can_perform_action():
value = 1 if int(value) > 0 else 0
CmtRECORDCOMMENT.query.filter(
CmtRECORDCOMMENT.id == id).update(dict(
nb_votes_total=CmtRECORDCOMMENT.nb_votes_total + 1,
nb_votes_yes=CmtRECORDCOMMENT.nb_votes_yes + value),
synchronize_session='fetch')
- log_comment_action(CFG_WEBCOMMENT_ACTION_CODE['VOTE'], id, recid)
+ log_comment_action(cfg['CFG_WEBCOMMENT_ACTION_CODE']['VOTE'], id, recid)
flash(_('Thank you for your vote.'), 'success')
else:
flash(_('You can not vote for this comment.'), 'error')
return redirect(url_for('comments.comments', recid=recid))
@blueprint.route('/<int:recid>/toggle/<int:id>', methods=['GET', 'POST'])
@login_required
@request_record
def toggle(recid, id, show=None):
uid = current_user.get_id()
comment = CmtRECORDCOMMENT.query.get_or_404(id)
assert(comment.id_bibrec == recid)
if show is None:
show = 1 if comment.is_collapsed(uid) else 0
if show:
comment.expand(uid)
else:
comment.collapse(uid)
if not request.is_xhr:
return redirect(url_for('comments.comments', recid=recid))
else:
return 'OK'
@blueprint.route('/<int:recid>/comments/subscribe', methods=['GET', 'POST'])
@login_required
@request_record
def subscribe(recid):
uid = current_user.get_id()
subscription = CmtSUBSCRIPTION(id_bibrec=recid, id_user=uid,
creation_time=datetime.now())
try:
db.session.add(subscription)
db.session.commit()
flash(_('You have been successfully subscribed'), 'success')
except:
flash(_('You are already subscribed'), 'error')
return redirect(url_for('.comments', recid=recid))
@blueprint.route('/<int:recid>/comments/unsubscribe', methods=['GET', 'POST'])
@blueprint.route('/comments/unsubscribe', methods=['GET', 'POST'])
@login_required
def unsubscribe(recid=None):
uid = current_user.get_id()
if recid is None:
recid = request.values.getlist('recid', type=int)
else:
recid = [recid]
current_app.logger.info(recid)
try:
db.session.query(CmtSUBSCRIPTION).filter(db.and_(
CmtSUBSCRIPTION.id_bibrec.in_(recid),
CmtSUBSCRIPTION.id_user == uid
)).delete(synchronize_session=False)
db.session.commit()
flash(_('You have been successfully unsubscribed'), 'success')
except:
flash(_('You are already unsubscribed'), 'error')
if len(recid) == 1:
return redirect(url_for('.comments', recid=recid[0]))
else:
return redirect(url_for('.subscriptions'))
@blueprint.route('/comments/subscriptions', methods=['GET', 'POST'])
@login_required
@templated('comments/subscriptions.html')
@register_menu(blueprint, 'personalize.comment_subscriptions',
_('Your comment subscriptions'), order=20)
@register_breadcrumb(blueprint, '.', _("Your comment subscriptions"))
def subscriptions():
uid = current_user.get_id()
subscriptions = CmtSUBSCRIPTION.query.filter(
CmtSUBSCRIPTION.id_user == uid).all()
return dict(subscriptions=subscriptions)
diff --git a/invenio/modules/deposit/storage.py b/invenio/modules/deposit/storage.py
index c4760c418..e21433d12 100644
--- a/invenio/modules/deposit/storage.py
+++ b/invenio/modules/deposit/storage.py
@@ -1,236 +1,236 @@
# -*- coding: utf-8 -*-
#
# This file is part of Invenio.
# Copyright (C) 2013 CERN.
#
# Invenio is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License as
# published by the Free Software Foundation; either version 2 of the
# License, or (at your option) any later version.
#
# Invenio is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with Invenio; if not, write to the Free Software Foundation, Inc.,
# 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Storage abstraction layer for WebDeposit.
"""
import uuid
import hashlib
from fs import opener
from fs import path
import urllib2
try:
from invenio.config import CFG_WEBDEPOSIT_MAX_UPLOAD_SIZE
except ImportError:
CFG_WEBDEPOSIT_MAX_UPLOAD_SIZE = 104857600 # 100MB
class UploadError(IOError):
pass
class ExternalFile(object):
"""
Wrapper around a URL to make it behave like a file which can be passed to
the storage layer
"""
def __init__(self, url, filename):
- from invenio.bibdocfile import open_url, \
+ from invenio.legacy.bibdocfile.api import open_url, \
InvenioBibdocfileUnauthorizedURL
try:
self._file = open_url(url, headers={})
self.filename = None
info = self._file.info()
content_disposition = info.getheader('Content-Disposition')
if content_disposition:
for item in text.split(';'):
item = item.strip()
if item.strip().startswith('filename='):
self.filename = item[len('filename="'):-len('"')]
if not self.filename:
self.filename = filename
try:
size = int(info.getheader('Content-length'))
if size > CFG_WEBDEPOSIT_MAX_UPLOAD_SIZE:
raise UploadError("File too big")
except Exception:
pass
except InvenioBibdocfileUnauthorizedURL, e:
raise WebDepositUploadError(str(e))
except urllib2.URLError, e:
raise WebDepositUploadError('URL could not be opened: %s' % str(e))
def close(self):
self._file.close()
def read(self):
return self._file.read()
class Storage(object):
"""
Default storage backend
"""
_fsdir = None
def __init__(self, fs_path):
self.fs_path = fs_path
@property
def storage(self):
""" Get the pyFilesytem object for the backend path """
if self._fsdir is None:
# Opens a directory, creates it if needed, and ensures
# it is writeable.
self._fsdir = opener.fsopendir(
self.fs_path, writeable=True, create_dir=True
)
return self._fsdir
def unique_filename(self, filename):
""" Generate a unique secure filename """
return str(uuid.uuid4()) + "-" + filename
def save(self, incoming_file, filename, unique_name=True,
with_checksum=True):
""" Store the incoming file """
if unique_name:
filename = self.unique_filename(filename)
fs_file = self.storage.open(filename, 'wb')
checksum = None
f_bytes = incoming_file.read()
fs_file.write(f_bytes)
if with_checksum:
m = hashlib.md5()
m.update(f_bytes)
checksum = m.hexdigest()
fs_file.close()
# Create complete file path and return it
return (
path.join(self.fs_path, filename),
self.storage.getsize(filename),
checksum,
with_checksum,
)
@staticmethod
def delete(fs_path):
""" Delete the file on storage """
(dirurl, filename) = opener.pathsplit(fs_path)
fs = opener.fsopendir(dirurl)
fs.remove(filename)
@staticmethod
def is_local(fs_path):
""" Determine if file is a local file """
(dirurl, filename) = opener.pathsplit(fs_path)
fs = opener.fsopendir(dirurl)
return fs.hassyspath(filename)
@staticmethod
def get_url(fs_path):
""" Get a URL for the file """
(dirurl, filename) = opener.pathsplit(fs_path)
fs = opener.fsopendir(dirurl)
return fs.getpathurl(filename)
@staticmethod
def get_syspath(fs_path):
""" Get a local system path to the file """
(dirurl, filename) = opener.pathsplit(fs_path)
fs = opener.fsopendir(dirurl)
return fs.getsyspath(filename)
class DepositionStorage(Storage):
"""
Deposition storage backend that will save files to a
a folder (<CFG_WEBDEPOSIT_UPLOAD_FOLDER>/<deposition_id>/).
"""
def __init__(self, deposition_id):
from invenio.config import CFG_WEBDEPOSIT_STORAGEDIR
self.fs_path = path.join(
CFG_WEBDEPOSIT_STORAGEDIR,
str(deposition_id)
)
class ChunkedDepositionStorage(DepositionStorage):
"""
Chunked storage backend, capable of handling storage of a file
in multiple chunks. Otherwise similar to DepositionStorage.
"""
def chunk_filename(self, filename, chunks, chunk):
return "%s_%s_%s" % (
filename,
chunks,
chunk,
)
def save(self, incoming_file, filename, chunk=None, chunks=None):
try:
# Generate chunked file name
chunk = int(chunk)
chunks = int(chunks)
except (ValueError, TypeError):
raise WebDepositUploadError("Invalid chunk value: %s" % chunk)
# Store chunk
chunk_filename = self.chunk_filename(filename, chunks, chunk)
res = super(ChunkedDepositionStorage, self).save(
incoming_file, chunk_filename, unique_name=False,
with_checksum=False,
)
# Only merge files on last_trunk
if chunk != chunks-1:
return res
# Get the chunks
file_chunks = self.storage.listdir(
wildcard=self.chunk_filename(
filename, chunks, '*'
)
)
file_chunks.sort(key=lambda x: int(x.split("_")[-1]))
# Write the chunks into one file
filename = self.unique_filename(filename)
fs_file = self.storage.open(filename, 'wb')
m = hashlib.md5()
for c in file_chunks:
fs_c = self.storage.open(c, 'rb')
f_bytes = fs_c.read()
fs_file.write(f_bytes)
fs_c.close()
m.update(f_bytes)
# Remove each chunk right after appending to main file, to
# minimize storage usage.
self.storage.remove(c)
fs_file.close()
checksum = m.hexdigest()
return (
path.join(self.fs_path, filename),
self.storage.getsize(filename),
checksum,
True
)
diff --git a/invenio/modules/deposit/tasks.py b/invenio/modules/deposit/tasks.py
index 838599623..af71ff5cd 100644
--- a/invenio/modules/deposit/tasks.py
+++ b/invenio/modules/deposit/tasks.py
@@ -1,263 +1,263 @@
# -*- coding: utf-8 -*-
#
# This file is part of Invenio.
# Copyright (C) 2012, 2013 CERN.
#
# Invenio is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License as
# published by the Free Software Foundation; either version 2 of the
# License, or (at your option) any later version.
#
# Invenio is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with Invenio; if not, write to the Free Software Foundation, Inc.,
# 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
"""
import os
from tempfile import mkstemp
from flask import current_app, abort
from flask.ext.login import current_user
-from invenio.bibtask import task_low_level_submission, \
+from invenio.legacy.bibsched.bibtask import task_low_level_submission, \
bibtask_allocate_sequenceid
from invenio.legacy.bibfield.bibfield_jsonreader import JsonReader
from invenio.config import CFG_TMPSHAREDDIR
from invenio.legacy.dbquery import run_sql
from invenio.modules.deposit.models import Deposition, Agent, \
DepositionDraftCacheManager
from invenio.ext.logging import register_exception
try:
from invenio.pidstore_model import PersistentIdentifier
HAS_PIDSUPPORT = True
except ImportError:
HAS_PIDSUPPORT = False
def authorize_user(action, **params):
"""
Check if current user is authorized to perform the action.
"""
def _authorize_user(obj, dummy_eng):
from invenio.modules.access.engine import acc_authorize_action
auth, message = acc_authorize_action(
current_user.get_id(),
action,
**dict((k, v() if callable(v) else v)
for (k, v) in params.items()))
if auth != 0:
current_app.logger.info(message)
abort(401)
return _authorize_user
def prefill_draft(form_class, draft_id='_default', clear=True):
"""
Fill draft values with values from pre-filled cache
"""
def _prefill_draft(obj, eng):
draft_cache = DepositionDraftCacheManager.get()
if draft_cache.has_data():
d = Deposition(obj)
draft_cache.fill_draft(
d, draft_id, form_class=form_class, clear=clear
)
d.update()
return _prefill_draft
def render_form(form_class, draft_id='_default'):
"""
Renders a form if the draft associated with it has not yet been completed.
:param form_class: The form class which should be rendered.
:param draft_id: The name of the draft to create. Must be specified if you
put more than two ``render_form'''s in your deposition workflow.
"""
def _render_form(obj, eng):
d = Deposition(obj)
draft = d.get_or_create_draft(draft_id, form_class=form_class)
if draft.is_completed():
eng.jumpCallForward(1)
else:
form = draft.get_form(validate_draft=draft.validate)
form.validate = True
d.set_render_context(dict(
template_name_or_list=form.get_template(),
deposition=d,
deposition_type=(
None if d.type.is_default() else d.type.get_identifier()
),
uuid=d.id,
draft=draft,
form=form,
my_depositions=Deposition.get_depositions(
current_user, type=d.type
),
))
d.update()
eng.halt('Wait for form submission.')
return _render_form
def create_recid():
"""
Create a new record id.
"""
def _create_recid(obj, dummy_eng):
d = Deposition(obj)
sip = d.get_latest_sip(include_sealed=False)
if sip is None:
raise Exception("No submission information package found.")
if 'recid' not in sip.metadata:
sip.metadata['recid'] = run_sql(
"INSERT INTO bibrec (creation_date, modification_date) "
"VALUES (NOW(), NOW())"
)
d.update()
return _create_recid
def mint_pid(pid_field='doi', pid_creator=None, pid_store_type='doi',
existing_pid_checker=None):
"""
Register a persistent identifier internally.
:param pid_field: The recjson key for where to look for a pre-reserved pid.
Defaults to 'pid'.
:param pid_creator: Callable taking one argument (the recjson) that when
called will generate and return a pid string.
:param pid_store_type: The PID store type. Defaults to 'doi'.
:param existing_pid_checker: A callable taking two arguments
(pid_str, recjson) that will check if an pid found using ``pid_field''
should be registered or not.
"""
if not HAS_PIDSUPPORT:
def _mint_pid_dummy(dummy_obj, dummy_eng):
pass
return _mint_pid_dummy
def _mint_pid(obj, dummy_eng):
d = Deposition(obj)
recjson = d.get_latest_sip(include_sealed=False).metadata
if 'recid' not in recjson:
raise Exception("'recid' not found in sip metadata.")
pid_text = None
pid = recjson.get(pid_field, None)
if not pid:
# No pid found in recjson, so create new pid with user supplied
# function.
current_app.logger.info("Registering pid %s" % pid_text)
pid_text = recjson[pid_field] = pid_creator(recjson)
else:
# Pid found - check if it should be minted
if existing_pid_checker and existing_pid_checker(pid, recjson):
pid_text = pid
# Create an assign pid internally - actually registration will happen
# asynchronously later.
if pid_text:
current_app.logger.info("Registering pid %s" % pid_text)
pid_obj = PersistentIdentifier.create(pid_store_type, pid_text)
if pid_obj is None:
pid_obj = PersistentIdentifier.get(pid_store_type, pid_text)
try:
pid_obj.assign("rec", recjson['recid'])
except Exception:
register_exception(alert_admin=True)
d.update()
return _mint_pid
def prepare_sip():
"""
Prepare a submission information package
"""
def _prepare_sip(obj, dummy_eng):
d = Deposition(obj)
sip = d.get_latest_sip(include_sealed=False)
if sip is None:
sip = d.create_sip()
sip.metadata['fft'] = sip.metadata['files']
del sip.metadata['files']
sip.agents = [Agent(role='creator', from_request_context=True)]
d.update()
return _prepare_sip
def finalize_record_sip():
"""
Finalizes the SIP by generating the MARC and storing it in the SIP.
"""
def _finalize_sip(obj, dummy_eng):
d = Deposition(obj)
sip = d.get_latest_sip(include_sealed=False)
jsonreader = JsonReader()
for k, v in sip.metadata.items():
jsonreader[k] = v
sip.package = jsonreader.legacy_export_as_marc()
current_app.logger.info(jsonreader['__error_messages'])
current_app.logger.info(sip.package)
d.update()
return _finalize_sip
def upload_record_sip():
"""
Generates the record from marc.
The function requires the marc to be generated,
so the function export_marc_from_json must have been called successfully
before
"""
def create(obj, dummy_eng):
current_app.logger.info("Upload sip")
d = Deposition(obj)
sip = d.get_latest_sip(include_sealed=False)
sip.seal()
tmp_file_fd, tmp_file_path = mkstemp(
prefix="webdeposit-%s-%s" % (d.id, sip.uuid),
suffix='.xml',
dir=CFG_TMPSHAREDDIR,
)
os.write(tmp_file_fd, sip.package)
os.close(tmp_file_fd)
# Trick to have access to task_sequence_id in subsequent tasks.
d.workflow_object.task_sequence_id = bibtask_allocate_sequenceid()
task_low_level_submission(
'bibupload', 'webdeposit',
'-r' if 'recid' in sip.metadata else '-i', tmp_file_path, '-P5',
'-I', str(d.workflow_object.task_sequence_id)
)
d.update()
return create
diff --git a/invenio/modules/encoder/batch_engine.py b/invenio/modules/encoder/batch_engine.py
index 7fcbfea17..198545666 100644
--- a/invenio/modules/encoder/batch_engine.py
+++ b/invenio/modules/encoder/batch_engine.py
@@ -1,760 +1,760 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Bibencode batch processing submodule"""
from string import Template
from pprint import pprint
import os
import shutil
import uuid
from pprint import pformat
-from invenio.bibtask import (
+from invenio.legacy.bibsched.bibtask import (
task_update_progress,
write_message,
task_low_level_submission
)
-from invenio.bibdocfile import BibRecDocs, compose_file, compose_format, decompose_file
+from invenio.legacy.bibdocfile.api import BibRecDocs, compose_file, compose_format, decompose_file
from invenio.legacy.search_engine import (
record_exists,
get_collection_reclist,
search_pattern,
get_fieldvalues
)
from invenio.modules.encoder.encode import encode_video, assure_quality
from invenio.modules.encoder.extract import extract_frames
from invenio.modules.encoder.profiles import (
get_encoding_profile,
get_extract_profile
)
-from invenio.bibdocfilecli import cli_fix_marc
+from invenio.legacy.bibdocfile.cli import cli_fix_marc
from invenio.modules.encoder.utils import chose2
from invenio.modules.encoder.metadata import (
pbcore_metadata
)
from invenio.modules.encoder.utils import getval, chose2, generate_timestamp
from invenio.modules.encoder.config import (
CFG_BIBENCODE_DAEMON_DIR_NEWJOBS,
CFG_BIBENCODE_PBCORE_MARC_XSLT,
CFG_BIBENCODE_ASPECT_RATIO_MARC_FIELD
)
from invenio.ext.email import send_email
from invenio.base.i18n import gettext_set_language
from invenio.legacy.webuser import emailUnique, get_user_preferences
from invenio.modules.formatter.engines.xslt import format
from invenio.utils.json import json, json_decode_file
import invenio.config
## Stored messages for email notifications
global _BATCH_STEP, _BATCH_STEPS
_BATCH_STEP = 1
_BATCH_STEPS = 1
global _MSG_HISTORY, _UPD_HISTORY
_MSG_HISTORY = []
_UPD_HISTORY = []
def _notify_error_admin(batch_job,
email_admin=invenio.config.CFG_SITE_ADMIN_EMAIL):
"""Sends a notification email to the specified address, containing
admin-only information. Is called by process_batch_job() if an error
occured during the processing.
@param email_admin: email address of the admin
@type email_admin: string
"""
if not email_admin:
return
template = ("BibEncode batch processing has reported an error during the"
"execution of a job within the batch description <br/><br/>"
"This is the batch description: <br/><br/>"
"%(batch_description)s <br/><br/>"
"This is the message log: <br/><br/>"
"%(message_log)s")
html_text = template % {"batch_description": pformat(batch_job).replace("\n", "<br/>"),
"message_log": "\n".join(_MSG_HISTORY)}
text = html_text.replace("<br/>", "\n")
send_email(fromaddr=invenio.config.CFG_SITE_ADMIN_EMAIL,
toaddr=email_admin,
subject="Error during BibEncode batch processing",
content=text,
html_content=html_text)
def _notify_error_user(email_user, original_filename, recid, submission_title, ln=invenio.config.CFG_SITE_LANG):
"""Sends an error notification to the specified addres of the user.
Is called by process_batch_job() if an error occured during the processing.
@param email_user: email address of the user
@type email_user: string
@param email_admin: email address of the admin
@type email_admin: string
"""
if not email_user:
return
uid = emailUnique(email_user)
if uid != -1 and uid != 0:
language = getval(get_user_preferences(uid), "language")
if language:
ln = language
_ = gettext_set_language(ln)
rec_url = invenio.config.CFG_SITE_URL + "/record/" + str(recid)
template = ("<br/>" +
_("We are sorry, a problem has occured during the processing of"
" your video upload%(submission_title)s.") +
"<br/><br/>" +
_("The file you uploaded was %(input_filename)s.") +
"<br/><br/>" +
_("Your video might not be fully available until intervention.") +
"<br/>" +
_("You can check the status of your video here: %(record_url)s.") +
"<br/>" +
_("You might want to take a look at "
" %(guidelines_url)s"
" and modify or redo your submission."))
text = template % {"input_filename": "%s" % original_filename,
"submission_title": " %s" % submission_title,
"record_url": "%s" % rec_url,
"guidelines_url": "localhost"}
text = text.replace("<br/>", "\n")
html_text = template % {"input_filename": "<strong>%s</strong>" % original_filename,
"submission_title": " <strong>%s</strong>" % submission_title,
"record_url": "<a href=\"%s\">%s</a>" % (rec_url, rec_url),
"guidelines_url": "<a href=\"locahost\">%s</a>" % _("the video guidelines")}
send_email(fromaddr=invenio.config.CFG_SITE_ADMIN_EMAIL,
toaddr=email_user,
subject="Problem during the processing of your video",
content=text,
html_content=html_text
)
def _notify_success_user(email_user, original_filename, recid, submission_title, ln=invenio.config.CFG_SITE_LANG):
"""Sends an success notification to the specified addres of the user.
Is called by process_batch_job() if the processing was successfull.
@param email_user: email address of the user
@type email_user: string
@param email_admin: email address of the admin
@type email_admin: string
"""
uid = emailUnique(email_user)
if uid != -1 and uid != 0:
language = getval(get_user_preferences(uid), "language")
if language:
ln = language
_ = gettext_set_language(ln)
rec_url = invenio.config.CFG_SITE_URL + "/record/" + str(recid)
template = ("<br/>" +
_("Your video submission%(submission_title)s was successfully processed.") +
"<br/><br/>" +
_("The file you uploaded was %(input_filename)s.") +
"<br/><br/>" +
_("Your video is now available here: %(record_url)s.") +
"<br/>" +
_("If the videos quality is not as expected, you might want to take "
"a look at %(guidelines_url)s"
" and modify or redo your submission."))
text = template % {"input_filename": "%s" % original_filename,
"submission_title": " %s" % submission_title,
"record_url": "%s" % rec_url,
"guidelines_url": "localhost"}
text = text.replace("<br/>", "\n")
html_text = template % {"input_filename": "<strong>%s</strong>" % original_filename,
"submission_title": " <strong>%s</strong>" % submission_title,
"record_url": "<a href=\"%s\">%s</a>" % (rec_url, rec_url),
"guidelines_url": "<a href=\"locahost\">%s</a>" % _("the video guidelines")}
send_email(fromaddr=invenio.config.CFG_SITE_ADMIN_EMAIL,
toaddr=email_user,
subject="Your video submission is now complete",
content=text,
html_content=html_text
)
def _task_update_overall_status(message):
""" Generates an overall update message for the BibEncode task.
Stores the messages in a global list for notifications
@param message: the message that should be printed as task status
@type message: string
"""
message = "[%d/%d]%s" % (_BATCH_STEP, _BATCH_STEPS, message)
task_update_progress(message)
global _UPD_HISTORY
_UPD_HISTORY.append(message)
def _task_write_message(message):
""" Stores the messages in a global list for notifications
@param message: the message that should be printed as task status
@type message: string
"""
write_message(message)
global _MSG_HISTORY
_MSG_HISTORY.append(message)
def clean_job_for_quality(batch_job_dict, fallback=True):
"""
Removes jobs from the batch description that are not suitable for the master
video's quality. It applies only for encoding jobs!
@param batch_job_dict: the dict containing the batch description
@type batch_job_dict: dict
@param
@return: the cleaned dict
@rtype: dict
"""
survived_jobs = []
fallback_jobs = []
other_jobs = []
for job in batch_job_dict['jobs']:
if job['mode'] == 'encode':
if getval(job, 'fallback') and fallback:
fallback_jobs.append(job)
if getval(job, 'enforce'):
survived_jobs.append(job)
else:
profile = None
if getval(job, 'profile'):
profile = get_encoding_profile(job['profile'])
if assure_quality(input_file=batch_job_dict['input'],
aspect=chose2('aspect', job, profile),
target_width=chose2('width', job, profile),
target_height=chose2('height', job, profile),
target_bitrate=chose2('videobitrate', job, profile)):
survived_jobs.append(job)
else:
other_jobs.append(job)
if survived_jobs:
survived_jobs.extend(other_jobs)
new_jobs = survived_jobs
else:
fallback_jobs.extend(other_jobs)
new_jobs = fallback_jobs
pprint(locals())
batch_job_dict['jobs'] = new_jobs
return batch_job_dict
def create_update_jobs_by_collection(
batch_template_file,
collection,
job_directory=CFG_BIBENCODE_DAEMON_DIR_NEWJOBS):
""" Creates the job description files to update a whole collection
@param batch_template_file: fullpath to the template for the update
@type batch_tempalte_file: string
@param collection: name of the collection that should be updated
@type collection: string
@param job_directory: fullpath to the directory storing the job files
@type job_directory: string
"""
recids = get_collection_reclist(collection)
return create_update_jobs_by_recids(recids, batch_template_file,
job_directory)
def create_update_jobs_by_search(pattern,
batch_template_file,
job_directory=CFG_BIBENCODE_DAEMON_DIR_NEWJOBS
):
""" Creates the job description files to update all records that fit a
search pattern. Be aware of the search limitations!
@param search_pattern: The pattern to search for
@type search_pattern: string
@param batch_template_file: fullpath to the template for the update
@type batch_tempalte_file: string
@param job_directory: fullpath to the directory storing the job files
@type job_directory: string
"""
recids = search_pattern(p=pattern)
return create_update_jobs_by_recids(recids, batch_template_file,
job_directory)
def create_update_jobs_by_recids(recids,
batch_template_file,
job_directory=CFG_BIBENCODE_DAEMON_DIR_NEWJOBS
):
""" Creates the job description files to update all given recids
@param recids: Iterable set of recids
@type recids: iterable
@param batch_template_file: fullpath to the template for the update
@type batch_tempalte_file: string
@param job_directory: fullpath to the directory storing the job files
@type job_directory: string
"""
batch_template = json_decode_file(batch_template_file)
for recid in recids:
task_update_progress("Creating Update Job for %d" % recid)
write_message("Creating Update Job for %d" % recid)
job = dict(batch_template)
job['recid'] = recid
timestamp = generate_timestamp()
job_filename = "update_%d_%s.job" % (recid, timestamp)
create_job_from_dictionary(job, job_filename, job_directory)
return 1
def create_job_from_dictionary(
job_dict,
job_filename=None,
job_directory=CFG_BIBENCODE_DAEMON_DIR_NEWJOBS
):
""" Creates a job from a given dictionary
@param job_dict: Dictionary that contains the job description
@type job_dict: job_dict
@param job_filename: Filename for the job
@type job_filename: string
@param job_directory: fullpath to the directory storing the job files
@type job_directory: string
"""
if not job_filename:
job_filename = str(uuid.uuid4())
if not job_filename.endswith(".job"):
job_filename += ".job"
job_fullpath = os.path.join(job_directory, job_filename)
job_string = json.dumps(job_dict, sort_keys=False, indent=4)
file = open(job_fullpath, "w")
file.write(job_string)
file.close()
def sanitise_batch_job(batch_job):
""" Checks the correctness of the batch job dictionary and additionally
sanitises some values.
@param batch_job: The batch description dictionary
@type batch_job: dictionary
"""
def san_bitrate(bitrate):
""" Sanitizes bitrates
"""
if type(str()) == type(bitrate):
if bitrate.endswith('k'):
try:
bitrate = int(bitrate[:-1])
bitrate *= 1000
return bitrate
except ValueError:
raise Exception("Could not parse bitrate")
elif type(int) == type(bitrate):
return bitrate
else:
raise Exception("Could not parse bitrate")
if not getval(batch_job, 'update_from_master'):
if not getval(batch_job, 'input'):
raise Exception("No input file in batch description")
if not getval(batch_job, 'recid'):
raise Exception("No recid in batch description")
if not getval(batch_job, 'jobs'):
raise Exception("No job list in batch description")
if getval(batch_job, 'update_from_master'):
if (not getval(batch_job, 'bibdoc_master_comment') and
not getval(batch_job, 'bibdoc_master_description') and
not getval(batch_job, 'bibdoc_master_subformat')):
raise Exception("If update_from_master ist set, a comment or"
" description or subformat for matching must be given")
if getval(batch_job, 'marc_snippet'):
if not os.path.exists(getval(batch_job, 'marc_snippet')):
raise Exception("The marc snipped file %s was not found" %
getval(batch_job, 'marc_snippet'))
for job in batch_job['jobs']:
if job['mode'] == 'encode':
if getval(job, 'videobitrate'):
job['videobitrate'] = san_bitrate(getval(job, 'videobitrate'))
if getval(job, 'audiobitrate'):
job['audiobitrate'] = san_bitrate(getval(job, 'audiobitrate'))
return batch_job
def process_batch_job(batch_job_file):
""" Processes a batch job description dictionary
@param batch_job_file: a fullpath to a batch job file
@type batch_job_file: string
@return: 1 if the process was successfull, 0 if not
@rtype; int
"""
def upload_marcxml_file(marcxml):
""" Creates a temporary marcxml file and sends it to bibupload
"""
xml_filename = 'bibencode_'+ str(batch_job['recid']) + '_' + str(uuid.uuid4()) + '.xml'
xml_filename = os.path.join(invenio.config.CFG_TMPSHAREDDIR, xml_filename)
xml_file = file(xml_filename, 'w')
xml_file.write(marcxml)
xml_file.close()
targs = ['-c', xml_filename]
task_low_level_submission('bibupload', 'bibencode', *targs)
#---------#
# GENERAL #
#---------#
_task_write_message("----------- Handling Master -----------")
## Check the validity of the batch file here
batch_job = json_decode_file(batch_job_file)
## Sanitise batch description and raise errrors
batch_job = sanitise_batch_job(batch_job)
## Check if the record exists
if record_exists(batch_job['recid']) < 1:
raise Exception("Record not found")
recdoc = BibRecDocs(batch_job['recid'])
#--------------------#
# UPDATE FROM MASTER #
#--------------------#
## We want to add new stuff to the video's record, using the master as input
if getval(batch_job, 'update_from_master'):
found_master = False
bibdocs = recdoc.list_bibdocs()
for bibdoc in bibdocs:
bibdocfiles = bibdoc.list_all_files()
for bibdocfile in bibdocfiles:
comment = bibdocfile.get_comment()
description = bibdocfile.get_description()
subformat = bibdocfile.get_subformat()
m_comment = getval(batch_job, 'bibdoc_master_comment', comment)
m_description = getval(batch_job, 'bibdoc_master_description', description)
m_subformat = getval(batch_job, 'bibdoc_master_subformat', subformat)
if (comment == m_comment and
description == m_description and
subformat == m_subformat):
found_master = True
batch_job['input'] = bibdocfile.get_full_path()
## Get the aspect of the from the record
try:
## Assumes pbcore metadata mapping
batch_job['aspect'] = get_fieldvalues(124, CFG_BIBENCODE_ASPECT_RATIO_MARC_FIELD)[0]
except IndexError:
pass
break
if found_master:
break
if not found_master:
_task_write_message("Video master for record %d not found"
% batch_job['recid'])
task_update_progress("Video master for record %d not found"
% batch_job['recid'])
## Maybe send an email?
return 1
## Clean the job to do no upscaling etc
if getval(batch_job, 'assure_quality'):
batch_job = clean_job_for_quality(batch_job)
global _BATCH_STEPS
_BATCH_STEPS = len(batch_job['jobs'])
## Generate the docname from the input filename's name or given name
bibdoc_video_docname, bibdoc_video_extension = decompose_file(batch_job['input'])[1:]
if not bibdoc_video_extension or getval(batch_job, 'bibdoc_master_extension'):
bibdoc_video_extension = getval(batch_job, 'bibdoc_master_extension')
if getval(batch_job, 'bibdoc_master_docname'):
bibdoc_video_docname = getval(batch_job, 'bibdoc_master_docname')
write_message("Creating BibDoc for %s" % bibdoc_video_docname)
## If the bibdoc exists, receive it
if bibdoc_video_docname in recdoc.get_bibdoc_names():
bibdoc_video = recdoc.get_bibdoc(bibdoc_video_docname)
## Create a new bibdoc if it does not exist
else:
bibdoc_video = recdoc.add_bibdoc(docname=bibdoc_video_docname)
## Get the directory auf the newly created bibdoc to copy stuff there
bibdoc_video_directory = bibdoc_video.get_base_dir()
#--------#
# MASTER #
#--------#
if not getval(batch_job, 'update_from_master'):
if getval(batch_job, 'add_master'):
## Generate the right name for the master
## The master should be hidden first an then renamed
## when it is really available
## !!! FIX !!!
_task_write_message("Adding %s master to the BibDoc"
% bibdoc_video_docname)
master_format = compose_format(
bibdoc_video_extension,
getval(batch_job, 'bibdoc_master_subformat', 'master')
)
## If a file of the same format is there, something is wrong, remove it!
## it might be caused by a previous corrupted submission etc.
if bibdoc_video.format_already_exists_p(master_format):
bibdoc_video.delete_file(master_format, 1)
bibdoc_video.add_file_new_format(
batch_job['input'],
version=1,
description=getval(batch_job, 'bibdoc_master_description'),
comment=getval(batch_job, 'bibdoc_master_comment'),
docformat=master_format
)
#-----------#
# JOBS LOOP #
#-----------#
return_code = 1
global _BATCH_STEP
for job in batch_job['jobs']:
_task_write_message("----------- Job %s of %s -----------"
% (_BATCH_STEP, _BATCH_STEPS))
## Try to substitute docname with master docname
if getval(job, 'bibdoc_docname'):
job['bibdoc_docname'] = Template(job['bibdoc_docname']).safe_substitute({'bibdoc_master_docname': bibdoc_video_docname})
#-------------#
# TRANSCODING #
#-------------#
if job['mode'] == 'encode':
## Skip the job if assure_quality is not set and marked as fallback
if not getval(batch_job, 'assure_quality') and getval(job, 'fallback'):
continue
if getval(job, 'profile'):
profile = get_encoding_profile(job['profile'])
else:
profile = None
## We need an extension defined fot the video container
bibdoc_video_extension = getval(job, 'extension',
getval(profile, 'extension'))
if not bibdoc_video_extension:
raise Exception("No container/extension defined")
## Get the docname and subformat
bibdoc_video_subformat = getval(job, 'bibdoc_subformat')
bibdoc_slave_video_docname = getval(job, 'bibdoc_docname', bibdoc_video_docname)
## The subformat is incompatible with ffmpegs name convention
## We do the encoding without and rename it afterwards
bibdoc_video_fullpath = compose_file(
bibdoc_video_directory,
bibdoc_slave_video_docname,
bibdoc_video_extension
)
_task_write_message("Transcoding %s to %s;%s" % (bibdoc_slave_video_docname,
bibdoc_video_extension,
bibdoc_video_subformat))
## We encode now directly into the bibdocs directory
encoding_result = encode_video(
input_file=batch_job['input'],
output_file=bibdoc_video_fullpath,
acodec=getval(job, 'audiocodec'),
vcodec=getval(job, 'videocodec'),
abitrate=getval(job, 'videobitrate'),
vbitrate=getval(job, 'audiobitrate'),
resolution=getval(job, 'resolution'),
passes=getval(job, 'passes', 1),
special=getval(job, 'special'),
specialfirst=getval(job, 'specialfirst'),
specialsecond=getval(job, 'specialsecond'),
metadata=getval(job, 'metadata'),
width=getval(job, 'width'),
height=getval(job, 'height'),
aspect=getval(batch_job, 'aspect'), # Aspect for every job
profile=getval(job, 'profile'),
update_fnc=_task_update_overall_status,
message_fnc=_task_write_message
)
return_code &= encoding_result
## only on success
if encoding_result:
## Rename it, adding the subformat
os.rename(bibdoc_video_fullpath,
compose_file(bibdoc_video_directory,
bibdoc_video_extension,
bibdoc_video_subformat,
1,
bibdoc_slave_video_docname)
)
bibdoc_video._build_file_list()
bibdoc_video_format = compose_format(bibdoc_video_extension,
bibdoc_video_subformat)
if getval(job, 'bibdoc_comment'):
bibdoc_video.set_comment(getval(job, 'bibdoc_comment'),
bibdoc_video_format)
if getval(job, 'bibdoc_description'):
bibdoc_video.set_description(getval(job, 'bibdoc_description'),
bibdoc_video_format)
#------------#
# EXTRACTION #
#------------#
# if there are multiple extraction jobs, all the produced files
# with the same name will be in the same bibdoc! Make sure that
# you use different subformats or docname templates to avoid
# conflicts.
if job['mode'] == 'extract':
if getval(job, 'profile'):
profile = get_extract_profile(job['profile'])
else:
profile = {}
bibdoc_frame_subformat = getval(job, 'bibdoc_subformat')
_task_write_message("Extracting frames to temporary directory")
tmpdir = invenio.config.CFG_TMPDIR + "/" + str(uuid.uuid4())
os.mkdir(tmpdir)
#Move this to the batch description
bibdoc_frame_docname = getval(job, 'bibdoc_docname', bibdoc_video_docname)
tmpfname = (tmpdir + "/" + bibdoc_frame_docname + '.'
+ getval(profile, 'extension',
getval(job, 'extension', 'jpg')))
extraction_result = extract_frames(input_file=batch_job['input'],
output_file=tmpfname,
size=getval(job, 'size'),
positions=getval(job, 'positions'),
numberof=getval(job, 'numberof'),
width=getval(job, 'width'),
height=getval(job, 'height'),
aspect=getval(batch_job, 'aspect'),
profile=getval(job, 'profile'),
update_fnc=_task_update_overall_status,
)
return_code &= extraction_result
## only on success:
if extraction_result:
## for every filename in the directorys, create a bibdoc that contains
## all sizes of the frame from the two directories
files = os.listdir(tmpdir)
for filename in files:
## The docname was altered by BibEncode extract through substitution
## Retrieve it from the filename again
bibdoc_frame_docname, bibdoc_frame_extension = os.path.splitext(filename)
_task_write_message("Creating new bibdoc for %s" % bibdoc_frame_docname)
## If the bibdoc exists, receive it
if bibdoc_frame_docname in recdoc.get_bibdoc_names():
bibdoc_frame = recdoc.get_bibdoc(bibdoc_frame_docname)
## Create a new bibdoc if it does not exist
else:
bibdoc_frame = recdoc.add_bibdoc(docname=bibdoc_frame_docname)
## The filename including path from tmpdir
fname = os.path.join(tmpdir, filename)
bibdoc_frame_format = compose_format(bibdoc_frame_extension, bibdoc_frame_subformat)
## Same as with the master, if the format allready exists,
## override it, because something went wrong before
if bibdoc_frame.format_already_exists_p(bibdoc_frame_format):
bibdoc_frame.delete_file(bibdoc_frame_format, 1)
_task_write_message("Adding %s jpg;%s to BibDoc"
% (bibdoc_frame_docname,
getval(job, 'bibdoc_subformat')))
bibdoc_frame.add_file_new_format(
fname,
version=1,
description=getval(job, 'bibdoc_description'),
comment=getval(job, 'bibdoc_comment'),
docformat=bibdoc_frame_format)
## Remove the temporary folders
_task_write_message("Removing temporary directory")
shutil.rmtree(tmpdir)
_BATCH_STEP = _BATCH_STEP + 1
#-----------------#
# FIX BIBDOC/MARC #
#-----------------#
_task_write_message("----------- Handling MARCXML -----------")
## Fix the BibDoc for all the videos previously created
_task_write_message("Updating BibDoc of %s" % bibdoc_video_docname)
bibdoc_video._build_file_list()
## Fix the MARC
_task_write_message("Fixing MARC")
cli_fix_marc({}, [batch_job['recid']], False)
if getval(batch_job, 'collection'):
## Make the record visible by moving in from the collection
marcxml = ("<record><controlfield tag=\"001\">%d</controlfield>"
"<datafield tag=\"980\" ind1=\" \" ind2=\" \">"
"<subfield code=\"a\">%s</subfield></datafield></record>"
) % (batch_job['recid'], batch_job['collection'])
upload_marcxml_file(marcxml)
#---------------------#
# ADD MASTER METADATA #
#---------------------#
if getval(batch_job, 'add_master_metadata'):
_task_write_message("Adding master metadata")
pbcore = pbcore_metadata(input_file = getval(batch_job, 'input'),
pbcoreIdentifier = batch_job['recid'],
aspect_override = getval(batch_job, 'aspect'))
marcxml = format(pbcore, CFG_BIBENCODE_PBCORE_MARC_XSLT)
upload_marcxml_file(marcxml)
#------------------#
# ADD MARC SNIPPET #
#------------------#
if getval(batch_job, 'marc_snippet'):
marc_snippet = open(getval(batch_job, 'marc_snippet'))
marcxml = marc_snippet.read()
marc_snippet.close()
upload_marcxml_file(marcxml)
#--------------#
# DELETE INPUT #
#--------------#
if getval(batch_job, 'delete_input'):
_task_write_message("Deleting input file")
# only if successfull
if not return_code:
# only if input matches pattern
if getval(batch_job, 'delete_input_pattern', '') in getval(batch_job, 'input'):
try:
os.remove(getval(batch_job, 'input'))
except OSError:
pass
#--------------#
# NOTIFICATION #
#--------------#
## Send Notification emails on errors
if not return_code:
if getval(batch_job, 'notify_user'):
_notify_error_user(getval(batch_job, 'notify_user'),
getval(batch_job, 'submission_filename', batch_job['input']),
getval(batch_job, 'recid'),
getval(batch_job, 'submission_title', ""))
_task_write_message("Notify user because of an error")
if getval(batch_job, 'notify_admin'):
_task_write_message("Notify admin because of an error")
if type(getval(batch_job, 'notify_admin') == type(str()) ):
_notify_error_admin(batch_job,
getval(batch_job, 'notify_admin'))
else:
_notify_error_admin(batch_job)
else:
if getval(batch_job, 'notify_user'):
_task_write_message("Notify user because of success")
_notify_success_user(getval(batch_job, 'notify_user'),
getval(batch_job, 'submission_filename', batch_job['input']),
getval(batch_job, 'recid'),
getval(batch_job, 'submission_title', ""))
return 1
diff --git a/invenio/modules/encoder/daemon.py b/invenio/modules/encoder/daemon.py
index 865f6c85c..55b417e45 100644
--- a/invenio/modules/encoder/daemon.py
+++ b/invenio/modules/encoder/daemon.py
@@ -1,158 +1,158 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Bibencode daemon submodule"""
import os
import re
import shutil
from invenio.utils.json import json_decode_file
from invenio.modules.encoder.utils import generate_timestamp, getval
-from invenio.bibtask import (
+from invenio.legacy.bibsched.bibtask import (
task_low_level_submission,
task_get_task_param,
write_message,
task_update_progress
)
from invenio.modules.encoder.config import (
CFG_BIBENCODE_DAEMON_DIR_NEWJOBS,
CFG_BIBENCODE_DAEMON_DIR_OLDJOBS
)
## Globals used to generate a unique task name
_TASKID = None
_TIMESTAMP = generate_timestamp()
_NUMBER = 0
def has_signature(string_to_check):
""" Checks if the given string has the signature of a job file
"""
sig_re = re.compile("^.*\.job$")
if sig_re.match(string_to_check):
return True
else:
return False
def job_to_args(job):
""" Maps the key-value pairs of the job file to CLI arguments for a
low-level task submission
@param job: job dictionary to process
@type job: dictionary
"""
argument_mapping = {
'profile': '-p',
'input': '--input',
'output': '--output',
'mode': '--mode',
'acodec': '--acodec',
'vcodec': '--vcodec',
'abitrate': '--abitrate',
'vbitrate': '--vbitrate',
'size': '--resolution',
'passes': '--passes',
'special': '--special',
'specialfirst': '--specialfirst',
'specialsecond': '--specialsecond',
'numberof': '--number',
'positions': '--positions',
'dump': '--dump',
'write': '--write',
'new_job_folder': '--newjobfolder',
'old_job_folder': '--oldjobfolder',
'recid': '--recid',
'collection': '--collection',
'search': '--search'
}
args = []
## Set a unique name for the task, this way there can be more than
## one bibencode task running at the same time
task_unique_name = '%(mode)s-%(tid)d-%(ts)s-%(num)d' % {
'mode': job['mode'],
'tid': _TASKID,
'ts': _TIMESTAMP,
'num': _NUMBER
}
args.append('-N')
args.append(task_unique_name)
## Transform the pairs of the job dictionary to CLI arguments
for key in job:
if key in argument_mapping:
args.append(argument_mapping[key]) # This is the new key
args.append(job[key]) # This is the value from the job file
return args
def launch_task(args):
""" Launches the job as a new bibtask through the low-level submission
interface
"""
return task_low_level_submission('bibencode', 'bibencode:daemon', *args)
def process_batch(jobfile_path):
""" Processes the job if it is a batch job
@param jobfile_path: fullpath to the batchjob file
@type jobfile_path: string
@return: True if the task was successfully launche, False if not
@rtype: bool
"""
args = []
task_unique_name = '%(mode)s-%(tid)d-%(ts)s-%(num)d' % {
'mode': 'batch',
'tid': _TASKID,
'ts': _TIMESTAMP,
'num': _NUMBER
}
args.append('-N')
args.append(task_unique_name)
args.append('-m')
args.append('batch')
args.append('-i')
args.append(jobfile_path)
return launch_task(args)
def watch_directory(new_job_dir=CFG_BIBENCODE_DAEMON_DIR_NEWJOBS,
old_job_dir=CFG_BIBENCODE_DAEMON_DIR_OLDJOBS):
""" Checks a folder job files, parses and executes them
@param new_job_dir: path to the directory with new jobs
@type new_job_dir: string
@param old_job_dir: path to the directory where the old jobs are moved
@type old_job_dir: string
"""
global _NUMBER, _TASKID
write_message('Checking directory %s for new jobs' % new_job_dir)
task_update_progress('Checking for new jobs')
_TASKID = task_get_task_param('task_id')
files = os.listdir(new_job_dir)
for file in files:
file_fullpath = os.path.join(new_job_dir, file)
if has_signature(file_fullpath):
write_message('New Job found: %s' % file)
job = json_decode_file(file_fullpath)
if not getval(job, 'isbatch'):
args = job_to_args(job)
if not launch_task(args):
write_message('Error submitting task')
else:
## We need the job description for the batch engine
## So we need to use the new path inside the oldjobs dir
process_batch(os.path.join(old_job_dir, file))
## Move the file to the done dir
shutil.move(file_fullpath, os.path.join(old_job_dir, file))
## Update number for next job
_NUMBER += 1
return 1
diff --git a/invenio/modules/encoder/encode.py b/invenio/modules/encoder/encode.py
index f1e0f930f..58cd33708 100644
--- a/invenio/modules/encoder/encode.py
+++ b/invenio/modules/encoder/encode.py
@@ -1,614 +1,614 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibEncode encoding submodule"""
-from invenio.bibtask import (
+from invenio.legacy.bibsched.bibtask import (
write_message,
task_update_progress,
)
from invenio.modules.encoder.config import (
CFG_BIBENCODE_FFMPEG_ENCODING_LOG,
CFG_BIBENCODE_FFMPEG_PASSLOGFILE_PREFIX,
CFG_BIBENCODE_FFMPEG_METADATA_ARGUMENT,
CFG_BIBENCODE_FFMPEG_ENCODE_TIME
)
from invenio.modules.encoder.utils import (
timecode_to_seconds,
generate_timestamp,
chose,
getval,
aspect_string_to_float
)
from invenio.modules.encoder.profiles import get_encoding_profile
from invenio.modules.encoder.metadata import (
ffprobe_metadata,
mediainfo_metadata
)
from invenio.config import CFG_PATH_FFMPEG
import time
import os
import subprocess
import uuid
def _filename_log(output_filename, nofpass=1):
""" Constructs the filename including path for the encoding err file
@param output_filename: name of the video file to be created
@type output_filename: string
@param nofpass: number of encoding passes
@type nofpass: int
@return: the constructed log filename
@rtype: string
"""
fname = os.path.split(output_filename)[1]
fname = os.path.splitext(fname)[0]
return CFG_BIBENCODE_FFMPEG_ENCODING_LOG % (generate_timestamp() +
"_" + fname + "_%d" % nofpass)
def determine_aspect(input_file):
""" Checks video metadata to find the display aspect ratio.
Returns None if the DAR is not stored in the video container.
@param input_file: full path of the video
@type input_file: string
"""
videoinfo = ffprobe_metadata(input_file)
if not videoinfo:
return None
for stream in videoinfo['streams']:
if stream['codec_type'] == 'video':
fwidth = int(stream['width'])
fheight = int(stream['height'])
if 'display_aspect_ratio' in stream:
return (stream['display_aspect_ratio'], fwidth, fheight)
return (None, fwidth, fheight)
def determine_resolution_preserving_aspect(input_file, width=None,
height=None, aspect=None):
""" Determines the right resolution for a given width or height while
preserving the aspect ratio.
@param input_file: full path of the video
@type input_file: string
@param width: The proposed width for the new size.
@type width: int
@param height: The proposed height for the new size
@type height: int
@param aspect: Override aspect ratio determined from the input file
@type aspect: float or "4:3" like string
@return: An FFMPEG compatible size string '640x480'
@rtype: string
"""
def _make_even(number):
""" Resolutions need to be even numbers for some video encoders.
We simply increase the resulution by one pixel if it is not even.
"""
if number % 2 != 0:
return number+1
else:
return number
if aspect:
if type(aspect) == type(str()):
aspect_ratio = aspect_string_to_float(aspect)
elif type(aspect) == type(float()):
aspect_ratio = aspect
else:
raise ValueError
else:
aspect_ratio_tuple = determine_aspect(input_file)
if aspect_ratio_tuple[0] is None:
aspect_ratio = float(aspect_ratio_tuple[1]) / float(aspect_ratio_tuple[2])
else:
aspect_ratio = aspect_string_to_float(aspect_ratio_tuple[0])
nresolution = None
if width and not height:
## The resolution hast to fit exactly the width
nheight = int(width / aspect_ratio)
nheight = _make_even(nheight)
nresolution = "%dx%d" % (width, nheight)
elif height and not width:
## The resolution hast to fit exactly the height
nwidth = int(height * aspect_ratio)
nwidth = _make_even(nwidth)
nresolution = "%dx%d" % (nwidth, height)
elif width and height:
## The resolution hast to be within both parameters, seen as a maximum
nwidth = width
nheight = height
new_aspect_ratio = float(width) / float(height)
if aspect_ratio > new_aspect_ratio:
nheight = int(width / aspect_ratio)
else:
nwidth = int(height * aspect_ratio)
nheight = _make_even(nheight)
nwidth = _make_even(nwidth)
nresolution = "%dx%d" % (nwidth, nheight)
else:
## Return the original size in square pixels
## original height * aspect_ratio
nwidth = aspect_ratio_tuple[2] * aspect_ratio
nwidth = _make_even(nwidth)
nresolution = "%dx%d" % (nwidth, aspect_ratio_tuple[2])
return nresolution
def assure_quality(input_file, aspect=None, target_width=None,
target_height=None, target_bitrate=None,
accept_anamorphic=True, tolerance=0.95):
"""
Checks if the original video material would support the target resolution
and/or bitrate.
@param input_file: full path of the video to check
@type input_file: string
@param aspect: the aspect ratio as override
@type aspect: float
@param target_width: width of the new video
@type target_width: int
@param target_height: height of the new video
@type target_height: int
@param target_bitrate: bitrate of the new video in bit/s.
@type target_bitrate: int
@param dismiss_aspect: do not care about the aspect
@type dismiss_aspect: bool
@return: 1 if the video supports the quality, 0 if not
@rtype: bool
"""
if target_bitrate:
target_bitrate = int(target_bitrate * tolerance)
if target_height:
target_height = int(target_height * tolerance)
if target_width:
target_width = int (target_width * tolerance)
## First get the size and aspect using ffprobe
## ffprobe is more reliable in this case then mediainfo
aspect_ratio_tuple = determine_aspect(input_file)
fwidth = aspect_ratio_tuple[1]
fheight = aspect_ratio_tuple[2]
if not aspect:
aspect = aspect_ratio_tuple[0]
## Get the bitrate with mediainfo now, because it is more realiable
## than ffprobe in this case
fbitrate = None
videoinfo = mediainfo_metadata(input_file)
for track in videoinfo:
if track['kind_of_stream'] == 'Video':
fbitrate = getval(track, 'bit_rate')
break
if fbitrate:
fbitrate = int(fbitrate)
# This adapts anamorphic videos.
# If it is stored anamorphic, calculate the real width from the height
# we can use our determine_resolution function for this
if accept_anamorphic and aspect:
fwidth = determine_resolution_preserving_aspect(
input_file=input_file,
width=None,
height=fheight,
aspect=aspect).split('x')[0]
fwidth = int(fwidth)
if target_height and target_width:
if target_width > fwidth or target_height > fheight:
return False
elif target_height:
if target_height > fheight:
return False
elif target_width:
if target_width > fwidth:
return False
if target_bitrate:
## If the video bitrate is unreadable, assume it is ok and our library
## has problems reading it out
if fbitrate and target_bitrate > fbitrate:
return False
return True
def encode_video(input_file, output_file,
acodec=None, vcodec=None,
abitrate=None, vbitrate=None,
resolution=None,
passes=1,
special=None, specialfirst=None, specialsecond=None,
metadata=None,
width=None, height=None, aspect=None,
profile=None,
update_fnc=task_update_progress,
message_fnc=write_message
):
""" Starts an ffmpeg encoding process based on the given parameters.
The encoding is run as a subprocess. The progress of the subprocess is
continiously written to the given messaging functions. In a normale case,
these should be the ones of BibTask.
@param input_file: Path to the input video.
@type input_file: string
@param output_file: Path to the output file. If no other parameters are giv
than input and output files, FFmpeg tries to auto-discover the right codecs
for the given file extension. In this case, every other aspect like
resolution and bitrates will be the same as in the input video.
@type output_file: string
@param acodec: The audio codec to use. This must be an available codec of
libavcodec within FFmpeg.
@type acodec: string
@param vcodec: The video codec to use. This must be an available codec of
libavcodec within FFmpeg.
@type vcodec: string
@param abitrate: Bitrate of the audio stream. In bit/s.
@type abitrate: int
@param vbitrate: Bitrate of the video stream. In bit/s.
@type vbitrate: int
@param resolution: Fixed size of the frames in the transcoded video.
FFmpeg notation: 'WxH' or preset like 'vga'. See also 'width'
@param passes: Number of encoding passes. Either 1 or 2.
@type passes: int
@param special: Additional FFmpeg parameters.
@type special: string
@param specialfirst: Additional FFmpeg parameters for the first pass.
The 'special' parameter is ignored if this ist not 'None'
@type specialfirst: string
@param specialsecond: Additional FFmpeg parameters for the second pass.
The 'special' parameter is ignored if this is not 'None'
@type specialsecond: string
@param metadata: Metadata that should be added to the transcoded video.
This must be a dictionary. As with as metadata in FFmpeg, there is no
guarantee that the metadata specified in the dictionary will really be added
to the file, because it will largelydepend on the container format and its
supported fields.
@type metadata: dict
@param width: Instead of giving a fixed resolution, you can use width and
height as dimensional constrains. The algorithm will try to preserve the
original aspect and fit the new frame size into the given dimensions.
@type width: int
@param height: see 'width'
@type height: int
@param aspect: A float representing the aspect ratio of the video:
4:3 equals 1.33 and 16:9 equals 1.77.
This is a fallback in case the algorithm fails to determine the real aspect
ratio from the video. See also 'width'
@type aspect: float or "4:3" like string
@param profile: A profile to use. The priority is on the parameters
directly given to the function.
@type profile: string
@param update_fnc: A function called to display or log an the encoding
status. This function must accept a string.
@type update_fnc: function
@param message_fnc: A function to log important messages or errors.
This function must accept a string.
@type message_fnc: function
@return: True if the encoding was successful, False if not
@rtype: boolean
"""
def encode(command):
""" Subfunction to run the acutal encoding
"""
## Start process
process = subprocess.Popen(command,
stderr=log_file_handle,
close_fds=True)
## While the process is running
time.sleep(1)
while process.poll() is None:
# Update the status in bibsched
update_status()
time.sleep(4)
## If the process was terminated
if process.poll() == -15:
# Encoding was terminated by system
message_fnc("FFMPEG was terminated")
update_fnc(" FFMPEG was terminated")
return 0
## If there was an error during encoding
if process.poll() == 1:
update_fnc(" An FFMPEG error has appeared, see log")
message_fnc("An FFMPEG error has appeared encoding %s" % output_file)
message_fnc("Command was: %s" % ' '.join(command))
message_fnc("Last lines of the FFmpeg log:")
## open the logfile again an retrieve the size
log_file_handle2 = open(log_file_name, 'rb')
size = os.fstat(log_file_handle2.fileno())[6]
## Read the last lines
log_file_handle2.seek(-min(size, 10000), 2)
lastlines = log_file_handle2.read().splitlines()[-5:]
for line in lastlines:
message_fnc(line)
return 0
## If everything went fine
if process.poll() == 0:
message_fnc("Encoding of %s done" % output_file)
update_fnc("Encoding of %s done" % output_file)
return 1
def build_command(nofpass=1):
""" Builds the ffmpeg command according to the function params
"""
def insert(key, value):
""" Shortcut for inserting parameters into the arg list
"""
base_args.insert(-1, key)
base_args.insert(-1, value)
## Determine base command arguments from the pass to run
base_args = None
if passes == 1:
base_args = [CFG_PATH_FFMPEG, '-y', '-i', input_file, output_file]
elif passes == 2:
if nofpass == 1:
base_args = [CFG_PATH_FFMPEG, '-y', '-i', input_file,
'-pass', '1', '-passlogfile', pass_log_file,
'-an', '-f', 'rawvideo', '/dev/null']
elif nofpass == 2:
base_args = [CFG_PATH_FFMPEG, '-y', '-i', input_file,
'-pass', '2', '-passlogfile',
pass_log_file, output_file]
## Insert additional arguments
if acodec is not None:
insert('-acodec', acodec)
if vcodec is not None:
insert('-vcodec', vcodec)
if abitrate is not None:
insert('-b:a', str(abitrate))
if vbitrate is not None:
insert('-b:v', str(vbitrate))
## If a resolution is given
if resolution:
insert('-s', resolution)
## If not, you can give width and height and generate the resolution
else:
## Use our new function to get the size of the input
nresolution = determine_resolution_preserving_aspect(input_file,
width,
height,
aspect)
insert('-s', nresolution)
## Metadata additions
if type(metadata) is type(dict()):
## build metadata arguments for ffmpeg
for key, value in metadata.iteritems():
if value is not None:
meta_arg = (
CFG_BIBENCODE_FFMPEG_METADATA_ARGUMENT % (key, value)
)
insert("-metadata", meta_arg)
## Special argument additions
if passes == 1:
if passes == 1 and special is not None:
for val in special.split():
base_args.insert(-1, val)
elif passes == 2:
if nofpass == 1:
if specialfirst is not None:
for val in specialfirst.split():
base_args.insert(-1, val)
if nofpass == 2:
if specialsecond is not None:
for val in specialsecond.split():
base_args.insert(-1, val)
return base_args
def update_status():
""" Parses the encoding status and updates the task in bibsched
"""
def graphical(value):
""" Converts a percentage value to a nice graphical representation
"""
## If the given value is a valid precentage
if value >= 0 and value <= 100:
## This is to get nice, aligned output in bibsched
oval = str(value).zfill(3)
return (
"[" + "#"*(value/10) + " "*(10-(value/10)) +
"][%d/%d] %s%%" % (nofpass, passes, oval)
)
else:
## Sometimes the parsed values from FFMPEG are totaly off.
## Or maybe nneeded values are not avail. for the given video.
## In this case there is no estimate.
return "[ no est. ][%d/%d] " % (nofpass, passes)
## init variables
time_string = '0.0'
percentage_done = -1
## try to read the encoding log
try:
filehandle = open(log_file_name, 'rb')
except IOError:
message_fnc("Error opening %s" % log_file_name)
update_fnc("Could not open encoding log")
return
## Check the size of the file before reading from the end
size = os.path.getsize(log_file_name)
if not size:
return
## Go to the end of the log
filehandle.seek(-min(10000, size), 2)
chunk = filehandle.read()
lines = chunk.splitlines()
## try to parse the status
for line in reversed(lines):
if CFG_BIBENCODE_FFMPEG_ENCODE_TIME.match(line):
time_string = (
CFG_BIBENCODE_FFMPEG_ENCODE_TIME.match(line).groups()
)[0]
break
filehandle.close()
try:
percentage_done = int(timecode_to_seconds(time_string) / total_seconds * 100)
except:
precentage_done = -1
## Now update the bibsched progress
opath, ofile = os.path.split(output_file)
if len(opath) > 8:
opath = "..." + opath[-8:]
ohint = opath + '/' + ofile
update_fnc(graphical(percentage_done) + " > " + ohint)
#------------------#
# PROFILE HANDLING #
#------------------#
if profile:
profile = get_encoding_profile(profile)
acodec = chose(acodec, 'audiocodec', profile)
vcodec = chose(vcodec, 'videocodec', profile)
abitrate = chose(abitrate, 'audiobitrate', profile)
vbitrate = chose(vbitrate, 'videobitrate', profile)
resolution = chose(resolution, 'resolution', profile)
passes = getval(profile, 'passes', 1)
special = chose(special, 'special', profile)
specialfirst = chose(specialfirst, 'special_firstpass', profile)
specialsecond = chose(specialsecond, 'special_secondpass', profile)
metadata = chose(metadata, 'metadata', profile)
width = chose(width, 'width', profile)
height = chose(height, 'height', profile)
aspect = chose(aspect, 'aspect', profile)
#----------#
# ENCODING #
#----------#
## Mark Task as stoppable
# task_sleep_now_if_required()
tech_metadata = ffprobe_metadata(input_file)
try:
total_seconds = float(tech_metadata['format']['duration'])
except:
total_seconds = 0.0
## Run the encoding
pass_log_file = CFG_BIBENCODE_FFMPEG_PASSLOGFILE_PREFIX % (
os.path.splitext(os.path.split(input_file)[1])[0],
str(uuid.uuid4()))
no_error = True
## For every encoding pass to do
for apass in range(0, passes):
nofpass = apass + 1
if no_error:
## Create Logfiles
log_file_name = _filename_log(output_file, nofpass)
try:
log_file_handle = open(log_file_name, 'w')
except IOError:
message_fnc("Error creating %s" % log_file_name)
update_fnc("Error creating logfile")
return 0
## Build command for FFMPEG
command = build_command(nofpass)
## Start encoding, result will define to continue or not to
no_error = encode(command)
## !!! Status Update
return no_error
def propose_resolutions(video_file, display_aspect=None, res_16_9=['1920x1080', '1280x720', '854x480', '640x360'], res_4_3=['640x480'], lq_fallback=True):
""" Returns a list of possible resolutions that would work with the given
video, based on its own resultion ans aspect ratio
@ param display_aspect: Sets the display aspect ratio for videos where
this might not be detectable
@param res_16_9: Possible resolutions to select from for 16:9 videos
@param res_4_3: Possible resolutions to select from for 4:3 videos
@param lq_fallback: Return the videos own resultion if none of the given fits
"""
def eq(a,b):
if abs(a-b) < 0.01:
return 1
else:
return 0
def get_smaler_or_equal_res(height, avail_res):
smaler_res = []
for res in avail_res:
vres = int(res.split('x')[1])
if vres <= height:
smaler_res.append(res)
return smaler_res
def get_res_for_weird_aspect(width, aspect, avail_res):
smaler_res = []
for res in avail_res:
hres, vres = res.split('x')
hres = int(hres)
vres = int(vres)
if hres <= width:
height = hres * (1.0 / aspect)
if height % 2 != 0:
height = height-1
smaler_res.append(str(hres) + 'x' + str(int(height)))
return smaler_res
meta_dict = ffprobe_metadata(video_file)
for stream in meta_dict['streams']:
if stream['codec_type'] == 'video':
width = int(stream['width'])
height = int(stream['height'])
# If the display aspect ratio is in the meta, we can even override
# the ratio that was given to the function as a fallback
# But the information in the file could be wrong ...
# Which information is trustful?
if 'display_aspect_ratio' in stream:
display_aspect = stream['display_aspect_ratio']
break
# Calculate the aspect factors
if display_aspect == None:
# Assume square pixels
display_aspect = float(width) / float(height)
else:
asp_w, asp_h = display_aspect.split(':')
display_aspect = float(asp_w) / float(asp_h)
# Check if 16:9
if eq(display_aspect, 1.77):
possible_res = get_smaler_or_equal_res(height, res_16_9)
# Check if 4:3
elif eq(display_aspect, 1.33):
possible_res = get_smaler_or_equal_res(height, res_4_3)
# Weird aspect
else:
possible_res = get_res_for_weird_aspect(width, display_aspect, res_16_9)
# If the video is crap
if not possible_res and lq_fallback:
return [str(width) + 'x' + str(height)]
else:
return possible_res
diff --git a/invenio/modules/encoder/extract.py b/invenio/modules/encoder/extract.py
index cfe3d1b9a..9729fce1e 100644
--- a/invenio/modules/encoder/extract.py
+++ b/invenio/modules/encoder/extract.py
@@ -1,249 +1,249 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibEncode frame extraction module.
"""
__revision__ = "$Id$"
from invenio.modules.encoder.config import (
CFG_BIBENCODE_FFMPEG_EXTRACT_COMMAND,
)
-from invenio.bibtask import (
+from invenio.legacy.bibsched.bibtask import (
task_update_progress,
write_message
)
from invenio.modules.encoder.utils import (
timecode_to_seconds,
seconds_to_timecode,
is_timecode,
is_seconds,
normalize_string,
getval,
chose
)
from invenio.modules.encoder.metadata import (
ffprobe_metadata
)
import subprocess
import os
from invenio.modules.encoder.profiles import get_extract_profile
from invenio.modules.encoder.encode import determine_resolution_preserving_aspect
import re
## rename size to resolution
def extract_frames(input_file, output_file=None, size=None, positions=None,
numberof=None, extension='jpg',
width=None, height=None, aspect=None, profile=None,
update_fnc=task_update_progress,
message_fnc=write_message):
""" Extracts frames from a given video using ffmpeg based on the given
parameters. Starts a subprocess. The status of the process is continously
written to the given messaging functions.
@param input_file: Full path to the input video.
@type input_file: String
@param output_file: Full path to the output file, in case of multiple outs,
there will be squential numbers appended to the file's name
automatically. If this parameter is not given, the
output filename will be generated from the input file
The output can be substituted with information.
Valid substrings for substitution are:
%(input)s for the input filename
%(timecode)s for the timecode
%(size)s for the frame size
%(number)d for sequential numbers
Everything else that could be a python substitution substring
should be escaped accordingly.
!!! Warnning !!! FFmpeg will also try to substitude if there
are any '%' left. This will likely screw up the extraction.
@type output_file: string
@param size: The size of the frames. Format is WxH
@type size: string
@param positions: A list of positions within the video where the frames
should be shot. Percentual values between 0 and 100 or
HH:MM:SS.ss are accepted.
@type positions: string
@param numberof: In case you don't want to give positions but just a fixed
number of frames to extract.
@type numberof: nt
@param extension: If no output filename is given, construct the name with
this extension
@type extension: string
@param width: The width of the extracted frame.
@type width: int
@param height: The height of the extracted frame
@type height: int
@param aspect: A float representing the aspect ratio of the video.
4:3 equals 1.33 and 16:9 equals 1.77. See also 'width'
@type aspect: float or "4:3" like string
@param profile: A profile to use. The priority is on the parameters directly
given to the function.
@type profile: string
@param update_fnc: A function called to display or log an the encoding
status. This function must accept a string.
@type update_fnc: function
@param message_fnc: A function to log important messages or errors.
This function must accept a string.
@type message_fnc: function
@return: 1 if the extraction was successful, 0 if not
@rtype: bool
"""
#---------#
# PROFILE #
#---------#
## Takes parameters from the profile if they are not directly given
if profile:
profile = get_extract_profile(profile)
size = chose(size, 'size', profile)
positions = chose(positions, 'positions', profile)
numberof = chose(numberof, 'numberof', profile)
extension = chose(extension, 'extension', profile)
width = chose(width, 'width', profile)
height = chose(height, 'height', profile)
#---------------#
# Check and fix #
#---------------#
## If neither positions nor a number of shots are given
if not positions and not numberof:
raise ValueError("Either argument \'positions\' xor argument \'numberof\' must be given")
## If both are given
if positions and numberof:
raise ValueError("Arguments \'positions\' and \'numberof\' exclude each other")
## If just a number of shots to take is given by 'numberof'
if numberof and not positions:
## Parse the duration from the input
info = ffprobe_metadata(input_file)
if info is None:
message_fnc("An error occured while receiving the video log")
return 0
duration = float(info['format']['duration'])
if duration is None:
message_fnc("Could not extract by \'numberof\' because video duration is unknown.")
return 0
positions = []
for pos in range(numberof):
## Calculate the position for every shot and append it to the list
position = pos * (duration / numberof)
positions.append(position)
## If specific positions are given
elif positions and not numberof:
## Check if correct timecodes or seconds are given
i = 0
for pos in positions:
if not (is_seconds(pos) or is_timecode(pos)):
raise ValueError("The given position \'%s\' is neither a value in seconds nor a timecode!" % str(pos))
## if a timecode is given, convert it to seconds
if is_timecode(pos):
positions[i] = timecode_to_seconds(pos)
i += 1
## If no output filename is given, use input filename and append jpg
if output_file is None:
ipath = os.path.splitext(input_file)[0]
if not extension.startswith("."):
extension = "." + extension
output_file = ipath + extension
## If no explizit size for the frames is given
if not size:
size = determine_resolution_preserving_aspect(input_file, width, height, aspect)
#------------#
# Extraction #
#------------#
counter = 1
for position in positions:
#---------------------------#
# Generate output file name #
#---------------------------#
number_substituted = False
if '%(number)' in output_file:
number_substituted = True
## If the output filename should be stubstituted
try:
output_filename = output_file % {
'input': os.path.splitext(os.path.split(input_file)[1])[0],
'timecode': seconds_to_timecode(position),
'size': size,
'number': counter
}
except KeyError:
raise
## In case that more than one shot is taken and you don't want to substitute
if not number_substituted:
if len(positions) > 1:
path, ext = os.path.splitext(output_file)
output_filename = path + str(counter).zfill(len(str(len(positions)))) + ext
## If you dont want to substitute and only one file is selected,
## it will just take the output or input name without altering it
else:
output_filename = output_file
#-------------#
# Run process #
#-------------#
## Build the command for ffmpeg
command = (CFG_BIBENCODE_FFMPEG_EXTRACT_COMMAND % (
position, input_file, size, output_filename
)).split()
## Start subprocess and poll the output until it finishes
process = subprocess.Popen(command, stderr=subprocess.PIPE)
stderr = []
while process.poll() is None:
## We want to keep the last lines of output in case of an error
stderr += process.communicate()[1].splitlines()
stderr = stderr[-5:]
## If something went wrong, print the last lines of the log
if process.poll() != 0:
msg = ("Error while extracting frame %d of %d" % (counter, len(positions)))
message_fnc(msg)
update_fnc(msg)
## Print the end of the log
message_fnc("Last lines of the FFmpeg log:")
for line in stderr:
message_fnc(line)
return 0
else:
update_fnc("Frame %d of %d extracted" % (counter, len(positions)))
counter += 1
## Everything should be fine if this position is reached
message_fnc("Extraction of frames was successful")
return 1
diff --git a/invenio/modules/encoder/metadata.py b/invenio/modules/encoder/metadata.py
index 065203e3a..49cf58d46 100644
--- a/invenio/modules/encoder/metadata.py
+++ b/invenio/modules/encoder/metadata.py
@@ -1,373 +1,373 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibEncode metadata submodule.
Metadata insertion, extraction and processing for video files.
"""
__revision__ = "$Id$"
import subprocess
import re
from xml.dom import minidom
from invenio.utils.json import json, json_decode_file
-from invenio.bibtask import write_message
+from invenio.legacy.bibsched.bibtask import write_message
from invenio.modules.encoder.config import (
CFG_BIBENCODE_FFMPEG_METADATA_ARGUMENT,
CFG_BIBENCODE_FFMPEG_METADATA_SET_COMMAND,
CFG_BIBENCODE_PBCORE_MAPPINGS
)
from invenio.modules.encoder.utils import probe, getval, mediainfo, seconds_to_timecode
## Stores metadata for the process. Many different functions in BibEncode
## need access to video metadata regularly. Because we dont pass objects arount
## we need to call the functions of this submodule again and again. To not
## call ffprobe and mediainfo all the time, the metadata is stored in this cache.
_FFPROBE_METADATA_CACHE = {}
_MEDIAINFO_METADATA_CACHE = {}
def write_metadata(input_file, output_file, metadata):
"""Writes metadata to a copy of the given input file.
The metadata must be a dictionary that contains valid key-value pairs.
Valid keys are defined in CFG_BIBENCODE_FFMPEG_METADATA_TEMPLATE.
@param input_file: The original video path
@param outout_file: The path to the copy with the new metadata
@param metadata: The metadata dictionary to write to the video
"""
meta_args = []
if type(metadata) is dict:
## build metadata arguments for ffmpeg
for key, value in metadata.iteritems():
if value is not None:
meta_args.append(CFG_BIBENCODE_FFMPEG_METADATA_ARGUMENT % (key, value))
else:
write_message("metadata arg no dict")
return 0
## build the command
command = (CFG_BIBENCODE_FFMPEG_METADATA_SET_COMMAND % (input_file, output_file)).split()
for meta_arg in meta_args:
command.insert(-1, '-metadata')
command.insert(-1, meta_arg)
write_message(command)
process = subprocess.Popen(command, stderr=subprocess.PIPE)
stderr = []
while process.poll() is None:
## We want to keep the last lines of output in case of an error
stderr += process.communicate()[1].splitlines()
stderr = stderr[-5:]
if process.poll() == -15:
write_message("terminated")
return 0
if process.poll() == 1:
## If there was an error during FFmpeg execution, write log
write_message("There was en error with FFmpeg writing metadata")
write_message("Below the last lines of the FFmpeg log:")
for line in stderr:
write_message(line)
return 0
if process.poll() == 0:
write_message("went fine")
return 1
def dump_metadata(input_file, output_file, meta_type="ffprobe"):
"""Dumps the metadata from a given video to the given file
The output will be in JSON or XML
@param input_file: Full path to the video
@param output_file: Full path to the JSON dump file
@param type: Metadata style/library to use,
either ffprobe, mediainfo or pbcore
"""
metadata_dict = None
if not meta_type in ('ffprobe', 'mediainfo', 'pbcore'):
raise ValueError("Type must be ffprobe, pbcore or mediainfo")
if meta_type == 'ffprobe':
metadata_dict = ffprobe_metadata(input_file)
elif meta_type == 'mediainfo':
metadata_dict = mediainfo_metadata(input_file)
if metadata_dict is not None:
metadata_string = json.dumps(metadata_dict, sort_keys=True, indent=4)
file = open(output_file, "w")
file.write(metadata_string)
file.close()
## Dump PBCORE
else:
pbcore = pbcore_metadata(input_file)
file = open(output_file, "w")
file.write(pbcore)
file.close()
def ffprobe_metadata(input_file):
"""This function uses pretty and parsable ffmpeg output to
access all the metadata of a videofile correctly
@param input_file: fullpath to the media file
@type input_file: string
@return: a dictionary containing the metadata
@rtype: dictionary
"""
global _FFPROBE_METADATA_CACHE
if input_file in _FFPROBE_METADATA_CACHE:
return _FFPROBE_METADATA_CACHE[input_file]
ffprobe_output = probe(input_file, True)
if ffprobe_output is None:
return None
meta_dict = {
'format': {},
'streams': []
}
format_start = re.compile("^\[FORMAT\]$")
format_end = re.compile("^\[\/FORMAT\]$")
stream_start = re.compile("^\[STREAM\]$")
stream_end = re.compile("^\[\/STREAM\]$")
lines = ffprobe_output.splitlines()
format_section = False
stream_section = False
for line in lines:
if format_start.match(line):
format_section = True
continue
if format_end.match(line):
format_section = False
continue
if stream_start.match(line):
meta_dict['streams'].append(dict())
stream_section = True
continue
if stream_end.match(line):
stream_section = False
continue
if format_section:
key, value = line.split("=", 1)
meta_dict['format'][key] = value
if stream_section:
key, value = line.split("=", 1)
meta_dict['streams'][-1][key] = value
_FFPROBE_METADATA_CACHE[input_file] = meta_dict
return meta_dict
def mediainfo_metadata(input_file, aspect_override=None):
"""Uses the mediainfo library instead of ffprobe to access metadata
@param input_file: fullpath to the media file
@type input_file: string
@return: a list of dictionaries containing the metadata
@rtype: list
"""
global _MEDIAINFO_METADATA_CACHE
if input_file in _MEDIAINFO_METADATA_CACHE:
return _MEDIAINFO_METADATA_CACHE[input_file]
meta_list = []
mediainfo_output = mediainfo(input_file)
dom = minidom.parseString(mediainfo_output)
for track in dom.getElementsByTagName('track'):
track_dict = {}
last_seen_tag = ""
for node in track.childNodes:
try:
if last_seen_tag != node.tagName or node.tagName == "Display_aspect_ratio":
track_dict[node.tagName.encode('ascii').lower()] = " ".join(t.nodeValue for t in node.childNodes if t.nodeType == t.TEXT_NODE).encode('ascii')
last_seen_tag = node.tagName.encode('ascii')
except:
pass
if 'display_aspect_ratio' in track_dict and aspect_override:
track_dict['display_aspect_ratio'] = aspect_override
meta_list.append(track_dict)
_MEDIAINFO_METADATA_CACHE[input_file] = meta_list
return meta_list
def pbcore_metadata(input_file, pbcoreIdentifier=None, pbcoreTitle=None,
pbcoreDescription=None, instantiationIdentifier=None,
instantiationPhysical=None, instantiationLocation=None,
instantiationGenerations=None,instantiationExtension=None,
instantiationPart=None, instantiationAnnotation=None,
instantiationRights=None, instantiationRelation=None,
xmlns="pbcore", aspect_override=None
):
""" Transformes parsed metadata to a pbcore representation.
To supplement all the pbcore field, we need both ffprobe and mediainfo.
If only ffprobe is installed, it will not fail but supplement only partially.
@param input_file: full path to the file to extract the metadata from
@type input_file: string
@return: pbcore xml metadata representation
@rtype: string
"""
def _follow_path(path, locals_u, meta_dict, probe_dict, stream_number=None):
"""
Trys to follow a given path and returns the value it represents.
The path is a string that must be like this:
local->variable_name
ffprobe->format->param
ffprobe->video->param
ffprobe->audio->param
ffprobe->stream->param
mediainfo->general->param
mediainfo->audio->param
mediainfo->video->param
mediainfo->track->param
@param path: Path to the value
@type: string
@param locals_u: Local variables
@type locals_u: dict
@param meta_dict: Mediainfo metadata
@type meta_dict: dict
@param probe_dict: FFprobe metadata
@type probe_dict: dict
@param stream_number: To follow a path to a specific stream
@type stream_number: int
@return: value of the element the path points to
@rtype: string
"""
path_segments = path.split("->")
## ffprobe
if path_segments[0] == 'ffprobe':
## format
if path_segments[1] == 'format':
return getval(probe_dict['format'], path_segments[2], 0)
## 1st video
elif path_segments[1] in ('video', 'audio'):
for stream in probe_dict['streams']:
if getval(stream, 'codec_type') == path_segments[1]:
return getval(stream, path_segments[2], 0)
## stream by number
elif path_segments[1] == 'stream':
return getval(probe_dict['streams'][stream_number],
path_segments[2], 0)
## mediainfo
elif path_segments[0] == 'mediainfo':
## general, video, audio
if path_segments[1] in ('general', 'video', 'audio'):
for track in meta_dict:
if getval(track, 'kind_of_stream').lower() == path_segments[1]:
return getval(track, path_segments[2], 0)
## stream by number
elif path_segments[1] == 'track':
## We rely on format being the first track in mediainfo
## And the order of streams in ffprobe and tracks in mediainfo being the same
return getval(meta_dict[stream_number+1], path_segments[2], 0)
## local variable
elif path_segments[0] == 'local':
return getval(locals_u, path_segments[1], 0)
## direct input
else:
return path_segments[0]
def _map_values(mapping, locals_u, meta_dict, probe_dict, stream_number=None):
""" substitute a mapping dictionary an returns the substituted value.
The dictionary must contain of a 'tag' a 'mapping' and a 'call'
@param mapping: mapping dictionary to substitute
@type: dict
@param locals_u: Local variables
@type locals_u: dict
@param meta_dict: Mediainfo metadata
@type meta_dict: dict
@param probe_dict: FFprobe metadata
@type probe_dict: dict
@param stream_number: To follow a path to a specific stream
@type stream_number: int
@return: substituted mapping
@rtype: string
"""
items = []
for value in mapping:
mapping = value['mapping']
tag = value['tag']
call = getval(value, 'call')
micro_mappings = mapping.split(';;')
values = []
foundall = True
for micro_mapping in micro_mappings:
value = _follow_path(micro_mapping, locals_u, meta_dict, probe_dict, stream_number)
if value:
if call:
value = globals()[call](value)
values.append(value.strip())
else:
foundall &= False
try:
if values and foundall:
items.append(tag % "".join(values))
except:
pass
return items
## Get the metadata from ffprobe and mediainfo
meta_dict = mediainfo_metadata(input_file, aspect_override)
probe_dict = ffprobe_metadata(input_file)
# parse the mappings
pbcore_mappings = json_decode_file(CFG_BIBENCODE_PBCORE_MAPPINGS)
## INSTANTIATION ##
# According to the PBcore standard, this strict order MUST be followed
instantiation_mapping = pbcore_mappings['instantiation_mapping']
## ESSENCE TRACK ##
# According to the PBcore standard, this strict order MUST be followed
essencetrack_mapping = pbcore_mappings['essencetrack_mapping']
## The XML header for the PBcore document
header = (
"""<?xml version="1.0" encoding="UTF-8"?><pbcoreDescriptionDocument """
"""xmlns%(xmlns)s="http://www.pbcore.org/PBCore/PBCoreNamespace.html" """
"""xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" """
"""xsi:schemaLocation="http://www.pbcore.org/PBCore/PBCoreNamespace.html">"""
)
if pbcoreIdentifier:
pbcoreIdentifier ="""<pbcoreIdentifier>%s</pbcoreIdentifier>""" % pbcoreIdentifier
else:
pbcoreIdentifier = ""
if pbcoreTitle:
pbcoreTitle = """<pbcoreTitle>%s</pbcoreTitle>""" % pbcoreTitle
else:
pbcoreTitle = ""
tail = """</pbcoreDescriptionDocument>"""
## ESSENCE TRACKS ##
essencetracks = []
for stream_no in range(len(probe_dict['streams'])):
essencetracks.append(_map_values(essencetrack_mapping, locals(),
meta_dict, probe_dict, stream_no))
joinedtracks = []
for track in essencetracks:
track = "<instantiationEssenceTrack>" + "".join(track) + "</instantiationEssenceTrack>"
joinedtracks.append(track)
joinedtracks = "".join(joinedtracks)
## INSTANTIATION ##
instantiation_items = _map_values(instantiation_mapping, locals(),
meta_dict, probe_dict)
joinedinstantiation = "<pbcoreInstantiation>" + "".join(instantiation_items) + "</pbcoreInstantiation>"
joined = "%s%s%s%s%s" % (header, pbcoreIdentifier, pbcoreTitle,
joinedinstantiation, tail)
if xmlns:
joined = joined % {"xmlns" : ":%s" % xmlns}
joined = re.sub("<(\w[^>]+)>", "<%s:\g<1>>" % xmlns, joined)
joined = re.sub("<\/([^>]+)>", "</%s:\g<1>>" % xmlns, joined)
else:
joined = joined % {"xmlns" : ""}
return joined
diff --git a/invenio/modules/encoder/tasks.py b/invenio/modules/encoder/tasks.py
index 1162057a7..504eacd6c 100644
--- a/invenio/modules/encoder/tasks.py
+++ b/invenio/modules/encoder/tasks.py
@@ -1,630 +1,630 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibEncode module.
A multi-purpose module that wraps around FFMPEG.
It allows the execution of video transcoding, frame extraction,
metadata handling and more as BibTasks.
"""
__revision__ = "$Id$"
from pprint import pprint
import os
-from invenio.bibtask import (task_init,
+from invenio.legacy.bibsched.bibtask import (task_init,
write_message,
task_set_option,
task_get_option,
task_set_task_param,
)
from invenio.utils.json import json, json_decode_file
from . import (encode, extract, metadata, daemon, batch_engine)
from .config import (
CFG_BIBENCODE_FFMPEG_VALID_SIZES,
CFG_BIBENCODE_FFMPEG_VALID_ACODECS,
CFG_BIBENCODE_FFMPEG_VALID_VCODECS,
CFG_BIBENCODE_VALID_MODES,
CFG_BIBENCODE_FFMPEG_RE_VALID_SIZE,
CFG_BIBENCODE_PROFILES_ENCODING,
CFG_BIBENCODE_PROFILES_EXTRACT,
CFG_BIBENCODE_DAEMON_DIR_NEWJOBS,
CFG_BIBENCODE_DAEMON_DIR_OLDJOBS,
)
from .profiles import get_encoding_profiles, get_extract_profiles
from .utils import check_ffmpeg_configuration
def _topt(key, fallback=None):
""" Just a shortcut
"""
return task_get_option(key, fallback)
def task_submit_elaborate_specific_parameter(key, value, opts, args):
""" Given the string key it checks it's meaning, eventually using the
value. Usually it fills some key in the options dict.
It must return True if it has elaborated the key, False, if it doesn't
know that key.
eg:
if key in ('-n', '--number'):
self.options['number'] = value
return True
return False
"""
## A dictionary used for mapping CLI parameters to task_option keys+-
parameter_mapping = {
'-p': 'profile_name',
'-i': 'input',
'--input': 'input',
'-o': 'output',
'--output': 'output',
'-m': 'mode',
'--mode': 'mode',
'--acodec': 'acodec',
'--vcodec': 'vcodec',
'--abitrate': 'abitrate',
'--vbitrate': 'vbitrate',
'--resolution': 'size',
'--passes': 'passes',
'--special': 'special',
'--specialfirst': 'specialfirst',
'--specialsecond': 'specialsecond',
'--width': 'width',
'--height': 'height',
'--aspect': 'aspect',
'--number': 'numberof',
'--positions': 'positions',
'-D': 'meta_dump',
'-W': 'meta_input',
'--dump': 'meta_dump',
'--write': 'meta_input',
'--newjobfolder': 'new_job_folder',
'--oldjobfolder': 'old_job_folder',
'--recid': 'recid',
'--collection': 'collection',
'--search': 'search'
}
## PASSES ##
## Transform 'passes' to integer
if key in ('--passes', ):
try:
value = int(value)
except ValueError:
write_message('Value of \'--passes\' must be an integer')
return False
## HEIGHT, WIDTH ##
if key in ('--height', '--width'):
try:
value = int(value)
except ValueError:
write_message('Value of \'--height\' or \'--width\''
' must be an integer')
return False
## META MODE ##
## Transform meta mode values to boolean
if key in ('-D', '--dump'):
if not value in ("ffprobe", "mediainfo", "pbcore"):
write_message("Unknown dumping format, must be 'ffprobe', 'mediainfo' or 'pbcore'")
return False
if key in ('--substitute', ):
value = True
## Transform the 'positions' parameter into a list
if key in ('--positions',):
try:
parsed = json.loads(value)
if type(parsed) is not type(list()):
write_message(
'Value of \'--positions\' must be a json list'
)
return False
else:
value = parsed
except ValueError:
write_message(
'Value of \'--positions\' must be a json list'
)
return False
## NUMBEROF ##
## Transform 'number' to integer
if key in ('--number'):
try:
value = int(value)
except ValueError:
write_message('Value of \'--number\' must be an integer')
return False
## ASPECT ##
if key in ('--aspect'):
try:
xasp, yasp = str(value).split(':')
xasp = float(xasp)
yasp = float(yasp)
value = xasp / yasp
except:
write_message('Value of \'--aspect\' must be in \'4:3\' format')
return False
## RECID ##
if key in ('--recid'):
try:
value = int(value)
except ValueError:
write_message('Value of \'--recid\' must be an integer')
return False
## GENERAL MAPPING ##
## For all general or other parameters just use the mapping dictionary
if key in parameter_mapping:
task_set_option(parameter_mapping[key], value)
return True
return False
def task_run_core():
"""Runs the task by fetching arguments from the BibSched task queue.
This is what BibSched will be invoking via daemon call.
Return 1 in case of success and 0 in case of failure."""
#---------------#
# Encoding Mode #
#---------------#
if _topt('mode') == 'encode':
return encode.encode_video(
input_file=_topt('input'),
output_file=_topt('output'),
acodec=_topt('acodec'),
vcodec=_topt('vcodec'),
abitrate=_topt('abitrate'),
vbitrate=_topt('vbitrate'),
resolution=_topt('size'),
passes=_topt('passes'),
special=_topt('special'),
specialfirst=_topt('specialfirst'),
specialsecond=_topt('specialsecond'),
width=_topt('width'),
height=_topt('height'),
aspect=_topt('aspect'),
profile=_topt('profile')
)
#-----------------#
# Extraction Mode #
#-----------------#
elif _topt('mode') == 'extract':
return extract.extract_frames(
input_file=_topt('input'),
output_file=_topt('output'),
size=_topt('size'),
positions=_topt('positions'),
numberof=_topt('numberof'),
width=_topt('width'),
height=_topt('height'),
aspect=_topt('aspect'),
profile=_topt('profile')
)
#---------------#
# Metadata Mode #
#---------------#
elif _topt('mode') == 'meta':
if _topt('meta_dump') is not None:
metadata.dump_metadata(
input_file=_topt('input'),
output_file=_topt('output'),
meta_type=_topt('meta_dump')
)
return True
elif _topt('meta_input') is not None:
if type(_topt('meta_input')) is not type(dict()):
the_metadata = metadata.json_decode_file(
filename=_topt('meta_input'))
task_set_option('meta_input', the_metadata)
return metadata.write_metadata(
input_file=_topt('input'),
output_file=_topt('output'),
metadata=_topt('meta_input')
)
#------------#
# Batch Mode #
#------------#
elif _topt('mode') == 'batch':
if _topt('collection'):
return batch_engine.create_update_jobs_by_collection(
batch_template_file=_topt('input'),
collection=_topt('collection'),
job_directory=_topt('new_job_dir',
CFG_BIBENCODE_DAEMON_DIR_NEWJOBS))
elif _topt('search'):
return batch_engine.create_update_jobs_by_search(
pattern=_topt('search'),
batch_template_file=_topt('input'),
job_directory=_topt('new_job_dir',
CFG_BIBENCODE_DAEMON_DIR_NEWJOBS))
else:
return batch_engine.process_batch_job(_topt('input'))
#-------------#
# Daemon Mode #
#-------------#
elif _topt('mode') == 'daemon':
return daemon.watch_directory(
_topt('new_job_dir', CFG_BIBENCODE_DAEMON_DIR_NEWJOBS),
_topt('old_job_dir', CFG_BIBENCODE_DAEMON_DIR_OLDJOBS)
)
def task_submit_check_options():
""" Checks the tasks arguments for validity
"""
#----------------#
# General Checks #
#----------------#
## FFMPEG CONFIGURATION ##
## The status of ffmpeg should be checked before a task is submitted
## There is a minimum configuration that ffmpeg must be compiled with
## See bibencode_utils and bibencode_config
config = check_ffmpeg_configuration()
if config:
## Prints missing configuration
string = ''
for item in config:
string += ('\t' + item + '\n')
write_message(
"FFmpeg options are missing. Please recompile and add:\n" + string
)
return False
## MODE ##
## Check if the mode is a valid
if _topt('mode') is None:
write_message('You have to specify a mode using \'-m MODE\'')
return False
if _topt('mode') not in CFG_BIBENCODE_VALID_MODES:
write_message('%s is not a valid mode. Use one of %s'
% (_topt('mode'), CFG_BIBENCODE_VALID_MODES))
return False
## INPUT ##
## Check if the input file is given and if it exists
## You should allways use an absolute path to the file
if _topt('mode') in ('encode', 'extract', 'meta', 'batch'):
if _topt('input') is None:
write_message('You must specify an input file using \'-i FILE\'')
return False
else:
if not os.path.exists(_topt('input')):
print("The file %s does not exist" % _topt('input'))
return False
## OUTPUT ##
## Check if the output file is given and if it exists
## You should allways use an absolute path to the file
if _topt('mode') in ('encode', 'extract', 'meta'):
if _topt('output') is None:
write_message('No output file is given. Please specify with'
' \'-o NAME\''
)
return False
#---------------#
# Encoding Mode #
#---------------#
if _topt('mode') == 'encode':
## PROFILE ## Check for a valid profile if this is given
if _topt('profile_name') is not None:
if _topt('profile_name') not in get_encoding_profiles():
write_message('%s not found in %s' %
(_topt('profile_name'),
CFG_BIBENCODE_PROFILES_ENCODING)
)
return False
## If the profile exists
else:
pass
## AUDIOCODEC ##
## Checks if the audiocodec is one of the predefined
if _topt('acodec') is not None:
if _topt('acodec') not in CFG_BIBENCODE_FFMPEG_VALID_ACODECS:
write_message(
'%s is not a valid audiocodec.\nAvailable codecs: %s'
% (_topt('acodec'), CFG_BIBENCODE_FFMPEG_VALID_ACODECS)
)
return False
## VIDEOCODEC ## Checks if the videocodec is one of the predefined
if _topt('vcodec') is not None:
if _topt('vcodec') not in CFG_BIBENCODE_FFMPEG_VALID_VCODECS:
write_message(
'%s is not a valid videocodec.\nAvailable codecs: %s'
% (_topt('vcodec'), CFG_BIBENCODE_FFMPEG_VALID_VCODECS)
)
return False
## SIZE ##
## Checks if the size is either WxH or an FFMPEG preset
if _topt('size') is not None:
if not CFG_BIBENCODE_FFMPEG_RE_VALID_SIZE.match(_topt('size')):
if _topt('size') not in CFG_BIBENCODE_FFMPEG_VALID_SIZES:
write_message(
'%s is not a valid frame size.\nEither use the'
' \'WxH\' notation or one of these values:\n%s'
% (_topt('size'), CFG_BIBENCODE_FFMPEG_VALID_SIZES)
)
return False
## Check if both a size and vertical or horizontal resolution
if (_topt('width') or _topt('height')) and _topt('size'):
write_message('Options \'width\' and \'height\' can not be '
'combined with \'resolution\'')
return False
## PASSES ##
## If a number of passes is given, it should be either 1 oder 2.
## You could do an infinite number of passes with ffmpeg,
## But it will almost never make a difference above 2 passes.
## So, we currently only support 2 passes.
if _topt('passes') is not None:
if _topt('passes') not in (1, 2):
write_message('The number of passes must be either 1 or 2')
return False
else:
task_set_option('passes', 1)
## BITRATE ##
## Check if the given bitrate is either 1000 sth. or 1000k sth.
if _topt('abitrate') is not None:
pass
if _topt('vbitrate') is not None:
pass
#-----------------#
# Extraction Mode #
#-----------------#
elif _topt('mode') == 'extract':
## PROFILE ##
## If a profile is given, check its validity
if _topt('profile_name') is not None:
if _topt('profile_name') not in get_extract_profiles():
write_message('%s not found in %s' %
(_topt('profile_name'),
CFG_BIBENCODE_PROFILES_EXTRACT)
)
return False
## If the profile exists
else:
pass
## You cannot give both a number and specific positions
## !!! Think about allowing both -> First extract by number,
## !!! then additionally the specific positions
if (
((_topt('numberof') is not None) and
(_topt('positions') is not None))
or
((_topt('numberof') is None) and
(_topt('positions') is None))
):
write_message('Please specify either a number of frames to '
'take or specific positions')
return False
## SIZE ##
## Checks if the size is either WxH or an FFMPEG specific value
if _topt('size') is not None:
if not CFG_BIBENCODE_FFMPEG_RE_VALID_SIZE.match(_topt('size')):
if _topt('size') not in CFG_BIBENCODE_FFMPEG_VALID_SIZES:
write_message(
'%s is not a valid frame size.\nEither use the'
'\'WxH\' notation or one of these valus:\n%s'
% (_topt('size'), CFG_BIBENCODE_FFMPEG_VALID_SIZES)
)
return False
#---------------#
# Metadata Mode #
#---------------#
elif _topt('mode') == 'meta':
## You have to give exactly one meta suboption
if not _xor(_topt('meta_input'),
_topt('meta_dump')):
write_message("You can either dump or write metadata")
return False
## METADATA INPUT ##
if _topt('meta_input') is not None:
## Check if this is either a filename (that should exist)
## or if this a jsonic metadata notation
if os.path.exists(_topt('meta_input')):
pass
else:
try:
metadict = json.loads(_topt('meta_input'))
task_set_option('meta_input', metadict)
except ValueError:
write_message('The value %s of the \'--meta\' parameter is '
'neither a valid filename nor a jsonic dict'
% _topt('meta_input'))
return False
#------------#
# Batch Mode #
#------------#
elif _topt('mode') == 'batch':
if _topt('collection') and _topt('search'):
write_message('You can either use \'search\' or \'collection\'')
return False
elif _topt('collection'):
template = json_decode_file(_topt('input'))
print('\n')
print("#---------------------------------------------#")
print("# YOU ARE ABOUT TO UPDATE A WHOLE COLLECTION #")
print("#---------------------------------------------#")
print('\n')
print('The selected template file contains:')
pprint(template)
print('\n')
elif _topt('search'):
template = json_decode_file(_topt('input'))
message = "# YOU ARE ABOUT TO UPDATE RECORDS MATCHING '%s' #" % _topt('search')
print('\n')
print("#" + "-"*(len(message)-2) + "#")
print(message)
print("#" + "-"*(len(message)-2) + "#")
print('\n')
print('The selected template file contains:')
pprint(template)
print('\n')
#-------------#
# Daemon Mode #
#-------------#
elif _topt('mode') == 'daemon':
task_set_task_param('task_specific_name', 'daemon')
## You can either give none or both folders, but not only one
if _xor(_topt('new_job_folder'), _topt('old_job_folder')):
write_message('When specifying folders for the daemon mode, you '
'have to specify both the folder for the new jobs '
'and the old ones')
return False
## If every check went fine
return True
def main():
"""Main that construct all the bibtask."""
task_init(authorization_action='runbibencode',
authorization_msg="Bibencode Task Submission",
help_specific_usage=(
"""
General options:
-m, --mode= Selects the mode for BibEncode
Modes: 'meta', 'encode', 'extract', 'daemon', 'batch'
-i, --input= Input file
-o, --output= Output file
Options for mode 'meta':
-D, --dump= Dumps metadata from a video to a file
Options: "ffprobe", "mediainfo", "pbcore"
-W, --write= Write metadata to a copy of the file
Either a filename or a serialized JSON object.
Options for mode 'encode'
-p Profile to use for encoding
--acodec= Audiocodec for the transcoded video
--vcodec= Videocodec for the transcoded video
--abitrate= Bitrate auf the audio stream
--vbitrate= Bitrate of the video stream
--resolution= Resolution of the transcoded video
--passes= Number of passes
--special= Pure FFmpeg options that will be appended to the command
--specialfirst= Pure FFmpeg options for the first pass
--specialsecond= Pure FFmpeg options for the second pass
--width= Horizontal resolution
--height= Vertical resolution
--aspect= Aspect ratio fallback if undetectable
Options for mode 'extract':
-p Profile to use for frame extraction
--resolution= Resolution of the extracted frame(s)
--number= Number of frames to extract
--positions= Specific positions inside the video to extract from
Python list notation
Either in seconds like '10' or '10.5'
Or as a timecode like '00:00:10.5'
Example:'[10, 10.5, 00:00:12.5, 20, 00:08:45:11.26]'
-o, --output= Output filename can be substituted by bibencode:
%(input)s for the input filename
%(timecode)s for the timecode
%(size)s for the frame size
%(number)d for sequential numbers
--width= Horizontal resolution
--height= Vertical resolution
--aspect= Aspect ratio fallback if undetectable
Options for mode 'batch':
--collection= Updates the whole collection acc. to a batch template
--search= Updates all records matching the search query
Options for mode 'daemon':
--newjobdir= Optional folder to look for new job descriptions
--oldjobdir= Optional folder to move the job desc. of done jobs
"""
),
version=__revision__,
specific_params=("m:i:o:p:W:D:",
[
"mode=",
"input=",
"output=",
"write=",
"dump=",
"acodec=",
"vcodec=",
"abitrate=",
"vbitrate=",
"resolution=",
"passes=",
"special=",
"specialfirst=",
"specialsecond=",
"height=",
"width=",
"number=",
"positions=",
"substitute",
"newjobdir=",
"oldjobdir=",
"recid=",
"aspect=",
"collection=",
"search="
]),
task_submit_elaborate_specific_parameter_fnc= \
task_submit_elaborate_specific_parameter,
task_submit_check_options_fnc=task_submit_check_options,
task_run_fnc=task_run_core)
def _xor(*xvars):
""" XOR Helper
"""
xsum = bool(False)
for xvar in xvars:
xsum = xsum ^ bool(xvar)
return xsum
if __name__ == '__main__':
main()
diff --git a/invenio/modules/formatter/config.py b/invenio/modules/formatter/config.py
index f6824f66f..693995f93 100644
--- a/invenio/modules/formatter/config.py
+++ b/invenio/modules/formatter/config.py
@@ -1,89 +1,89 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0301
"""BibFormat configuration parameters."""
__revision__ = "$Id$"
import os
import pkg_resources
from invenio.config import CFG_ETCDIR, CFG_PYLIBDIR
# True if old php format written in EL must be used by Invenio.
# False if new python format must be used. If set to 'False' but
# new format cannot be found, old format will be used.
CFG_BIBFORMAT_USE_OLD_BIBFORMAT = False
# Paths to main formats directories
CFG_BIBFORMAT_TEMPLATES_DIR = "format_templates"
CFG_BIBFORMAT_TEMPLATES_PATH = os.path.join(CFG_ETCDIR, 'bibformat', CFG_BIBFORMAT_TEMPLATES_DIR)
CFG_BIBFORMAT_JINJA_TEMPLATE_PATH = os.path.join(CFG_ETCDIR, 'templates', CFG_BIBFORMAT_TEMPLATES_DIR)
CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH = None # defaults to autodiscovery
CFG_BIBFORMAT_ELEMENTS_PATH = pkg_resources.resource_filename('invenio.modules.formatter', 'format_elements')
-CFG_BIBFORMAT_OUTPUTS_PATH = os.path.join(CFG_ETCDIR, 'bibformat', 'output_formats')
+CFG_BIBFORMAT_OUTPUTS_PATH = pkg_resources.resource_filename('invenio.modules.formatter', 'output_formats')
# CFG_BIBFORMAT_HIDDEN_TAGS -- list of MARC tags that
# are not shown to users not having cataloging authorizations.
CFG_BIBFORMAT_HIDDEN_TAGS = [595,]
# File extensions of formats
CFG_BIBFORMAT_FORMAT_TEMPLATE_EXTENSION = "bft"
CFG_BIBFORMAT_FORMAT_JINJA_TEMPLATE_EXTENSION = "tpl"
CFG_BIBFORMAT_FORMAT_OUTPUT_EXTENSION = "bfo"
assert CFG_BIBFORMAT_FORMAT_TEMPLATE_EXTENSION != CFG_BIBFORMAT_FORMAT_JINJA_TEMPLATE_EXTENSION, \
"CFG_BIBFORMAT_FORMAT_TEMPLATE_EXTENSION and CFG_BIBFORMAT_FORMAT_JINJA_TEMPLATE_EXTENSION must be different"
assert len(CFG_BIBFORMAT_FORMAT_TEMPLATE_EXTENSION) == 3, \
"CFG_BIBFORMAT_FORMAT_TEMPLATE_EXTENSION must be 3 characters long"
assert len(CFG_BIBFORMAT_FORMAT_JINJA_TEMPLATE_EXTENSION) == 3, \
"CFG_BIBFORMAT_FORMAT_JINJA_TEMPLATE_EXTENSION must be 3 characters long"
assert len(CFG_BIBFORMAT_FORMAT_OUTPUT_EXTENSION) == 3, \
"CFG_BIBFORMAT_FORMAT_OUTPUT_EXTENSION must be 3 characters long"
# Exceptions: errors
class InvenioBibFormatError(Exception):
"""A generic error for BibFormat."""
def __init__(self, message):
"""Initialisation."""
self.message = message
def __str__(self):
"""String representation."""
return repr(self.message)
# Exceptions: warnings
class InvenioBibFormatWarning(Exception):
"""A generic warning for BibFormat."""
def __init__(self, message):
"""Initialisation."""
self.message = message
def __str__(self):
"""String representation."""
return repr(self.message)
diff --git a/invenio/modules/formatter/engine.py b/invenio/modules/formatter/engine.py
index 88792661a..57cd9c1d5 100644
--- a/invenio/modules/formatter/engine.py
+++ b/invenio/modules/formatter/engine.py
@@ -1,2243 +1,2245 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Formats a single XML Marc record using specified format.
There is no API for the engine. Instead use module L{bibformat}.
You can have a look at the various escaping modes available in
X{BibFormatObject} in function L{escape_field}
Still it is useful sometimes for debugging purpose to use the
L{BibFormatObject} class directly. For eg:
>>> from invenio.modules.formatter.engine import BibFormatObject
>>> bfo = BibFormatObject(102)
>>> bfo.field('245__a')
The order Rodentia in South America
>>> from invenio.modules.formatter.format_elements import bfe_title
>>> bfe_title.format_element(bfo)
The order Rodentia in South America
@see: bibformat.py, bibformat_utils.py
"""
__revision__ = "$Id$"
import re
import sys
import os
import inspect
import traceback
import zlib
import cgi
import types
from flask import has_app_context
from operator import itemgetter
from werkzeug.utils import cached_property
from invenio.base.globals import cfg
from invenio.base.utils import (autodiscover_template_context_functions,
autodiscover_format_elements)
from invenio.config import \
CFG_PATH_PHP, \
CFG_BINDIR, \
CFG_SITE_LANG
from invenio.ext.logging import \
register_exception
from invenio.legacy.bibrecord import \
create_record, \
record_get_field_instances, \
record_get_field_value, \
record_get_field_values, \
record_xml_output
from invenio.modules.formatter.engines.xslt import format
from invenio.legacy.dbquery import run_sql
from invenio.base.i18n import \
language_list_long, \
wash_language, \
gettext_set_language
from . import api as bibformat_dblayer
from .config import \
CFG_BIBFORMAT_TEMPLATES_DIR, \
CFG_BIBFORMAT_FORMAT_TEMPLATE_EXTENSION, \
CFG_BIBFORMAT_FORMAT_JINJA_TEMPLATE_EXTENSION, \
CFG_BIBFORMAT_FORMAT_OUTPUT_EXTENSION, \
CFG_BIBFORMAT_OUTPUTS_PATH, \
InvenioBibFormatError
from invenio.modules.formatter.utils import \
record_get_xml, \
parse_tag
from invenio.utils.html import \
HTMLWasher, \
CFG_HTML_BUFFER_ALLOWED_TAG_WHITELIST, \
CFG_HTML_BUFFER_ALLOWED_ATTRIBUTE_WHITELIST
from invenio.modules.knowledge.api import get_kbr_values
from invenio.ext.template import render_template_to_string
from HTMLParser import HTMLParseError
from invenio.utils.shell import escape_shell_arg
if CFG_PATH_PHP: #Remove when call_old_bibformat is removed
from xml.dom import minidom
import tempfile
# Cache for data we have already read and parsed
format_templates_cache = {}
format_elements_cache = {}
format_outputs_cache = {}
html_field = '<!--HTML-->' # String indicating that field should be
# treated as HTML (and therefore no escaping of
# HTML tags should occur.
# Appears in some field values.
washer = HTMLWasher() # Used to remove dangerous tags from HTML
# sources
# Regular expression for finding <lang>...</lang> tag in format templates
pattern_lang = re.compile(r'''
<lang #<lang tag (no matter case)
\s* #any number of white spaces
> #closing <lang> start tag
(?P<langs>.*?) #anything but the next group (greedy)
(</lang\s*>) #end tag
''', re.IGNORECASE | re.DOTALL | re.VERBOSE)
# Builds regular expression for finding each known language in <lang> tags
ln_pattern_text = r"<("
for lang in language_list_long(enabled_langs_only=False):
ln_pattern_text += lang[0] +r"|"
ln_pattern_text = ln_pattern_text.rstrip(r"|")
ln_pattern_text += r")>(.*?)</\1>"
ln_pattern = re.compile(ln_pattern_text, re.IGNORECASE | re.DOTALL)
# Regular expression for finding text to be translated
translation_pattern = re.compile(r'_\((?P<word>.*?)\)_', \
re.IGNORECASE | re.DOTALL | re.VERBOSE)
# Regular expression for finding <name> tag in format templates
pattern_format_template_name = re.compile(r'''
<name #<name tag (no matter case)
\s* #any number of white spaces
> #closing <name> start tag
(?P<name>.*?) #name value. any char that is not end tag
(</name\s*>)(\n)? #end tag
''', re.IGNORECASE | re.DOTALL | re.VERBOSE)
# Regular expression for finding <description> tag in format templates
pattern_format_template_desc = re.compile(r'''
<description #<decription tag (no matter case)
\s* #any number of white spaces
> #closing <description> start tag
(?P<desc>.*?) #description value. any char that is not end tag
</description\s*>(\n)? #end tag
''', re.IGNORECASE | re.DOTALL | re.VERBOSE)
# Regular expression for finding <BFE_ > tags in format templates
pattern_tag = re.compile(r'''
<BFE_ #every special tag starts with <BFE_ (no matter case)
(?P<function_name>[^/\s]+) #any char but a space or slash
\s* #any number of spaces
(?P<params>(\s* #params here
(?P<param>([^=\s])*)\s* #param name: any chars that is not a white space or equality. Followed by space(s)
=\s* #equality: = followed by any number of spaces
(?P<sep>[\'"]) #one of the separators
(?P<value>.*?) #param value: any chars that is not a separator like previous one
(?P=sep) #same separator as starting one
)*) #many params
\s* #any number of spaces
(/)?> #end of the tag
''', re.IGNORECASE | re.DOTALL | re.VERBOSE)
# Regular expression for finding params inside <BFE_ > tags in format templates
pattern_function_params = re.compile('''
(?P<param>([^=\s])*)\s* # Param name: any chars that is not a white space or equality. Followed by space(s)
=\s* # Equality: = followed by any number of spaces
(?P<sep>[\'"]) # One of the separators
(?P<value>.*?) # Param value: any chars that is not a separator like previous one
(?P=sep) # Same separator as starting one
''', re.VERBOSE | re.DOTALL )
# Regular expression for finding format elements "params" attributes
# (defined by @param)
pattern_format_element_params = re.compile('''
@param\s* # Begins with AT param keyword followed by space(s)
(?P<name>[^\s=]*):\s* # A single keyword and comma, then space(s)
#(=\s*(?P<sep>[\'"]) # Equality, space(s) and then one of the separators
#(?P<default>.*?) # Default value: any chars that is not a separator like previous one
#(?P=sep) # Same separator as starting one
#)?\s* # Default value for param is optional. Followed by space(s)
(?P<desc>.*) # Any text that is not end of line (thanks to MULTILINE parameter)
''', re.VERBOSE | re.MULTILINE)
# Regular expression for finding format elements "see also" attribute
# (defined by @see)
pattern_format_element_seealso = re.compile('''@see:\s*(?P<see>.*)''',
re.VERBOSE | re.MULTILINE)
#Regular expression for finding 2 expressions in quotes, separated by
#comma (as in template("1st","2nd") )
#Used when parsing output formats
## pattern_parse_tuple_in_quotes = re.compile('''
## (?P<sep1>[\'"])
## (?P<val1>.*)
## (?P=sep1)
## \s*,\s*
## (?P<sep2>[\'"])
## (?P<val2>.*)
## (?P=sep2)
## ''', re.VERBOSE | re.MULTILINE)
sub_non_alnum = re.compile('[^0-9a-zA-Z]+')
fix_tag_name = lambda s: sub_non_alnum.sub('_', s.lower())
from invenio.utils.memoise import memoize
class LazyTemplateContextFunctionsCache(object):
"""Loads bibformat elements using plugin builder and caches results."""
@cached_property
def template_context_functions(self):
"""Returns template context functions"""
modules = autodiscover_template_context_functions()
elem = {}
for m in modules:
register_func = getattr(m, 'template_context_function', None)
if register_func and isinstance(register_func, types.FunctionType):
elem[m.__name__.split('.')[-1]] = register_func
return elem
@memoize
def bibformat_elements(self, packages=None):
"""Returns bibformat elements."""
if cfg['CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH'] is not None:
packages = [cfg['CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH']]
modules = autodiscover_format_elements(packages=packages, silent=True)
elem = {}
for m in modules:
+ if m is None:
+ continue
name = m.__name__.split('.')[-1]
register_func = getattr(m, 'format_element',
getattr(m, 'format', None))
escape_values = getattr(m, 'escape_values', None)
if register_func and isinstance(register_func, types.FunctionType):
register_func._escape_values = escape_values
elem[name] = register_func
return elem
#@cached_property
#def bibformat_elements(self):
# return self._bibformat_elements()
@cached_property
def functions(self):
def insert(name):
def _bfe_element(bfo, **kwargs):
# convert to utf-8 for legacy app
kwargs = dict((k, v.encode('utf-8') if isinstance(v, unicode) else v)
for k, v in kwargs.iteritems())
format_element = get_format_element(name)
(out, dummy) = eval_format_element(format_element,
bfo,
kwargs)
# returns unicode for jinja2
return out.decode('utf-8')
return _bfe_element
# Old bibformat templates
tfn_from_files = dict((name.lower(), insert(name.lower()))
for name in self.bibformat_elements().keys())
# Update with new template context functions
tfn_from_files.update(self.template_context_functions)
bfe_from_tags = {}
if has_app_context():
from invenio.ext.sqlalchemy import db
from invenio.modules.search.models import Tag
# get functions from tag table
bfe_from_tags = dict(('bfe_'+fix_tag_name(name),
insert(fix_tag_name(name)))
for name in map(itemgetter(0),
db.session.query(Tag.name).all()))
# overwrite functions from tag table with functions from files
bfe_from_tags.update(tfn_from_files)
return bfe_from_tags
TEMPLATE_CONTEXT_FUNCTIONS_CACHE = LazyTemplateContextFunctionsCache()
def call_old_bibformat(recID, of="HD", on_the_fly=False, verbose=0):
"""
FIXME: REMOVE FUNCTION WHEN MIGRATION IS DONE
Calls BibFormat for the record RECID in the desired output format 'of'.
Note: this functions always try to return HTML, so when
bibformat returns XML with embedded HTML format inside the tag
FMT $g, as is suitable for prestoring output formats, we
perform un-XML-izing here in order to return HTML body only.
@param recID: record ID to format
@param of: output format to be used for formatting
@param on_the_fly: if False, try to return an already preformatted version of the record in the database
@param verbose: verbosity
@return: a formatted output using old BibFormat
"""
out = ""
res = []
if not on_the_fly:
# look for formatted record existence:
query = "SELECT value, last_updated FROM bibfmt WHERE "\
"id_bibrec='%s' AND format='%s'" % (recID, of)
res = run_sql(query, None, 1)
if res:
# record 'recID' is formatted in 'of', so print it
if verbose == 9:
last_updated = res[0][1]
out += """\n<br/><span class="quicknote">
Found preformatted output for record %i (cache updated on %s).
</span>""" % (recID, last_updated)
decompress = zlib.decompress
return "%s" % decompress(res[0][0])
else:
# record 'recID' is not formatted in 'of',
# so try to call BibFormat on the fly or use default format:
if verbose == 9:
out += """\n<br/><span class="quicknote">
Formatting record %i on-the-fly with old BibFormat.
</span><br/>""" % recID
# Retrieve MARCXML
# Build it on-the-fly only if 'call_old_bibformat' was called
# with format=xm and on_the_fly=True
xm_record = record_get_xml(recID, 'xm',
on_the_fly=(on_the_fly and of == 'xm'))
## import platform
## # Some problem have been found using either popen() or os.system().
## # Here is a temporary workaround until the issue is solved.
## if platform.python_compiler().find('Red Hat') > -1:
## # use os.system
(result_code, result_path) = tempfile.mkstemp()
command = "( %s/bibformat otype=%s ) > %s" % \
(CFG_BINDIR, escape_shell_arg(of), result_path)
(xm_code, xm_path) = tempfile.mkstemp()
xm_file = open(xm_path, "w")
xm_file.write(xm_record)
xm_file.close()
command = command + " <" + xm_path
os.system(command)
result_file = open(result_path,"r")
bibformat_output = result_file.read()
result_file.close()
os.close(result_code)
os.remove(result_path)
os.close(xm_code)
os.remove(xm_path)
## else:
## # use popen
## pipe_input, pipe_output, pipe_error = os.popen3(["%s/bibformat" % CFG_BINDIR,
## "otype=%s" % format],
## 'rw')
## pipe_input.write(xm_record)
## pipe_input.flush()
## pipe_input.close()
## bibformat_output = pipe_output.read()
## pipe_output.close()
## pipe_error.close()
if bibformat_output.startswith("<record>"):
dom = minidom.parseString(bibformat_output)
for e in dom.getElementsByTagName('subfield'):
if e.getAttribute('code') == 'g':
for t in e.childNodes:
out += t.data.encode('utf-8')
else:
out += bibformat_output
return out
def format_record(recID, of, ln=CFG_SITE_LANG, verbose=0,
search_pattern=None, xml_record=None, user_info=None, qid=""):
"""
Formats a record given output format. Main entry function of
bibformat engine.
Returns a formatted version of the record in the specified
language, search pattern, and with the specified output format.
The function will define which format template must be applied.
You can either specify an record ID to format, or give its xml
representation. if 'xml_record' is not None, then use it instead
of recID.
'user_info' allows to grant access to some functionalities on a
page depending on the user's priviledges. 'user_info' is the same
object as the one returned by 'webuser.collect_user_info(req)'
@param recID: the ID of record to format
@param of: an output format code (or short identifier for the output format)
@param ln: the language to use to format the record
@param verbose: the level of verbosity from 0 to 9 (O: silent,
5: errors,
7: errors and warnings, stop if error in format elements
9: errors and warnings, stop if error (debug mode ))
@param search_pattern: list of strings representing the user request in web interface
@param xml_record: an xml string representing the record to format
@param user_info: the information of the user who will view the formatted page
@return: formatted record
"""
if search_pattern is None:
search_pattern = []
out = ""
ln = wash_language(ln)
_ = gettext_set_language(ln)
# Temporary workflow (during migration of formats):
# Call new BibFormat
# But if format not found for new BibFormat, then call old BibFormat
#Create a BibFormat Object to pass that contain record and context
bfo = BibFormatObject(recID, ln, search_pattern, xml_record, user_info, of)
if of.lower() != 'xm' and \
(not bfo.get_record() or len(bfo.get_record()) <= 1):
# Record only has recid: do not format, excepted
# for xm format
return ""
#Find out which format template to use based on record and output format.
template = decide_format_template(bfo, of)
if verbose == 9 and template is not None:
out += """\n<br/><span class="quicknote">
Using %s template for record %i.
</span>""" % (template, recID)
############### FIXME: REMOVE WHEN MIGRATION IS DONE ###############
path = "%s%s%s" % (cfg['CFG_BIBFORMAT_TEMPLATES_PATH'], os.sep, template)
if template is None or not (
os.access(path, os.R_OK) or
template.endswith("." + CFG_BIBFORMAT_FORMAT_JINJA_TEMPLATE_EXTENSION)):
# template not found in new BibFormat. Call old one
if verbose == 9:
if template is None:
out += """\n<br/><span class="quicknote">
No template found for output format %s and record %i.
(Check invenio.err log file for more details)
</span>""" % (of, recID)
else:
out += """\n<br/><span class="quicknote">
Template %s could not be read.
</span>""" % (template)
if CFG_PATH_PHP and os.path.isfile(os.path.join(CFG_BINDIR, 'bibformat')):
if verbose == 9:
out += """\n<br/><span class="quicknote">
Using old BibFormat for record %s.
</span>""" % recID
return out + call_old_bibformat(recID, of=of, on_the_fly=True,
verbose=verbose)
############################# END ##################################
try:
raise InvenioBibFormatError(_('No template could be found for output format %s.') % of)
except InvenioBibFormatError, exc:
register_exception(req=bfo.req)
if verbose > 5:
return out + str(exc.message)
return out
# Format with template
out_ = format_with_format_template(template, bfo, verbose, qid=qid)
out += out_
return out
def decide_format_template(bfo, of):
"""
Returns the format template name that should be used for formatting
given output format and L{BibFormatObject}.
Look at of rules, and take the first matching one.
If no rule matches, returns None
To match we ignore lettercase and spaces before and after value of
rule and value of record
@param bfo: a L{BibFormatObject}
@param of: the code of the output format to use
@return: name of a format template
"""
output_format = get_output_format(of)
for rule in output_format['rules']:
if rule['field'].startswith('00'):
# Rule uses controlfield
values = [bfo.control_field(rule['field']).strip()] #Remove spaces
else:
# Rule uses datafield
values = bfo.fields(rule['field'])
# loop over multiple occurences, but take the first match
if len(values) > 0:
for value in values:
value = value.strip() #Remove spaces
pattern = rule['value'].strip() #Remove spaces
match_obj = re.match(pattern, value, re.IGNORECASE)
if match_obj is not None and \
match_obj.end() == len(value):
return rule['template']
template = output_format['default']
if template != '':
return template
else:
return None
def format_with_format_template(format_template_filename, bfo,
verbose=0, format_template_code=None, qid=""):
""" Format a record given a
format template.
Returns a formatted version of the record represented by bfo,
in the language specified in bfo, and with the specified format template.
If format_template_code is provided, the template will not be loaded from
format_template_filename (but format_template_filename will still be used to
determine if bft or xsl transformation applies). This allows to preview format
code without having to save file on disk.
@param format_template_filename: the dilename of a format template
@param bfo: the object containing parameters for the current formatting
@param format_template_code: if not empty, use code as template instead of reading format_template_filename (used for previews)
@param verbose: the level of verbosity from 0 to 9 (O: silent,
5: errors,
7: errors and warnings,
9: errors and warnings, stop if error (debug mode ))
@return: formatted text
"""
_ = gettext_set_language(bfo.lang)
def translate(match):
"""
Translate matching values
"""
word = match.group("word")
translated_word = _(word)
return translated_word
if format_template_code is not None:
format_content = str(format_template_code)
elif not format_template_filename.endswith("." + CFG_BIBFORMAT_FORMAT_JINJA_TEMPLATE_EXTENSION):
format_content = get_format_template(format_template_filename)['code']
if format_template_filename is None or \
format_template_filename.endswith("." + CFG_BIBFORMAT_FORMAT_TEMPLATE_EXTENSION):
# .bft
filtered_format = filter_languages(format_content, bfo.lang)
localized_format = translation_pattern.sub(translate, filtered_format)
evaluated_format = eval_format_template_elements(localized_format,
bfo,
verbose)
elif format_template_filename.endswith("." + CFG_BIBFORMAT_FORMAT_JINJA_TEMPLATE_EXTENSION):
evaluated_format = '<!-- empty -->'
#try:
from functools import wraps
from invenio.legacy.bibfield import \
create_record as bibfield_create_record, \
get_record as bibfield_get_record
from invenio.legacy.search_engine import print_record
from flask.ext.login import current_user
from invenio.base.helpers import unicodifier
def _format_record(recid, of='hb', user_info=current_user, *args, **kwargs):
return print_record(recid, format=of, user_info=user_info, *args, **kwargs)
# Fixes unicode problems in Jinja2 templates.
def encode_utf8(f):
@wraps(f)
def wrapper(*args, **kwds):
return unicodifier(f(*args, **kwds))
return wrapper
if bfo.recID:
record = bibfield_get_record(bfo.recID)
else:
record = bibfield_create_record(bfo.xml_record, master_format='marc')
bfo.recID = bfo.recID if bfo.recID else 0
record.__getitem__ = encode_utf8(record.__getitem__)
record.get = encode_utf8(record.get)
evaluated_format = render_template_to_string(
'format/record/'+format_template_filename,
recid=bfo.recID,
record=record,
format_record=_format_record,
qid=qid,
bfo=bfo, **TEMPLATE_CONTEXT_FUNCTIONS_CACHE.functions).encode('utf-8')
#except Exception:
# register_exception()
else:
#.xsl
if bfo.xml_record:
# bfo was initialized with a custom MARCXML
xml_record = '<?xml version="1.0" encoding="UTF-8"?>\n' + \
record_xml_output(bfo.record)
else:
# Fetch MARCXML. On-the-fly xm if we are now formatting in xm
xml_record = '<?xml version="1.0" encoding="UTF-8"?>\n' + \
record_get_xml(bfo.recID, 'xm', on_the_fly=False)
# Transform MARCXML using stylesheet
evaluated_format = format(xml_record, template_source=format_content)
try:
evaluated_format = evaluated_format.decode('utf8').encode('utf8')
except:
try:
evaluated_format = evaluated_format.encode('utf8')
except:
evaluated_format = '<!-- Error -->'.encode('utf8')
return evaluated_format
def eval_format_template_elements(format_template, bfo, verbose=0):
"""
Evalutes the format elements of the given template and replace each element with its value.
Prepare the format template content so that we can directly replace the marc code by their value.
This implies:
1. Look for special tags
2. replace special tags by their evaluation
@param format_template: the format template code
@param bfo: the object containing parameters for the current formatting
@param verbose: the level of verbosity from 0 to 9 (O: silent,
5: errors, 7: errors and warnings,
9: errors and warnings, stop if error (debug mode ))
@return: tuple (result, errors)
"""
_ = gettext_set_language(bfo.lang)
# First define insert_element_code(match), used in re.sub() function
def insert_element_code(match):
"""
Analyses 'match', interpret the corresponding code, and return the result of the evaluation.
Called by substitution in 'eval_format_template_elements(...)'
@param match: a match object corresponding to the special tag that must be interpreted
"""
function_name = match.group("function_name")
try:
format_element = get_format_element(function_name, verbose)
except Exception, e:
format_element = None
if verbose >= 5:
return '<b><span style="color: rgb(255, 0, 0);">' + \
cgi.escape(str(e)).replace('\n', '<br/>') + \
'</span>'
if format_element is None:
try:
raise InvenioBibFormatError(_('Could not find format element named %s.') % function_name)
except InvenioBibFormatError, exc:
register_exception(req=bfo.req)
if verbose >= 5:
return '<b><span style="color: rgb(255, 0, 0);">' + \
str(exc.message)+'</span></b>'
else:
params = {}
# Look for function parameters given in format template code
all_params = match.group('params')
if all_params is not None:
function_params_iterator = pattern_function_params.finditer(all_params)
for param_match in function_params_iterator:
name = param_match.group('param')
value = param_match.group('value')
params[name] = value
# Evaluate element with params and return (Do not return errors)
(result, dummy) = eval_format_element(format_element,
bfo,
params,
verbose)
return result
# Substitute special tags in the format by our own text.
# Special tags have the form <BNE_format_element_name [param="value"]* />
format = pattern_tag.sub(insert_element_code, format_template)
return format
def eval_format_element(format_element, bfo, parameters=None, verbose=0):
"""
Returns the result of the evaluation of the given format element
name, with given L{BibFormatObject} and parameters. Also returns
the errors of the evaluation.
@param format_element: a format element structure as returned by get_format_element
@param bfo: a L{BibFormatObject} used for formatting
@param parameters: a dict of parameters to be used for formatting. Key is parameter and value is value of parameter
@param verbose: the level of verbosity from 0 to 9 (O: silent,
5: errors,
7: errors and warnings,
9: errors and warnings, stop if error (debug mode ))
@return: tuple (result, errors)
"""
if parameters is None:
parameters = {}
errors = []
#Load special values given as parameters
prefix = parameters.get('prefix', "")
suffix = parameters.get('suffix', "")
default_value = parameters.get('default', "")
escape = parameters.get('escape', "")
output_text = ''
_ = gettext_set_language(bfo.lang)
# 3 possible cases:
# a) format element file is found: we execute it
# b) format element file is not found, but exist in tag table (e.g. bfe_isbn)
# c) format element is totally unknown. Do nothing or report error
if format_element is not None and format_element['type'] == "python":
# a) We found an element with the tag name, of type "python"
# Prepare a dict 'params' to pass as parameter to 'format'
# function of element
params = {}
# Look for parameters defined in format element
# Fill them with specified default values and values
# given as parameters.
# Also remember if the element overrides the 'escape'
# parameter
format_element_overrides_escape = False
for param in format_element['attrs']['params']:
name = param['name']
default = param['default']
params[name] = parameters.get(name, default)
if name == 'escape':
format_element_overrides_escape = True
# Add BibFormatObject
params['bfo'] = bfo
# Execute function with given parameters and return result.
function = format_element['code']
_ = gettext_set_language(bfo.lang)
try:
output_text = apply(function, (), params)
except Exception, e:
name = format_element['attrs']['name']
try:
raise InvenioBibFormatError(_('Error when evaluating format element %s with parameters %s.') % (name, str(params)))
except InvenioBibFormatError, exc:
register_exception(req=bfo.req)
errors.append(exc.message)
if verbose >= 5:
tb = sys.exc_info()[2]
stack = traceback.format_exception(Exception, e, tb, limit=None)
output_text = '<b><span style="color: rgb(255, 0, 0);">'+ \
str(exc.message) + "".join(stack) +'</span></b> '
# None can be returned when evaluating function
if output_text is None:
output_text = ""
else:
output_text = str(output_text)
# Escaping:
# (1) By default, everything is escaped in mode 1
# (2) If evaluated element has 'escape_values()' function, use
# its returned value as escape mode, and override (1)
# (3) If template has a defined parameter 'escape' (in allowed
# values), use it, and override (1) and (2). If this
# 'escape' parameter is overriden by the format element
# (defined in the 'format' function of the element), leave
# the escaping job to this element
# (1)
escape_mode = 1
# (2)
escape_function = format_element['escape_function']
if escape_function is not None:
try:
escape_mode = apply(escape_function, (), {'bfo': bfo})
except Exception, e:
try:
raise InvenioBibFormatError(_('Escape mode for format element %s could not be retrieved. Using default mode instead.') % name)
except InvenioBibFormatError, exc:
register_exception(req=bfo.req)
errors.append(exc.message)
if verbose >= 5:
tb = sys.exc_info()[2]
output_text += '<b><span style="color: rgb(255, 0, 0);">'+ \
str(exc.message) +'</span></b> '
# (3)
if escape in ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']:
escape_mode = int(escape)
# If escape is equal to 1, then escape all
# HTML reserved chars.
if escape_mode > 0 and not format_element_overrides_escape:
output_text = escape_field(output_text, mode=escape_mode)
# Add prefix and suffix if they have been given as parameters and if
# the evaluation of element is not empty
if output_text.strip() != "":
output_text = prefix + output_text + suffix
# Add the default value if output_text is empty
if output_text == "":
output_text = default_value
return (output_text, errors)
elif format_element is not None and format_element['type'] == "field":
# b) We have not found an element in files that has the tag
# name. Then look for it in the table "tag"
#
# <BFE_LABEL_IN_TAG prefix = "" suffix = "" separator = ""
# nbMax="" escape="0"/>
#
# Load special values given as parameters
separator = parameters.get('separator ', "")
nbMax = parameters.get('nbMax', "")
escape = parameters.get('escape', "1") # By default, escape here
# Get the fields tags that have to be printed
tags = format_element['attrs']['tags']
output_text = []
# Get values corresponding to tags
for tag in tags:
p_tag = parse_tag(tag)
values = record_get_field_values(bfo.get_record(),
p_tag[0],
p_tag[1],
p_tag[2],
p_tag[3])
if len(values)>0 and isinstance(values[0], dict):
#flatten dict to its values only
values_list = map(lambda x: x.values(), values)
#output_text.extend(values)
for values in values_list:
output_text.extend(values)
else:
output_text.extend(values)
if nbMax != "":
try:
nbMax = int(nbMax)
output_text = output_text[:nbMax]
except:
name = format_element['attrs']['name']
try:
raise InvenioBibFormatError(_('"nbMax" parameter for %s must be an "int".') % name)
except InvenioBibFormatError, exc:
register_exception(req=bfo.req)
errors.append(exc.message)
if verbose >= 5:
output_text = output_text.append(exc.message)
# Add prefix and suffix if they have been given as parameters and if
# the evaluation of element is not empty.
# If evaluation is empty string, return default value if it exists.
# Else return empty string
if ("".join(output_text)).strip() != "":
# If escape is equal to 1, then escape all
# HTML reserved chars.
if escape == '1':
output_text = cgi.escape(separator.join(output_text))
else:
output_text = separator.join(output_text)
output_text = prefix + output_text + suffix
else:
#Return default value
output_text = default_value
return (output_text, errors)
else:
# c) Element is unknown
try:
raise InvenioBibFormatError(_('Could not find format element named %s.') % format_element)
except InvenioBibFormatError, exc:
register_exception(req=bfo.req)
errors.append(exc.message)
if verbose < 5:
return ("", errors)
elif verbose >= 5:
if verbose >= 9:
sys.exit(exc.message)
return ('<b><span style="color: rgb(255, 0, 0);">' + \
str(exc.message)+'</span></b>', errors)
def filter_languages(format_template, ln='en'):
"""
Filters the language tags that do not correspond to the specified language.
@param format_template: the format template code
@param ln: the language that is NOT filtered out from the template
@return: the format template with unnecessary languages filtered out
"""
# First define search_lang_tag(match) and clean_language_tag(match), used
# in re.sub() function
def search_lang_tag(match):
"""
Searches for the <lang>...</lang> tag and remove inner localized tags
such as <en>, <fr>, that are not current_lang.
If current_lang cannot be found inside <lang> ... </lang>, try to use 'CFG_SITE_LANG'
@param match: a match object corresponding to the special tag that must be interpreted
"""
current_lang = ln
def clean_language_tag(match):
"""
Return tag text content if tag language of match is output language.
Called by substitution in 'filter_languages(...)'
@param match: a match object corresponding to the special tag that must be interpreted
"""
if match.group(1) == current_lang:
return match.group(2)
else:
return ""
# End of clean_language_tag
lang_tag_content = match.group("langs")
# Try to find tag with current lang. If it does not exists,
# then current_lang becomes CFG_SITE_LANG until the end of this
# replace
pattern_current_lang = re.compile(r"<("+current_lang+ \
r")\s*>(.*?)(</"+current_lang+r"\s*>)", re.IGNORECASE | re.DOTALL)
if re.search(pattern_current_lang, lang_tag_content) is None:
current_lang = CFG_SITE_LANG
cleaned_lang_tag = ln_pattern.sub(clean_language_tag, lang_tag_content)
return cleaned_lang_tag
# End of search_lang_tag
filtered_format_template = pattern_lang.sub(search_lang_tag, format_template)
return filtered_format_template
def get_format_template(filename, with_attributes=False):
"""
Returns the structured content of the given formate template.
if 'with_attributes' is true, returns the name and description. Else 'attrs' is not
returned as key in dictionary (it might, if it has already been loaded previously)::
{'code':"<b>Some template code</b>"
'attrs': {'name': "a name", 'description': "a description"}
}
@param filename: the filename of an format template
@param with_attributes: if True, fetch the attributes (names and description) for format'
@return: strucured content of format template
"""
_ = gettext_set_language(CFG_SITE_LANG)
# Get from cache whenever possible
global format_templates_cache
if not filename.endswith("."+CFG_BIBFORMAT_FORMAT_TEMPLATE_EXTENSION) and \
not filename.endswith(".xsl"):
return None
if format_templates_cache.has_key(filename):
# If we must return with attributes and template exist in
# cache with attributes then return cache.
# Else reload with attributes
if with_attributes and \
format_templates_cache[filename].has_key('attrs'):
return format_templates_cache[filename]
format_template = {'code':""}
try:
path = "%s%s%s" % (cfg['CFG_BIBFORMAT_TEMPLATES_PATH'], os.sep, filename)
format_file = open(path)
format_content = format_file.read()
format_file.close()
# Load format template code
# Remove name and description
if filename.endswith("."+CFG_BIBFORMAT_FORMAT_TEMPLATE_EXTENSION):
code_and_description = pattern_format_template_name.sub("",
format_content, 1)
code = pattern_format_template_desc.sub("", code_and_description, 1)
else:
code = format_content
format_template['code'] = code
except Exception, e:
try:
raise InvenioBibFormatError(_('Could not read format template named %s. %s.') % (filename, str(e)))
except InvenioBibFormatError, exc:
register_exception()
# Save attributes if necessary
if with_attributes:
format_template['attrs'] = get_format_template_attrs(filename)
# Cache and return
format_templates_cache[filename] = format_template
return format_template
def get_format_templates(with_attributes=False):
"""
Returns the list of all format templates, as dictionary with filenames as keys
if 'with_attributes' is true, returns the name and description. Else 'attrs' is not
returned as key in each dictionary (it might, if it has already been loaded previously)::
[{'code':"<b>Some template code</b>"
'attrs': {'name': "a name", 'description': "a description"}
},
...
}
@param with_attributes: if True, fetch the attributes (names and description) for formats
@return: the list of format templates (with code and info)
"""
format_templates = {}
files = os.listdir(cfg['CFG_BIBFORMAT_TEMPLATES_PATH'])
for filename in files:
if filename.endswith("."+CFG_BIBFORMAT_FORMAT_TEMPLATE_EXTENSION) or \
filename.endswith(".xsl"):
format_templates[filename] = get_format_template(filename,
with_attributes)
return format_templates
def get_format_template_attrs(filename):
"""
Returns the attributes of the format template with given filename
The attributes are {'name', 'description'}
Caution: the function does not check that path exists or
that the format element is valid.
@param filename: the name of a format template
@return: a structure with detailed information about given format template
"""
_ = gettext_set_language(CFG_SITE_LANG)
attrs = {}
attrs['name'] = ""
attrs['description'] = ""
try:
template_file = open("%s%s%s" % (cfg['CFG_BIBFORMAT_TEMPLATES_PATH'],
os.sep,
filename))
code = template_file.read()
template_file.close()
match = None
if filename.endswith(".xsl"):
# .xsl
attrs['name'] = filename[:-4]
else:
# .bft
match = pattern_format_template_name.search(code)
if match is not None:
attrs['name'] = match.group('name')
else:
attrs['name'] = filename
match = pattern_format_template_desc.search(code)
if match is not None:
attrs['description'] = match.group('desc').rstrip('.')
except Exception, e:
try:
raise InvenioBibFormatError(_('Could not read format template named %s. %s.') % (filename, str(e)))
except InvenioBibFormatError, exc:
register_exception()
attrs['name'] = filename
return attrs
def get_format_element(element_name, verbose=0, with_built_in_params=False):
"""
Returns the format element structured content.
Return None if element cannot be loaded (file not found, not readable or
invalid)
The returned structure is::
{'attrs': {some attributes in dict. See get_format_element_attrs_from_*}
'code': the_function_code,
'type':"field" or "python" depending if element is defined in file or table,
'escape_function': the function to call to know if element output must be escaped}
@param element_name: the name of the format element to load
@param verbose: the level of verbosity from 0 to 9 (O: silent,
5: errors,
7: errors and warnings,
9: errors and warnings, stop if error (debug mode ))
@param with_built_in_params: if True, load the parameters built in all elements
@return: a dictionary with format element attributes
"""
_ = gettext_set_language(CFG_SITE_LANG)
# Get from cache whenever possible
global format_elements_cache
errors = []
# Resolve filename and prepare 'name' as key for the cache
filename = resolve_format_element_filename(element_name)
if filename is not None:
name = filename.upper()
else:
name = element_name.upper()
if format_elements_cache.has_key(name):
element = format_elements_cache[name]
if not with_built_in_params or \
(with_built_in_params and \
element['attrs'].has_key('builtin_params')):
return element
if filename is None:
# Element is maybe in tag table
if bibformat_dblayer.tag_exists_for_name(element_name):
format_element = {'attrs': get_format_element_attrs_from_table( \
element_name,
with_built_in_params),
'code':None,
'escape_function':None,
'type':"field"}
# Cache and returns
format_elements_cache[name] = format_element
return format_element
else:
try:
raise InvenioBibFormatError(_('Format element %s could not be found.') % element_name)
except InvenioBibFormatError, exc:
register_exception()
if verbose >= 5:
sys.stderr.write(exc.message)
return None
else:
format_element = {}
module_name = filename
if module_name.endswith(".py"):
module_name = module_name[:-3]
# Load function 'format_element()' inside element
try:
packages = cfg['CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH']
packages = [packages] if packages is not None else None
function_format = TEMPLATE_CONTEXT_FUNCTIONS_CACHE.\
bibformat_elements(packages)[module_name]
format_element['code'] = function_format
except KeyError:
try:
raise InvenioBibFormatError(_('Format element %s has no function named "format".') % element_name)
except InvenioBibFormatError, exc:
register_exception()
errors.append(exc.message)
if verbose >= 5:
sys.stderr.write(exc.message)
if errors:
if verbose >= 7:
raise Exception, exc.message
return None
# Load function 'escape_values()' inside element
format_element['escape_function'] = function_format._escape_values
# Prepare, cache and return
format_element['attrs'] = get_format_element_attrs_from_function( \
function_format,
element_name,
with_built_in_params)
format_element['type'] = "python"
format_elements_cache[name] = format_element
return format_element
def get_format_elements(with_built_in_params=False):
"""
Returns the list of format elements attributes as dictionary structure
Elements declared in files have priority over element declared in 'tag' table
The returned object has this format::
{element_name1: {'attrs': {'description':..., 'seealso':...
'params':[{'name':..., 'default':..., 'description':...}, ...]
'builtin_params':[{'name':..., 'default':..., 'description':...}, ...]
},
'code': code_of_the_element
},
element_name2: {...},
...}
Returns only elements that could be loaded (not error in code)
@return: a dict of format elements with name as key, and a dict as attributes
@param with_built_in_params: if True, load the parameters built in all elements
"""
format_elements = {}
mappings = bibformat_dblayer.get_all_name_tag_mappings()
for name in mappings:
format_elements[name.upper().replace(" ", "_").strip()] = get_format_element(name, with_built_in_params=with_built_in_params)
files = os.listdir(cfg['CFG_BIBFORMAT_ELEMENTS_PATH'])
for filename in files:
filename_test = filename.upper().replace(" ", "_")
if filename_test.endswith(".PY") and filename.upper() != "__INIT__.PY":
if filename_test.startswith("BFE_"):
filename_test = filename_test[4:]
element_name = filename_test[:-3]
element = get_format_element(element_name,
with_built_in_params=with_built_in_params)
if element is not None:
format_elements[element_name] = element
return format_elements
def get_format_element_attrs_from_function(function, element_name,
with_built_in_params=False):
"""
Returns the attributes of the function given as parameter.
It looks for standard parameters of the function, default
values and comments in the docstring.
The attributes are::
{'name' : "name of element" #basically the name of 'name' parameter
'description': "a string description of the element",
'seealso' : ["element_1.py", "element_2.py", ...] #a list of related elements
'params': [{'name':"param_name", #a list of parameters for this element (except 'bfo')
'default':"default value",
'description': "a description"}, ...],
'builtin_params': {name: {'name':"param_name",#the parameters builtin for all elem of this kind
'default':"default value",
'description': "a description"}, ...},
}
@param function: the formatting function of a format element
@param element_name: the name of the element
@param with_built_in_params: if True, load the parameters built in all elements
@return: a structure with detailed information of a function
"""
attrs = {}
attrs['description'] = ""
attrs['name'] = element_name.replace(" ", "_").upper()
attrs['seealso'] = []
docstring = function.__doc__
if isinstance(docstring, str):
# Look for function description in docstring
#match = pattern_format_element_desc.search(docstring)
description = docstring.split("@param")[0]
description = description.split("@see:")[0]
attrs['description'] = description.strip().rstrip('.')
# Look for @see: in docstring
match = pattern_format_element_seealso.search(docstring)
if match is not None:
elements = match.group('see').rstrip('.').split(",")
for element in elements:
attrs['seealso'].append(element.strip())
params = {}
# Look for parameters in function definition
(args, varargs, varkw, defaults) = inspect.getargspec(function)
# Prepare args and defaults_list such that we can have a mapping
# from args to defaults
args.reverse()
if defaults is not None:
defaults_list = list(defaults)
defaults_list.reverse()
else:
defaults_list = []
for arg, default in map(None, args, defaults_list):
if arg == "bfo":
#Don't keep this as parameter. It is hidden to users, and
#exists in all elements of this kind
continue
param = {}
param['name'] = arg
if default is None:
#In case no check is made inside element, we prefer to
#print "" (nothing) than None in output
param['default'] = ""
else:
param['default'] = default
param['description'] = "(no description provided)"
params[arg] = param
if isinstance(docstring, str):
# Look for AT param descriptions in docstring.
# Add description to existing parameters in params dict
params_iterator = pattern_format_element_params.finditer(docstring)
for match in params_iterator:
name = match.group('name')
if params.has_key(name):
params[name]['description'] = match.group('desc').rstrip('.')
attrs['params'] = params.values()
# Load built-in parameters if necessary
if with_built_in_params:
builtin_params = []
# Add 'prefix' parameter
param_prefix = {}
param_prefix['name'] = "prefix"
param_prefix['default'] = ""
param_prefix['description'] = """A prefix printed only if the
record has a value for this element"""
builtin_params.append(param_prefix)
# Add 'suffix' parameter
param_suffix = {}
param_suffix['name'] = "suffix"
param_suffix['default'] = ""
param_suffix['description'] = """A suffix printed only if the
record has a value for this element"""
builtin_params.append(param_suffix)
# Add 'default' parameter
param_default = {}
param_default['name'] = "default"
param_default['default'] = ""
param_default['description'] = """A default value printed if the
record has no value for this element"""
builtin_params.append(param_default)
# Add 'escape' parameter
param_escape = {}
param_escape['name'] = "escape"
param_escape['default'] = ""
param_escape['description'] = """0 keeps value as it is. Refer to main
documentation for escaping modes
1 to 7"""
builtin_params.append(param_escape)
attrs['builtin_params'] = builtin_params
return attrs
def get_format_element_attrs_from_table(element_name,
with_built_in_params=False):
"""
Returns the attributes of the format element with given name in 'tag' table.
Returns None if element_name does not exist in tag table.
The attributes are::
{'name' : "name of element" #basically the name of 'element_name' parameter
'description': "a string description of the element",
'seealso' : [] #a list of related elements. Always empty in this case
'params': [], #a list of parameters for this element. Always empty in this case
'builtin_params': [{'name':"param_name", #the parameters builtin for all elem of this kind
'default':"default value",
'description': "a description"}, ...],
'tags':["950.1", 203.a] #the list of tags printed by this element
}
@param element_name: an element name in database
@param element_name: the name of the element
@param with_built_in_params: if True, load the parameters built in all elements
@return: a structure with detailed information of an element found in DB
"""
attrs = {}
tags = bibformat_dblayer.get_tags_from_name(element_name)
field_label = "field"
if len(tags)>1:
field_label = "fields"
attrs['description'] = "Prints %s %s of the record" % (field_label,
", ".join(tags))
attrs['name'] = element_name.replace(" ", "_").upper()
attrs['seealso'] = []
attrs['params'] = []
attrs['tags'] = tags
# Load built-in parameters if necessary
if with_built_in_params:
builtin_params = []
# Add 'prefix' parameter
param_prefix = {}
param_prefix['name'] = "prefix"
param_prefix['default'] = ""
param_prefix['description'] = """A prefix printed only if the
record has a value for this element"""
builtin_params.append(param_prefix)
# Add 'suffix' parameter
param_suffix = {}
param_suffix['name'] = "suffix"
param_suffix['default'] = ""
param_suffix['description'] = """A suffix printed only if the
record has a value for this element"""
builtin_params.append(param_suffix)
# Add 'separator' parameter
param_separator = {}
param_separator['name'] = "separator"
param_separator['default'] = " "
param_separator['description'] = """A separator between elements of
the field"""
builtin_params.append(param_separator)
# Add 'nbMax' parameter
param_nbMax = {}
param_nbMax['name'] = "nbMax"
param_nbMax['default'] = ""
param_nbMax['description'] = """The maximum number of values to
print for this element. No limit if not
specified"""
builtin_params.append(param_nbMax)
# Add 'default' parameter
param_default = {}
param_default['name'] = "default"
param_default['default'] = ""
param_default['description'] = """A default value printed if the
record has no value for this element"""
builtin_params.append(param_default)
# Add 'escape' parameter
param_escape = {}
param_escape['name'] = "escape"
param_escape['default'] = ""
param_escape['description'] = """If set to 1, replaces special
characters '&', '<' and '>' of this
element by SGML entities"""
builtin_params.append(param_escape)
attrs['builtin_params'] = builtin_params
return attrs
def get_output_format(code, with_attributes=False, verbose=0):
"""
Returns the structured content of the given output format
If 'with_attributes' is true, also returns the names and description of the output formats,
else 'attrs' is not returned in dict (it might, if it has already been loaded previously).
if output format corresponding to 'code' is not found return an empty structure.
See get_output_format_attrs() to learn more about the attributes::
{'rules': [ {'field': "980__a",
'value': "PREPRINT",
'template': "filename_a.bft",
},
{...}
],
'attrs': {'names': {'generic':"a name", 'sn':{'en': "a name", 'fr':"un nom"}, 'ln':{'en':"a long name"}}
'description': "a description"
'code': "fnm1",
'content_type': "application/ms-excel",
'visibility': 1
}
'default':"filename_b.bft"
}
@param code: the code of an output_format
@param with_attributes: if True, fetch the attributes (names and description) for format
@param verbose: the level of verbosity from 0 to 9 (O: silent,
5: errors,
7: errors and warnings,
9: errors and warnings, stop if error (debug mode ))
@return: strucured content of output format
"""
_ = gettext_set_language(CFG_SITE_LANG)
output_format = {'rules':[], 'default':""}
filename = resolve_output_format_filename(code, verbose)
if filename is None:
try:
raise InvenioBibFormatError(_('Output format with code %s could not be found.') % code)
except InvenioBibFormatError, exc:
register_exception()
if with_attributes: #Create empty attrs if asked for attributes
output_format['attrs'] = get_output_format_attrs(code, verbose)
return output_format
# Get from cache whenever possible
global format_outputs_cache
if format_outputs_cache.has_key(filename):
# If was must return with attributes but cache has not
# attributes, then load attributes
if with_attributes and not \
format_outputs_cache[filename].has_key('attrs'):
format_outputs_cache[filename]['attrs'] = get_output_format_attrs(code, verbose)
return format_outputs_cache[filename]
try:
if with_attributes:
output_format['attrs'] = get_output_format_attrs(code, verbose)
path = "%s%s%s" % (CFG_BIBFORMAT_OUTPUTS_PATH, os.sep, filename )
format_file = open(path)
current_tag = ''
for line in format_file:
line = line.strip()
if line == "":
# Ignore blank lines
continue
if line.endswith(":"):
# Retrieve tag
# Remove : spaces and eol at the end of line
clean_line = line.rstrip(": \n\r")
# The tag starts at second position
current_tag = "".join(clean_line.split()[1:]).strip()
elif line.find('---') != -1:
words = line.split('---')
template = words[-1].strip()
condition = ''.join(words[:-1])
value = ""
output_format['rules'].append({'field': current_tag,
'value': condition,
'template': template,
})
elif line.find(':') != -1:
# Default case
default = line.split(':')[1].strip()
output_format['default'] = default
except Exception, e:
try:
raise InvenioBibFormatError(_('Output format %s cannot not be read. %s.') % (filename, str(e)))
except InvenioBibFormatError, exc:
register_exception()
# Cache and return
format_outputs_cache[filename] = output_format
return output_format
def get_output_format_attrs(code, verbose=0):
"""
Returns the attributes of an output format.
The attributes contain 'code', which is the short identifier of the output format
(to be given as parameter in format_record function to specify the output format),
'description', a description of the output format, 'visibility' the visibility of
the format in the output format list on public pages and 'names', the localized names
of the output format. If 'content_type' is specified then the search_engine will
send a file with this content type and with result of formatting as content to the user.
The 'names' dict always contais 'generic', 'ln' (for long name) and 'sn' (for short names)
keys. 'generic' is the default name for output format. 'ln' and 'sn' contain long and short
localized names of the output format. Only the languages for which a localization exist
are used::
{'names': {'generic':"a name", 'sn':{'en': "a name", 'fr':"un nom"}, 'ln':{'en':"a long name"}}
'description': "a description"
'code': "fnm1",
'content_type': "application/ms-excel",
'visibility': 1
}
@param code: the short identifier of the format
@param verbose: the level of verbosity from 0 to 9 (O: silent,
5: errors,
7: errors and warnings,
9: errors and warnings, stop if error (debug mode ))
@return: strucured content of output format attributes
"""
if code.endswith("."+CFG_BIBFORMAT_FORMAT_OUTPUT_EXTENSION):
code = code[:-(len(CFG_BIBFORMAT_FORMAT_OUTPUT_EXTENSION) + 1)]
attrs = {'names':{'generic':"",
'ln':{},
'sn':{}},
'description':'',
'code':code.upper(),
'content_type':"",
'visibility':1}
filename = resolve_output_format_filename(code, verbose)
if filename is None:
return attrs
attrs['names'] = bibformat_dblayer.get_output_format_names(code)
attrs['description'] = bibformat_dblayer.get_output_format_description(code)
attrs['content_type'] = bibformat_dblayer.get_output_format_content_type(code)
attrs['visibility'] = bibformat_dblayer.get_output_format_visibility(code)
return attrs
def get_output_formats(with_attributes=False):
"""
Returns the list of all output format, as a dictionary with their filename as key
If 'with_attributes' is true, also returns the names and description of the output formats,
else 'attrs' is not returned in dicts (it might, if it has already been loaded previously).
See get_output_format_attrs() to learn more on the attributes::
{'filename_1.bfo': {'rules': [ {'field': "980__a",
'value': "PREPRINT",
'template': "filename_a.bft",
},
{...}
],
'attrs': {'names': {'generic':"a name", 'sn':{'en': "a name", 'fr':"un nom"}, 'ln':{'en':"a long name"}}
'description': "a description"
'code': "fnm1"
}
'default':"filename_b.bft"
},
'filename_2.bfo': {...},
...
}
@param with_attributes: if returned output formats contain detailed info, or not
@type with_attributes: boolean
@return: the list of output formats
"""
output_formats = {}
files = os.listdir(CFG_BIBFORMAT_OUTPUTS_PATH)
for filename in files:
if filename.endswith("."+CFG_BIBFORMAT_FORMAT_OUTPUT_EXTENSION):
code = "".join(filename.split(".")[:-1])
output_formats[filename] = get_output_format(code, with_attributes)
return output_formats
def resolve_format_element_filename(element_name):
"""
Returns the filename of element corresponding to x{element_name}
This is necessary since format templates code call
elements by ignoring case, for eg. <BFE_AUTHOR> is the
same as <BFE_author>.
It is also recommended that format elements filenames are
prefixed with bfe_ . We need to look for these too.
The name of the element has to start with "BFE_".
@param element_name: a name for a format element
@return: the corresponding filename, with right case
"""
if not element_name.endswith(".py"):
name = element_name.replace(" ", "_").upper() +".PY"
else:
name = element_name.replace(" ", "_").upper()
files = os.listdir(cfg['CFG_BIBFORMAT_ELEMENTS_PATH'])
for filename in files:
test_filename = filename.replace(" ", "_").upper()
if test_filename == name or \
test_filename == "BFE_" + name or \
"BFE_" + test_filename == name:
return filename
# No element with that name found
# Do not log error, as it might be a normal execution case:
# element can be in database
return None
def resolve_output_format_filename(code, verbose=0):
"""
Returns the filename of output corresponding to code
This is necessary since output formats names are not case sensitive
but most file systems are.
@param code: the code for an output format
@param verbose: the level of verbosity from 0 to 9 (O: silent,
5: errors,
7: errors and warnings,
9: errors and warnings, stop if error (debug mode ))
@return: the corresponding filename, with right case, or None if not found
"""
_ = gettext_set_language(CFG_SITE_LANG)
#Remove non alphanumeric chars (except . and _)
code = re.sub(r"[^.0-9a-zA-Z_]", "", code)
if not code.endswith("."+CFG_BIBFORMAT_FORMAT_OUTPUT_EXTENSION):
code = re.sub(r"\W", "", code)
code += "."+CFG_BIBFORMAT_FORMAT_OUTPUT_EXTENSION
files = os.listdir(CFG_BIBFORMAT_OUTPUTS_PATH)
for filename in files:
if filename.upper() == code.upper():
return filename
# No output format with that name found
try:
raise InvenioBibFormatError(_('Could not find output format named %s.') % code)
except InvenioBibFormatError, exc:
register_exception()
if verbose >= 5:
sys.stderr.write(exc.message)
if verbose >= 9:
sys.exit(exc.message)
return None
def get_fresh_format_template_filename(name):
"""
Returns a new filename and name for template with given name.
Used when writing a new template to a file, so that the name
has no space, is unique in template directory
Returns (unique_filename, modified_name)
@param name: name for a format template
@return: the corresponding filename, and modified name if necessary
"""
#name = re.sub(r"\W", "", name) #Remove non alphanumeric chars
name = name.replace(" ", "_")
filename = name
# Remove non alphanumeric chars (except .)
filename = re.sub(r"[^.0-9a-zA-Z]", "", filename)
path = cfg['CFG_BIBFORMAT_TEMPLATES_PATH'] + os.sep + filename \
+ "." + CFG_BIBFORMAT_FORMAT_TEMPLATE_EXTENSION
index = 1
while os.path.exists(path):
index += 1
filename = name + str(index)
path = cfg['CFG_BIBFORMAT_TEMPLATES_PATH'] + os.sep + filename \
+ "." + CFG_BIBFORMAT_FORMAT_TEMPLATE_EXTENSION
if index > 1:
returned_name = (name + str(index)).replace("_", " ")
else:
returned_name = name.replace("_", " ")
return (filename + "." + CFG_BIBFORMAT_FORMAT_TEMPLATE_EXTENSION,
returned_name) #filename.replace("_", " "))
def get_fresh_output_format_filename(code):
"""
Returns a new filename for output format with given code.
Used when writing a new output format to a file, so that the code
has no space, is unique in output format directory. The filename
also need to be at most 6 chars long, as the convention is that
filename == output format code (+ .extension)
We return an uppercase code
Returns (unique_filename, modified_code)
@param code: the code of an output format
@return: the corresponding filename, and modified code if necessary
"""
_ = gettext_set_language(CFG_SITE_LANG)
#code = re.sub(r"\W", "", code) #Remove non alphanumeric chars
code = code.upper().replace(" ", "_")
# Remove non alphanumeric chars (except . and _)
code = re.sub(r"[^.0-9a-zA-Z_]", "", code)
if len(code) > 6:
code = code[:6]
filename = code
path = CFG_BIBFORMAT_OUTPUTS_PATH + os.sep + filename \
+ "." + CFG_BIBFORMAT_FORMAT_OUTPUT_EXTENSION
index = 2
while os.path.exists(path):
filename = code + str(index)
if len(filename) > 6:
filename = code[:-(len(str(index)))]+str(index)
index += 1
path = CFG_BIBFORMAT_OUTPUTS_PATH + os.sep + filename \
+ "." + CFG_BIBFORMAT_FORMAT_OUTPUT_EXTENSION
# We should not try more than 99999... Well I don't see how we
# could get there.. Sanity check.
if index >= 99999:
try:
raise InvenioBibFormatError(_('Could not find a fresh name for output format %s.') % code)
except InvenioBibFormatError, exc:
register_exception()
sys.exit("Output format cannot be named as %s"%code)
return (filename + "." + CFG_BIBFORMAT_FORMAT_OUTPUT_EXTENSION, filename)
def clear_caches():
"""
Clear the caches (Output Format, Format Templates and Format Elements)
@return: None
"""
global format_templates_cache, format_elements_cache, format_outputs_cache
format_templates_cache = {}
format_elements_cache = {}
format_outputs_cache = {}
class BibFormatObject:
"""
An object that encapsulates a record and associated methods, and that is given
as parameter to all format elements 'format' function.
The object is made specifically for a given formatting, i.e. it includes
for example the language for the formatting.
The object provides basic accessors to the record. For full access, one can get
the record with get_record() and then use BibRecord methods on the returned object.
"""
# The record
record = None
# The language in which the formatting has to be done
lang = CFG_SITE_LANG
# A list of string describing the context in which the record has
# to be formatted.
# It represents the words of the user request in web interface search
search_pattern = []
# The id of the record
recID = 0
# The information about the user, as returned by
# 'webuser.collect_user_info(req)'
user_info = None
# The format in which the record is being formatted
output_format = ''
req = None # DEPRECATED: use bfo.user_info instead. Used by WebJournal.
def __init__(self, recID, ln=CFG_SITE_LANG, search_pattern=None,
xml_record=None, user_info=None, output_format=''):
"""
Creates a new bibformat object, with given record.
You can either specify an record ID to format, or give its xml representation.
if 'xml_record' is not None, use 'xml_record' instead of recID for the record.
'user_info' allows to grant access to some functionalities on
a page depending on the user's priviledges. It is a dictionary
in the following form::
user_info = {
'remote_ip' : '',
'remote_host' : '',
'referer' : '',
'uri' : '',
'agent' : '',
'uid' : -1,
'nickname' : '',
'email' : '',
'group' : [],
'guest' : '1'
}
@param recID: the id of a record
@param ln: the language in which the record has to be formatted
@param search_pattern: list of string representing the request used by the user in web interface
@param xml_record: a xml string of the record to format
@param user_info: the information of the user who will view the formatted page
@param output_format: the output_format used for formatting this record
"""
self.xml_record = None # *Must* remain empty if recid is given
if xml_record is not None:
# If record is given as parameter
self.xml_record = xml_record
self.record = create_record(xml_record)[0]
recID = record_get_field_value(self.record, "001")
self.lang = wash_language(ln)
if search_pattern is None:
search_pattern = []
self.search_pattern = search_pattern
self.recID = recID
self.output_format = output_format
self.user_info = user_info
if self.user_info is None:
from invenio.ext.login.legacy_user import UserInfo
self.user_info = UserInfo(None)
def get_record(self):
"""
Returns the record structure of this L{BibFormatObject} instance
@return: the record structure as defined by BibRecord library
"""
from invenio.legacy.search_engine import get_record
# Create record if necessary
if self.record is None:
# on-the-fly creation if current output is xm
self.record = get_record(self.recID)
return self.record
def control_field(self, tag, escape=0):
"""
Returns the value of control field given by tag in record
@param tag: the marc code of a field
@param escape: 1 if returned value should be escaped. Else 0.
@return: value of field tag in record
"""
if self.get_record() is None:
#Case where BibRecord could not parse object
return ''
p_tag = parse_tag(tag)
field_value = record_get_field_value(self.get_record(),
p_tag[0],
p_tag[1],
p_tag[2],
p_tag[3])
if escape == 0:
return field_value
else:
return escape_field(field_value, escape)
def field(self, tag, escape=0):
"""
Returns the value of the field corresponding to tag in the
current record.
If the value does not exist, return empty string. Else
returns the same as bfo.fields(..)[0] (see docstring below).
'escape' parameter allows to escape special characters
of the field. The value of escape can be:
0. no escaping
1. escape all HTML characters
2. remove unsafe HTML tags (Eg. keep <br />)
3. Mix of mode 1 and 2. If value of field starts with
<!-- HTML -->, then use mode 2. Else use mode 1.
4. Remove all HTML tags
5. Same as 2, with more tags allowed (like <img>)
6. Same as 3, with more tags allowed (like <img>)
7. Mix of mode 0 and mode 1. If field_value
starts with <!--HTML-->, then use mode 0.
Else use mode 1.
8. Same as mode 1, but also escape double-quotes
9. Same as mode 4, but also escape double-quotes
@param tag: the marc code of a field
@param escape: 1 if returned value should be escaped. Else 0. (see above for other modes)
@return: value of field tag in record
"""
list_of_fields = self.fields(tag)
if len(list_of_fields) > 0:
# Escaping below
if escape == 0:
return list_of_fields[0]
else:
return escape_field(list_of_fields[0], escape)
else:
return ""
def fields(self, tag, escape=0, repeatable_subfields_p=False):
"""
Returns the list of values corresonding to "tag".
If tag has an undefined subcode (such as 999C5),
the function returns a list of dictionaries, whoose keys
are the subcodes and the values are the values of tag.subcode.
If the tag has a subcode, simply returns list of values
corresponding to tag.
Eg. for given MARC::
999C5 $a value_1a $b value_1b
999C5 $b value_2b
999C5 $b value_3b $b value_3b_bis
>>> bfo.fields('999C5b')
>>> ['value_1b', 'value_2b', 'value_3b', 'value_3b_bis']
>>> bfo.fields('999C5')
>>> [{'a':'value_1a', 'b':'value_1b'},
{'b':'value_2b'},
{'b':'value_3b'}]
By default the function returns only one value for each
subfield (that is it considers that repeatable subfields are
not allowed). It is why in the above example 'value3b_bis' is
not shown for bfo.fields('999C5'). (Note that it is not
defined which of value_3b or value_3b_bis is returned). This
is to simplify the use of the function, as most of the time
subfields are not repeatable (in that way we get a string
instead of a list). You can allow repeatable subfields by
setting 'repeatable_subfields_p' parameter to True. In
this mode, the above example would return:
>>> bfo.fields('999C5b', repeatable_subfields_p=True)
>>> ['value_1b', 'value_2b', 'value_3b']
>>> bfo.fields('999C5', repeatable_subfields_p=True)
>>> [{'a':['value_1a'], 'b':['value_1b']},
{'b':['value_2b']},
{'b':['value_3b', 'value3b_bis']}]
NOTICE THAT THE RETURNED STRUCTURE IS DIFFERENT. Also note
that whatever the value of 'repeatable_subfields_p' is,
bfo.fields('999C5b') always show all fields, even repeatable
ones. This is because the parameter has no impact on the
returned structure (it is always a list).
'escape' parameter allows to escape special characters
of the fields. The value of escape can be:
0. No escaping
1. Escape all HTML characters
2. Remove unsafe HTML tags (Eg. keep <br />)
3. Mix of mode 1 and 2. If value of field starts with
<!-- HTML -->, then use mode 2. Else use mode 1.
4. Remove all HTML tags
5. Same as 2, with more tags allowed (like <img>)
6. Same as 3, with more tags allowed (like <img>)
7. Mix of mode 0 and mode 1. If field_value
starts with <!--HTML-->, then use mode 0.
Else use mode 1.
8. Same as mode 1, but also escape double-quotes
9. Same as mode 4, but also escape double-quotes
@param tag: the marc code of a field
@param escape: 1 if returned values should be escaped. Else 0.
@repeatable_subfields_p if True, returns the list of subfields in the dictionary
@return: values of field tag in record
"""
if self.get_record() is None:
# Case where BibRecord could not parse object
return []
p_tag = parse_tag(tag)
if p_tag[3] != "":
# Subcode has been defined. Simply returns list of values
values = record_get_field_values(self.get_record(),
p_tag[0],
p_tag[1],
p_tag[2],
p_tag[3])
if escape == 0:
return values
else:
return [escape_field(value, escape) for value in values]
else:
# Subcode is undefined. Returns list of dicts.
# However it might be the case of a control field.
instances = record_get_field_instances(self.get_record(),
p_tag[0],
p_tag[1],
p_tag[2])
if repeatable_subfields_p:
list_of_instances = []
for instance in instances:
instance_dict = {}
for subfield in instance[0]:
if not instance_dict.has_key(subfield[0]):
instance_dict[subfield[0]] = []
if escape == 0:
instance_dict[subfield[0]].append(subfield[1])
else:
instance_dict[subfield[0]].append(escape_field(subfield[1], escape))
list_of_instances.append(instance_dict)
return list_of_instances
else:
if escape == 0:
return [dict(instance[0]) for instance in instances]
else:
return [dict([ (subfield[0], escape_field(subfield[1], escape)) \
for subfield in instance[0] ]) \
for instance in instances]
def kb(self, kb, string, default=""):
"""
Returns the value of the "string" in the knowledge base "kb".
If kb does not exist or string does not exist in kb,
returns 'default' string or empty string if not specified.
@param kb: a knowledge base name
@param string: the string we want to translate
@param default: a default value returned if 'string' not found in 'kb'
@return: a string value corresponding to translated input with given kb
"""
if not string:
return default
val = get_kbr_values(kb, searchkey=string, searchtype='e')
try:
return val[0][0]
except:
return default
def escape_field(value, mode=0):
"""
Utility function used to escape the value of a field in given mode.
- mode 0: no escaping
- mode 1: escaping all HTML/XML characters (escaped chars are shown as escaped)
- mode 2: escaping unsafe HTML tags to avoid XSS, but
keep basic one (such as <br />)
Escaped tags are removed.
- mode 3: mix of mode 1 and mode 2. If field_value starts with <!--HTML-->,
then use mode 2. Else use mode 1.
- mode 4: escaping all HTML/XML tags (escaped tags are removed)
- mode 5: same as 2, but allows more tags, like <img>
- mode 6: same as 3, but allows more tags, like <img>
- mode 7: mix of mode 0 and mode 1. If field_value starts with <!--HTML-->,
then use mode 0. Else use mode 1.
- mode 8: same as mode 1, but also escape double-quotes
- mode 9: same as mode 4, but also escape double-quotes
@param value: value to escape
@param mode: escaping mode to use
@return: an escaped version of X{value} according to chosen X{mode}
"""
if mode == 1:
return cgi.escape(value)
elif mode == 8:
return cgi.escape(value, True)
elif mode in [2, 5]:
allowed_attribute_whitelist = CFG_HTML_BUFFER_ALLOWED_ATTRIBUTE_WHITELIST
allowed_tag_whitelist = CFG_HTML_BUFFER_ALLOWED_TAG_WHITELIST + \
('class',)
if mode == 5:
allowed_attribute_whitelist += ('src', 'alt',
'width', 'height',
'style', 'summary',
'border', 'cellspacing',
'cellpadding')
allowed_tag_whitelist += ('img', 'table', 'td',
'tr', 'th', 'span', 'caption')
try:
return washer.wash(value,
allowed_attribute_whitelist=\
allowed_attribute_whitelist,
allowed_tag_whitelist= \
allowed_tag_whitelist
)
except HTMLParseError:
# Parsing failed
return cgi.escape(value)
elif mode in [3, 6]:
if value.lstrip(' \n').startswith(html_field):
allowed_attribute_whitelist = CFG_HTML_BUFFER_ALLOWED_ATTRIBUTE_WHITELIST
allowed_tag_whitelist = CFG_HTML_BUFFER_ALLOWED_TAG_WHITELIST + \
('class',)
if mode == 6:
allowed_attribute_whitelist += ('src', 'alt',
'width', 'height',
'style', 'summary',
'border', 'cellspacing',
'cellpadding')
allowed_tag_whitelist += ('img', 'table', 'td',
'tr', 'th', 'span', 'caption')
try:
return washer.wash(value,
allowed_attribute_whitelist=\
allowed_attribute_whitelist,
allowed_tag_whitelist=\
allowed_tag_whitelist
)
except HTMLParseError:
# Parsing failed
return cgi.escape(value)
else:
return cgi.escape(value)
elif mode in [4, 9]:
try:
out = washer.wash(value,
allowed_attribute_whitelist=[],
allowed_tag_whitelist=[]
)
if mode == 9:
out = out.replace('"', '&quot;')
return out
except HTMLParseError:
# Parsing failed
if mode == 4:
return cgi.escape(value)
else:
return cgi.escape(value, True)
elif mode == 7:
if value.lstrip(' \n').startswith(html_field):
return value
else:
return cgi.escape(value)
else:
return value
def bf_profile():
"""
Runs a benchmark
@return: None
"""
for i in range(1, 51):
format_record(i, "HD", ln=CFG_SITE_LANG, verbose=9, search_pattern=[])
return
if __name__ == "__main__":
import profile
import pstats
#bf_profile()
profile.run('bf_profile()', "bibformat_profile")
p = pstats.Stats("bibformat_profile")
p.strip_dirs().sort_stats("cumulative").print_stats()
diff --git a/invenio/modules/formatter/format_elements/bfe_edit_record.py b/invenio/modules/formatter/format_elements/bfe_edit_record.py
index 7d16d0857..0ecafdfae 100644
--- a/invenio/modules/formatter/format_elements/bfe_edit_record.py
+++ b/invenio/modules/formatter/format_elements/bfe_edit_record.py
@@ -1,56 +1,56 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibFormat element - Prints a link to BibEdit
"""
__revision__ = "$Id$"
from invenio.utils.url import create_html_link
from invenio.base.i18n import gettext_set_language
from invenio.config import CFG_SITE_URL, CFG_SITE_RECORD
-from invenio.bibedit_utils import user_can_edit_record_collection
+from invenio.legacy.bibedit.utils import user_can_edit_record_collection
def format_element(bfo, style):
"""
Prints a link to BibEdit, if authorization is granted
@param style: the CSS style to be applied to the link.
"""
_ = gettext_set_language(bfo.lang)
out = ""
user_info = bfo.user_info
if user_can_edit_record_collection(user_info, bfo.recID):
linkattrd = {}
if style != '':
linkattrd['style'] = style
out += create_html_link(CFG_SITE_URL +
'/%s/edit/?ln=%s#state=edit&recid=%s' % (CFG_SITE_RECORD, bfo.lang, str(bfo.recID)),
{},
link_label=_("Edit This Record"),
linkattrd=linkattrd)
return out
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
diff --git a/invenio/modules/formatter/format_elements/bfe_fulltext.py b/invenio/modules/formatter/format_elements/bfe_fulltext.py
index 4928ca7f4..69e0d0053 100644
--- a/invenio/modules/formatter/format_elements/bfe_fulltext.py
+++ b/invenio/modules/formatter/format_elements/bfe_fulltext.py
@@ -1,350 +1,350 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibFormat element - Prints a links to fulltext
"""
__revision__ = "$Id$"
import re
-from invenio.bibdocfile import BibRecDocs, file_strip_ext, normalize_format, compose_format
+from invenio.legacy.bibdocfile.api import BibRecDocs, file_strip_ext, normalize_format, compose_format
from invenio.base.i18n import gettext_set_language
from invenio.config import CFG_SITE_URL, CFG_CERN_SITE, CFG_SITE_RECORD, \
CFG_BIBFORMAT_HIDDEN_FILE_FORMATS
-from invenio.bibdocfile_config import CFG_BIBDOCFILE_ICON_SUBFORMAT_RE
+from invenio.legacy.bibdocfile.config import CFG_BIBDOCFILE_ICON_SUBFORMAT_RE
from cgi import escape, parse_qs
from urlparse import urlparse
from os.path import basename
import urllib
_CFG_NORMALIZED_BIBFORMAT_HIDDEN_FILE_FORMATS = set(normalize_format(fmt) for fmt in CFG_BIBFORMAT_HIDDEN_FILE_FORMATS)
cern_arxiv_categories = ["astro-ph", "chao-dyn", "cond-mat", "gr-qc",
"hep-ex", "hep-lat", "hep-ph", "hep-th", "math-ph",
"math", "nucl-ex", "nucl-th", "out", "physics",
"quant-ph", "q-alg", "cs", "adap-org", "comp-gas",
"chem-ph", "cs", "math", "neuro-sys", "patt-sol",
"solv-int", "acc-phys", "alg-geom", "ao-sci",
"atom-ph", "cmp-lg", "dg-ga", "funct-an", "mtrl-th",
"plasm-ph", "q-alg", "supr-con"]
def format_element(bfo, style, separator='; ', show_icons='no', focus_on_main_file='no', show_subformat_icons='no'):
"""
This is the default format for formatting fulltext links.
When possible, it returns only the main file(s) (+ link to
additional files if needed). If no distinction is made at
submission time between main and additional files, returns
all the files
@param separator: the separator between urls.
@param style: CSS class of the link
@param show_icons: if 'yes', print icons for fulltexts
@param focus_on_main_file: if 'yes' and a doctype 'Main' is found,
prominently display this doctype. In that case other doctypes are
summarized with a link to the Files tab, named "Additional files"
@param show_subformat_icons: shall we display subformats considered as icons?
"""
_ = gettext_set_language(bfo.lang)
out = ''
# Retrieve files
(parsed_urls, old_versions, additionals) = get_files(bfo, \
distinguish_main_and_additional_files=focus_on_main_file.lower() == 'yes',
include_subformat_icons=show_subformat_icons == 'yes')
main_urls = parsed_urls['main_urls']
others_urls = parsed_urls['others_urls']
if parsed_urls.has_key('cern_urls'):
cern_urls = parsed_urls['cern_urls']
# Prepare style and icon
if style != "":
style = 'class="'+style+'"'
if show_icons.lower() == 'yes':
file_icon = '<img style="border:none" src="%s/img/file-icon-text-12x16.gif" alt="%s"/>' % (CFG_SITE_URL, _("Download fulltext"))
else:
file_icon = ''
# Build urls list.
# Escape special chars for <a> tag value.
additional_str = ''
if additionals:
additional_str = ' <small>(<a '+style+' href="'+CFG_SITE_URL+'/%s/' % CFG_SITE_RECORD + str(bfo.recID)+'/files/">%s</a>)</small>' % _("additional files")
versions_str = ''
#if old_versions:
#versions_str = ' <small>(<a '+style+' href="'+CFG_SITE_URL+'/CFG_SITE_RECORD/'+str(bfo.recID)+'/files/">%s</a>)</small>' % _("older versions")
if main_urls:
out = []
main_urls_keys = sort_alphanumerically(main_urls.keys())
for descr in main_urls_keys:
urls = main_urls[descr]
if re.match(r'^\d+\s', descr) and urls[0][2] == 'png':
# FIXME: we have probably hit a Plot (as link
# description looks like '0001 This is Caption'), so
# do not take it. This test is not ideal, we should
# rather study doc type, and base ourselves on
# Main/Additional/Plot etc.
continue
out += ['<li class="nav-header"><strong>%s:</strong></li>' % descr]
urls_dict = {}
for url, name, url_format in urls:
if name not in urls_dict:
urls_dict[name] = [(url, url_format)]
else:
urls_dict[name].append((url, url_format))
for name, urls_and_format in urls_dict.items():
if len(urls_dict) > 1:
print_name = "<em>%s</em>" % name
url_list = ['<li class="nav-header">' + print_name + "</li>"]
else:
url_list = []
for url, url_format in urls_and_format:
if CFG_CERN_SITE and url_format == 'ps.gz' and len(urls_and_format) > 1:
## We skip old PS.GZ files
continue
url_list.append('<li><a %(style)s href="%(url)s">%(file_icon)s %(url_format)s</a></li>' % {
'style': style,
'url': escape(url, True),
'file_icon': file_icon,
'url_format': escape(url_format.upper())
})
out += url_list
return '<ul class="dropdown-menu pull-right">' + "\n".join(out) + '</ul>'
if main_urls:
main_urls_keys = sort_alphanumerically(main_urls.keys())
for descr in main_urls_keys:
urls = main_urls[descr]
if re.match(r'^\d+\s', descr) and urls[0][2] == 'png':
# FIXME: we have probably hit a Plot (as link
# description looks like '0001 This is Caption'), so
# do not take it. This test is not ideal, we should
# rather study doc type, and base ourselves on
# Main/Additional/Plot etc.
continue
out += "<strong>%s:</strong> " % descr
urls_dict = {}
for url, name, url_format in urls:
if name not in urls_dict:
urls_dict[name] = [(url, url_format)]
else:
urls_dict[name].append((url, url_format))
for name, urls_and_format in urls_dict.items():
if len(urls_dict) > 1:
print_name = "<em>%s</em> - " % name
url_list = [print_name]
else:
url_list = []
for url, url_format in urls_and_format:
if CFG_CERN_SITE and url_format == 'ps.gz' and len(urls_and_format) > 1:
## We skip old PS.GZ files
continue
url_list.append('<a %(style)s href="%(url)s">%(file_icon)s%(url_format)s</a>' % {
'style': style,
'url': escape(url, True),
'file_icon': file_icon,
'url_format': escape(url_format.upper())
})
out += " ".join(url_list) + additional_str + versions_str + separator
if CFG_CERN_SITE and cern_urls:
link_word = len(cern_urls) == 1 and _('%(x_sitename)s link') or _('%(x_sitename)s links')
out += '<strong>%s</strong>: ' % (link_word % {'x_sitename': 'CERN'})
url_list = []
for url, descr in cern_urls:
url_list.append('<a '+style+' href="'+escape(url)+'">'+ \
file_icon + escape(str(descr))+'</a>')
out += separator.join(url_list)
if others_urls:
external_link = len(others_urls) == 1 and _('external link') or _('external links')
out += '<strong>%s</strong>: ' % external_link.capitalize()
url_list = []
for url, descr in others_urls:
url_list.append('<a '+style+' href="'+escape(url)+'">'+ \
file_icon + escape(str(descr))+'</a>')
out += separator.join(url_list) + '<br />'
if out.endswith('<br />'):
out = out[:-len('<br />')]
# When exported to text (eg. in WebAlert emails) we do not want to
# display the link to the fulltext:
if out:
out = '<!--START_NOT_FOR_TEXT-->' + out + '<!--END_NOT_FOR_TEXT-->'
return out
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
def get_files(bfo, distinguish_main_and_additional_files=True, include_subformat_icons=False):
"""
Returns the files available for the given record.
Returned structure is a tuple (parsed_urls, old_versions, additionals):
- parsed_urls: contains categorized URLS (see details below)
- old_versions: set to True if we can have access to old versions
- additionals: set to True if we have other documents than the 'main' document
Parameter 'include_subformat_icons' decides if subformat
considered as icons should be returned
'parsed_urls' is a dictionary in the form::
{'main_urls' : {'Main' : [('http://CFG_SITE_URL/CFG_SITE_RECORD/1/files/aFile.pdf', 'aFile', 'PDF'),
('http://CFG_SITE_URL/CFG_SITE_RECORD/1/files/aFile.gif', 'aFile', 'GIF')],
'Additional': [('http://CFG_SITE_URL/CFG_SITE_RECORD/1/files/bFile.pdf', 'bFile', 'PDF')]},
'other_urls': [('http://externalurl.com/aFile.pdf', 'Fulltext'), # url(8564_u), description(8564_z/y)
('http://externalurl.com/bFile.pdf', 'Fulltext')],
'cern_urls' : [('http://cern.ch/aFile.pdf', 'Fulltext'), # url(8564_u), description(8564_z/y)
('http://cern.ch/bFile.pdf', 'Fulltext')],
}
Some notes about returned structure:
- key 'cern_urls' is only available on CERN site
- keys in main_url dictionaries are defined by the BibDoc.
- older versions are not part of the parsed urls
- returns only main files when possible, that is when doctypes
make a distinction between 'Main' files and other
files. Otherwise returns all the files as main. This is only
enabled if distinguish_main_and_additional_files is set to True
"""
_ = gettext_set_language(bfo.lang)
urls = bfo.fields("8564_")
bibarchive = BibRecDocs(bfo.recID)
old_versions = False # We can provide link to older files. Will be
# set to True if older files are found.
additionals = False # We have additional files. Will be set to
# True if additional files are found.
# Prepare object to return
parsed_urls = {'main_urls':{}, # Urls hosted by Invenio (bibdocs)
'others_urls':[] # External urls
}
if CFG_CERN_SITE:
parsed_urls['cern_urls'] = [] # cern.ch urls
# Doctypes can of any type, but when there is one file marked as
# 'Main', we consider that there is a distinction between "main"
# and "additional" files. Otherwise they will all be considered
# equally as main files
distinct_main_and_additional_files = False
if len(bibarchive.list_bibdocs(doctype='Main')) > 0 and \
distinguish_main_and_additional_files:
distinct_main_and_additional_files = True
# Parse URLs
for complete_url in urls:
if complete_url.has_key('u'):
url = complete_url['u']
(dummy, host, path, dummy, params, dummy) = urlparse(url)
subformat = complete_url.get('x', '')
filename = urllib.unquote(basename(path))
name = file_strip_ext(filename)
url_format = filename[len(name):]
if url_format.startswith('.'):
url_format = url_format[1:]
if compose_format(url_format, subformat) in _CFG_NORMALIZED_BIBFORMAT_HIDDEN_FILE_FORMATS:
## This format should be hidden.
continue
descr = _("Fulltext")
if complete_url.has_key('y'):
descr = complete_url['y']
if descr == 'Fulltext':
descr = _("Fulltext")
if not url.startswith(CFG_SITE_URL): # Not a bibdoc?
if not descr: # For not bibdoc let's have a description
# Display the URL in full:
descr = url
if CFG_CERN_SITE and 'cern.ch' in host and \
('/setlink?' in url or \
'cms' in host or \
'documents.cern.ch' in url or \
'doc.cern.ch' in url or \
'preprints.cern.ch' in url):
url_params_dict = dict([part.split('=') for part in params.split('&') if len(part.split('=')) == 2])
if url_params_dict.has_key('categ') and \
(url_params_dict['categ'].split('.', 1)[0] in cern_arxiv_categories) and \
url_params_dict.has_key('id'):
# Old arXiv links, used to be handled by
# setlink. Provide direct links to arXiv
for file_format, label in [('pdf', "PDF")]:#,
#('ps', "PS"),
#('e-print', "Source (generally TeX or LaTeX)"),
#('abs', "Abstract")]:
url = "http://arxiv.org/%(format)s/%(category)s/%(id)s" % \
{'format': file_format,
'category': url_params_dict['categ'],
'id': url_params_dict['id']}
parsed_urls['others_urls'].append((url, "%s/%s %s" % \
(url_params_dict['categ'],
url_params_dict['id'],
label)))
else:
parsed_urls['others_urls'].append((url, descr)) # external url
else: # It's a bibdoc!
assigned = False
for doc in bibarchive.list_bibdocs():
if int(doc.get_latest_version()) > 1:
old_versions = True
if True in [f.get_full_name().startswith(filename) \
for f in doc.list_all_files()]:
assigned = True
if not include_subformat_icons and \
CFG_BIBDOCFILE_ICON_SUBFORMAT_RE.match(subformat):
# This is an icon and we want to skip it
continue
if not doc.get_doctype(bfo.recID) == 'Main' and \
distinct_main_and_additional_files == True:
# In that case we record that there are
# additional files, but don't add them to
# returned structure.
additionals = True
else:
if not descr:
descr = _('Fulltext')
if not parsed_urls['main_urls'].has_key(descr):
parsed_urls['main_urls'][descr] = []
params_dict = parse_qs(params)
if 'subformat' in params_dict:
url_format += ' (%s)' % params_dict['subformat'][0]
parsed_urls['main_urls'][descr].append((url, name, url_format))
if not assigned: # Url is not a bibdoc :-S
if not descr:
descr = filename
parsed_urls['others_urls'].append((url, descr)) # Let's put it in a general other url
return (parsed_urls, old_versions, additionals)
_RE_SPLIT = re.compile(r"\d+|\D+")
def sort_alphanumerically(elements):
elements = [([not token.isdigit() and token or int(token) for token in _RE_SPLIT.findall(element)], element) for element in elements]
elements.sort()
return [element[1] for element in elements]
diff --git a/invenio/modules/formatter/format_elements/bfe_meta.py b/invenio/modules/formatter/format_elements/bfe_meta.py
index 2c79113b7..884b5a4bd 100644
--- a/invenio/modules/formatter/format_elements/bfe_meta.py
+++ b/invenio/modules/formatter/format_elements/bfe_meta.py
@@ -1,117 +1,117 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibFormat element - meta"""
__revision__ = "$Id$"
import cgi
from invenio.modules.formatter.format_elements.bfe_server_info import format_element as server_info
from invenio.modules.formatter.format_elements.bfe_client_info import format_element as client_info
from invenio.utils.html import create_tag
-from invenio.bibindex_engine import get_field_tags
+from invenio.legacy.bibindex.engine import get_field_tags
from invenio.config import CFG_WEBSEARCH_ENABLE_GOOGLESCHOLAR, CFG_WEBSEARCH_ENABLE_OPENGRAPH
def format_element(bfo, name, tag_name='', tag='', kb='', kb_default_output='', var='', protocol='googlescholar'):
"""Prints a custom field in a way suitable to be used in HTML META
tags. In particular conforms to Google Scholar harvesting protocol as
defined http://scholar.google.com/intl/en/scholar/inclusion.html and
Open Graph http://ogp.me/
@param tag_name: the name, from tag table, of the field to be exported
looks initially for names prefixed by "meta-"<tag_name>
then looks for exact name, then falls through to "tag"
@param tag: the MARC tag to be exported (only if not defined by tag_name)
@param name: name to be displayed in the meta headers, labelling this value.
@param kb: a knowledge base through which to process the retrieved value if necessary.
@param kb: when a '<code>kb</code>' is specified and no match for value is found, what shall we
return? Either return the given parameter or specify "{value}" to return the retrieved
value before processing though kb.
@param var: the name of a variable to output instead of field from metadata.
Allowed values are those supported by bfe_server_info and
bfe_client_info. Overrides <code>name</code> and <code>tag_name</code>
@param protocol: the protocol this tag is aimed at. Can be used to switch on/off support for a given "protocol". Can take values among 'googlescholar', 'opengraph'
@see: bfe_server_info.py, bfe_client_info.py
"""
if protocol == 'googlescholar' and not CFG_WEBSEARCH_ENABLE_GOOGLESCHOLAR:
return ""
elif protocol == 'opengraph' and not CFG_WEBSEARCH_ENABLE_OPENGRAPH:
return ""
tags = []
if var:
# delegate to bfe_server_info or bfe_client_info:
value = server_info(bfo, var)
if value.startswith("Unknown variable: "):
# Oops variable was not defined there
value = client_info(bfo, var)
return not value.startswith("Unknown variable: ") and \
create_metatag(name=name, content=cgi.escape(value, True)) \
or ""
elif tag_name:
# First check for special meta named tags
tags = get_field_tags("meta-" + tag_name)
if not tags:
# then check for regular tags
tags = get_field_tags(tag_name)
if not tags and tag:
# fall back to explicit marc tag
tags = [tag]
if not tags:
return ''
out = []
values = [bfo.fields(marctag, escape=9) for marctag in tags]
for value in values:
if isinstance(value, list):
for val in value:
if isinstance(val, dict):
out.extend(val.values())
else:
out.append(val)
elif isinstance(value, dict):
out.extend(value.values())
else:
out.append(value)
out = dict(zip(out, len(out)*[''])).keys() # Remove duplicates
if name == 'citation_date':
for idx in range(len(out)):
out[idx] = out[idx].replace('-', '/')
if kb:
if kb_default_output == "{value}":
out = [bfo.kb(kb, value, value) for value in out]
else:
out = [bfo.kb(kb, value, kb_default_output) for value in out]
return '\n'.join([create_metatag(name=name, content=value) for value in out])
def create_metatag(name, content):
"""
Wraps create_tag
"""
if name.startswith("og:"):
return create_tag('meta', property=name, content=content)
else:
return create_tag('meta', name=name, content=content)
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
diff --git a/invenio/modules/formatter/format_elements/bfe_meta_opengraph_image.py b/invenio/modules/formatter/format_elements/bfe_meta_opengraph_image.py
index 7f1e773c5..ab273460c 100644
--- a/invenio/modules/formatter/format_elements/bfe_meta_opengraph_image.py
+++ b/invenio/modules/formatter/format_elements/bfe_meta_opengraph_image.py
@@ -1,119 +1,119 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibFormat element - return an image for the record"""
from invenio.config import CFG_SITE_URL, CFG_SITE_SECURE_URL, CFG_CERN_SITE
-from invenio.bibdocfile import BibRecDocs, get_superformat_from_format
+from invenio.legacy.bibdocfile.api import BibRecDocs, get_superformat_from_format
from invenio.config import CFG_WEBSEARCH_ENABLE_OPENGRAPH
def format_element(bfo, max_photos='', one_icon_per_bibdoc='yes', twitter_card_type='photo', use_webjournal_featured_image='no'):
"""Return an image of the record, suitable for the Open Graph protocol.
Will look for any icon stored with the record, and will fallback to any
image file attached to the record. Returns nothing when no image is found.
Some optional structured properties are not considered, for optimizing both generation of the page
and page size.
@param max_photos: the maximum number of photos to display
@param one_icon_per_bibdoc: shall we keep just one icon per bibdoc in the output (not repetition of same preview in multiple sizes)?
@param twitter_card_type: the type of Twitter card: 'photo' (single photo) or 'gallery'. Fall back to 'photo' if not enough pictures for a 'gallery'.
@param use_webjournal_featured_image: if 'yes', use the "featured image" (as defined in bfe_webjournal_articles_overview) as image for the Twitter Card
"""
if not CFG_WEBSEARCH_ENABLE_OPENGRAPH:
return ""
bibarchive = BibRecDocs(bfo.recID)
bibdocs = bibarchive.list_bibdocs()
tags = []
images = []
if max_photos.isdigit():
max_photos = int(max_photos)
else:
max_photos = len(bibdocs)
for doc in bibdocs[:max_photos]:
found_icons = []
found_image_url = ''
found_image_size = 0
for docfile in doc.list_latest_files(list_hidden=False):
if docfile.is_icon():
found_icons.append((docfile.get_size(), docfile.get_url()))
elif get_superformat_from_format(docfile.get_format()).lower() in [".jpg", ".gif", ".jpeg", ".png"]:
found_image_url = docfile.get_url()
found_image_size = docfile.get_size()
found_icons.sort()
# We might have found several icons for the same file: keep
# middle-size one
if found_icons:
if one_icon_per_bibdoc.lower() == 'yes':
found_icons = [found_icons[len(found_icons)/2]]
for icon_size, icon_url in found_icons:
images.append((icon_url, icon_url.replace(CFG_SITE_URL, CFG_SITE_SECURE_URL), icon_size))
# Link to main file too (?)
if found_image_url:
images.append((found_image_url, found_image_url.replace(CFG_SITE_URL, CFG_SITE_SECURE_URL), found_image_size))
if CFG_CERN_SITE:
# Add some more pictures from metadata
dummy_size = 500*1024 # we don't we to check image size, we just make one (see Twitter Card limit)
additional_images = [(image_url, image_url.replace("http://mediaarchive.cern.ch/", "https://mediastream.cern.ch"), dummy_size) for image_url in bfo.fields("8567_u") if image_url.split('.')[-1] in ('jpg', 'png', 'jpeg', 'gif') and 'A5' in image_url]
images.extend(additional_images)
tags = ['<meta property="og:image" content="%s" />%s' % (image_url, image_url != image_secure_url and '\n<meta property="og:image:secure_url" content="%s" />' % image_secure_url or "") for image_url, image_secure_url, image_size in images]
# Twitter Card
if use_webjournal_featured_image.lower() == 'yes':
# First look for the prefered image, if available. Note that
# it might be a remote one.
try:
from invenio.bibformat_elements import bfe_webjournal_articles_overview
image_url = bfe_webjournal_articles_overview._get_feature_image(bfo)
image_secure_url = image_url.replace('http:', 'https:')
image_size = 500 * 1024 # TODO: check for real image size
if image_url.strip():
images.insert(0, (image_url, image_secure_url, image_size))
except:
pass
# Filter out images that would not be compatible
twitter_compatible_images = [image_url for image_url, image_secure_url, image_size in images if \
image_size < 1024*1024][:4] #Max 1MB according to Twitter Card APIs, max 4 photos
twitter_card_tags = []
if len(twitter_compatible_images) == 4 and twitter_card_type == 'gallery':
twitter_card_tags = ['<meta name="twitter:image%i" content="%s" />' % \
(twitter_compatible_images.index(image_url), image_url) \
for image_url in twitter_compatible_images]
elif twitter_compatible_images:
twitter_card_tags = ['<meta name="twitter:image" content="%s" />' % twitter_compatible_images[0]]
tags = twitter_card_tags + tags
return "\n".join(tags)
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
diff --git a/invenio/modules/formatter/format_elements/bfe_meta_opengraph_video.py b/invenio/modules/formatter/format_elements/bfe_meta_opengraph_video.py
index 5e7b066f4..1418f1112 100644
--- a/invenio/modules/formatter/format_elements/bfe_meta_opengraph_video.py
+++ b/invenio/modules/formatter/format_elements/bfe_meta_opengraph_video.py
@@ -1,109 +1,109 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibFormat element - return the video of a record"""
import cgi
from invenio.config import CFG_SITE_URL, CFG_SITE_SECURE_URL, CFG_CERN_SITE
-from invenio.bibdocfile import BibRecDocs, get_superformat_from_format
+from invenio.legacy.bibdocfile.api import BibRecDocs, get_superformat_from_format
from invenio.config import CFG_WEBSEARCH_ENABLE_OPENGRAPH
def format_element(bfo):
"""
Return the video of the record, suitable for the Open Graph protocol.
"""
if not CFG_WEBSEARCH_ENABLE_OPENGRAPH:
return ""
bibarchive = BibRecDocs(bfo.recID)
bibdocs = bibarchive.list_bibdocs()
additional_tags = ""
tags = []
videos = []
images = []
for doc in bibdocs:
found_icons = []
found_image_url = ''
for docfile in doc.list_latest_files():
if docfile.is_icon():
found_icons.append((docfile.get_size(), docfile.get_url()))
elif get_superformat_from_format(docfile.get_format()).lower() in [".mp4", '.webm', '.ogv']:
found_image_url = docfile.get_url()
found_icons.sort()
for icon_size, icon_url in found_icons:
images.append((icon_url, icon_url.replace(CFG_SITE_URL, CFG_SITE_SECURE_URL)))
if found_image_url:
videos.append((found_image_url, found_image_url.replace(CFG_SITE_URL, CFG_SITE_SECURE_URL)))
if CFG_CERN_SITE:
mp4_urls = [url.replace('http://mediaarchive.cern.ch', 'https://mediastream.cern.ch') \
for url in bfo.fields('8567_u') if url.endswith('.mp4')]
img_urls = [url.replace('http://mediaarchive.cern.ch', 'https://mediastream.cern.ch') \
for url in bfo.fields('8567_u') if url.endswith('.jpg') or url.endswith('.png')]
if mp4_urls:
mp4_url = mp4_urls[0]
if "4/3" in bfo.field("300__b"):
width = "640"
height = "480"
else:
width = "640"
height = "360"
additional_tags += '''
<meta property="og:video" content="%(CFG_CERN_PLAYER_URL)s?file=%(mp4_url_relative)s&streamer=%(CFG_STREAMER_URL)s&provider=rtmp&stretching=exactfit&image=%(image_url)s" />
<meta property="og:video:height" content="%(height)s" />
<meta property="og:video:width" content="%(width)s" />
<meta property="og:video:type" content="application/x-shockwave-flash" />
<meta property="og:video" content="%(mp4_url)s" />
<meta property="og:video:type" content="video/mp4" />
<meta property="og:image" content="%(image_url)s" />
<meta name="twitter:player:height" content="%(height)s" />
<meta name="twitter:player:width" content="%(width)s" />
<link rel="image_src" href="%(image_url)s" />
<link rel="video_src" href="%(CFG_CERN_PLAYER_URL)s?file=%(mp4_url_relative)s&streamer=%(CFG_STREAMER_URL)s&provider=rtmp&stretching=exactfit&image=%(image_url)s"/>
''' % {'CFG_CERN_PLAYER_URL': "https://cds.cern.ch/mediaplayer.swf",
'CFG_STREAMER_URL': "rtmp://wowza.cern.ch:1935/vod",
'width': width,
'height': height,
'image_url': img_urls and img_urls[0] or '',
'mp4_url': mp4_url.replace('http://mediaarchive.cern.ch', 'https://mediastream.cern.ch'),
'mp4_url_relative': '/' + '/'.join(mp4_url.split('/')[4:])}
try:
from invenio.media_utils import generate_embedding_url
embed_url = generate_embedding_url(bfo.field('037__a'))
additional_tags += '''<meta name="twitter:player" content="%s"/>''' % cgi.escape(embed_url, quote=True).replace('http://', 'https://', 1)
except:
pass
tags = ['<meta property="og:image" content="%s" />%s' % (image_url, image_url != image_secure_url and '\n<meta property="og:image:secure_url" content="%s" />' % image_secure_url or "") for image_url, image_secure_url in images]
tags.extend(['<meta property="og:video" content="%s" />%s' % (video_url, video_url != video_secure_url and '\n<meta property="og:video:secure_url" content="%s" />' % video_secure_url or "") for video_url, video_secure_url in videos])
return "\n".join(tags) + additional_tags
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
diff --git a/invenio/modules/formatter/format_elements/bfe_photos.py b/invenio/modules/formatter/format_elements/bfe_photos.py
index e52703364..eb5be1c35 100644
--- a/invenio/modules/formatter/format_elements/bfe_photos.py
+++ b/invenio/modules/formatter/format_elements/bfe_photos.py
@@ -1,124 +1,124 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibFormat element - Print photos of the record (if bibdoc file)
"""
import cgi
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
from invenio.utils.url import create_html_link
def format_element(bfo, separator=" ", style='', img_style='', text_style='font-size:small',
print_links='yes', max_photos='', show_comment='yes',
img_max_width='250px', display_all_version_links='yes'):
"""
Lists the photos of a record. Display the icon version, linked to
its original version.
This element works for photos appended to a record as BibDoc
files, for which a preview icon has been generated. If there are
several formats for one photo, use the first one found.
@param separator: separator between each photo
@param print_links: if 'yes', print links to the original photo
@param style: style attributes of the whole image block. Eg: "padding:2px;border:1px"
@param img_style: style attributes of the images. Eg: "width:50px;border:none"
@param text_style: style attributes of the text. Eg: "font-size:small"
@param max_photos: the maximum number of photos to display
@param show_comment: if 'yes', display the comment of each photo
@param display_all_version_links: if 'yes', print links to additional (sub)formats
"""
photos = []
bibarchive = BibRecDocs(bfo.recID)
bibdocs = bibarchive.list_bibdocs()
if max_photos.isdigit():
max_photos = int(max_photos)
else:
max_photos = len(bibdocs)
for doc in bibdocs[:max_photos]:
found_icons = []
found_url = ''
for docfile in doc.list_latest_files():
if docfile.is_icon():
found_icons.append((docfile.get_size(), docfile.get_url()))
else:
found_url = docfile.get_url()
found_icons.sort()
if found_icons:
additional_links = ''
name = bibarchive.get_docname(doc.id)
comment = doc.list_latest_files()[0].get_comment()
preview_url = None
if len(found_icons) > 1:
preview_url = found_icons[1][1]
additional_urls = [(docfile.get_size(), docfile.get_url(), \
docfile.get_superformat(), docfile.get_subformat()) \
for docfile in doc.list_latest_files() if not docfile.is_icon()]
additional_urls.sort()
additional_links = [create_html_link(url, urlargd={}, \
linkattrd={'style': 'font-size:x-small'}, \
link_label="%s %s (%s)" % (format.strip('.').upper(), subformat, format_size(size))) \
for (size, url, format, subformat) in additional_urls]
img = '<img src="%(icon_url)s" alt="%(name)s" style="max-width:%(img_max_width)s;_width:%(img_max_width)s;%(img_style)s" />' % \
{'icon_url': cgi.escape(found_icons[0][1], True),
'name': cgi.escape(name, True),
'img_style': img_style,
'img_max_width': img_max_width}
if print_links.lower() == 'yes':
img = '<a href="%s">%s</a>' % (cgi.escape(preview_url or found_url, True), img)
if display_all_version_links.lower() == 'yes' and additional_links:
img += '<br />' + '&nbsp;'.join(additional_links) + '<br />'
if show_comment.lower() == 'yes' and comment:
img += '<div style="margin-auto;text-align:center;%(text_style)s">%(comment)s</div>' % \
{'comment': comment.replace('\n', '<br/>'),
'text_style': text_style}
img = '<div style="vertical-align: middle;text-align:center;display:inline-block;display: -moz-inline-stack;zoom: 1;*display: inline;max-width:%(img_max_width)s;_width:%(img_max_width)s;text-align:center;%(style)s">%(img)s</div>' % \
{'img_max_width': img_max_width,
'style': style,
'img': img}
photos.append(img)
return '<div>' + separator.join(photos) + '</div>'
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
def format_size(size):
"""
Get human-readable string for the given size in Bytes
"""
if size < 1024:
return "%d byte%s" % (size, size != 1 and 's' or '')
elif size < 1024 * 1024:
return "%.1f KB" % (size / 1024)
elif size < 1024 * 1024 * 1024:
return "%.1f MB" % (size / (1024 * 1024))
else:
return "%.1f GB" % (size / (1024 * 1024 * 1024))
diff --git a/invenio/modules/formatter/format_elements/bfe_plots.py b/invenio/modules/formatter/format_elements/bfe_plots.py
index ea72c62fa..d802877ed 100644
--- a/invenio/modules/formatter/format_elements/bfe_plots.py
+++ b/invenio/modules/formatter/format_elements/bfe_plots.py
@@ -1,91 +1,91 @@
# -*- coding: utf-8 -*-
##
## $Id: bfe_CERN_plots.py,v 1.3 2009/03/17 10:55:15 jerome Exp $
##
## This file is part of Invenio.
## Copyright (C) 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibFormat element - Display image of the plot if we are in selected plots collection
"""
__revision__ = "$Id: bfe_CERN_plots.py,v 1.3 2009/03/17 10:55:15 jerome Exp $"
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
from invenio.utils.url import create_html_link
from invenio.config import CFG_SITE_URL, CFG_SITE_RECORD
def format_element(bfo, width="", caption="yes", max="-1"):
"""
Display image of the plot if we are in selected plots collections
@param width: the width of the returned image (Eg: '100px')
@param caption: display the captions or not?
@param max: the maximum number of plots to display (-1 is all plots)
"""
## To achieve this, we take the pngs associated with this document
img_files = []
max = int(max)
bibarchive = BibRecDocs(bfo.recID)
if width != "":
width = 'width="%s"' % width
for doc in bibarchive.list_bibdocs():
for _file in doc.list_latest_files():
if _file.get_type() == "Plot":
try:
caption_text = _file.get_description()[5:]
index = int(_file.get_description()[:5])
img_location = _file.get_url()
except:
# FIXME: we have hit probably a plot context file,
# so ignore this document; but it would be safer
# to check subformat type, so that we don't mask
# other eventual errors here.
continue
img = '<img src="%s" title="%s" %s/>' % \
(img_location, caption_text, width)
link = create_html_link(urlbase='%s/%s/%s/plots#%d' %
(CFG_SITE_URL, CFG_SITE_RECORD, bfo.recID,\
index),
urlargd={},
link_label=img)
img_files.append((index, link))
img_files = sorted(img_files, key=lambda x: x[0])
if max > 0:
img_files = img_files[:max]
for index in range(len(img_files)):
img_files[index] = img_files[index][1]
if len(img_files) == 0:
return ''
return '<div style="overflow-x:scroll;width=100%;white-space:nowrap">' +\
" ".join(img_files) + '</div>'
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
diff --git a/invenio/modules/formatter/format_elements/bfe_plots_thumb.py b/invenio/modules/formatter/format_elements/bfe_plots_thumb.py
index dd128745b..3eb376526 100644
--- a/invenio/modules/formatter/format_elements/bfe_plots_thumb.py
+++ b/invenio/modules/formatter/format_elements/bfe_plots_thumb.py
@@ -1,71 +1,71 @@
# -*- coding: utf-8 -*-
##
## $Id: bfe_CERN_plots.py,v 1.3 2009/03/17 10:55:15 jerome Exp $
##
## This file is part of Invenio.
## Copyright (C) 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibFormat element - Display image of the plot if we are in selected plots collection
"""
__revision__ = "$Id: bfe_CERN_plots.py,v 1.3 2009/03/17 10:55:15 jerome Exp $"
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
from invenio.utils.url import create_html_link
from invenio.config import CFG_SITE_URL
def format_element(bfo):
"""
Display image of the thumbnail plot if we are in selected plots collections
"""
## To achieve this, we take the Thumb file associated with this document
bibarchive = BibRecDocs(bfo.recID)
img_files = []
for doc in bibarchive.list_bibdocs():
for _file in doc.list_latest_files():
if _file.get_type() == "Plot":
caption_text = _file.get_description()[5:]
index = int(_file.get_description()[:5])
img_location = _file.get_url()
if img_location == "":
continue
img = '<img src="%s" width="100px"/>' % \
(img_location)
img_files.append((index, img_location)) # FIXME: was link here
if _file.get_type() == "Thumb":
img_location = _file.get_url()
img = '<img src="%s" width="100px"/>' % \
(img_location)
return '<div align="left">' + img + '</div>'
# then we use the default: the last plot with an image
img_files = sorted(img_files, key=lambda x: x[0])
if img_files:
return '<div align="left">' + img_files[-1][1] + '</div>'
else:
return ''
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
diff --git a/invenio/modules/formatter/format_elements/bfe_video_platform_downloads.py b/invenio/modules/formatter/format_elements/bfe_video_platform_downloads.py
index 1e7447699..95ac6affb 100644
--- a/invenio/modules/formatter/format_elements/bfe_video_platform_downloads.py
+++ b/invenio/modules/formatter/format_elements/bfe_video_platform_downloads.py
@@ -1,99 +1,99 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibFormat element
* Part of the Video Platform Prototype
* Generates a list of links to download videos directly
* The list includes the codec/container, subformat/resolution and file size
"""
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
html_skeleton_popup = """<!-- DOWNLOAD POPUP -->
<div id="video_download_popup_box">
%(elements)s
</div>
"""
html_skeleton_element = """<!-- DOWNLOAD POPUP ELEMENT -->
<div class="video_download_popup_element">
<a href="%(video_url)s">
<div class="video_download_popup_element_codec">
%(video_codec)s
</div>
<div class="video_download_popup_element_filesize">
%(video_size)s
</div>
<div class="video_download_popup_element_resolution">
%(video_resolution)s
</div>
</a>
</div>
"""
def format_element(bfo):
"""Format Element Function"""
return create_download_popup(bfo)
def create_download_popup(bfo):
"""Create the complete download popup"""
elements = []
recdoc = BibRecDocs(bfo.recID)
bibdocs = recdoc.list_bibdocs()
## Go through all the BibDocs and search for video related signatures
for bibdoc in bibdocs:
bibdocfiles = bibdoc.list_all_files()
for bibdocfile in bibdocfiles:
## When a video signature is found, add it as an element
if bibdocfile.get_superformat() in ('.mp4', '.webm', '.ogv',
'.mov', '.wmv', '.avi',
'.mpeg', '.flv', '.mkv'):
url = bibdocfile.get_url()
codec = bibdocfile.get_superformat()[1:]
resolution = bibdocfile.get_subformat()
size = bibdocfile.get_size()
elements.append(create_download_element(url, codec,
size, resolution))
if elements:
return html_skeleton_popup % {
'elements': "\n".join(elements)
}
else:
return ""
def create_download_element(url, codec, size, resolution):
"""Creates an HTML element based on the element skeleton"""
return html_skeleton_element % {
'video_url': url + "&download=1",
'video_codec': codec.upper(),
'video_size': human_size(size),
'video_resolution': resolution
}
def human_size(byte_size):
for x in ['bytes','KB','MB','GB','TB']:
if byte_size < 1024.0:
return "%3.1f %s" % (byte_size, x)
byte_size /= 1024.0
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
diff --git a/invenio/modules/formatter/format_elements/bfe_video_platform_sources.py b/invenio/modules/formatter/format_elements/bfe_video_platform_sources.py
index 1c688620f..cc749b88e 100644
--- a/invenio/modules/formatter/format_elements/bfe_video_platform_sources.py
+++ b/invenio/modules/formatter/format_elements/bfe_video_platform_sources.py
@@ -1,111 +1,111 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibFormat element
* Part of the video platform prototype
* Creates a <select> element with <option> elements
containing various information about video sources. The options are later
evaluated by javascript and the video source is dynamically injected in the
HTML5 video element.
* Based on bfe_video_selector.py
"""
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
def format_element(bfo):
""" Format element function to create the select and option elements
with HTML5 data attributes that store all the necesarry metadata to
construct video sources with JavaScript."""
videos = {
'360p': {'width': 640, 'height': 360, 'poster': None, 'mp4': None, 'webm': None, 'ogv': None},
'480p': {'width': 854,'height': 480, 'poster': None, 'mp4': None, 'webm': None, 'ogv': None,},
'720p': {'width': 1280, 'height': 720, 'poster': None, 'mp4': None, 'webm': None, 'ogv': None},
'1080p': {'width': 1920, 'height': 1080, 'poster': None, 'mp4': None, 'webm': None, 'ogv': None}
}
recdoc = BibRecDocs(bfo.recID)
bibdocs = recdoc.list_bibdocs()
## Go through all the BibDocs and search for video related signatures
for bibdoc in bibdocs:
bibdocfiles = bibdoc.list_all_files()
for bibdocfile in bibdocfiles:
## When a video signature is found, add the url to the videos dictionary
if bibdocfile.get_superformat() in ('.mp4', '.webm', '.ogv') and bibdocfile.get_subformat() in ('360p', '480p', '720p', '1080p'):
src = bibdocfile.get_url()
codec = bibdocfile.get_superformat()[1:]
size = bibdocfile.get_subformat()
videos[size][codec] = src
## When a poster signature is found, add the url to the videos dictionary
elif bibdocfile.get_comment() in ('SUGGESTIONTUMB', 'BIGTHUMB', 'POSTER', 'SMALLTHUMB') and bibdocfile.get_subformat() in ('360p', '480p', '720p', '1080p'):
src = bibdocfile.get_url()
size = bibdocfile.get_subformat()
videos[size]['poster'] = src
## Build video select options for every video size format that was found
select_options = []
for key, options in videos.iteritems():
## If we have at least one url, the format is available
if options['mp4'] or options['webm'] or options['ogv']:
## create am option element
option_element = create_option_element(url_webm=options['webm'], url_ogv=options['ogv'], url_mp4=options['mp4'],
url_poster=options['poster'], width=options['width'], height=options['height'],
subformat=key)
select_options.append(option_element)
select_element = create_select_element(select_options)
return select_element
def create_select_element(options):
""" Creates the HTML select element that carries the video format options
"""
text = """<select id="mejs-resolution">
%s
</select>
""" % '\n'.join(options)
return text
def create_option_element(width, height, subformat, url_webm=None, url_ogv=None, url_mp4=None, url_poster=None,):
""" Creates an HTML option element that carries all video information
"""
if url_webm:
webm = """data-src-webm="%s" data-type-webm='video/webm; codecs="vp8, vorbis"'""" % url_webm
else:
webm = ""
if url_ogv:
ogv = """data-src-ogg="%s" data-type-ogv='video/ogv; codecs="theora, vorbis"'""" % url_ogv
else:
ogv = ""
if url_mp4:
mp4 = """data-src-mp4="%s" data-type-mp4='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'""" % url_mp4
else:
mp4 = ""
text = """<option %(webm)s %(ogv)s %(mp4)s data-poster="%(url_poster)s" data-video-width="%(width)spx" data-video-height="%(height)spx">%(subformat)s</option>""" % {
'webm': webm,
'ogv': ogv,
'mp4': mp4,
'url_poster': url_poster,
'width': width,
'height': height,
'subformat': subformat
}
return text
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
diff --git a/invenio/modules/formatter/format_elements/bfe_video_platform_suggestions.py b/invenio/modules/formatter/format_elements/bfe_video_platform_suggestions.py
index c07b2cbf4..3cddd7b64 100644
--- a/invenio/modules/formatter/format_elements/bfe_video_platform_suggestions.py
+++ b/invenio/modules/formatter/format_elements/bfe_video_platform_suggestions.py
@@ -1,163 +1,163 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibFormat element
* Part of the video platform prototype
* Creates a list of video suggestions
* Based on word similarity ranking
* Must be done in a collection that holds video records with thumbnails, title and author
"""
from invenio.config import CFG_SITE_URL
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
from invenio.intbitset import intbitset
from invenio.legacy.search_engine import perform_request_search
from invenio.legacy.bibrank.record_sorter import rank_records
from invenio.legacy.bibrecord import get_fieldvalues
from invenio.modules.encoder.utils import timecode_to_seconds
import random
html_skeleton_suggestion = """
<!-- VIDEO SUGGESTION -->
<div class="video_suggestion_box">
<div class="video_suggestion_thumbnail">
<a href="%(video_record_url)s">
<img src="%(video_thumb_url)s" alt="%(video_thumb_alt)s"/>
</a>
<div class="video_suggestion_duration">
%(video_duration)s
</div>
</div>
<div class="video_suggestion_title">
%(video_title)s
</div>
<div class="video_suggestion_author">
by %(video_authors)s
</div>
</div>
"""
def format_element(bfo, collection="Videos", threshold="75", maximum="3", shuffle="True"):
""" Creates video suggestions based on ranking algorithms
@param collection: Collection to take the suggestions from
@param threshold: Value between 0 and 100. Only records ranked higher than the value are presented.
@param maximum: Maximum suggestions to show
@param shuffle: True or False, should the suggestions be shuffled?
"""
if threshold.isdigit():
threshold = int(threshold)
else:
raise ValueError("The given threshold is not a digit")
if maximum.isdigit():
maximum = int(maximum)
else:
raise ValueError("The given maximum is not a digit")
if shuffle == "True":
shuffle = True
else:
shuffle = False;
suggestions = []
recid = bfo.control_field('001')
similar_records = find_similar_videos(recid, collection, threshold, maximum, shuffle)
for sim_recid in similar_records:
thumbnail = get_video_thumbnail(sim_recid)
title = get_video_title(sim_recid)
authors = get_video_authors(sim_recid)
url = get_video_record_url(sim_recid)
duration = get_video_duration(sim_recid)
suggestion = html_skeleton_suggestion % {
'video_record_url': url,
'video_thumb_url': thumbnail[0],
'video_thumb_alt': thumbnail[1],
'video_duration': duration,
'video_title': title,
'video_authors': authors,
}
suggestions.append(suggestion)
return "\n".join(suggestions)
def find_similar_videos(recid, collection="Videos", threshold=75, maximum=3, shuffle=True):
""" Returns a list of similar video records
"""
similar_records = []
collection_recids = intbitset(perform_request_search(cc=collection))
ranking = rank_records('wrd', 0, collection_recids, ['recid:' + str(recid)])
## ([6, 7], [81, 100], '(', ')', '')
for list_pos, rank in enumerate(ranking[1]):
if rank >= threshold:
similar_records.append(ranking[0][list_pos])
if shuffle:
if maximum > len(similar_records):
maximum = len(similar_records)
return random.sample(similar_records, maximum)
else:
return similar_records[:maximum]
def get_video_thumbnail(recid):
""" Returns the URL and ALT text for a video thumbnail of a given record
"""
comments = get_fieldvalues(recid, '8564_z')
descriptions = get_fieldvalues(recid, '8564_y')
urls = get_fieldvalues(recid, '8564_u')
for pos, comment in enumerate(comments):
if comment in ('SUGGESTIONTHUMB', 'BIGTHUMB', 'THUMB', 'SMALLTHUMB', 'POSTER'):
return (urls[pos], descriptions[pos])
return ("", "")
def get_video_title(recid):
""" Return the Title of a video record
"""
return get_fieldvalues(recid, '245__a')[0]
def get_video_authors(recid):
""" Return the Authors of a video record
"""
return ", ".join(get_fieldvalues(recid, '100__a'))
def get_video_record_url(recid):
""" Return the URL of a video record
"""
return CFG_SITE_URL + "/record/" + str(recid)
def get_video_duration(recid):
""" Return the duration of a video
"""
duration = get_fieldvalues(recid, '950__d')
if duration:
duration = duration[0]
duration = timecode_to_seconds(duration)
return human_readable_time(duration)
else:
return ""
def human_readable_time(seconds):
""" Creates a human readable duration representation
"""
for x in ['s','m','h']:
if seconds < 60.0:
return "%.0f %s" % (seconds, x)
seconds /= seconds
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
diff --git a/invenio/modules/formatter/format_elements/bfe_video_selector.py b/invenio/modules/formatter/format_elements/bfe_video_selector.py
index 86e17ba0f..8dd57cdf5 100644
--- a/invenio/modules/formatter/format_elements/bfe_video_selector.py
+++ b/invenio/modules/formatter/format_elements/bfe_video_selector.py
@@ -1,108 +1,108 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibFormat element - Creates a <select> element with <option> elements
containing various information about video sources. The options are later
evaluated by javascript and the video source is dynamically injected in the
HTML5 video element.
"""
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
def format_element(bfo):
""" Format element function to create the select and option elements
with HTML5 data attributes that store all the necesarry metadata to
construct video sources with JavaScript."""
videos = {
'360p': {'width': 640, 'height': 360, 'poster': None, 'mp4': None, 'webm': None, 'ogv': None},
'480p': {'width': 854,'height': 480, 'poster': None, 'mp4': None, 'webm': None, 'ogv': None,},
'720p': {'width': 1280, 'height': 720, 'poster': None, 'mp4': None, 'webm': None, 'ogv': None},
'1080p': {'width': 1920, 'height': 1080, 'poster': None, 'mp4': None, 'webm': None, 'ogv': None}
}
recdoc = BibRecDocs(bfo.recID)
bibdocs = recdoc.list_bibdocs()
## Go through all the BibDocs and search for video related signatures
for bibdoc in bibdocs:
bibdocfiles = bibdoc.list_all_files()
for bibdocfile in bibdocfiles:
## When a video signature is found, add the url to the videos dictionary
if bibdocfile.get_superformat() in ('.mp4', '.webm', '.ogv') and bibdocfile.get_subformat() in ('360p', '480p', '720p', '1080p'):
src = bibdocfile.get_url()
codec = bibdocfile.get_superformat()[1:]
size = bibdocfile.get_subformat()
videos[size][codec] = src
## When a poster signature is found, add the url to the videos dictionary
elif bibdocfile.get_comment() in ('POSTER') and bibdocfile.get_subformat() in ('360p', '480p', '720p', '1080p'):
src = bibdocfile.get_url()
size = bibdocfile.get_subformat()
videos[size]['poster'] = src
## Build video select options for every video size format that was found
select_options = []
for key, options in videos.iteritems():
## If we have at least one url, the format is available
if options['mp4'] or options['webm'] or options['ogv']:
## create am option element
option_element = create_option_element(url_webm=options['webm'], url_ogv=options['ogv'], url_mp4=options['mp4'],
url_poster=options['poster'], width=options['width'], height=options['height'],
subformat=key)
select_options.append(option_element)
select_element = create_select_element(select_options)
return select_element
def create_select_element(options):
""" Creates the HTML select element that carries the video format options
"""
text = """<select id="mejs-resolution" style="display: none">
%s
</select>
""" % '\n'.join(options)
return text
def create_option_element(width, height, subformat, url_webm=None, url_ogv=None, url_mp4=None, url_poster=None,):
""" Creates an HTML option element that carries all video information
"""
if url_webm:
webm = """data-src-webm="%s" data-type-webm='video/webm; codecs="vp8, vorbis"'""" % url_webm
else:
webm = ""
if url_ogv:
ogv = """data-src-ogg="%s" data-type-ogv='video/ogv; codecs="theora, vorbis"'""" % url_ogv
else:
ogv = ""
if url_mp4:
mp4 = """data-src-mp4="%s" data-type-mp4='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'""" % url_mp4
else:
mp4 = ""
text = """<option %(webm)s %(ogv)s %(mp4)s data-poster="%(url_poster)s" data-video-width="%(width)spx" data-video-height="%(height)spx">%(subformat)s</option>""" % {
'webm': webm,
'ogv': ogv,
'mp4': mp4,
'url_poster': url_poster,
'width': width,
'height': height,
'subformat': subformat
}
return text
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
diff --git a/invenio/modules/formatter/format_elements/bfe_video_sources.py b/invenio/modules/formatter/format_elements/bfe_video_sources.py
index 6b8b22fcc..aee18a6ab 100644
--- a/invenio/modules/formatter/format_elements/bfe_video_sources.py
+++ b/invenio/modules/formatter/format_elements/bfe_video_sources.py
@@ -1,57 +1,57 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibFormat element - Creates <source> elements for html5 videos
"""
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
def format_element(bfo, subformat="480p"):
""" Creates HTML5 source elements for the given subformat.
MP4, WebM and OGV are currently supported as video sources.
The function will scan the bibdocfiles attached to the record for
videos with these formats and the fiven subformat.
@param subformat: BibDocFile subformat to create the sources from (e.g. 480p)
"""
video_sources = []
recdoc = BibRecDocs(bfo.recID)
bibdocs = recdoc.list_bibdocs()
for bibdoc in bibdocs:
bibdocfiles = bibdoc.list_all_files()
for bibdocfile in bibdocfiles:
if bibdocfile.get_superformat() in ('.mp4', '.webm', '.ogv') and bibdocfile.get_subformat() == subformat:
src = bibdocfile.get_url()
ftype = bibdocfile.get_superformat()[1:]
if ftype == 'mp4':
codecs = 'avc1.42E01E, mp4a.40.2'
elif ftype == 'webm':
codecs = 'vp8, vorbis'
elif ftype == 'ogv':
codecs = 'theora, vorbis'
source = '<source src=\"%s\" type=\'video/%s; codecs=\"%s\"\' />' % (src, ftype, codecs)
video_sources.append(source)
return "\n".join(video_sources)
def escape_values(bfo):
"""
Called by BibFormat in order to check if output of this element
should be escaped.
"""
return 0
diff --git a/invenio/modules/formatter/output_formats/__init__.py b/invenio/modules/formatter/output_formats/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/invenio/modules/formatter/testsuite/output_formats/__init__.py b/invenio/modules/formatter/testsuite/output_formats/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/invenio/modules/formatter/testsuite/test_formatter_engine.py b/invenio/modules/formatter/testsuite/test_formatter_engine.py
index b563df3f6..593e07cca 100644
--- a/invenio/modules/formatter/testsuite/test_formatter_engine.py
+++ b/invenio/modules/formatter/testsuite/test_formatter_engine.py
@@ -1,847 +1,848 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Test cases for the BibFormat engine. Also test
some utilities function in bibformat_utils module"""
__revision__ = "$Id$"
# pylint: disable=C0301
import os
import sys
from invenio.base.globals import cfg
from invenio.base.wrappers import lazy_import
from invenio.testsuite import make_test_suite, run_test_suite, InvenioTestCase
bibformat = lazy_import('invenio.modules.formatter')
bibformat_engine = lazy_import('invenio.modules.formatter.engine')
bibformat_utils = lazy_import('invenio.modules.formatter.utils')
bibformat_config = lazy_import('invenio.modules.formatter.config')
bibformatadminlib = lazy_import('invenio.legacy.bibformat.adminlib')
format_templates = lazy_import('invenio.modules.formatter.testsuite.format_templates')
format_elements = lazy_import('invenio.modules.formatter.testsuite.format_elements')
+output_formats = lazy_import('invenio.modules.formatter.testsuite.output_formats')
CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH = 'invenio.modules.formatter.testsuite'
class FormatTemplateTest(InvenioTestCase):
""" bibformat - tests on format templates"""
def setUp(self):
self.old_templates_path = cfg['CFG_BIBFORMAT_TEMPLATES_PATH']
cfg['CFG_BIBFORMAT_TEMPLATES_PATH'] = format_templates.__path__[0]
def tearDown(self):
cfg['CFG_BIBFORMAT_TEMPLATES_PATH'] = self.old_templates_path
def test_get_format_template(self):
"""bibformat - format template parsing and returned structure"""
#Test correct parsing and structure
template_1 = bibformat_engine.get_format_template("Test1.bft", with_attributes=True)
self.assert_(template_1 is not None)
self.assertEqual(template_1['code'], "test\n<name>this value should stay as it is</name>\n<description>this one too</description>\n")
self.assertEqual(template_1['attrs']['name'], "name_test")
self.assertEqual(template_1['attrs']['description'], "desc_test")
#Test correct parsing and structure of file without description or name
template_2 = bibformat_engine.get_format_template("Test_2.bft", with_attributes=True)
self.assert_(template_2 is not None)
self.assertEqual(template_2['code'], "test")
self.assertEqual(template_2['attrs']['name'], "Test_2.bft")
self.assertEqual(template_2['attrs']['description'], "")
#Test correct parsing and structure of file without description or name
unknown_template = bibformat_engine.get_format_template("test_no_template.test", with_attributes=True)
self.assertEqual(unknown_template, None)
def test_get_format_templates(self):
""" bibformat - loading multiple format templates"""
templates = bibformat_engine.get_format_templates(with_attributes=True)
#test correct loading
self.assert_("Test1.bft" in templates.keys())
self.assert_("Test_2.bft" in templates.keys())
self.assert_("Test3.bft" in templates.keys())
self.assert_("Test_no_template.test" not in templates.keys())
#Test correct pasrsing and structure
self.assertEqual(templates['Test1.bft']['code'], "test\n<name>this value should stay as it is</name>\n<description>this one too</description>\n")
self.assertEqual(templates['Test1.bft']['attrs']['name'], "name_test")
self.assertEqual(templates['Test1.bft']['attrs']['description'], "desc_test")
def test_get_format_template_attrs(self):
""" bibformat - correct parsing of attributes in format template"""
attrs = bibformat_engine.get_format_template_attrs("Test1.bft")
self.assertEqual(attrs['name'], "name_test")
self.assertEqual(attrs['description'], "desc_test")
def test_get_fresh_format_template_filename(self):
""" bibformat - getting fresh filename for format template"""
filename_and_name_1 = bibformat_engine.get_fresh_format_template_filename("Test")
self.assert_(len(filename_and_name_1) >= 2)
self.assertEqual(filename_and_name_1[0], "Test.bft")
filename_and_name_2 = bibformat_engine.get_fresh_format_template_filename("Test1")
self.assert_(len(filename_and_name_2) >= 2)
self.assert_(filename_and_name_2[0] != "Test1.bft")
path = cfg['CFG_BIBFORMAT_TEMPLATES_PATH'] + os.sep + filename_and_name_2[0]
self.assert_(not os.path.exists(path))
class FormatElementTest(InvenioTestCase):
""" bibformat - tests on format templates"""
def setUp(self):
# pylint: disable=C0103
"""bibformat - setting python path to test elements"""
sys.path.append('%s' % cfg['CFG_TMPDIR'])
self.old_elements_path = cfg['CFG_BIBFORMAT_ELEMENTS_PATH']
cfg['CFG_BIBFORMAT_ELEMENTS_PATH'] = format_elements.__path__[0]
self.old_import_path = cfg['CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH']
cfg['CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH'] = CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH
def tearDown(self):
sys.path.pop()
cfg['CFG_BIBFORMAT_ELEMENTS_PATH'] = self.old_elements_path
cfg['CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH'] = self.old_import_path
def test_resolve_format_element_filename(self):
"""bibformat - resolving format elements filename """
#Test elements filename starting without bfe_, with underscore instead of space
filenames = ["test 1", "test 1.py", "bfe_test 1", "bfe_test 1.py", "BFE_test 1",
"BFE_TEST 1", "BFE_TEST 1.py", "BFE_TeST 1.py", "BFE_TeST 1",
"BfE_TeST 1.py", "BfE_TeST 1","test_1", "test_1.py", "bfe_test_1",
"bfe_test_1.py", "BFE_test_1",
"BFE_TEST_1", "BFE_TEST_1.py", "BFE_Test_1.py", "BFE_TeST_1",
"BfE_TeST_1.py", "BfE_TeST_1"]
for i in range(len(filenames)-2):
filename_1 = bibformat_engine.resolve_format_element_filename(filenames[i])
self.assert_(filename_1 is not None)
filename_2 = bibformat_engine.resolve_format_element_filename(filenames[i+1])
self.assertEqual(filename_1, filename_2)
#Test elements filename starting with bfe_, and with underscores instead of spaces
filenames = ["test 2", "test 2.py", "bfe_test 2", "bfe_test 2.py", "BFE_test 2",
"BFE_TEST 2", "BFE_TEST 2.py", "BFE_TeST 2.py", "BFE_TeST 2",
"BfE_TeST 2.py", "BfE_TeST 2","test_2", "test_2.py", "bfe_test_2",
"bfe_test_2.py", "BFE_test_2",
"BFE_TEST_2", "BFE_TEST_2.py", "BFE_TeST_2.py", "BFE_TeST_2",
"BfE_TeST_2.py", "BfE_TeST_2"]
for i in range(len(filenames)-2):
filename_1 = bibformat_engine.resolve_format_element_filename(filenames[i])
self.assert_(filename_1 is not None)
filename_2 = bibformat_engine.resolve_format_element_filename(filenames[i+1])
self.assertEqual(filename_1, filename_2)
#Test non existing element
non_existing_element = bibformat_engine.resolve_format_element_filename("BFE_NON_EXISTING_ELEMENT")
self.assertEqual(non_existing_element, None)
def test_get_format_element(self):
"""bibformat - format elements parsing and returned structure"""
#Test loading with different kind of names, for element with spaces in name, without bfe_
element_1 = bibformat_engine.get_format_element("test 1", with_built_in_params=True)
self.assert_(element_1 is not None)
element_1_bis = bibformat_engine.get_format_element("bfe_tEst_1.py", with_built_in_params=True)
self.assertEqual(element_1, element_1_bis)
#Test loading with different kind of names, for element without spaces in name, wit bfe_
element_2 = bibformat_engine.get_format_element("test 2", with_built_in_params=True)
self.assert_(element_2 is not None)
element_2_bis = bibformat_engine.get_format_element("bfe_tEst_2.py", with_built_in_params=True)
self.assertEqual(element_2, element_2_bis)
#Test loading incorrect elements
element_3 = bibformat_engine.get_format_element("test 3", with_built_in_params=True)
self.assertEqual(element_3, None)
element_4 = bibformat_engine.get_format_element("test 4", with_built_in_params=True)
self.assertEqual(element_4, None)
unknown_element = bibformat_engine.get_format_element("TEST_NO_ELEMENT", with_built_in_params=True)
self.assertEqual(unknown_element, None)
#Test element without docstring
element_5 = bibformat_engine.get_format_element("test_5", with_built_in_params=True)
self.assert_(element_5 is not None)
self.assertEqual(element_5['attrs']['description'], '')
self.assert_({'name':"param1",
'description':"(no description provided)",
'default':""} in element_5['attrs']['params'] )
self.assertEqual(element_5['attrs']['seealso'], [])
#Test correct parsing:
#Test type of element
self.assertEqual(element_1['type'], "python")
#Test name = element filename, with underscore instead of spaces,
#without BFE_ and uppercase
self.assertEqual(element_1['attrs']['name'], "TEST_1")
#Test description parsing
self.assertEqual(element_1['attrs']['description'], "Prints test")
#Test @see: parsing
self.assertEqual(element_1['attrs']['seealso'], ["element2.py", "unknown_element.py"])
#Test @param parsing
self.assert_({'name':"param1",
'description':"desc 1",
'default':""} in element_1['attrs']['params'] )
self.assert_({'name':"param2",
'description':"desc 2",
'default':"default value"} in element_1['attrs']['params'] )
#Test non existing element
non_existing_element = bibformat_engine.get_format_element("BFE_NON_EXISTING_ELEMENT")
self.assertEqual(non_existing_element, None)
def test_get_format_element_attrs_from_function(self):
""" bibformat - correct parsing of attributes in 'format' docstring"""
element_1 = bibformat_engine.get_format_element("test 1", with_built_in_params=True)
function = element_1['code']
attrs = bibformat_engine.get_format_element_attrs_from_function(function,
element_1['attrs']['name'],
with_built_in_params=True)
self.assertEqual(attrs['name'], "TEST_1")
#Test description parsing
self.assertEqual(attrs['description'], "Prints test")
#Test @see: parsing
self.assertEqual(attrs['seealso'], ["element2.py", "unknown_element.py"])
def test_get_format_elements(self):
"""bibformat - multiple format elements parsing and returned structure"""
elements = bibformat_engine.get_format_elements()
self.assert_(isinstance(elements, dict))
self.assertEqual(elements['TEST_1']['attrs']['name'], "TEST_1")
self.assertEqual(elements['TEST_2']['attrs']['name'], "TEST_2")
self.assert_("TEST_3" not in elements.keys())
self.assert_("TEST_4" not in elements.keys())
def test_get_tags_used_by_element(self):
"""bibformat - identification of tag usage inside element"""
cfg['CFG_BIBFORMAT_ELEMENTS_PATH'] = self.old_elements_path
cfg['CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH'] = self.old_import_path
tags = bibformatadminlib.get_tags_used_by_element('bfe_abstract.py')
self.failUnless(len(tags) == 4,
'Could not correctly identify tags used in bfe_abstract.py')
class OutputFormatTest(InvenioTestCase):
""" bibformat - tests on output formats"""
def setUp(self):
self.old_outputs_path = bibformat_engine.CFG_BIBFORMAT_OUTPUTS_PATH
- bibformat_engine.CFG_BIBFORMAT_OUTPUTS_PATH = cfg['CFG_TMPDIR']
+ bibformat_engine.CFG_BIBFORMAT_OUTPUTS_PATH = output_formats.__path__[0]
def tearDown(self):
bibformat_engine.CFG_BIBFORMAT_OUTPUTS_PATH = self.old_outputs_path
def test_get_output_format(self):
""" bibformat - output format parsing and returned structure """
filename_1 = bibformat_engine.resolve_output_format_filename("test1")
output_1 = bibformat_engine.get_output_format(filename_1, with_attributes=True)
self.assertEqual(output_1['attrs']['names']['generic'], "")
self.assert_(isinstance(output_1['attrs']['names']['ln'], dict))
self.assert_(isinstance(output_1['attrs']['names']['sn'], dict))
self.assertEqual(output_1['attrs']['code'], "TEST1")
self.assert_(len(output_1['attrs']['code']) <= 6)
self.assertEqual(len(output_1['rules']), 4)
self.assertEqual(output_1['rules'][0]['field'], '980.a')
self.assertEqual(output_1['rules'][0]['template'], 'Picture_HTML_detailed.bft')
self.assertEqual(output_1['rules'][0]['value'], 'PICTURE ')
self.assertEqual(output_1['rules'][1]['field'], '980.a')
self.assertEqual(output_1['rules'][1]['template'], 'Article.bft')
self.assertEqual(output_1['rules'][1]['value'], 'ARTICLE')
self.assertEqual(output_1['rules'][2]['field'], '980__a')
self.assertEqual(output_1['rules'][2]['template'], 'Thesis_detailed.bft')
self.assertEqual(output_1['rules'][2]['value'], 'THESIS ')
self.assertEqual(output_1['rules'][3]['field'], '980__a')
self.assertEqual(output_1['rules'][3]['template'], 'Pub.bft')
self.assertEqual(output_1['rules'][3]['value'], 'PUBLICATION ')
filename_2 = bibformat_engine.resolve_output_format_filename("TEST2")
output_2 = bibformat_engine.get_output_format(filename_2, with_attributes=True)
self.assertEqual(output_2['attrs']['names']['generic'], "")
self.assert_(isinstance(output_2['attrs']['names']['ln'], dict))
self.assert_(isinstance(output_2['attrs']['names']['sn'], dict))
self.assertEqual(output_2['attrs']['code'], "TEST2")
self.assert_(len(output_2['attrs']['code']) <= 6)
self.assertEqual(output_2['rules'], [])
unknown_output = bibformat_engine.get_output_format("unknow", with_attributes=True)
self.assertEqual(unknown_output, {'rules':[],
'default':"",
'attrs':{'names':{'generic':"", 'ln':{}, 'sn':{}},
'description':'',
'code':"UNKNOW",
'visibility': 1,
'content_type':""}})
def test_get_output_formats(self):
""" bibformat - loading multiple output formats """
outputs = bibformat_engine.get_output_formats(with_attributes=True)
self.assert_(isinstance(outputs, dict))
self.assert_("TEST1.bfo" in outputs.keys())
self.assert_("TEST2.bfo" in outputs.keys())
self.assert_("unknow.bfo" not in outputs.keys())
#Test correct parsing
output_1 = outputs["TEST1.bfo"]
self.assertEqual(output_1['attrs']['names']['generic'], "")
self.assert_(isinstance(output_1['attrs']['names']['ln'], dict))
self.assert_(isinstance(output_1['attrs']['names']['sn'], dict))
self.assertEqual(output_1['attrs']['code'], "TEST1")
self.assert_(len(output_1['attrs']['code']) <= 6)
def test_get_output_format_attrs(self):
""" bibformat - correct parsing of attributes in output format"""
attrs= bibformat_engine.get_output_format_attrs("TEST1")
self.assertEqual(attrs['names']['generic'], "")
self.assert_(isinstance(attrs['names']['ln'], dict))
self.assert_(isinstance(attrs['names']['sn'], dict))
self.assertEqual(attrs['code'], "TEST1")
self.assert_(len(attrs['code']) <= 6)
def test_resolve_output_format(self):
""" bibformat - resolving output format filename"""
filenames = ["test1", "test1.bfo", "TEST1", "TeST1", "TEST1.bfo", "<b>test1"]
for i in range(len(filenames)-2):
filename_1 = bibformat_engine.resolve_output_format_filename(filenames[i])
self.assert_(filename_1 is not None)
filename_2 = bibformat_engine.resolve_output_format_filename(filenames[i+1])
self.assertEqual(filename_1, filename_2)
def test_get_fresh_output_format_filename(self):
""" bibformat - getting fresh filename for output format"""
filename_and_name_1 = bibformat_engine.get_fresh_output_format_filename("test")
self.assert_(len(filename_and_name_1) >= 2)
self.assertEqual(filename_and_name_1[0], "TEST.bfo")
filename_and_name_1_bis = bibformat_engine.get_fresh_output_format_filename("<test>")
self.assert_(len(filename_and_name_1_bis) >= 2)
self.assertEqual(filename_and_name_1_bis[0], "TEST.bfo")
filename_and_name_2 = bibformat_engine.get_fresh_output_format_filename("test1")
self.assert_(len(filename_and_name_2) >= 2)
self.assert_(filename_and_name_2[0] != "TEST1.bfo")
path = bibformat_engine.CFG_BIBFORMAT_OUTPUTS_PATH + os.sep + filename_and_name_2[0]
self.assert_(not os.path.exists(path))
filename_and_name_3 = bibformat_engine.get_fresh_output_format_filename("test1testlong")
self.assert_(len(filename_and_name_3) >= 2)
self.assert_(filename_and_name_3[0] != "TEST1TESTLONG.bft")
self.assert_(len(filename_and_name_3[0]) <= 6 + 1 + len(bibformat_config.CFG_BIBFORMAT_FORMAT_OUTPUT_EXTENSION))
path = bibformat_engine.CFG_BIBFORMAT_OUTPUTS_PATH + os.sep + filename_and_name_3[0]
self.assert_(not os.path.exists(path))
class PatternTest(InvenioTestCase):
""" bibformat - tests on re patterns"""
def test_pattern_lang(self):
""" bibformat - correctness of pattern 'pattern_lang'"""
text = ''' <h1>Here is my test text</h1>
<p align="center">
<lang><en><b>Some words</b></en><fr>Quelques mots</fr><de>Einige Wörter</de> garbage </lang>
Here ends the middle of my test text
<lang><en><b>English</b></en><fr><b>Français</b></fr><de><b>Deutsch</b></de></lang>
<b>Here ends my test text</b></p>'''
result = bibformat_engine.pattern_lang.search(text)
self.assertEqual(result.group("langs"), "<en><b>Some words</b></en><fr>Quelques mots</fr><de>Einige Wörter</de> garbage ")
text = ''' <h1>Here is my test text</h1>
<BFE_test param="
<lang><en><b>Some words</b></en><fr>Quelques mots</fr><de>Einige Wörter</de> garbage </lang>" />
'''
result = bibformat_engine.pattern_lang.search(text)
self.assertEqual(result.group("langs"), "<en><b>Some words</b></en><fr>Quelques mots</fr><de>Einige Wörter</de> garbage ")
def test_ln_pattern(self):
""" bibformat - correctness of pattern 'ln_pattern'"""
text = "<en><b>Some words</b></en><fr>Quelques mots</fr><de>Einige Wörter</de> garbage "
result = bibformat_engine.ln_pattern.search(text)
self.assertEqual(result.group(1), "en")
self.assertEqual(result.group(2), "<b>Some words</b>")
def test_pattern_format_template_name(self):
""" bibformat - correctness of pattern 'pattern_format_template_name'"""
text = '''
garbage
<name><b>a name</b></name>
<description>a <b>description</b> on
2 lines </description>
<h1>the content of the template</h1>
content
'''
result = bibformat_engine.pattern_format_template_name.search(text)
self.assertEqual(result.group('name'), "<b>a name</b>")
def test_pattern_format_template_desc(self):
""" bibformat - correctness of pattern 'pattern_format_template_desc'"""
text = '''
garbage
<name><b>a name</b></name>
<description>a <b>description</b> on
2 lines </description>
<h1>the content of the template</h1>
content
'''
result = bibformat_engine.pattern_format_template_desc.search(text)
self.assertEqual(result.group('desc'), '''a <b>description</b> on
2 lines ''')
def test_pattern_tag(self):
""" bibformat - correctness of pattern 'pattern_tag'"""
text = '''
garbage but part of content
<name><b>a name</b></name>
<description>a <b>description</b> on
2 lines </description>
<h1>the content of the template</h1>
<BFE_tiTLE param1="<b>value1</b>"
param2=""/>
my content is so nice!
<BFE_title param1="value1"/>
<BFE_title param1="value1"/>
'''
result = bibformat_engine.pattern_tag.search(text)
self.assertEqual(result.group('function_name'), "tiTLE")
self.assertEqual(result.group('params').strip(), '''param1="<b>value1</b>"
param2=""''')
def test_pattern_function_params(self):
""" bibformat - correctness of pattern 'test_pattern_function_params'"""
text = '''
param1="" param2="value2"
param3="<b>value3</b>" garbage
'''
names = ["param1", "param2", "param3"]
values = ["", "value2", "<b>value3</b>"]
results = bibformat_engine.pattern_format_element_params.finditer(text) #TODO
param_i = 0
for match in results:
self.assertEqual(match.group('param'), names[param_i])
self.assertEqual(match.group('value'), values [param_i])
param_i += 1
def test_pattern_format_element_params(self):
""" bibformat - correctness of pattern 'pattern_format_element_params'"""
text = '''
a description for my element
some text
@param param1: desc1
@param param2: desc2
@see: seethis, seethat
'''
names = ["param1", "param2"]
descriptions = ["desc1", "desc2"]
results = bibformat_engine.pattern_format_element_params.finditer(text) #TODO
param_i = 0
for match in results:
self.assertEqual(match.group('name'), names[param_i])
self.assertEqual(match.group('desc'), descriptions[param_i])
param_i += 1
def test_pattern_format_element_seealso(self):
""" bibformat - correctness of pattern 'pattern_format_element_seealso' """
text = '''
a description for my element
some text
@param param1: desc1
@param param2: desc2
@see: seethis, seethat
'''
result = bibformat_engine.pattern_format_element_seealso.search(text)
self.assertEqual(result.group('see').strip(), 'seethis, seethat')
class EscapingAndWashingTest(InvenioTestCase):
""" bibformat - test escaping and washing metadata"""
def test_escaping(self):
""" bibformat - tests escaping HTML characters"""
text = "Is 5 < 6 ? For sure! And what about True && False == True?"
result = bibformat_engine.escape_field(text, mode=0)
self.assertEqual(result, text)
result = bibformat_engine.escape_field(text, mode=1)
self.assertEqual(result, 'Is 5 &lt; 6 ? For sure! And what about True &amp;&amp; False == True?')
def test_washing(self):
""" bibformat - test washing HTML tags"""
text = '''Hi dude, <br>, <strong>please login</strong>:<br/>
<a onclick="http://www.mycrappywebsite.com" href="login.html">login here</a></a><SCRIPT>alert("XSS");</SCRIPT>'''
# Keep only basic tags
result = bibformat_engine.escape_field(text, mode=2)
self.assert_('script' not in result.lower())
self.assert_('onclick' not in result.lower())
self.assert_('mycrappywebsite' not in result.lower())
self.assert_('<br>' in result.lower())
self.assert_('<br/>' in result.lower().replace(' ', ''))
# Keep only basic tags only if value starts with <!--HTML-->
# directive. Otherwise escape (which is the case here)
result = bibformat_engine.escape_field(text, mode=3)
self.assert_('<script' not in result.lower())
self.assert_('<' not in result.lower())
result = bibformat_engine.escape_field(text, mode=5)
self.assert_('<script' not in result.lower())
self.assert_('<br' in result.lower())
# Remove all HTML tags
result = bibformat_engine.escape_field(text, mode=4)
self.assert_('script' not in result.lower())
self.assert_('onclick' not in result.lower())
self.assert_('mycrappywebsite' not in result.lower())
self.assert_('strong' not in result.lower())
self.assert_('<br>' not in result.lower())
self.assert_('<br/>' not in result.lower().replace(' ', ''))
self.assert_('login here' in result.lower())
# Keep basic tags + some others (like <img>)
result = bibformat_engine.escape_field(text, mode=5)
self.assert_('script' not in result.lower())
self.assert_('onclick' not in result.lower())
self.assert_('mycrappywebsite' not in result.lower())
self.assert_('<br' in result.lower())
self.assert_('login here' in result.lower())
text2 = text + ' <img src="loginicon" alt="login icon"/>'
result = bibformat_engine.escape_field(text2, mode=5)
self.assert_('<img' in result.lower())
self.assert_('src=' in result.lower())
self.assert_('alt="login icon"' in result.lower())
# Keep some tags only if value starts with <!--HTML-->
# directive. Otherwise escape (which is the case here)
result = bibformat_engine.escape_field(text, mode=6)
self.assert_('<script' not in result.lower())
self.assert_('<' not in result.lower())
result = bibformat_engine.escape_field('<!--HTML-->'+text, mode=6)
self.assert_('<script' not in result.lower())
self.assert_('<br>' in result.lower())
self.assert_('mycrappywebsite' not in result.lower())
# When the value cannot be parsed by our not so smart parser,
# just escape everything
text3 = """Ok, let't try with something unparsable < hehe <a onclick="http://www.mycrappywebsite.com" href="login.html">login</a>"""
result = bibformat_engine.escape_field(text3, mode=2)
self.assert_('mycrappywebsite' not in result.lower() or \
'<a' not in result.lower())
result = bibformat_engine.escape_field(text3, mode=3)
self.assert_('<a' not in result.lower())
result = bibformat_engine.escape_field(text3, mode=5)
self.assert_('mycrappywebsite' not in result.lower() or \
'<a' not in result.lower())
result = bibformat_engine.escape_field(text3, mode=6)
self.assert_('<a' not in result.lower())
class MiscTest(InvenioTestCase):
""" bibformat - tests on various functions"""
def test_parse_tag(self):
""" bibformat - result of parsing tags"""
tags_and_parsed_tags = ['245COc', ['245', 'C', 'O', 'c'],
'245C_c', ['245', 'C', '' , 'c'],
'245__c', ['245', '' , '' , 'c'],
'245__$$c', ['245', '' , '' , 'c'],
'245__$c', ['245', '' , '' , 'c'],
'245 $c', ['245', '' , '' , 'c'],
'245 $$c', ['245', '' , '' , 'c'],
'245__.c', ['245', '' , '' , 'c'],
'245 .c', ['245', '' , '' , 'c'],
'245C_$c', ['245', 'C', '' , 'c'],
'245CO$$c', ['245', 'C', 'O', 'c'],
'245CO.c', ['245', 'C', 'O', 'c'],
'245$c', ['245', '' , '' , 'c'],
'245.c', ['245', '' , '' , 'c'],
'245$$c', ['245', '' , '' , 'c'],
'245__%', ['245', '' , '' , '%'],
'245__$$%', ['245', '' , '' , '%'],
'245__$%', ['245', '' , '' , '%'],
'245 $%', ['245', '' , '' , '%'],
'245 $$%', ['245', '' , '' , '%'],
'245$%', ['245', '' , '' , '%'],
'245.%', ['245', '' , '' , '%'],
'245_O.%', ['245', '' , 'O', '%'],
'245.%', ['245', '' , '' , '%'],
'245$$%', ['245', '' , '' , '%'],
'2%5$$a', ['2%5', '' , '' , 'a'],
'2%%%%a', ['2%%', '%', '%', 'a'],
'2%%__a', ['2%%', '' , '' , 'a'],
'2%%a', ['2%%', '' , '' , 'a']]
for i in range(0, len(tags_and_parsed_tags), 2):
parsed_tag = bibformat_utils.parse_tag(tags_and_parsed_tags[i])
self.assertEqual(parsed_tag, tags_and_parsed_tags[i+1])
class FormatTest(InvenioTestCase):
""" bibformat - generic tests on function that do the formatting. Main functions"""
def setUp(self):
# pylint: disable=C0103
""" bibformat - prepare BibRecord objects"""
sys.path.append('%s' % cfg['CFG_TMPDIR'])
self.xml_text_1 = '''
<record>
<controlfield tag="001">33</controlfield>
<datafield tag="980" ind1="" ind2="">
<subfield code="a">thesis</subfield>
</datafield>
<datafield tag="950" ind1="" ind2="">
<subfield code="b">Doe1, John</subfield>
</datafield>
<datafield tag="100" ind1="" ind2="">
<subfield code="a">Doe2, John</subfield>
<subfield code="b">editor</subfield>
</datafield>
<datafield tag="245" ind1="" ind2="1">
<subfield code="a">On the foo and bar1</subfield>
</datafield>
<datafield tag="245" ind1="" ind2="2">
<subfield code="a">On the foo and bar2</subfield>
</datafield>
<datafield tag="088" ind1="" ind2="">
<subfield code="a">99999</subfield>
</datafield>
</record>
'''
#rec_1 = bibrecord.create_record(self.xml_text_1)
self.bfo_1 = bibformat_engine.BibFormatObject(recID=None,
ln='fr',
xml_record=self.xml_text_1)
self.xml_text_2 = '''
<record>
<controlfield tag="001">33</controlfield>
<datafield tag="980" ind1="" ind2="">
<subfield code="b">thesis </subfield>
</datafield>
<datafield tag="950" ind1="" ind2="">
<subfield code="b">Doe1, John</subfield>
</datafield>
<datafield tag="100" ind1="" ind2="">
<subfield code="a">Doe2, John</subfield>
<subfield code="b">editor</subfield>
</datafield>
<datafield tag="245" ind1="" ind2="1">
<subfield code="b">On the foo and bar1</subfield>
</datafield>
<datafield tag="245" ind1="" ind2="2">
<subfield code="b">On the foo and bar2</subfield>
</datafield>
</record>
'''
#self.rec_2 = bibrecord.create_record(xml_text_2)
self.bfo_2 = bibformat_engine.BibFormatObject(recID=None,
ln='fr',
xml_record=self.xml_text_2)
self.xml_text_3 = '''
<record>
<controlfield tag="001">33</controlfield>
<datafield tag="041" ind1="" ind2="">
<subfield code="a">eng</subfield>
</datafield>
<datafield tag="100" ind1="" ind2="">
<subfield code="a">Doe1, John</subfield>
</datafield>
<datafield tag="100" ind1="" ind2="">
<subfield code="a">Doe2, John</subfield>
<subfield code="b">editor</subfield>
</datafield>
<datafield tag="245" ind1="" ind2="1">
<subfield code="a">On the foo and bar1</subfield>
</datafield>
<datafield tag="245" ind1="" ind2="2">
<subfield code="a">On the foo and bar2</subfield>
</datafield>
<datafield tag="980" ind1="" ind2="">
<subfield code="a">article</subfield>
</datafield>
</record>
'''
#self.rec_3 = bibrecord.create_record(xml_text_3)
self.bfo_3 = bibformat_engine.BibFormatObject(recID=None,
ln='fr',
xml_record=self.xml_text_3)
self.empty_record_xml = '''
<record>
<controlfield tag="001">555</controlfield>
</record>'''
self.old_outputs_path = bibformat_engine.CFG_BIBFORMAT_OUTPUTS_PATH
- bibformat_engine.CFG_BIBFORMAT_OUTPUTS_PATH = cfg['CFG_TMPDIR']
+ bibformat_engine.CFG_BIBFORMAT_OUTPUTS_PATH = output_formats.__path__[0]
self.old_elements_path = cfg['CFG_BIBFORMAT_ELEMENTS_PATH']
cfg['CFG_BIBFORMAT_ELEMENTS_PATH'] = format_elements.__path__[0]
self.old_import_path = cfg['CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH']
cfg['CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH'] = CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH
self.old_templates_path = cfg['CFG_BIBFORMAT_TEMPLATES_PATH']
cfg['CFG_BIBFORMAT_TEMPLATES_PATH'] = format_templates.__path__[0]
def tearDown(self):
sys.path.pop()
bibformat_engine.CFG_BIBFORMAT_OUTPUTS_PATH = self.old_outputs_path
cfg['CFG_BIBFORMAT_ELEMENTS_PATH'] = self.old_elements_path
cfg['CFG_BIBFORMAT_ELEMENTS_IMPORT_PATH'] = self.old_import_path
cfg['CFG_BIBFORMAT_TEMPLATES_PATH'] = self.old_templates_path
def test_decide_format_template(self):
""" bibformat - choice made by function decide_format_template"""
result = bibformat_engine.decide_format_template(self.bfo_1, "test1")
self.assertEqual(result, "Thesis_detailed.bft")
result = bibformat_engine.decide_format_template(self.bfo_3, "test3")
self.assertEqual(result, "Test3.bft")
#Only default matches
result = bibformat_engine.decide_format_template(self.bfo_2, "test1")
self.assertEqual(result, "Default_HTML_detailed.bft")
#No match at all for record
result = bibformat_engine.decide_format_template(self.bfo_2, "test2")
self.assertEqual(result, None)
#Non existing output format
result = bibformat_engine.decide_format_template(self.bfo_2, "UNKNOW")
self.assertEqual(result, None)
def test_format_record(self):
""" bibformat - correct formatting"""
#use output format that has no match TEST DISABLED DURING MIGRATION
#result = bibformat_engine.format_record(recID=None, of="test2", xml_record=self.xml_text_2)
#self.assertEqual(result.replace("\n", ""),"")
#use output format that link to unknown template
result = bibformat_engine.format_record(recID=None, of="test3", xml_record=self.xml_text_2)
self.assertEqual(result.replace("\n", ""),"")
#Unknown output format TEST DISABLED DURING MIGRATION
#result = bibformat_engine.format_record(recID=None, of="unkno", xml_record=self.xml_text_3)
#self.assertEqual(result.replace("\n", ""),"")
#Default formatting
result = bibformat_engine.format_record(recID=None, ln='fr', of="test3", xml_record=self.xml_text_3)
self.assertEqual(result,'''<h1>hi</h1> this is my template\ntest<bfe_non_existing_element must disappear/><test_1 non prefixed element must stay as any normal tag/>tfrgarbage\n<br/>test me!&lt;b&gt;ok&lt;/b&gt;a default valueeditor\n<br/>test me!<b>ok</b>a default valueeditor\n<br/>test me!&lt;b&gt;ok&lt;/b&gt;a default valueeditor\n''')
def test_empty_formatting(self):
"""bibformat - formatting empty record"""
result = bibformat_engine.format_record(recID=0,
of='hb',
verbose=9,
xml_record=self.empty_record_xml)
self.assertEqual(result, '')
# FIXME: The commented test below currently fails, since xm
# format is generated from the database
## result = bibformat_engine.format_record(recID=0,
## of='xm',
## verbose=9,
## xml_record=self.empty_record_xml)
## self.assertEqual(result, self.empty_record_xml)
def test_format_with_format_template(self):
""" bibformat - correct formatting with given template"""
bibformat_engine.CFG_BIBFORMAT_OUTPUTS_PATH = self.old_outputs_path
template = bibformat_engine.get_format_template("Test3.bft")
result = bibformat_engine.format_with_format_template(format_template_filename = None,
bfo=self.bfo_1,
verbose=0,
format_template_code=template['code'])
self.assertEqual(result,'''<h1>hi</h1> this is my template\ntest<bfe_non_existing_element must disappear/><test_1 non prefixed element must stay as any normal tag/>tfrgarbage\n<br/>test me!&lt;b&gt;ok&lt;/b&gt;a default valueeditor\n<br/>test me!<b>ok</b>a default valueeditor\n<br/>test me!&lt;b&gt;ok&lt;/b&gt;a default valueeditor\n99999''')
class MarcFilteringTest(InvenioTestCase):
""" bibformat - MARC tag filtering tests"""
def setUp(self):
"""bibformat - prepare MARC filtering tests"""
self.xml_text_4 = '''
<record>
<controlfield tag="001">33</controlfield>
<datafield tag="041" ind1="" ind2="">
<subfield code="a">eng</subfield>
</datafield>
<datafield tag="100" ind1="" ind2="">
<subfield code="a">Doe1, John</subfield>
</datafield>
<datafield tag="100" ind1="" ind2="">
<subfield code="a">Doe2, John</subfield>
<subfield code="b">editor</subfield>
</datafield>
<datafield tag="245" ind1="" ind2="1">
<subfield code="a">On the foo and bar1</subfield>
</datafield>
<datafield tag="245" ind1="" ind2="2">
<subfield code="a">On the foo and bar2</subfield>
</datafield>
<datafield tag="595" ind1="" ind2="2">
<subfield code="a">Confidential comment</subfield>
</datafield>
<datafield tag="980" ind1="" ind2="">
<subfield code="a">article</subfield>
</datafield>
</record>
'''
def test_filtering(self):
"""bibformat - filter hidden fields"""
newxml = bibformat.filter_hidden_fields(self.xml_text_4, user_info=None, filter_tags=['595',], force_filtering=True)
numhfields = newxml.count("595")
self.assertEqual(numhfields, 0)
newxml = bibformat.filter_hidden_fields(self.xml_text_4, user_info=None, filter_tags=['595',], force_filtering=False)
numhfields = newxml.count("595")
self.assertEqual(numhfields, 1)
TEST_SUITE = make_test_suite(FormatTemplateTest,
OutputFormatTest,
FormatElementTest,
PatternTest,
MiscTest,
FormatTest,
EscapingAndWashingTest,
MarcFilteringTest)
if __name__ == '__main__':
run_test_suite(TEST_SUITE)
diff --git a/invenio/modules/formatter/utils.py b/invenio/modules/formatter/utils.py
index 0addfb0f5..fedbcb497 100644
--- a/invenio/modules/formatter/utils.py
+++ b/invenio/modules/formatter/utils.py
@@ -1,802 +1,802 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Utilities for special formatting of records.
API functions: highlight, get_contextual_content, encode_for_xml
Used mainly by BibFormat elements.
"""
__revision__ = "$Id$"
import re
import zlib
from invenio.config import \
CFG_OAI_ID_FIELD, \
CFG_WEBSEARCH_FULLTEXT_SNIPPETS, \
CFG_WEBSEARCH_FULLTEXT_SNIPPETS_CHARS, \
CFG_INSPIRE_SITE, \
CFG_WEBSEARCH_FULLTEXT_SNIPPETS_GENERATOR
from invenio.legacy.dbquery import run_sql
from invenio.utils.url import string_to_numeric_char_reference
from invenio.utils.text import encode_for_xml
from invenio.utils.shell import run_shell_command
from invenio.legacy.bibrecord import get_fieldvalues
def highlight_matches(text, compiled_pattern, \
prefix_tag='<strong>', suffix_tag="</strong>"):
"""
Highlight words in 'text' matching the 'compiled_pattern'
@param text: the text in which we want to "highlight" parts
@param compiled_pattern: the parts to highlight
@type compiled_pattern: a compiled regular expression
@param prefix_tag: prefix to use before each matching parts
@param suffix_tag: suffix to use after each matching parts
@return: a version of input X{text} with words matching X{compiled_pattern} surrounded by X{prefix_tag} and X{suffix_tag}
"""
#Add 'prefix_tag' and 'suffix_tag' before and after 'match'
#FIXME decide if non english accentuated char should be desaccentuaded
def replace_highlight(match):
""" replace match.group() by prefix_tag + match.group() + suffix_tag"""
return prefix_tag + match.group() + suffix_tag
#Replace and return keywords with prefix+keyword+suffix
return compiled_pattern.sub(replace_highlight, text)
def highlight(text, keywords=None, \
prefix_tag='<strong>', suffix_tag="</strong>", whole_word_matches=False):
"""
Returns text with all words highlighted with given tags (this
function places 'prefix_tag' and 'suffix_tag' before and after
words from 'keywords' in 'text').
for example set prefix_tag='<b style="color: black; background-color: rgb(255, 255, 102);">' and suffix_tag="</b>"
@param text: the text to modify
@param keywords: a list of string
@param prefix_tag: prefix to use before each matching parts
@param suffix_tag: suffix to use after each matching parts
@param whole_word_matches: to use whole word matches
@return: highlighted text
"""
if not keywords:
return text
escaped_keywords = []
for k in keywords:
escaped_keywords.append(re.escape(k))
#Build a pattern of the kind keyword1 | keyword2 | keyword3
if whole_word_matches:
pattern = '|'.join(['\\b' + key + '\\b' for key in escaped_keywords])
else:
pattern = '|'.join(escaped_keywords)
compiled_pattern = re.compile(pattern, re.IGNORECASE)
#Replace and return keywords with prefix+keyword+suffix
return highlight_matches(text, compiled_pattern, \
prefix_tag, suffix_tag)
def get_contextual_content(text, keywords, max_lines=2):
"""
Returns some lines from a text contextually to the keywords in
'keywords_string'
@param text: the text from which we want to get contextual content
@param keywords: a list of keyword strings ("the context")
@param max_lines: the maximum number of line to return from the record
@return: a string
"""
def grade_line(text_line, keywords):
"""
Grades a line according to keywords.
grade = number of keywords in the line
"""
grade = 0
for keyword in keywords:
grade += text_line.upper().count(keyword.upper())
return grade
#Grade each line according to the keywords
lines = text.split('.')
#print 'lines: ',lines
weights = [grade_line(line, keywords) for line in lines]
#print 'line weights: ', weights
def grade_region(lines_weight):
"""
Grades a region. A region is a set of consecutive lines.
grade = sum of weights of the line composing the region
"""
grade = 0
for weight in lines_weight:
grade += weight
return grade
if max_lines > 1:
region_weights = []
for index_weight in range(len(weights)- max_lines + 1):
region_weights.append(grade_region(weights[index_weight:(index_weight+max_lines)]))
weights = region_weights
#print 'region weights: ',weights
#Returns line with maximal weight, and (max_lines - 1) following lines.
index_with_highest_weight = 0
highest_weight = 0
i = 0
for weight in weights:
if weight > highest_weight:
index_with_highest_weight = i
highest_weight = weight
i += 1
#print 'highest weight', highest_weight
if index_with_highest_weight+max_lines > len(lines):
return lines[index_with_highest_weight:]
else:
return lines[index_with_highest_weight:index_with_highest_weight+max_lines]
def record_get_xml(recID, format='xm', decompress=zlib.decompress,
on_the_fly=False):
"""
Returns an XML string of the record given by recID.
The function builds the XML directly from the database,
without using the standard formatting process.
'format' allows to define the flavour of XML:
- 'xm' for standard XML
- 'marcxml' for MARC XML
- 'oai_dc' for OAI Dublin Core
- 'xd' for XML Dublin Core
If record does not exist, returns empty string.
If the record is deleted, returns an empty MARCXML (with recid
controlfield, OAI ID fields and 980__c=DELETED)
@param recID: the id of the record to retrieve
@param format: the format to use
@param on_the_fly: if False, try to fetch precreated one in database
@param decompress: the library to use to decompress cache from DB
@return: the xml string of the record
"""
from invenio.legacy.search_engine import record_exists
def get_creation_date(recID, fmt="%Y-%m-%d"):
"Returns the creation date of the record 'recID'."
out = ""
res = run_sql("SELECT DATE_FORMAT(creation_date,%s) FROM bibrec WHERE id=%s", (fmt, recID), 1)
if res:
out = res[0][0]
return out
def get_modification_date(recID, fmt="%Y-%m-%d"):
"Returns the date of last modification for the record 'recID'."
out = ""
res = run_sql("SELECT DATE_FORMAT(modification_date,%s) FROM bibrec WHERE id=%s", (fmt, recID), 1)
if res:
out = res[0][0]
return out
#_ = gettext_set_language(ln)
out = ""
# sanity check:
record_exist_p = record_exists(recID)
if record_exist_p == 0: # doesn't exist
return out
# print record opening tags, if needed:
if format == "marcxml" or format == "oai_dc":
out += " <record>\n"
out += " <header>\n"
for identifier in get_fieldvalues(recID, CFG_OAI_ID_FIELD):
out += " <identifier>%s</identifier>\n" % identifier
out += " <datestamp>%s</datestamp>\n" % get_modification_date(recID)
out += " </header>\n"
out += " <metadata>\n"
if format.startswith("xm") or format == "marcxml":
res = None
if on_the_fly == False:
# look for cached format existence:
query = """SELECT value FROM bibfmt WHERE
id_bibrec='%s' AND format='%s'""" % (recID, format)
res = run_sql(query, None, 1)
if res and record_exist_p == 1:
# record 'recID' is formatted in 'format', so print it
out += "%s" % decompress(res[0][0])
else:
# record 'recID' is not formatted in 'format' -- they are
# not in "bibfmt" table; so fetch all the data from
# "bibXXx" tables:
if format == "marcxml":
out += """ <record xmlns="http://www.loc.gov/MARC21/slim">\n"""
out += " <controlfield tag=\"001\">%d</controlfield>\n" % int(recID)
elif format.startswith("xm"):
out += """ <record>\n"""
out += " <controlfield tag=\"001\">%d</controlfield>\n" % int(recID)
if record_exist_p == -1:
# deleted record, so display only OAI ID and 980:
oai_ids = get_fieldvalues(recID, CFG_OAI_ID_FIELD)
if oai_ids:
out += "<datafield tag=\"%s\" ind1=\"%s\" ind2=\"%s\"><subfield code=\"%s\">%s</subfield></datafield>\n" % \
(CFG_OAI_ID_FIELD[0:3],
CFG_OAI_ID_FIELD[3:4],
CFG_OAI_ID_FIELD[4:5],
CFG_OAI_ID_FIELD[5:6],
oai_ids[0])
out += "<datafield tag=\"980\" ind1=\" \" ind2=\" \"><subfield code=\"c\">DELETED</subfield></datafield>\n"
from invenio.legacy.search_engine import get_merged_recid
merged_recid = get_merged_recid(recID)
if merged_recid: # record was deleted but merged to other record, so display this information:
out += "<datafield tag=\"970\" ind1=\" \" ind2=\" \"><subfield code=\"d\">%d</subfield></datafield>\n" % merged_recid
else:
# controlfields
query = "SELECT b.tag,b.value,bb.field_number FROM bib00x AS b, bibrec_bib00x AS bb "\
"WHERE bb.id_bibrec='%s' AND b.id=bb.id_bibxxx AND b.tag LIKE '00%%' "\
"ORDER BY bb.field_number, b.tag ASC" % recID
res = run_sql(query)
for row in res:
field, value = row[0], row[1]
value = encode_for_xml(value)
out += """ <controlfield tag="%s">%s</controlfield>\n""" % \
(encode_for_xml(field[0:3]), value)
# datafields
i = 1 # Do not process bib00x and bibrec_bib00x, as
# they are controlfields. So start at bib01x and
# bibrec_bib00x (and set i = 0 at the end of
# first loop)
for digit1 in range(0, 10):
for digit2 in range(i, 10):
bx = "bib%d%dx" % (digit1, digit2)
bibx = "bibrec_bib%d%dx" % (digit1, digit2)
query = "SELECT b.tag,b.value,bb.field_number FROM %s AS b, %s AS bb "\
"WHERE bb.id_bibrec='%s' AND b.id=bb.id_bibxxx AND b.tag LIKE '%s%%' "\
"ORDER BY bb.field_number, b.tag ASC" % (bx,
bibx,
recID,
str(digit1)+str(digit2))
res = run_sql(query)
field_number_old = -999
field_old = ""
for row in res:
field, value, field_number = row[0], row[1], row[2]
ind1, ind2 = field[3], field[4]
if ind1 == "_" or ind1 == "":
ind1 = " "
if ind2 == "_" or ind2 == "":
ind2 = " "
# print field tag
if field_number != field_number_old or \
field[:-1] != field_old[:-1]:
if field_number_old != -999:
out += """ </datafield>\n"""
out += """ <datafield tag="%s" ind1="%s" ind2="%s">\n""" % \
(encode_for_xml(field[0:3]),
encode_for_xml(ind1),
encode_for_xml(ind2))
field_number_old = field_number
field_old = field
# print subfield value
value = encode_for_xml(value)
out += """ <subfield code="%s">%s</subfield>\n""" % \
(encode_for_xml(field[-1:]), value)
# all fields/subfields printed in this run, so close the tag:
if field_number_old != -999:
out += """ </datafield>\n"""
i = 0 # Next loop should start looking at bib%0 and bibrec_bib00x
# we are at the end of printing the record:
out += " </record>\n"
elif format == "xd" or format == "oai_dc":
# XML Dublin Core format, possibly OAI -- select only some bibXXx fields:
out += """ <dc xmlns="http://purl.org/dc/elements/1.1/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://purl.org/dc/elements/1.1/
http://www.openarchives.org/OAI/1.1/dc.xsd">\n"""
if record_exist_p == -1:
out += ""
else:
for f in get_fieldvalues(recID, "041__a"):
out += " <language>%s</language>\n" % f
for f in get_fieldvalues(recID, "100__a"):
out += " <creator>%s</creator>\n" % encode_for_xml(f)
for f in get_fieldvalues(recID, "700__a"):
out += " <creator>%s</creator>\n" % encode_for_xml(f)
for f in get_fieldvalues(recID, "245__a"):
out += " <title>%s</title>\n" % encode_for_xml(f)
for f in get_fieldvalues(recID, "65017a"):
out += " <subject>%s</subject>\n" % encode_for_xml(f)
for f in get_fieldvalues(recID, "8564_u"):
out += " <identifier>%s</identifier>\n" % encode_for_xml(f)
for f in get_fieldvalues(recID, "520__a"):
out += " <description>%s</description>\n" % encode_for_xml(f)
out += " <date>%s</date>\n" % get_creation_date(recID)
out += " </dc>\n"
# print record closing tags, if needed:
if format == "marcxml" or format == "oai_dc":
out += " </metadata>\n"
out += " </record>\n"
return out
def parse_tag(tag):
"""
Parse a marc code and decompose it in a table with:
0-tag 1-indicator1 2-indicator2 3-subfield
The first 3 chars always correspond to tag.
The indicators are optional. However they must both be indicated, or both ommitted.
If indicators are ommitted or indicated with underscore '_', they mean "No indicator".
"No indicator" is also equivalent indicator marked as whitespace.
The subfield is optional. It can optionally be preceded by a dot '.' or '$$' or '$'
Any of the chars can be replaced by wildcard %
THE FUNCTION DOES NOT CHECK WELLFORMNESS OF 'tag'
Any empty chars is not considered
For example:
>> parse_tag('245COc') = ['245', 'C', 'O', 'c']
>> parse_tag('245C_c') = ['245', 'C', '', 'c']
>> parse_tag('245__c') = ['245', '', '', 'c']
>> parse_tag('245__$$c') = ['245', '', '', 'c']
>> parse_tag('245__$c') = ['245', '', '', 'c']
>> parse_tag('245 $c') = ['245', '', '', 'c']
>> parse_tag('245 $$c') = ['245', '', '', 'c']
>> parse_tag('245__.c') = ['245', '', '', 'c']
>> parse_tag('245 .c') = ['245', '', '', 'c']
>> parse_tag('245C_$c') = ['245', 'C', '', 'c']
>> parse_tag('245CO$$c') = ['245', 'C', 'O', 'c']
>> parse_tag('245C_.c') = ['245', 'C', '', 'c']
>> parse_tag('245$c') = ['245', '', '', 'c']
>> parse_tag('245.c') = ['245', '', '', 'c']
>> parse_tag('245$$c') = ['245', '', '', 'c']
>> parse_tag('245__%') = ['245', '', '', '']
>> parse_tag('245__$$%') = ['245', '', '', '']
>> parse_tag('245__$%') = ['245', '', '', '']
>> parse_tag('245 $%') = ['245', '', '', '']
>> parse_tag('245 $$%') = ['245', '', '', '']
>> parse_tag('245$%') = ['245', '', '', '']
>> parse_tag('245.%') = ['245', '', '', '']
>> parse_tag('245$$%') = ['245', '', '', '']
>> parse_tag('2%5$$a') = ['2%5', '', '', 'a']
@param tag: tag to parse
@return: a canonical form of the input X{tag}
"""
p_tag = ['', '', '', ''] # tag, ind1, ind2, code
tag = tag.replace(" ", "") # Remove empty characters
tag = tag.replace("$", "") # Remove $ characters
tag = tag.replace(".", "") # Remove . characters
#tag = tag.replace("_", "") # Remove _ characters
p_tag[0] = tag[0:3] # tag
if len(tag) == 4:
p_tag[3] = tag[3] # subfield
elif len(tag) == 5:
ind1 = tag[3] # indicator 1
if ind1 != "_":
p_tag[1] = ind1
ind2 = tag[4] # indicator 2
if ind2 != "_":
p_tag[2] = ind2
elif len(tag) == 6:
p_tag[3] = tag[5] # subfield
ind1 = tag[3] # indicator 1
if ind1 != "_":
p_tag[1] = ind1
ind2 = tag[4] # indicator 2
if ind2 != "_":
p_tag[2] = ind2
return p_tag
def get_all_fieldvalues(recID, tags_in):
"""
Returns list of values that belong to fields in tags_in for record
with given recID.
Note that when a partial 'tags_in' is specified (eg. '100__'), the
subfields of all corresponding datafields are returned all 'mixed'
together. Eg. with::
123 100__ $a Ellis, J $u CERN
123 100__ $a Smith, K
>>> get_all_fieldvalues(123, '100__')
['Ellis, J', 'CERN', 'Smith, K']
@param recID: record ID to consider
@param tags_in: list of tags got retrieve
@return: a list of values corresponding to X{tags_in} found in X{recID}
"""
out = []
if type(tags_in) is not list:
tags_in = [tags_in, ]
dict_of_tags_out = {}
if not tags_in:
for i in range(0, 10):
for j in range(0, 10):
dict_of_tags_out["%d%d%%" % (i, j)] = '%'
else:
for tag in tags_in:
if len(tag) == 0:
for i in range(0, 10):
for j in range(0, 10):
dict_of_tags_out["%d%d%%" % (i, j)] = '%'
elif len(tag) == 1:
for j in range(0, 10):
dict_of_tags_out["%s%d%%" % (tag, j)] = '%'
elif len(tag) <= 5:
dict_of_tags_out["%s%%" % tag] = '%'
else:
dict_of_tags_out[tag[0:5]] = tag[5:6]
tags_out = dict_of_tags_out.keys()
tags_out.sort()
# search all bibXXx tables as needed:
for tag in tags_out:
digits = tag[0:2]
try:
intdigits = int(digits)
if intdigits < 0 or intdigits > 99:
raise ValueError
except ValueError:
# invalid tag value asked for
continue
bx = "bib%sx" % digits
bibx = "bibrec_bib%sx" % digits
query = "SELECT b.tag,b.value,bb.field_number FROM %s AS b, %s AS bb "\
"WHERE bb.id_bibrec=%%s AND b.id=bb.id_bibxxx AND b.tag LIKE %%s"\
"ORDER BY bb.field_number, b.tag ASC" % (bx, bibx)
res = run_sql(query, (recID, str(tag)+dict_of_tags_out[tag]))
# go through fields:
for row in res:
field, value, field_number = row[0], row[1], row[2]
out.append(value)
return out
re_bold_latex = re.compile('\$?\\\\textbf\{(?P<content>.*?)\}\$?')
re_emph_latex = re.compile('\$?\\\\emph\{(?P<content>.*?)\}\$?')
re_generic_start_latex = re.compile('\$?\\\\begin\{(?P<content>.*?)\}\$?')
re_generic_end_latex = re.compile('\$?\\\\end\{(?P<content>.*?)\}\$?')
re_verbatim_env_latex = re.compile('\\\\begin\{verbatim.*?\}(?P<content>.*?)\\\\end\{verbatim.*?\}')
def latex_to_html(text):
"""
Do some basic interpretation of LaTeX input. Gives some nice
results when used in combination with MathJax.
@param text: input "LaTeX" markup to interpret
@return: a representation of input LaTeX more suitable for HTML
"""
# Process verbatim environment first
def make_verbatim(match_obj):
"""Replace all possible special chars by HTML character
entities, so that they are not interpreted by further commands"""
return '<br/><pre class="tex2math_ignore">' + \
string_to_numeric_char_reference(match_obj.group('content')) + \
'</pre><br/>'
text = re_verbatim_env_latex.sub(make_verbatim, text)
# Remove trailing "line breaks"
text = text.strip('\\\\')
# Process special characters
text = text.replace("\\%", "%")
text = text.replace("\\#", "#")
text = text.replace("\\$", "$")
text = text.replace("\\&", "&amp;")
text = text.replace("\\{", "{")
text = text.replace("\\}", "}")
text = text.replace("\\_", "_")
text = text.replace("\\^{} ", "^")
text = text.replace("\\~{} ", "~")
text = text.replace("\\textregistered", "&#0174;")
text = text.replace("\\copyright", "&#0169;")
text = text.replace("\\texttrademark", "&#0153; ")
# Remove commented lines and join lines
text = '\\\\'.join([line for line in text.split('\\\\') \
if not line.lstrip().startswith('%')])
# Line breaks
text = text.replace('\\\\', '<br/>')
# Non-breakable spaces
text = text.replace('~', '&nbsp;')
# Styled text
def make_bold(match_obj):
"Make the found pattern bold"
# FIXME: check if it is valid to have this inside a formula
return '<b>' + match_obj.group('content') + '</b>'
text = re_bold_latex.sub(make_bold, text)
def make_emph(match_obj):
"Make the found pattern emphasized"
# FIXME: for the moment, remove as it could cause problem in
# the case it is used in a formula. To be check if it is valid.
return ' ' + match_obj.group('content') + ''
text = re_emph_latex.sub(make_emph, text)
# Lists
text = text.replace('\\begin{enumerate}', '<ol>')
text = text.replace('\\end{enumerate}', '</ol>')
text = text.replace('\\begin{itemize}', '<ul>')
text = text.replace('\\end{itemize}', '</ul>')
text = text.replace('\\item', '<li>')
# Remove remaining non-processed tags
text = re_generic_start_latex.sub('', text)
text = re_generic_end_latex.sub('', text)
return text
def get_pdf_snippets(recID, patterns, user_info):
"""
Extract text snippets around 'patterns' from the newest PDF file of 'recID'
The search is case-insensitive.
The snippets are meant to look like in the results of the popular search
engine: using " ... " between snippets.
For empty patterns it returns ""
@param recID: record ID to consider
@param patterns: list of patterns to retrieve
@param nb_words_around: max number of words around the matched pattern
@param max_snippets: max number of snippets to include
@return: snippet
"""
- from invenio.bibdocfile import BibRecDocs, check_bibdoc_authorization
+ from invenio.legacy.bibdocfile.api import BibRecDocs, check_bibdoc_authorization
text_path = ""
text_path_courtesy = ""
for bd in BibRecDocs(recID).list_bibdocs():
# Show excluded fulltext in snippets on Inspire, otherwise depending on authorization
if bd.get_text() and (CFG_INSPIRE_SITE or not check_bibdoc_authorization(user_info, bd.get_status())[0]):
text_path = bd.get_text_path()
text_path_courtesy = bd.get_status()
if CFG_INSPIRE_SITE and not text_path_courtesy:
# get courtesy from doctype, since docstatus was empty:
text_path_courtesy = bd.get_type()
if text_path_courtesy == 'INSPIRE-PUBLIC':
# but ignore 'INSPIRE-PUBLIC' doctype
text_path_courtesy = ''
break # stop at the first good PDF textable file
nb_chars = CFG_WEBSEARCH_FULLTEXT_SNIPPETS_CHARS.get('', 0)
max_snippets = CFG_WEBSEARCH_FULLTEXT_SNIPPETS.get('', 0)
if CFG_WEBSEARCH_FULLTEXT_SNIPPETS_CHARS.has_key(text_path_courtesy):
nb_chars=CFG_WEBSEARCH_FULLTEXT_SNIPPETS_CHARS[text_path_courtesy]
if CFG_WEBSEARCH_FULLTEXT_SNIPPETS.has_key(text_path_courtesy):
max_snippets=CFG_WEBSEARCH_FULLTEXT_SNIPPETS[text_path_courtesy]
if text_path and nb_chars and max_snippets:
out = ''
if CFG_WEBSEARCH_FULLTEXT_SNIPPETS_GENERATOR == 'native':
out = get_text_snippets(text_path, patterns, nb_chars, max_snippets)
if not out:
# no hit, so check stemmed versions:
- from invenio.bibindex_engine_stemmer import stem
+ from invenio.legacy.bibindex.engine_stemmer import stem
stemmed_patterns = [stem(p, 'en') for p in patterns]
out = get_text_snippets(text_path, stemmed_patterns, nb_chars, max_snippets)
elif CFG_WEBSEARCH_FULLTEXT_SNIPPETS_GENERATOR == 'SOLR':
- from invenio.solrutils_bibindex_searcher import solr_get_snippet
+ from invenio.legacy.miscutil.solrutils_bibindex_searcher import solr_get_snippet
out = solr_get_snippet(patterns, recID, nb_chars, max_snippets)
if out:
out_courtesy = ""
if CFG_INSPIRE_SITE and text_path_courtesy:
out_courtesy = '<strong>Snippets courtesy of ' + text_path_courtesy + '</strong><br>'
return '%s%s' % (out_courtesy, out)
else:
return ""
else:
return ""
def get_text_snippets(textfile_path, patterns, nb_chars, max_snippets):
"""
Extract text snippets around 'patterns' from the file found at
'textfile_path'. The snippets are meant to look similar to results of
popular Internet search engines: using " ... " between snippets.
For empty patterns it returns ""
"""
"""
TODO: - distinguish the beginning of sentences and make the snippets
start there
- optimize finding patterns - first search for patterns apperaing next
to each other, secondly look for each patten not for first
occurances of any pattern
"""
if len(patterns) == 0:
return ""
max_lines = nb_chars / 40 + 2 # rule of thumb in order to catch nb_chars
# Produce the big snippets from which the real snippets will be cut out
cmd = "grep -i -C%s -m%s"
cmdargs = [str(max_lines), str(max_snippets)]
for p in patterns:
cmd += " -e %s"
cmdargs.append(" " + p)
cmd += " %s"
cmdargs.append(textfile_path)
(dummy1, output, dummy2) = run_shell_command(cmd, cmdargs)
# a fact to keep in mind with this call to grep is that if patterns appear
# in two contigious lines, they will not be separated by '--' and therefore
# treated as one 'big snippet'
result = []
big_snippets = output.split("--")
# cut the snippets to match the nb_words_around parameter precisely:
for s in big_snippets:
small_snippet = cut_out_snippet(s, patterns, nb_chars)
result.append(small_snippet)
# combine snippets
out = ""
count = 0
for snippet in result:
if snippet and count < max_snippets:
if out:
out += "..."
out += highlight(snippet, patterns, whole_word_matches=True)
return out
def words_start_with_patterns(words, patterns):
"""
Check whether the first word's beginning matches any of the patterns.
The second argument is an array of patterns to match.
"""
ret = False
for p in patterns:
# Phrase handling
if ' ' in p:
phrase = p
phrase_terms = p.split()
additional_term_count = len(phrase_terms) - 1
possible_match = ' '.join(words[:additional_term_count + 1])
if possible_match.lower() == phrase.lower():
return True, additional_term_count
else:
lower_case = words[0].lower()
if lower_case.startswith(str(p).lower()):
ret = True
break
return ret, 0
def cut_out_snippet(text, patterns, nb_chars):
"""
Cut out one snippet in such a way that it includes at most nb_chars or
a few more chars until the end of last word.
The snippet can include:
- one pattern and "symmetrical" context
- several patterns as long as they fit into the nb_chars limit (context
is always "symmetrical")
"""
# TODO: cut at begin or end of sentence
words = text.split()
snippet, start, finish = cut_out_snippet_core_creation(words, patterns, nb_chars)
return cut_out_snippet_wrap(snippet, words, start, finish, nb_chars)
def cut_out_snippet_core_creation(words, patterns, nb_chars):
""" Stage 1:
Creating the snipper core starts and finishes with a matched pattern
The idea is to find a pattern occurance, then go on creating a suffix until
the next pattern is found. Then the suffix is added to the snippet
unless the loop brakes before due to suffix being to long.
"""
snippet = ""
suffix = ""
i = 0
start = -1 # start is an index of the first matched pattern
finish = -1 # is an index of the last matched pattern
#in this loop, the snippet core always starts and finishes with a matched pattern
while i < len(words) and len(snippet) + len(suffix) < nb_chars:
word_matched_p, additional_term_count = words_start_with_patterns(words[i:], patterns)
#if the first pattern was already found
if len(snippet) == 0:
#first occurance of pattern
if word_matched_p:
start = i
suffix = ""
if not additional_term_count:
snippet = words[i]
finish = i
else:
snippet = ' '.join(words[i:i + additional_term_count + 1])
finish = i + additional_term_count
else:
if word_matched_p:
if not additional_term_count:
# there is enough room for this pattern in the snippet because
# with previous word the snippet was shorter than nb_chars
snippet += suffix + " " + words[i] # suffix starts with a space
finish = i
else:
snippet += suffix + " " + ' '.join(words[i:i + additional_term_count + 1]) # suffix starts with a space
finish = i + additional_term_count
suffix = ""
else:
suffix += " " + words[i]
i += 1 + additional_term_count
return snippet, start, finish
def cut_out_snippet_wrap(snippet, words, start, finish, nb_chars):
""" Stage 2: Wrap the snippet core symetrically up to the nb_chars
if snippet is non-empty, then start and finish will be set before
"""
front = True
while 0 < len(snippet) < nb_chars:
if front and start == 0:
front = False
else:
if not front and finish == len(words) -1:
front = True
if start == 0 and finish == len(words) - 1:
break
if front:
snippet = words[start -1] + " " + snippet
start -= 1
front = False
else:
snippet += " " + words[finish + 1]
finish += 1
front = True
return snippet
diff --git a/invenio/modules/linkbacks/testsuite/test_linkbacks.py b/invenio/modules/linkbacks/testsuite/test_linkbacks.py
index dcb9628de..ca3d28ff8 100644
--- a/invenio/modules/linkbacks/testsuite/test_linkbacks.py
+++ b/invenio/modules/linkbacks/testsuite/test_linkbacks.py
@@ -1,151 +1,151 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebLinkback - Unit Test Suite"""
import datetime
from invenio.base.wrappers import lazy_import
from invenio.testsuite import make_test_suite, run_test_suite, InvenioTestCase
split_in_days = lazy_import('invenio.weblinkback:split_in_days')
-from invenio.weblinkback_config import CFG_WEBLINKBACK_TYPE, CFG_WEBLINKBACK_STATUS
+from invenio.legacy.weblinkback.config import CFG_WEBLINKBACK_TYPE, CFG_WEBLINKBACK_STATUS
class TestSplitLinkbacksInInsertionDayGroups(InvenioTestCase):
"""Test for splitting linkbacks in insertion day groups"""
def setUp(self):
# [(linkback_id, origin_url, recid, additional_properties, linkback_type, linkback_status, insert_time)]
self.test_data = ((23L, 'URL', 42, None, CFG_WEBLINKBACK_TYPE['TRACKBACK'], CFG_WEBLINKBACK_STATUS['APPROVED'], datetime.datetime(2011, 10, 21, 0, 1, 40)),
(22L, 'URL', 41, None, CFG_WEBLINKBACK_TYPE['TRACKBACK'], CFG_WEBLINKBACK_STATUS['APPROVED'], datetime.datetime(2011, 10, 20, 0, 0, 51)),
(21L, 'URL', 42, None, CFG_WEBLINKBACK_TYPE['TRACKBACK'], CFG_WEBLINKBACK_STATUS['APPROVED'], datetime.datetime(2011, 10, 20, 0, 0, 42)),
(18L, 'URL', 42, None, CFG_WEBLINKBACK_TYPE['TRACKBACK'], CFG_WEBLINKBACK_STATUS['APPROVED'], datetime.datetime(2011, 10, 20, 0, 0, 41)),
(16L, 'URL', 41, None, CFG_WEBLINKBACK_TYPE['TRACKBACK'], CFG_WEBLINKBACK_STATUS['APPROVED'], datetime.datetime(2011, 10, 16, 22, 44, 41)),
(15L, 'URL', 41, None, CFG_WEBLINKBACK_TYPE['TRACKBACK'], CFG_WEBLINKBACK_STATUS['APPROVED'], datetime.datetime(2011, 10, 16, 22, 43, 19)),
(14L, 'URL', 42, None, CFG_WEBLINKBACK_TYPE['TRACKBACK'], CFG_WEBLINKBACK_STATUS['APPROVED'], datetime.datetime(2011, 10, 14, 22, 43, 18)),
(12L, 'URL', 41, None, CFG_WEBLINKBACK_TYPE['TRACKBACK'], CFG_WEBLINKBACK_STATUS['APPROVED'], datetime.datetime(2011, 10, 13, 22, 43, 14)),
(11L, 'URL', 42, None, CFG_WEBLINKBACK_TYPE['TRACKBACK'], CFG_WEBLINKBACK_STATUS['APPROVED'], datetime.datetime(2011, 10, 13, 22, 32, 43)),
(10L, 'URL', 41, None, CFG_WEBLINKBACK_TYPE['TRACKBACK'], CFG_WEBLINKBACK_STATUS['APPROVED'], datetime.datetime(2011, 10, 10, 21, 28, 48)))
def test_no_linkbacks(self):
"""weblinkback - no linkbacks (edge case test)"""
result = split_in_days(())
self.assertEqual(0, len(result))
def test_one_linkback(self):
"""weblinkback - one linkback (edge case test)"""
test_data = self.test_data[0:1]
result = split_in_days(test_data)
self.assertEqual(1, len(result))
self.assertEqual(1, len(result[0]))
self.assertEqual(self.test_data[0], result[0][0])
def test_all_same_day(self):
"""weblinkback - all linkbacks of the same day (edge case test)"""
test_data = self.test_data[1:4]
result = split_in_days(test_data)
self.assertEqual(1, len(result))
self.assertEqual(3, len(result[0]))
self.assertEqual(self.test_data[1], result[0][0])
self.assertEqual(self.test_data[2], result[0][1])
self.assertEqual(self.test_data[3], result[0][2])
def test_multiple_days(self):
"""weblinkback - linkbacks of different days"""
test_data = self.test_data[0:11]
result = split_in_days(test_data)
# Group count
self.assertEqual(6, len(result))
# First group
self.assertEqual(1, len(result[0]))
self.assertEqual(self.test_data[0], result[0][0])
# Second group
self.assertEqual(3, len(result[1]))
self.assertEqual(self.test_data[1], result[1][0])
self.assertEqual(self.test_data[2], result[1][1])
self.assertEqual(self.test_data[3], result[1][2])
# Third group
self.assertEqual(2, len(result[2]))
self.assertEqual(self.test_data[4], result[2][0])
self.assertEqual(self.test_data[5], result[2][1])
# Fourth group
self.assertEqual(1, len(result[3]))
self.assertEqual(self.test_data[6], result[3][0])
# Fifth group
self.assertEqual(2, len(result[4]))
self.assertEqual(self.test_data[7], result[4][0])
self.assertEqual(self.test_data[8], result[4][1])
# Sixth group
self.assertEqual(1, len(result[5]))
self.assertEqual(self.test_data[9], result[5][0])
def test_multiple_days_reversed(self):
"""weblinkback - linkbacks of different days in reversed order"""
# Reverse test data
test_data = list(self.test_data[0:11])
test_data.reverse()
test_data_reversed = tuple(test_data)
result = split_in_days(test_data_reversed)
# Group count
self.assertEqual(6, len(result))
# First group
self.assertEqual(1, len(result[0]))
self.assertEqual(self.test_data[9], result[0][0])
# Second group
self.assertEqual(2, len(result[1]))
self.assertEqual(self.test_data[8], result[1][0])
self.assertEqual(self.test_data[7], result[1][1])
# Third group
self.assertEqual(1, len(result[2]))
self.assertEqual(self.test_data[6], result[2][0])
# Fourth group
self.assertEqual(2, len(result[3]))
self.assertEqual(self.test_data[5], result[3][0])
self.assertEqual(self.test_data[4], result[3][1])
# Fifth group
self.assertEqual(3, len(result[4]))
self.assertEqual(self.test_data[3], result[4][0])
self.assertEqual(self.test_data[2], result[4][1])
self.assertEqual(self.test_data[1], result[4][2])
# Sixth group
self.assertEqual(1, len(result[5]))
self.assertEqual(self.test_data[0], result[5][0])
TEST_SUITE = make_test_suite(TestSplitLinkbacksInInsertionDayGroups)
if __name__ == "__main__":
run_test_suite(TEST_SUITE)
diff --git a/invenio/modules/linkbacks/views.py b/invenio/modules/linkbacks/views.py
index c423404d3..2138cd976 100644
--- a/invenio/modules/linkbacks/views.py
+++ b/invenio/modules/linkbacks/views.py
@@ -1,73 +1,73 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebSearch Flask Blueprint"""
from flask import Blueprint, render_template, Response
from invenio.base.decorators import wash_arguments
from invenio.ext.sqlalchemy import db
#from invenio.base.i18n import _
from flask.ext.login import current_user
from .models import LnkENTRY
from invenio.config import CFG_SITE_RECORD, \
CFG_WEBLINKBACK_TRACKBACK_ENABLED
-from invenio.weblinkback_config import CFG_WEBLINKBACK_TYPE, \
+from invenio.legacy.weblinkback.config import CFG_WEBLINKBACK_TYPE, \
CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME, \
CFG_WEBLINKBACK_STATUS, \
CFG_WEBLINKBACK_ORDER_BY_INSERTION_TIME, \
CFG_WEBLINKBACK_LIST_TYPE, \
CFG_WEBLINKBACK_TRACKBACK_SUBSCRIPTION_ERROR_MESSAGE, \
CFG_WEBLINKBACK_PAGE_TITLE_STATUS, \
CFG_WEBLINKBACK_BROKEN_COUNT
blueprint = Blueprint('weblinkback', __name__, url_prefix="/"+CFG_SITE_RECORD,
template_folder='templates', static_folder='static')
from invenio.modules.records.views import request_record
@blueprint.route('/<int:recid>/linkbacks2', methods=['GET', 'POST'])
@request_record
def index(recid):
linkbacks = LnkENTRY.query.filter(db.and_(
LnkENTRY.id_bibrec == recid,
LnkENTRY.status == CFG_WEBLINKBACK_STATUS['APPROVED']
)).all()
return render_template('linkbacks/index.html',
linkbacks=linkbacks)
@blueprint.route('/<int:recid>/sendtrackback', methods=['GET', 'POST'])
@request_record
@wash_arguments({'url': (unicode, CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME),
'title': (unicode, CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME),
'excerpt': (unicode, CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME),
'blog_name': (unicode, CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME),
'id': (unicode, CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME),
'source': (unicode, CFG_WEBLINKBACK_SUBSCRIPTION_DEFAULT_ARGUMENT_NAME)})
def sendtrackback(recid, url, title, excerpt, blog_name, id, source):
- from invenio.weblinkback import perform_sendtrackback, perform_sendtrackback_disabled
+ from invenio.legacy.weblinkback.api import perform_sendtrackback, perform_sendtrackback_disabled
mime_type = 'text/xml; charset=utf-8'
if CFG_WEBLINKBACK_TRACKBACK_ENABLED:
xml_response, status = perform_sendtrackback(recid, url, title, excerpt, blog_name, id, source, current_user)
else:
xml_response, status = perform_sendtrackback_disabled()
return Response(response=xml_response, status=status, mimetype=mime_type)
diff --git a/invenio/modules/records/views.py b/invenio/modules/records/views.py
index d087482f0..a43acc77b 100644
--- a/invenio/modules/records/views.py
+++ b/invenio/modules/records/views.py
@@ -1,217 +1,217 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebSearch Flask Blueprint"""
from functools import wraps
from flask import g, render_template, request, flash, redirect, url_for, \
current_app, abort, Blueprint
from flask.ext.login import current_user
from invenio.base.decorators import wash_arguments
from invenio.base.globals import cfg
from invenio.config import CFG_SITE_RECORD
from invenio.ext.template.context_processor import \
register_template_context_processor
from invenio.modules.search.models import Collection
from invenio.modules.search.signals import record_viewed
from invenio.modules.record_editor.models import Bibrec
from invenio.base.i18n import _
from invenio.utils import apache
from invenio.ext.breadcrumb import default_breadcrumb_root
blueprint = Blueprint('record', __name__, url_prefix="/"+CFG_SITE_RECORD,
template_folder='templates', static_folder='static')
default_breadcrumb_root(blueprint, '.')
def request_record(f):
@wraps(f)
def decorated(recid, *args, **kwargs):
from invenio.modules.access.mailcookie import \
mail_cookie_create_authorize_action
from invenio.modules.access.local_config import VIEWRESTRCOLL
from invenio.legacy.search_engine import guess_primary_collection_of_a_record, \
check_user_can_view_record
- from invenio.websearchadminlib import get_detailed_page_tabs,\
+ from invenio.legacy.websearch.adminlib import get_detailed_page_tabs,\
get_detailed_page_tabs_counts
# ensure recid to be integer
recid = int(recid)
g.collection = collection = Collection.query.filter(
Collection.name == guess_primary_collection_of_a_record(recid)).\
one()
(auth_code, auth_msg) = check_user_can_view_record(current_user, recid)
# only superadmins can use verbose parameter for obtaining debug information
if not current_user.is_super_admin and 'verbose' in kwargs:
kwargs['verbose'] = 0
if auth_code and current_user.is_guest:
cookie = mail_cookie_create_authorize_action(VIEWRESTRCOLL, {
'collection': g.collection.name})
url_args = {'action': cookie, 'ln': g.ln, 'referer': request.url}
flash(_("Authorization failure"), 'error')
return redirect(url_for('webaccount.login', **url_args))
elif auth_code:
flash(auth_msg, 'error')
abort(apache.HTTP_UNAUTHORIZED)
from invenio.legacy.bibfield import get_record
from invenio.legacy.search_engine import record_exists, get_merged_recid
# check if the current record has been deleted
# and has been merged, case in which the deleted record
# will be redirect to the new one
record_status = record_exists(recid)
merged_recid = get_merged_recid(recid)
if record_status == -1 and merged_recid:
return redirect(url_for('record.metadata', recid=merged_recid))
elif record_status == -1:
abort(apache.HTTP_GONE) # The record is gone!
g.bibrec = Bibrec.query.get(recid)
record = get_record(recid)
title = record.get('title.title', '')
# b = [(_('Home'), '')] + collection.breadcrumbs()[1:]
# b += [(title, 'record.metadata', dict(recid=recid))]
# current_app.config['breadcrumbs_map'][request.endpoint] = b
g.record_tab_keys = []
tabs = []
counts = get_detailed_page_tabs_counts(recid)
for k, v in get_detailed_page_tabs(collection.id, recid,
g.ln).iteritems():
t = {}
b = 'record'
if k == '':
k = 'metadata'
if k == 'comments' or k == 'reviews':
b = 'comments'
if k == 'linkbacks':
b = 'weblinkback'
k = 'index'
t['key'] = b + '.' + k
t['count'] = counts.get(k.capitalize(), -1)
t.update(v)
tabs.append(t)
if v['visible']:
g.record_tab_keys.append(b+'.'+k)
if cfg.get('CFG_WEBLINKBACK_TRACKBACK_ENABLED'):
@register_template_context_processor
def trackback_context():
- from invenio.weblinkback_templates import get_trackback_auto_discovery_tag
+ from invenio.legacy.weblinkback.templates import get_trackback_auto_discovery_tag
return dict(headerLinkbackTrackbackLink=get_trackback_auto_discovery_tag(recid))
def _format_record(recid, of='hd', user_info=current_user, *args, **kwargs):
from invenio.legacy.search_engine import print_record
return print_record(recid, format=of, user_info=user_info, *args, **kwargs)
@register_template_context_processor
def record_context():
from invenio.modules.comments.api import get_mini_reviews
return dict(recid=recid,
record=record,
tabs=tabs,
title=title,
get_mini_reviews=lambda *args, **kwargs:
get_mini_reviews(*args, **kwargs).decode('utf8'),
collection=collection,
format_record=_format_record
)
return f(recid, *args, **kwargs)
return decorated
@blueprint.route('/<int:recid>/metadata', methods=['GET', 'POST'])
@blueprint.route('/<int:recid>/', methods=['GET', 'POST'])
@blueprint.route('/<int:recid>', methods=['GET', 'POST'])
@wash_arguments({'of': (unicode, 'hd')})
@request_record
def metadata(recid, of='hd'):
from invenio.legacy.bibrank.downloads_similarity import register_page_view_event
from invenio.modules.formatter import get_output_format_content_type
register_page_view_event(recid, current_user.get_id(), str(request.remote_addr))
if get_output_format_content_type(of) != 'text/html':
return redirect('/%s/%d/export/%s' % (CFG_SITE_RECORD, recid, of))
# Send the signal 'document viewed'
record_viewed.send(
current_app._get_current_object(),
recid=recid,
id_user=current_user.get_id(),
request=request)
return render_template('records/metadata.html', of=of)
@blueprint.route('/<int:recid>/references', methods=['GET', 'POST'])
@request_record
def references(recid):
return render_template('records/references.html')
@blueprint.route('/<int:recid>/files', methods=['GET', 'POST'])
@request_record
def files(recid):
return render_template('records/files.html')
@blueprint.route('/<int:recid>/citations', methods=['GET', 'POST'])
@request_record
def citations(recid):
from invenio.legacy.bibrank.citation_searcher import calculate_cited_by_list,\
get_self_cited_by, calculate_co_cited_with_list
citations = dict(
citinglist=calculate_cited_by_list(recid),
selfcited=get_self_cited_by(recid),
co_cited=calculate_co_cited_with_list(recid)
)
return render_template('records/citations.html',
citations=citations)
@blueprint.route('/<int:recid>/keywords', methods=['GET', 'POST'])
@request_record
def keywords(recid):
from invenio.legacy.bibclassify.webinterface import record_get_keywords
found, keywords, record = record_get_keywords(recid)
return render_template('records/keywords.html',
found=found,
keywords=keywords)
@blueprint.route('/<int:recid>/usage', methods=['GET', 'POST'])
@request_record
def usage(recid):
from invenio.legacy.bibrank.downloads_similarity import calculate_reading_similarity_list
from invenio.legacy.bibrank.downloads_grapher import create_download_history_graph_and_box
viewsimilarity = calculate_reading_similarity_list(recid, "pageviews")
downloadsimilarity = calculate_reading_similarity_list(recid, "downloads")
downloadgraph = create_download_history_graph_and_box(recid)
return render_template('records/usage.html',
viewsimilarity=viewsimilarity,
downloadsimilarity=downloadsimilarity,
downloadgraph=downloadgraph)
diff --git a/invenio/modules/redirector/redirect_methods/goto_plugin_cern_hr_documents.py b/invenio/modules/redirector/redirect_methods/goto_plugin_cern_hr_documents.py
index 019f7d085..251f58daa 100644
--- a/invenio/modules/redirector/redirect_methods/goto_plugin_cern_hr_documents.py
+++ b/invenio/modules/redirector/redirect_methods/goto_plugin_cern_hr_documents.py
@@ -1,155 +1,155 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
This implements a redirection for CERN HR Documents in the CERN Document
Server. It's useful as a reference on how goto plugins could be implemented.
"""
import time
import re
from invenio.legacy.search_engine import perform_request_search
from invenio.legacy.bibrecord import get_fieldvalues
-from invenio.bibdocfile import BibRecDocs
+from invenio.legacy.bibdocfile.api import BibRecDocs
def make_cern_ssr_docname(lang, edition, modif=0):
if modif:
return "CERN_SSR_%(lang)s_ed%(edition)02d_modif%(modif)02d" % {
'lang': lang,
'edition': edition,
'modif': modif
}
else:
return "CERN_SSR_%(lang)s_ed%(edition)02d" % {
'lang': lang,
'edition': edition,
}
_RE_REVISION = re.compile(r"rev(\d\d)")
def _get_revision(docname):
"""
Return the revision in a docname. E.g.:
CERN_Circ_Op_en_02_rev01_Implementation measures.pdf -> 1
CERN_Circ_Op_en_02_rev02_Implementation measures.PDF -> 2
"""
g = _RE_REVISION.search(docname)
if g:
return int(g.group(1))
return 0
def _register_document(documents, docname, key):
"""
Register in the documents mapping the docname to key, but only if the
docname has a revision higher of the docname already associated with a key
"""
if key in documents:
if _get_revision(docname) > _get_revision(documents[key]):
documents[key] = docname
else:
documents[key] = docname
def goto(type, document='', number=0, lang='en', modif=0):
today = time.strftime('%Y-%m-%d')
if type == 'SSR':
## We would like a CERN Staff Rules and Regulations
recids = perform_request_search(cc='Staff Rules and Regulations', f="925__a:1996-01-01->%s 925__b:%s->9999-99-99" % (today, today))
recid = recids[-1]
reportnumber = get_fieldvalues(recid, '037__a')[0]
edition = int(reportnumber[-2:]) ## e.g. CERN-STAFF-RULES-ED08
return BibRecDocs(recid).get_bibdoc(make_cern_ssr_docname(lang, edition, modif)).get_file('.pdf').get_url()
elif type == "OPER-CIRC":
recids = perform_request_search(cc="Operational Circulars", p="reportnumber=\"CERN-OPER-CIRC-%s-*\"" % number, sf="925__a")
recid = recids[-1]
documents = {}
bibrecdocs = BibRecDocs(recid)
for docname in bibrecdocs.get_bibdoc_names():
ldocname = docname.lower()
if 'implementation' in ldocname:
_register_document(documents, docname, 'implementation_en')
elif 'application' in ldocname:
_register_document(documents, docname, 'implementation_fr')
elif 'archiving' in ldocname:
_register_document(documents, docname, 'archiving_en')
elif 'archivage' in ldocname:
_register_document(documents, docname, 'archiving_fr')
elif 'annexe' in ldocname or 'annexes_fr' in ldocname:
_register_document(documents, docname, 'annex_fr')
elif 'annexes_en' in ldocname or 'annex' in ldocname:
_register_document(documents, docname, 'annex_en')
elif '_en_' in ldocname or '_eng_' in ldocname or '_angl_' in ldocname:
_register_document(documents, docname, 'en')
elif '_fr_' in ldocname:
_register_document(documents, docname, 'fr')
return bibrecdocs.get_bibdoc(documents[document]).get_file('.pdf').get_url()
elif type == 'ADMIN-CIRC':
recids = perform_request_search(cc="Administrative Circulars", p="reportnumber=\"CERN-ADMIN-CIRC-%s-*\"" % number, sf="925__a")
recid = recids[-1]
documents = {}
bibrecdocs = BibRecDocs(recid)
for docname in bibrecdocs.get_bibdoc_names():
ldocname = docname.lower()
if 'implementation' in ldocname:
_register_document(documents, docname, 'implementation-en')
elif 'application' in ldocname:
_register_document(documents, docname, 'implementation-fr')
elif 'archiving' in ldocname:
_register_document(documents, docname, 'archiving-en')
elif 'archivage' in ldocname:
_register_document(documents, docname, 'archiving-fr')
elif 'annexe' in ldocname or 'annexes_fr' in ldocname:
_register_document(documents, docname, 'annex-fr')
elif 'annexes_en' in ldocname or 'annex' in ldocname:
_register_document(documents, docname, 'annex-en')
elif '_en_' in ldocname or '_eng_' in ldocname or '_angl_' in ldocname:
_register_document(documents, docname, 'en')
elif '_fr_' in ldocname:
_register_document(documents, docname, 'fr')
return bibrecdocs.get_bibdoc(documents[document]).get_file('.pdf').get_url()
def register_hr_redirections():
"""
Run this only once
"""
from invenio.modules.redirector.api import register_redirection
plugin = 'goto_plugin_cern_hr_documents'
## Staff rules and regulations
for modif in range(1, 20):
for lang in ('en', 'fr'):
register_redirection('hr-srr-modif%02d-%s' % (modif, lang), plugin, parameters={'type': 'SSR', 'lang': lang, 'modif': modif})
for lang in ('en', 'fr'):
register_redirection('hr-srr-%s' % lang, plugin, parameters={'type': 'SSR', 'lang': lang, 'modif': 0})
## Operational Circulars
for number in range(1, 10):
for lang in ('en', 'fr'):
register_redirection('hr-oper-circ-%s-%s' % (number, lang), plugin, parameters={'type': 'OPER-CIRC', 'document': lang, 'number': number})
for number, special_document in ((2, 'implementation'), (2, 'annex'), (3, 'archiving'), (3, 'annex')):
for lang in ('en', 'fr'):
register_redirection('hr-circ-%s-%s-%s' % (number, special_document, lang), plugin, parameters={'type': 'OPER-CIRC', 'document': '%s-%s' % (special_document, lang), 'number': number})
## Administrative Circulars:
for number in range(1, 32):
for lang in ('en', 'fr'):
register_redirection('hr-admin-circ-%s-%s' % (number, lang), plugin, parameters={'type': 'ADMIN-CIRC', 'document': lang, 'number': number})
if __name__ == "__main__":
register_hr_redirections()
diff --git a/invenio/modules/redirector/redirect_methods/goto_plugin_latest_record.py b/invenio/modules/redirector/redirect_methods/goto_plugin_latest_record.py
index 851ef8e7f..304e12aa8 100644
--- a/invenio/modules/redirector/redirect_methods/goto_plugin_latest_record.py
+++ b/invenio/modules/redirector/redirect_methods/goto_plugin_latest_record.py
@@ -1,59 +1,59 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Demostrative PURL implementing a redirection to the very last record
(of a collection)
"""
from invenio.config import CFG_SITE_NAME, CFG_SITE_RECORD
from invenio.legacy.search_engine import perform_request_search
-from invenio.bibdocfile import BibRecDocs, InvenioBibDocFileError
+from invenio.legacy.bibdocfile.api import BibRecDocs, InvenioBibDocFileError
def goto(cc=CFG_SITE_NAME, p='', f='', sf='', so='d', docname='', format=''):
"""
Redirect the user to the latest record in the given collection,
optionally within the specified pattern and field. If docname
and format are specified, redirect the user to the corresponding
docname and format. If docname it is not specified, but there is
only a single bibdoc attached to the record will redirect to that
one.
"""
recids = perform_request_search(cc=cc, p=p, f=f, sf=sf, so=so)
if recids:
## We shall take the last recid. This is the last one
recid = recids[-1]
url = '/%s/%s' % (CFG_SITE_RECORD, recid)
if format:
bibrecdocs = BibRecDocs(recid)
if not docname:
if len(bibrecdocs.get_bibdoc_names()) == 1:
docname = bibrecdocs.get_bibdoc_names()[0]
else:
return url
try:
bibdoc = BibRecDocs(recid).get_bibdoc(docname)
except InvenioBibDocFileError:
return url
try:
bibdocfile = bibdoc.get_file(format=format)
return bibdocfile.get_url()
except InvenioBibDocFileError:
return url
return url
diff --git a/invenio/modules/search/testsuite/__init__.py b/invenio/modules/search/testsuite/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/invenio/testsuite/test_utils_memoise.py b/invenio/modules/search/testsuite/test_views.py
similarity index 54%
copy from invenio/testsuite/test_utils_memoise.py
copy to invenio/modules/search/testsuite/test_views.py
index 941ff050e..b242bb9ad 100644
--- a/invenio/testsuite/test_utils_memoise.py
+++ b/invenio/modules/search/testsuite/test_views.py
@@ -1,40 +1,46 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
-"""
-Unit tests for the memoise facility.
-"""
+"""Unit tests for search views."""
-from invenio.testsuite import make_test_suite, run_test_suite, InvenioTestCase
+from flask import url_for, current_app
+from invenio.testsuite import InvenioTestCase, make_test_suite, \
+ run_test_suite
-class MemoiseTest(InvenioTestCase):
- """Unit test cases for Memoise."""
+class SearchViewTest(InvenioTestCase):
+ """ Test search view functions. """
- def test_memoise_fib(self):
- """memoiseutils - test fib() memoisation"""
- from invenio.utils.memoise import Memoise
- from invenio.bibtaskex import fib
- fib_memoised = Memoise(fib)
- self.assertEqual(fib(17), fib_memoised(17))
+ def test_home_collection_page_availability(self):
+ response = self.client.get(url_for('search.index'))
+ self.assert200(response)
-TEST_SUITE = make_test_suite(MemoiseTest, )
+ response = self.client.get(url_for(
+ 'search.collection', name=current_app.config['CFG_SITE_NAME']))
+ self.assert200(response)
+
+ def test_search_page_availability(self):
+ response = self.client.get(url_for('search.search'))
+ self.assert200(response)
+
+
+TEST_SUITE = make_test_suite(SearchViewTest)
if __name__ == "__main__":
run_test_suite(TEST_SUITE)
diff --git a/invenio/modules/search/views/search.py b/invenio/modules/search/views/search.py
index 60f1d0a7e..9d94ee96e 100644
--- a/invenio/modules/search/views/search.py
+++ b/invenio/modules/search/views/search.py
@@ -1,525 +1,523 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebSearch Flask Blueprint"""
import json
import string
import functools
import cStringIO
from math import ceil
from flask import make_response, g, request, flash, jsonify, \
redirect, url_for, current_app, abort, session, Blueprint
from flask.ext.login import current_user
from .. import receivers
from ..cache import get_search_query_id, get_collection_name_from_cache
from ..facet_builders import get_current_user_records_that_can_be_displayed, \
faceted_results_filter, FacetLoader
from ..forms import EasySearchForm
from ..models import Collection
+from ..washers import wash_search_urlargd
from invenio.ext.menu import register_menu
from invenio.base.signals import websearch_before_browse, websearch_before_search
from invenio.modules.index import models as BibIndex
from invenio.modules.formatter import format_record
from invenio.base.i18n import _
from invenio.base.decorators import wash_arguments, templated
from invenio.ext.breadcrumb import \
register_breadcrumb, breadcrumbs, default_breadcrumb_root
from invenio.ext.template.context_processor import \
register_template_context_processor
from invenio.utils.pagination import Pagination
blueprint = Blueprint('search', __name__, url_prefix="",
template_folder='../templates',
static_url_path='', # static url path has to be empty
# if url_prefix is empty
static_folder='../static')
default_breadcrumb_root(blueprint, '.')
FACETS = FacetLoader()
def collection_name_from_request():
collection = request.values.get('cc')
if collection is None and len(request.values.getlist('c')) == 1:
collection = request.values.get('c')
return collection
def min_length(length, code=406):
def checker(value):
if len(value) < 3:
abort(code)
return value
return checker
def check_collection(method=None, name_getter=collection_name_from_request,
default_collection=False):
"""Check collection existence and authorization for current user."""
if method is None:
return functools.partial(check_collection, name_getter=name_getter,
default_collection=default_collection)
@functools.wraps(method)
def decorated(*args, **kwargs):
uid = current_user.get_id()
name = name_getter()
if name:
collection = Collection.query.filter(Collection.name == name).first_or_404()
elif default_collection:
collection = Collection.query.get_or_404(1)
else:
return abort(404)
if collection.is_restricted:
from invenio.modules.access.engine import acc_authorize_action
from invenio.modules.access.local_config import VIEWRESTRCOLL
(auth_code, auth_msg) = acc_authorize_action(uid, VIEWRESTRCOLL,
collection=collection.name)
if auth_code:
flash(_('This collection is restricted.'), 'error')
if auth_code and current_user.is_guest:
return redirect(url_for('webaccount.login',
referer=request.url))
elif auth_code:
return abort(401)
return method(collection, *args, **kwargs)
return decorated
def response_formated_records(recids, collection, of, **kwargs):
from invenio.modules.formatter import get_output_format_content_type, print_records
response = make_response(print_records(recids, collection=collection,
of=of, **kwargs))
response.mimetype = get_output_format_content_type(of)
return response
@blueprint.route('/index.py', methods=['GET', 'POST'])
@blueprint.route('/', methods=['GET', 'POST'])
@templated('search/index.html')
@register_menu(blueprint, 'main.search', _('Search'), order=1)
@register_breadcrumb(blueprint, 'breadcrumbs', _('Home'))
def index():
""" Renders homepage. """
# legacy app support
c = request.values.get('c')
if c == current_app.config['CFG_SITE_NAME']:
return redirect(url_for('.index', ln=g.ln))
elif c is not None:
return redirect(url_for('.collection', name=c, ln=g.ln))
collection = Collection.query.get_or_404(1)
@register_template_context_processor
def index_context():
return dict(
easy_search_form=EasySearchForm(csrf_enabled=False),
format_record=format_record,
)
return dict(collection=collection)
@blueprint.route('/collection/<name>', methods=['GET', 'POST'])
@templated('search/collection.html')
def collection(name):
collection = Collection.query.filter(Collection.name == name).first_or_404()
@register_template_context_processor
def index_context():
return dict(
format_record=format_record,
easy_search_form=EasySearchForm(csrf_enabled=False),
breadcrumbs=breadcrumbs + collection.breadcrumbs(ln=g.ln)[1:])
return dict(collection=collection)
class SearchUrlargs(object):
DEFAULT_URLARGS = {
'p': {'title': 'Search', 'store': None},
'cc': {'title': 'Collection', 'store': None},
'c': {'title': 'Collection', 'store': None},
'rg': {'title': 'Records in Groups',
'store': 'websearch_group_records'},
'sf': {'title': 'Sort Field', 'store': None},
'so': {'title': 'Sort Option', 'store': 'websearch_sort_option'},
'rm': {'title': 'Rank Method', 'store': 'websearch_rank_method'}
}
def __init__(self, session=None, user=None, **kwargs):
self.session = session
self.user = user
self._url_args = kwargs
@property
def args(self):
out = self.user_args
out.update(self.url_args)
return out
@property
def user_storable_args(self):
return dict(map(lambda (k, v): (v['store'], k),
filter(lambda (k, v): v['store'],
self.DEFAULT_URLARGS.iteritems())))
@property
def url_args(self):
return filter(lambda (k, v): k in self.DEFAULT_URLARGS.keys(),
self._url_args.iteritems())
@property
def user_args(self):
if not self.user:
return {}
user_storable_args = self.user_storable_args
args_keys = user_storable_args.keys()
if self.user.settings is None:
self.user.settings = dict()
return dict(map(lambda (k, v): (user_storable_args[k], v),
filter(lambda (k, v): k in args_keys,
self.user.settings.iteritems())))
def _create_neareset_term_box(argd_orig):
try:
p = argd_orig.pop('p', '')#.encode('utf-8')
f = argd_orig.pop('f', '')#.encode('utf-8')
if 'rg' in argd_orig and not 'rg' in request.values:
del argd_orig['rg']
if f == '' and ':' in p:
fx, px = p.split(':', 1)
from invenio.legacy.search_engine import get_field_name
if get_field_name(fx) != "":
f, p = fx, px
from invenio.legacy.search_engine import create_nearest_terms_box
return create_nearest_terms_box(argd_orig,
p=p, f=f.lower(), ln=g.ln, intro_text_p=True)
except:
return '<!-- not found -->'
def sort_and_rank_records(recids, so=None, rm=None, p=''):
output = recids.tolist()
if so:
output.reverse()
elif rm:
from invenio.legacy.bibrank.record_sorter import rank_records
ranked = rank_records(rm, 0, output, p.split())
if ranked[0]:
output = ranked[0]
output.reverse()
else:
output = output.tolist()
else:
output.reverse()
return output
def crumb_builder(url):
def _crumb_builder(collection):
qargs = request.args.to_dict()
qargs['cc'] = collection.name
#return (collection.name_ln, url, qargs)
return dict(text=collection.name_ln, url=url_for(url, **qargs))
return _crumb_builder
def collection_breadcrumbs(collection, endpoint=None):
b = []
if endpoint is None:
endpoint = request.endpoint
if collection.id > 1:
qargs = request.args.to_dict()
k = 'cc' if 'cc' in qargs else 'c'
del qargs[k]
b = [(_('Home'), endpoint, qargs)] + collection.breadcrumbs(
builder=crumb_builder(endpoint), ln=g.ln)[1:]
return b
@blueprint.route('/browse', methods=['GET', 'POST'])
@register_breadcrumb(blueprint, '.browse', _('Browse results'))
@templated('search/browse.html')
@wash_arguments({'p': (unicode, ''),
'f': (unicode, None),
'of': (unicode, 'hb'),
'so': (unicode, None),
'rm': (unicode, None),
'rg': (int, 10),
'jrec': (int, 1)})
@check_collection(default_collection=True)
def browse(collection, p, f, of, so, rm, rg, jrec):
from invenio.legacy.search_engine import browse_pattern_phrases
- from invenio.legacy.websearch.webinterface import wash_search_urlargd
argd = argd_orig = wash_search_urlargd(request.args)
colls = [collection.name] + request.args.getlist('c')
if f is None and ':' in p[1:]:
f, p = string.split(p, ":", 1)
argd['f'] = f
argd['p'] = p
websearch_before_browse.send(collection, **argd)
records = map(lambda (r, h): (r.decode('utf-8'), h),
browse_pattern_phrases(req=request.get_legacy_request(),
colls=colls, p=p, f=f, rg=rg, ln=g.ln))
@register_template_context_processor
def index_context():
return dict(collection=collection,
create_nearest_terms_box=lambda: _create_neareset_term_box(argd_orig),
pagination=Pagination(int(ceil(jrec / float(rg))), rg, len(records)),
rg=rg, p=p, f=f,
easy_search_form=EasySearchForm(csrf_enabled=False),
breadcrumbs=breadcrumbs+collection_breadcrumbs(collection)
)
return dict(records=records)
websearch_before_browse.connect(receivers.websearch_before_browse_handler)
@blueprint.route('/rss', methods=['GET'])
# FIXME caching issue of response object
@wash_arguments({'p': (unicode, ''),
'jrec': (int, 1),
'so': (unicode, None),
'rm': (unicode, None)})
@check_collection(default_collection=True)
def rss(collection, p, jrec, so, rm):
from invenio.legacy.search_engine import perform_request_search
of = 'xr'
- from invenio.legacy.websearch.webinterface import wash_search_urlargd
argd = argd_orig = wash_search_urlargd(request.args)
argd['of'] = 'id'
# update search arguments with the search user preferences
if 'rg' not in request.values and current_user.get('rg'):
argd['rg'] = current_user.get('rg')
rg = int(argd['rg'])
qid = get_search_query_id(**argd)
recids = perform_request_search(req=request.get_legacy_request(), **argd)
if so or rm:
recids.reverse()
ctx = dict(records=len(get_current_user_records_that_can_be_displayed(qid)),
qid=qid, rg=rg)
return response_formated_records(recids, collection, of, **ctx)
@blueprint.route('/search', methods=['GET', 'POST'])
@register_breadcrumb(blueprint, '.browse', _('Search results'))
@wash_arguments({'p': (unicode, ''),
'of': (unicode, 'hb'),
'so': (unicode, None),
'rm': (unicode, None)})
@check_collection(default_collection=True)
def search(collection, p, of, so, rm):
"""
Renders search pages.
"""
from invenio.legacy.search_engine import perform_request_search
if 'action_browse' in request.args \
or request.args.get('action', '') == 'browse':
return browse()
if 'c' in request.args and len(request.args) == 1 \
and len(request.args.getlist('c')) == 1:
return redirect(url_for('.collection', name=request.args.get('c')))
- from invenio.legacy.websearch.webinterface import wash_search_urlargd
argd = argd_orig = wash_search_urlargd(request.args)
argd['of'] = 'id'
# update search arguments with the search user preferences
if 'rg' not in request.values and current_user.get('rg'):
argd['rg'] = current_user.get('rg')
rg = int(argd['rg'])
collection_breadcrumbs(collection)
qid = get_search_query_id(**argd)
recids = perform_request_search(req=request.get_legacy_request(), **argd)
#if so or rm:
if len(of)>0 and of[0] == 'h':
recids.reverse()
# back-to-search related code
if request and not isinstance(request.get_legacy_request(), cStringIO.OutputType):
# store the last search results page
session['websearch-last-query'] = request.get_legacy_request().unparsed_uri
if len(recids) > current_app.config['CFG_WEBSEARCH_PREV_NEXT_HIT_LIMIT']:
last_query_hits = None
else:
last_query_hits = recids
# store list of results if user wants to display hits
# in a single list, or store list of collections of records
# if user displays hits split by collections:
session["websearch-last-query-hits"] = last_query_hits
ctx = dict(facets=FACETS.config(collection=collection, qid=qid),
records=len(get_current_user_records_that_can_be_displayed(qid)),
qid=qid, rg=rg,
create_nearest_terms_box=lambda: _create_neareset_term_box(argd_orig),
easy_search_form=EasySearchForm(csrf_enabled=False))
return response_formated_records(recids, collection, of, **ctx)
@blueprint.route('/facet/<name>/<qid>', methods=['GET', 'POST'])
def facet(name, qid):
"""
Creates list of fields specified facet.
@param name: facet identifier
@param qid: query identifier
@return: jsonified facet list sorted by number of records
"""
try:
out = FACETS[name].get_facets_for_query(
qid, limit=request.args.get('limit', 20))
except KeyError:
abort(406)
if request.is_xhr:
return jsonify(facet=out)
else:
response = make_response('<html><body>%s</body></html>' % str(out))
response.mimetype = 'text/html'
return response
@blueprint.route('/results/<qid>', methods=['GET', 'POST'])
@wash_arguments({'p': (unicode, ''),
'of': (unicode, 'hb'),
'so': (unicode, None),
'rm': (unicode, None)})
def results(qid, p, of, so, rm):
"""
Generates results for cached query using POSTed filter.
@param qid: query indentifier
"""
try:
recIDsHitSet = get_current_user_records_that_can_be_displayed(qid)
except KeyError:
return 'KeyError'
except:
return _('Please reload the page')
try:
filter_data = json.loads(request.values.get('filter', '[]'))
except:
return _('Invalid filter data')
@check_collection(
name_getter=functools.partial(get_collection_name_from_cache, qid))
def make_results(collection):
recids = faceted_results_filter(recIDsHitSet, filter_data, FACETS.elements)
recids = sort_and_rank_records(recids, so=so, rm=rm, p=p)
return response_formated_records(
recids, collection, of,
create_nearest_terms_box=_create_neareset_term_box, qid=qid)
return make_results()
@blueprint.route('/list/<any(exactauthor, keyword, affiliation, reportnumber, collaboration):field>', methods=['GET', 'POST'])
@wash_arguments({'q': (min_length(3), '')})
def autocomplete(field, q):
"""
Autocompletes data from indexes.
It uses POSTed arguments with name `q` that has to be longer than 3
characters in order to returns any results.
@param field: index name
@param q: query string for index term
@return: list of values matching query.
"""
- from invenio.bibindex_engine import get_index_id_from_index_name
+ from invenio.legacy.bibindex.engine import get_index_id_from_index_name
IdxPHRASE = BibIndex.__getattribute__('IdxPHRASE%02dF' %
get_index_id_from_index_name(field))
results = IdxPHRASE.query.filter(IdxPHRASE.term.contains(q)).limit(20).all()
results = map(lambda r: r.term, results)
return jsonify(results=results)
@blueprint.route('/search/dispatch', methods=['GET', 'POST'])
def dispatch():
""" Redirects request to appropriate methods from search page. """
action = request.values.get('action')
if action not in ['addtobasket', 'export']:
abort(406)
if action == 'export':
return redirect(url_for('.export', **request.values.to_dict(flat=False)))
if action == 'addtobasket':
recids = request.values.getlist('recid', type=int)
lang = (request.values.get('ln') or 'en')
new_url = '/yourbaskets/add?ln={ln}&'.format(ln=lang)
new_url += '&'.join(['recid=' + str(r) for r in recids])
return redirect(new_url)
# ERROR: parser of GET arguments in 'next' does not parse lists
# only the first element of a list is passed to webbasket.add
# (however, this url works in 'master' with the same webbasket module)
flash("Not implemented action " + action, 'error')
return redirect(request.referrer)
@blueprint.route('/export', methods=['GET', 'POST'])
@wash_arguments({'of': (unicode, 'xm')})
@check_collection(default_collection=True)
def export(collection, of):
"""
Exports requested records to defined output format.
It uses following request values:
* of (string): output format
* recid ([int]): list of record IDs
"""
# Get list of integers with record IDs.
recids = request.values.getlist('recid', type=int)
return response_formated_records(recids, collection, of)
diff --git a/invenio/modules/search/washers.py b/invenio/modules/search/washers.py
new file mode 100644
index 000000000..0369d46d1
--- /dev/null
+++ b/invenio/modules/search/washers.py
@@ -0,0 +1,105 @@
+## This file is part of Invenio.
+## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
+##
+## Invenio is free software; you can redistribute it and/or
+## modify it under the terms of the GNU General Public License as
+## published by the Free Software Foundation; either version 2 of the
+## License, or (at your option) any later version.
+##
+## Invenio is distributed in the hope that it will be useful, but
+## WITHOUT ANY WARRANTY; without even the implied warranty of
+## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+## General Public License for more details.
+##
+## You should have received a copy of the GNU General Public License
+## along with Invenio; if not, write to the Free Software Foundation, Inc.,
+## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
+
+"""
+ invenio.modules.search.washers
+ ------------------------------
+
+ Implements search washers.
+"""
+from invenio.base.globals import cfg
+from invenio.utils.datastructures import LazyDict
+
+
+def get_search_results_default_urlargd():
+ """Returns default config for search results arguments."""
+ return {
+ 'cc': (str, cfg['CFG_SITE_NAME']),
+ 'c': (list, []),
+ 'p': (str, ""), 'f': (str, ""),
+ 'rg': (int, cfg['CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS']),
+ 'sf': (str, ""),
+ 'so': (str, "d"),
+ 'sp': (str, ""),
+ 'rm': (str, ""),
+ 'of': (str, "hb"),
+ 'ot': (list, []),
+ 'em': (str,""),
+ 'aas': (int, cfg['CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE']),
+ 'as': (int, cfg['CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE']),
+ 'p1': (str, ""), 'f1': (str, ""), 'm1': (str, ""), 'op1':(str, ""),
+ 'p2': (str, ""), 'f2': (str, ""), 'm2': (str, ""), 'op2':(str, ""),
+ 'p3': (str, ""), 'f3': (str, ""), 'm3': (str, ""),
+ 'sc': (int, 0),
+ 'jrec': (int, 0),
+ 'recid': (int, -1), 'recidb': (int, -1), 'sysno': (str, ""),
+ 'id': (int, -1), 'idb': (int, -1), 'sysnb': (str, ""),
+ 'action': (str, "search"),
+ 'action_search': (str, ""),
+ 'action_browse': (str, ""),
+ 'd1': (str, ""),
+ 'd1y': (int, 0), 'd1m': (int, 0), 'd1d': (int, 0),
+ 'd2': (str, ""),
+ 'd2y': (int, 0), 'd2m': (int, 0), 'd2d': (int, 0),
+ 'dt': (str, ""),
+ 'ap': (int, 1),
+ 'verbose': (int, 0),
+ 'ec': (list, []),
+ 'wl': (int, cfg['CFG_WEBSEARCH_WILDCARD_LIMIT']),
+ }
+
+search_results_default_urlargd = LazyDict(get_search_results_default_urlargd)
+
+
+def wash_search_urlargd(form):
+ """
+ Create canonical search arguments from those passed via web form.
+ """
+ from invenio.ext.legacy.handler import wash_urlargd
+ argd = wash_urlargd(form, search_results_default_urlargd)
+ if argd.has_key('as'):
+ argd['aas'] = argd['as']
+ del argd['as']
+ if argd.get('aas', cfg['CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE']) \
+ not in cfg['CFG_WEBSEARCH_ENABLED_SEARCH_INTERFACES']:
+ argd['aas'] = cfg['CFG_WEBSEARCH_DEFAULT_SEARCH_INTERFACE']
+
+ # Sometimes, users pass ot=245,700 instead of
+ # ot=245&ot=700. Normalize that.
+ ots = []
+ for ot in argd['ot']:
+ ots += ot.split(',')
+ argd['ot'] = ots
+
+ # We can either get the mode of function as
+ # action=<browse|search>, or by setting action_browse or
+ # action_search.
+ if argd['action_browse']:
+ argd['action'] = 'browse'
+ elif argd['action_search']:
+ argd['action'] = 'search'
+ else:
+ if argd['action'] not in ('browse', 'search'):
+ argd['action'] = 'search'
+
+ del argd['action_browse']
+ del argd['action_search']
+
+ if argd['em'] != "":
+ argd['em'] = argd['em'].split(",")
+
+ return argd
diff --git a/invenio/modules/sequencegenerator/cnum.py b/invenio/modules/sequencegenerator/cnum.py
index 8ff0a2bb5..bd5e1ea56 100644
--- a/invenio/modules/sequencegenerator/cnum.py
+++ b/invenio/modules/sequencegenerator/cnum.py
@@ -1,125 +1,125 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
from invenio.sequtils import SequenceGenerator
-from invenio.bibedit_utils import get_bibrecord
+from invenio.legacy.bibedit.utils import get_bibrecord
from invenio.legacy.bibrecord import record_get_field_value, create_record
from invenio.legacy.search_engine import perform_request_search
from invenio.legacy.dbquery import run_sql
class ConferenceNoStartDateError(Exception):
pass
class CnumSeq(SequenceGenerator):
"""
cnum sequence generator
"""
seq_name = 'cnum'
def _get_record_cnums(self, value):
"""
Get all the values that start with the base cnum
@param value: base cnum
@type value: string
@return: values starting by the base cnum
@rtype: tuple
"""
return run_sql("""SELECT seq_value FROM seqSTORE WHERE seq_value
LIKE %s AND seq_name=%s""",
(value + "%", self.seq_name))
def _next_value(self, recid=None, xml_record=None):
"""
Returns the next cnum for the given recid
@param recid: id of the record where the cnum will be generated
@type recid: int
@param xml_record: record in xml format
@type xml_record: string
@return: next cnum for the given recid. Format is Cyy-mm-dd.[.1n]
@rtype: string
@raises ConferenceNoStartDateError: No date information found in the
given recid
"""
if recid is None and xml_record is not None:
bibrecord = create_record(xml_record)[0]
else:
bibrecord = get_bibrecord(recid)
start_date = record_get_field_value(bibrecord,
tag="111",
ind1="",
ind2="",
code="x")
if not start_date:
raise ConferenceNoStartDateError
base_cnum = "C" + start_date[2:]
record_cnums = self._get_record_cnums(base_cnum)
if not record_cnums:
new_cnum = base_cnum
elif len(record_cnums) == 1:
new_cnum = base_cnum + '.' + '1'
else:
# Get the max current revision, cnums are in format Cyy-mm-dd,
# Cyy-mm-dd.1, Cyy-mm-dd.2
highest_revision = max([int(rev[0].split('.')[1]) for rev in record_cnums[1:]])
new_cnum = base_cnum + '.' + str(highest_revision + 1)
return new_cnum
# Helper functions to populate cnums from existing database records
def _cnum_exists(cnum):
"""
Checks existance of a given cnum in seqSTORE table
"""
return run_sql("""select seq_value from seqSTORE where seq_value=%s and seq_name='cnum'""", (cnum, ))
def _insert_cnum(cnum):
"""
Inserts a new cnum in table seqSTORE
"""
return run_sql("INSERT INTO seqSTORE (seq_name, seq_value) VALUES (%s, %s)", ("cnum", cnum))
def populate_cnums():
"""
Populates table seqSTORE with the cnums present in CONFERENCE records
"""
# First get all records from conference collection
conf_records = perform_request_search(cc="Conferences", p="111__g:C*", rg=0)
for recid in conf_records:
cnum = record_get_field_value(get_bibrecord(recid), tag="111", ind1="", ind2="", code="g")
if cnum:
if not _cnum_exists(cnum):
_insert_cnum(cnum)
print "cnum %s from record %s inserted" % (cnum, recid)
diff --git a/invenio/modules/upgrader/upgrades/invenio_release_1_1_0.py b/invenio/modules/upgrader/upgrades/invenio_release_1_1_0.py
index 13c22a967..fbc0bf916 100644
--- a/invenio/modules/upgrader/upgrades/invenio_release_1_1_0.py
+++ b/invenio/modules/upgrader/upgrades/invenio_release_1_1_0.py
@@ -1,690 +1,690 @@
#-*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Invenio 1.1.0 upgrade
DISCLAIMER: This upgrade is special, and should not be used as an example of an
upgrade. Read more below.
This upgrade is special and much more complex than future upgrade will normally
be. Since this is the first upgrade, the upgrade does not depend on other
upgrades. Furthermore the current state of the database is unknown and thus
the upgrade will try to guess the current state, as well as be quite tolerant
towards errors.
To create a new upgrade recipe just run:
inveniocfg --upgrade-create-standard-recipe=invenio,~/src/invenio/modules/miscutil/lib/upgrades/
"""
import warnings
import logging
from invenio.legacy.dbquery import run_sql
DB_VERSION = None
depends_on = []
def info():
return "Invenio 1.0.x to 1.1.0 upgrade"
def estimate():
""" Return estimate of upgrade time in seconds """
return 10
def do_upgrade():
""" Perform upgrade """
tables = _get_tables()
session_tbl = _get_table_info('session')
if (DB_VERSION == '1.0.0' or DB_VERSION == 'master') and \
'session_expiry' not in session_tbl['indexes']:
_run_sql_ignore("ALTER TABLE session ADD KEY session_expiry " \
"(session_expiry)")
# Create tables
_create_table(tables, "upgrade", """
CREATE TABLE IF NOT EXISTS upgrade (
upgrade varchar(255) NOT NULL,
applied DATETIME NOT NULL,
PRIMARY KEY (upgrade)
) ENGINE=MyISAM;
""")
_create_table(tables, "idxWORD17F", """
CREATE TABLE IF NOT EXISTS idxWORD17F (
id mediumint(9) unsigned NOT NULL auto_increment,
term varchar(50) default NULL,
hitlist longblob,
PRIMARY KEY (id),
UNIQUE KEY term (term)
) ENGINE=MyISAM;
""")
_create_table(tables, "idxWORD17R", """
CREATE TABLE IF NOT EXISTS idxWORD17R (
id_bibrec mediumint(9) unsigned NOT NULL,
termlist longblob,
type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT',
PRIMARY KEY (id_bibrec,type)
) ENGINE=MyISAM;
""")
_create_table(tables, "idxWORD18F", """
CREATE TABLE IF NOT EXISTS idxWORD18F (
id mediumint(9) unsigned NOT NULL auto_increment,
term varchar(50) default NULL,
hitlist longblob,
PRIMARY KEY (id),
UNIQUE KEY term (term)
) ENGINE=MyISAM;
""")
_create_table(tables, "idxWORD18R", """
CREATE TABLE IF NOT EXISTS idxWORD18R (
id_bibrec mediumint(9) unsigned NOT NULL,
termlist longblob,
type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT',
PRIMARY KEY (id_bibrec,type)
) ENGINE=MyISAM;
""")
_create_table(tables, "idxPAIR17F", """
CREATE TABLE IF NOT EXISTS idxPAIR17F (
id mediumint(9) unsigned NOT NULL auto_increment,
term varchar(100) default NULL,
hitlist longblob,
PRIMARY KEY (id),
UNIQUE KEY term (term)
) ENGINE=MyISAM;
""")
_create_table(tables, "idxPAIR17R", """
CREATE TABLE IF NOT EXISTS idxPAIR17R (
id_bibrec mediumint(9) unsigned NOT NULL,
termlist longblob,
type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT',
PRIMARY KEY (id_bibrec,type)
) ENGINE=MyISAM;
""")
_create_table(tables, "idxPAIR18F", """
CREATE TABLE IF NOT EXISTS idxPAIR18F (
id mediumint(9) unsigned NOT NULL auto_increment,
term varchar(100) default NULL,
hitlist longblob,
PRIMARY KEY (id),
UNIQUE KEY term (term)
) ENGINE=MyISAM;
""")
_create_table(tables, "idxPAIR18R", """
CREATE TABLE IF NOT EXISTS idxPAIR18R (
id_bibrec mediumint(9) unsigned NOT NULL,
termlist longblob,
type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT',
PRIMARY KEY (id_bibrec,type)
) ENGINE=MyISAM;
""")
_create_table(tables, "idxPHRASE17F", """
CREATE TABLE IF NOT EXISTS idxPHRASE17F (
id mediumint(9) unsigned NOT NULL auto_increment,
term text default NULL,
hitlist longblob,
PRIMARY KEY (id),
KEY term (term(50))
) ENGINE=MyISAM;
""")
_create_table(tables, "idxPHRASE17R", """
CREATE TABLE IF NOT EXISTS idxPHRASE17R (
id_bibrec mediumint(9) unsigned NOT NULL,
termlist longblob,
type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT',
PRIMARY KEY (id_bibrec,type)
) ENGINE=MyISAM;
""")
_create_table(tables, "idxPHRASE18F", """
CREATE TABLE IF NOT EXISTS idxPHRASE18F (
id mediumint(9) unsigned NOT NULL auto_increment,
term text default NULL,
hitlist longblob,
PRIMARY KEY (id),
KEY term (term(50))
) ENGINE=MyISAM;
""")
_create_table(tables, "idxPHRASE18R", """
CREATE TABLE IF NOT EXISTS idxPHRASE18R (
id_bibrec mediumint(9) unsigned NOT NULL,
termlist longblob,
type enum('CURRENT','FUTURE','TEMPORARY') NOT NULL default 'CURRENT',
PRIMARY KEY (id_bibrec,type)
) ENGINE=MyISAM;
""")
_create_table(tables, "bibdocfsinfo", """
CREATE TABLE IF NOT EXISTS bibdocfsinfo (
id_bibdoc mediumint(9) unsigned NOT NULL,
version tinyint(4) unsigned NOT NULL,
format varchar(50) NOT NULL,
last_version boolean NOT NULL,
cd datetime NOT NULL,
md datetime NOT NULL,
checksum char(32) NOT NULL,
filesize bigint(15) unsigned NOT NULL,
mime varchar(100) NOT NULL,
master_format varchar(50) NULL default NULL,
PRIMARY KEY (id_bibdoc, version, format),
KEY (last_version),
KEY (format),
KEY (cd),
KEY (md),
KEY (filesize),
KEY (mime)
) ENGINE=MyISAM;
""")
_create_table(tables, "userEXT", """
CREATE TABLE IF NOT EXISTS userEXT (
id varbinary(255) NOT NULL,
method varchar(50) NOT NULL,
id_user int(15) unsigned NOT NULL,
PRIMARY KEY (id, method),
UNIQUE KEY (id_user, method)
) ENGINE=MyISAM;
""")
_create_table(tables, "cmtCOLLAPSED", """
CREATE TABLE IF NOT EXISTS cmtCOLLAPSED (
id_bibrec int(15) unsigned NOT NULL default '0',
id_cmtRECORDCOMMENT int(15) unsigned NULL,
id_user int(15) unsigned NOT NULL,
PRIMARY KEY (id_user, id_bibrec, id_cmtRECORDCOMMENT)
) ENGINE=MyISAM;
""")
_create_table(tables, "aidPERSONIDPAPERS", """
CREATE TABLE IF NOT EXISTS `aidPERSONIDPAPERS` (
`personid` BIGINT( 16 ) UNSIGNED NOT NULL ,
`bibref_table` ENUM( '100', '700' ) NOT NULL ,
`bibref_value` MEDIUMINT( 8 ) UNSIGNED NOT NULL ,
`bibrec` MEDIUMINT( 8 ) UNSIGNED NOT NULL ,
`name` VARCHAR( 256 ) NOT NULL ,
`flag` SMALLINT( 2 ) NOT NULL DEFAULT '0' ,
`lcul` SMALLINT( 2 ) NOT NULL DEFAULT '0' ,
`last_updated` TIMESTAMP ON UPDATE CURRENT_TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ,
INDEX `personid-b` (`personid`) ,
INDEX `reftable-b` (`bibref_table`) ,
INDEX `refvalue-b` (`bibref_value`) ,
INDEX `rec-b` (`bibrec`) ,
INDEX `name-b` (`name`) ,
INDEX `pn-b` (`personid`, `name`) ,
INDEX `timestamp-b` (`last_updated`) ,
INDEX `flag-b` (`flag`) ,
INDEX `ptvrf-b` (`personid`, `bibref_table`, `bibref_value`, `bibrec`, `flag`)
) ENGINE=MYISAM;
""")
_create_table(tables, "aidRESULTS", """
CREATE TABLE IF NOT EXISTS `aidRESULTS` (
`personid` VARCHAR( 256 ) NOT NULL ,
`bibref_table` ENUM( '100', '700' ) NOT NULL ,
`bibref_value` MEDIUMINT( 8 ) UNSIGNED NOT NULL ,
`bibrec` MEDIUMINT( 8 ) UNSIGNED NOT NULL ,
INDEX `personid-b` (`personid`) ,
INDEX `reftable-b` (`bibref_table`) ,
INDEX `refvalue-b` (`bibref_value`) ,
INDEX `rec-b` (`bibrec`)
) ENGINE=MYISAM;
""")
_create_table(tables, "aidPERSONIDDATA", """
CREATE TABLE IF NOT EXISTS `aidPERSONIDDATA` (
`personid` BIGINT( 16 ) UNSIGNED NOT NULL ,
`tag` VARCHAR( 64 ) NOT NULL ,
`data` VARCHAR( 256 ) NOT NULL ,
`opt1` MEDIUMINT( 8 ) NULL DEFAULT NULL ,
`opt2` MEDIUMINT( 8 ) NULL DEFAULT NULL ,
`opt3` VARCHAR( 256 ) NULL DEFAULT NULL ,
INDEX `personid-b` (`personid`) ,
INDEX `tag-b` (`tag`) ,
INDEX `data-b` (`data`) ,
INDEX `opt1` (`opt1`)
) ENGINE=MYISAM;
""")
_create_table(tables, "aidUSERINPUTLOG", """
CREATE TABLE IF NOT EXISTS `aidUSERINPUTLOG` (
`id` bigint(15) NOT NULL AUTO_INCREMENT,
`transactionid` bigint(15) NOT NULL,
`timestamp` datetime NOT NULL,
`userid` int,
`userinfo` varchar(255) NOT NULL,
`personid` bigint(15) NOT NULL,
`action` varchar(50) NOT NULL,
`tag` varchar(50) NOT NULL,
`value` varchar(200) NOT NULL,
`comment` text,
PRIMARY KEY (`id`),
INDEX `transactionid-b` (`transactionid`),
INDEX `timestamp-b` (`timestamp`),
INDEX `userinfo-b` (`userinfo`),
INDEX `userid-b` (`userid`),
INDEX `personid-b` (`personid`),
INDEX `action-b` (`action`),
INDEX `tag-b` (`tag`),
INDEX `value-b` (`value`)
) ENGINE=MyISAM;
""")
_create_table(tables, "aidCACHE", """
CREATE TABLE IF NOT EXISTS `aidCACHE` (
`id` int(15) NOT NULL auto_increment,
`object_name` varchar(120) NOT NULL,
`object_key` varchar(120) NOT NULL,
`object_value` text,
`last_updated` datetime NOT NULL,
PRIMARY KEY (`id`),
INDEX `name-b` (`object_name`),
INDEX `key-b` (`object_key`),
INDEX `last_updated-b` (`last_updated`)
) ENGINE=MyISAM;
""")
_create_table(tables, "xtrJOB", """
CREATE TABLE IF NOT EXISTS `xtrJOB` (
`id` tinyint(4) NOT NULL AUTO_INCREMENT,
`name` varchar(30) NOT NULL,
`last_updated` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM;
""")
_create_table(tables, "bsrMETHOD", """
CREATE TABLE IF NOT EXISTS bsrMETHOD (
id mediumint(8) unsigned NOT NULL auto_increment,
name varchar(20) NOT NULL,
definition varchar(255) NOT NULL,
washer varchar(255) NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY (name)
) ENGINE=MyISAM;
""")
_create_table(tables, "bsrMETHODNAME", """
CREATE TABLE IF NOT EXISTS bsrMETHODNAME (
id_bsrMETHOD mediumint(8) unsigned NOT NULL,
ln char(5) NOT NULL default '',
type char(3) NOT NULL default 'sn',
value varchar(255) NOT NULL,
PRIMARY KEY (id_bsrMETHOD, ln, type)
) ENGINE=MyISAM;
""")
_create_table(tables, "bsrMETHODDATA", """
CREATE TABLE IF NOT EXISTS bsrMETHODDATA (
id_bsrMETHOD mediumint(8) unsigned NOT NULL,
data_dict longblob,
data_dict_ordered longblob,
data_list_sorted longblob,
last_updated datetime,
PRIMARY KEY (id_bsrMETHOD)
) ENGINE=MyISAM;
""")
_create_table(tables, "bsrMETHODDATABUCKET", """
CREATE TABLE IF NOT EXISTS bsrMETHODDATABUCKET (
id_bsrMETHOD mediumint(8) unsigned NOT NULL,
bucket_no tinyint(2) NOT NULL,
bucket_data longblob,
bucket_last_value varchar(255),
last_updated datetime,
PRIMARY KEY (id_bsrMETHOD, bucket_no)
) ENGINE=MyISAM;
""")
_create_table(tables, "collection_bsrMETHOD", """
CREATE TABLE IF NOT EXISTS collection_bsrMETHOD (
id_collection mediumint(9) unsigned NOT NULL,
id_bsrMETHOD mediumint(9) unsigned NOT NULL,
score tinyint(4) unsigned NOT NULL default '0',
PRIMARY KEY (id_collection, id_bsrMETHOD)
) ENGINE=MyISAM;
""")
_create_table(tables, "seqSTORE", """
CREATE TABLE IF NOT EXISTS seqSTORE (
id int(15) NOT NULL auto_increment,
seq_name varchar(15),
seq_value varchar(20),
PRIMARY KEY (id),
UNIQUE KEY seq_name_value (seq_name, seq_value)
) ENGINE=MyISAM;
""")
_create_table(tables, "webapikey", """
CREATE TABLE IF NOT EXISTS webapikey (
id varchar(150) NOT NULL,
secret varchar(150) NOT NULL,
id_user int(15) NOT NULL,
status varchar(25) NOT NULL default 'OK',
description varchar(255) default NULL,
PRIMARY KEY (id),
KEY (id_user),
KEY (status)
) ENGINE=MyISAM;
""")
_create_table(tables, "wapCACHE", """
CREATE TABLE IF NOT EXISTS `wapCACHE` (
`object_name` varchar(120) NOT NULL,
`object_key` varchar(120) NOT NULL,
`object_value` longtext,
`object_status` varchar(120),
`last_updated` datetime NOT NULL,
PRIMARY KEY (`object_name`,`object_key`),
INDEX `last_updated-b` (`last_updated`),
INDEX `status-b` (`object_status`)
) ENGINE=MyISAM;
""")
# Insert and alter table queries
_run_sql_ignore("INSERT INTO sbmALLFUNCDESCR VALUES ('Set_Embargo','Set an embargo on all the documents of a given record.');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Set_Embargo','date_file');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Set_Embargo','date_format');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('User_is_Record_Owner_or_Curator','curator_role');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('User_is_Record_Owner_or_Curator','curator_flag');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Move_Photos_to_Storage','iconformat');")
_run_sql_ignore("INSERT INTO format (name, code, description, content_type, visibility) VALUES ('Podcast', 'xp', 'Sample format suitable for multimedia feeds, such as podcasts', 'application/rss+xml', 0);")
_run_sql_ignore("ALTER TABLE accMAILCOOKIE ADD INDEX expiration (expiration);")
_run_sql_ignore("UPDATE sbmFUNDESC SET function='Move_CKEditor_Files_to_Storage' WHERE function='Move_FCKeditor_Files_to_Storage';")
_run_sql_ignore("UPDATE sbmALLFUNCDESCR SET function='Move_CKEditor_Files_to_Storage', description='Transfer files attached to the record with the CKEditor' WHERE function='Move_FCKeditor_Files_to_Storage';")
_run_sql_ignore("UPDATE sbmFUNCTIONS SET function='Move_CKEditor_Files_to_Storage' WHERE function='Move_FCKeditor_Files_to_Storage';")
_run_sql_ignore("ALTER TABLE schTASK CHANGE proc proc varchar(255) NOT NULL;")
_run_sql_ignore("ALTER TABLE schTASK ADD sequenceid int(8) NULL default NULL;")
_run_sql_ignore("ALTER TABLE schTASK ADD INDEX sequenceid (sequenceid);")
_run_sql_ignore("ALTER TABLE hstTASK CHANGE proc proc varchar(255) NOT NULL;")
_run_sql_ignore("ALTER TABLE hstTASK ADD sequenceid int(8) NULL default NULL;")
_run_sql_ignore("ALTER TABLE hstTASK ADD INDEX sequenceid (sequenceid);")
_run_sql_ignore("ALTER TABLE session CHANGE session_object session_object longblob;")
_run_sql_ignore("ALTER TABLE session CHANGE session_expiry session_expiry datetime NOT NULL default '0000-00-00 00:00:00';")
_run_sql_ignore("ALTER TABLE oaiREPOSITORY CHANGE setSpec setSpec varchar(255) NOT NULL default 'GLOBAL_SET';")
_run_sql_ignore("UPDATE oaiREPOSITORY SET setSpec='GLOBAL_SET' WHERE setSpec='';")
_run_sql_ignore("ALTER TABLE user_query_basket ADD COLUMN alert_desc TEXT DEFAULT NULL AFTER alert_name;")
_run_sql_ignore("INSERT INTO sbmALLFUNCDESCR VALUES ('Link_Records','Link two records toghether via MARC');")
_run_sql_ignore("INSERT INTO sbmALLFUNCDESCR VALUES ('Video_Processing',NULL);")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Link_Records','edsrn');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Link_Records','edsrn2');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Link_Records','directRelationship');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Link_Records','reverseRelationship');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Link_Records','keep_original_edsrn2');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Video_Processing','aspect');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Video_Processing','batch_template');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Video_Processing','title');")
_run_sql_ignore("INSERT INTO sbmALLFUNCDESCR VALUES ('Set_RN_From_Sysno', 'Set the value of global rn variable to the report number identified by sysno (recid)');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Set_RN_From_Sysno','edsrn');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Set_RN_From_Sysno','rep_tags');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Set_RN_From_Sysno','record_search_pattern');")
_run_sql_ignore("UPDATE externalcollection SET name='INSPIRE' where name='SPIRES HEP';")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Report_Number_Generation','initialvalue');")
_run_sql_ignore("INSERT INTO sbmALLFUNCDESCR VALUES ('Notify_URL','Access URL, possibly to post content');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Notify_URL','url');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Notify_URL','data');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Notify_URL','admin_emails');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Notify_URL','content_type');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Notify_URL','attempt_times');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Notify_URL','attempt_sleeptime');")
_run_sql_ignore("INSERT INTO sbmFUNDESC VALUES ('Notify_URL','user');")
_run_sql_ignore("ALTER TABLE bibfmt DROP COLUMN id;")
_run_sql_ignore("ALTER TABLE bibfmt ADD PRIMARY KEY (id_bibrec, format);")
_run_sql_ignore("ALTER TABLE bibfmt DROP KEY id_bibrec;")
_run_sql_ignore("ALTER TABLE bibfmt ADD KEY last_updated (last_updated);")
_run_sql_ignore("ALTER TABLE user_query_basket ADD COLUMN alert_recipient TEXT DEFAULT NULL AFTER alert_desc;")
_run_sql_ignore("ALTER TABLE format ADD COLUMN last_updated datetime NOT NULL default '0000-00-00' AFTER visibility;")
- _run_sql_ignore("REPLACE INTO sbmFIELDDESC VALUES ('Upload_Files',NULL,'','R',NULL,NULL,NULL,NULL,NULL,'\"\"\"\r\nThis is an example of element that creates a file upload interface.\r\nClone it, customize it and integrate it into your submission. Then add function \r\n\\'Move_Uploaded_Files_to_Storage\\' to your submission functions list, in order for files \r\nuploaded with this interface to be attached to the record. More information in \r\nthe WebSubmit admin guide.\r\n\"\"\"\r\nfrom invenio.bibdocfile_managedocfiles import create_file_upload_interface\r\nfrom invenio.websubmit_functions.Shared_Functions import ParamFromFile\r\n\r\nindir = ParamFromFile(os.path.join(curdir, \\'indir\\'))\r\ndoctype = ParamFromFile(os.path.join(curdir, \\'doctype\\'))\r\naccess = ParamFromFile(os.path.join(curdir, \\'access\\'))\r\ntry:\r\n sysno = int(ParamFromFile(os.path.join(curdir, \\'SN\\')).strip())\r\nexcept:\r\n sysno = -1\r\nln = ParamFromFile(os.path.join(curdir, \\'ln\\'))\r\n\r\n\"\"\"\r\nRun the following to get the list of parameters of function \\'create_file_upload_interface\\':\r\necho -e \\'from invenio.bibdocfile_managedocfiles import create_file_upload_interface as f\\nprint f.__doc__\\' | python\r\n\"\"\"\r\ntext = create_file_upload_interface(recid=sysno,\r\n print_outside_form_tag=False,\r\n include_headers=True,\r\n ln=ln,\r\n doctypes_and_desc=[(\\'main\\',\\'Main document\\'),\r\n (\\'additional\\',\\'Figure, schema, etc.\\')],\r\n can_revise_doctypes=[\\'*\\'],\r\n can_describe_doctypes=[\\'main\\'],\r\n can_delete_doctypes=[\\'additional\\'],\r\n can_rename_doctypes=[\\'main\\'],\r\n sbm_indir=indir, sbm_doctype=doctype, sbm_access=access)[1]\r\n','0000-00-00','0000-00-00',NULL,NULL,0);")
+ _run_sql_ignore("REPLACE INTO sbmFIELDDESC VALUES ('Upload_Files',NULL,'','R',NULL,NULL,NULL,NULL,NULL,'\"\"\"\r\nThis is an example of element that creates a file upload interface.\r\nClone it, customize it and integrate it into your submission. Then add function \r\n\\'Move_Uploaded_Files_to_Storage\\' to your submission functions list, in order for files \r\nuploaded with this interface to be attached to the record. More information in \r\nthe WebSubmit admin guide.\r\n\"\"\"\r\nfrom invenio.legacy.bibdocfile.managedocfiles import create_file_upload_interface\r\nfrom invenio.websubmit_functions.Shared_Functions import ParamFromFile\r\n\r\nindir = ParamFromFile(os.path.join(curdir, \\'indir\\'))\r\ndoctype = ParamFromFile(os.path.join(curdir, \\'doctype\\'))\r\naccess = ParamFromFile(os.path.join(curdir, \\'access\\'))\r\ntry:\r\n sysno = int(ParamFromFile(os.path.join(curdir, \\'SN\\')).strip())\r\nexcept:\r\n sysno = -1\r\nln = ParamFromFile(os.path.join(curdir, \\'ln\\'))\r\n\r\n\"\"\"\r\nRun the following to get the list of parameters of function \\'create_file_upload_interface\\':\r\necho -e \\'from invenio.legacy.bibdocfile.managedocfiles import create_file_upload_interface as f\\nprint f.__doc__\\' | python\r\n\"\"\"\r\ntext = create_file_upload_interface(recid=sysno,\r\n print_outside_form_tag=False,\r\n include_headers=True,\r\n ln=ln,\r\n doctypes_and_desc=[(\\'main\\',\\'Main document\\'),\r\n (\\'additional\\',\\'Figure, schema, etc.\\')],\r\n can_revise_doctypes=[\\'*\\'],\r\n can_describe_doctypes=[\\'main\\'],\r\n can_delete_doctypes=[\\'additional\\'],\r\n can_rename_doctypes=[\\'main\\'],\r\n sbm_indir=indir, sbm_doctype=doctype, sbm_access=access)[1]\r\n','0000-00-00','0000-00-00',NULL,NULL,0);")
def pre_upgrade():
"""
Run pre-upgrade check conditions to ensure the upgrade can be applied
without problems.
"""
logger = logging.getLogger('invenio_upgrader')
global DB_VERSION # Needed because we assign to it
DB_VERSION = _invenio_schema_version_guesser()
if DB_VERSION == 'unknown':
raise RuntimeError("Your Invenio database schema version could not be"
" determined. Please upgrade to Invenio v1.0.0 first.")
if DB_VERSION == 'pre-0.99.0':
raise RuntimeError("Upgrading from Invenio versions prior to 0.99 is"
" not supported. Please upgrade to 0.99.0 first.")
if DB_VERSION == '0.99.0':
raise RuntimeError("It seems like you are running Invenio version "
"0.99.0. Please run the upgrade in the following special way:\n"
"make install\ninveniocfg --update-all\n"
"make update-v0.99.0-tables\nmake update-v0.99.6-tables\n"
"inveniocfg --upgrade\n\nNote: Most warnings printed during "
"inveniocfg --upgrade can safely be ignored when upgrading from"
" Invenio 0.99.0.")
if DB_VERSION in ['0.99.x', '0.99.x-1.0.0']:
raise RuntimeError("It seems like you are running Invenio version "
"v0.99.1-v0.99.x. Please run the upgrade in the following special"
" way:\nmake install\ninveniocfg --update-all\n"
"make update-v0.99.6-tables\n"
"inveniocfg --upgrade\n\nNote: Most warnings printed during "
"inveniocfg --upgrade can safely be ignored when upgrading from"
" Invenio v0.99.1-0.99.x.")
if DB_VERSION == 'master':
warnings.warn("Invenio database schema is on a development version"
" between 1.0.x and 1.1.0")
# Run import here, since we are on 1.0-1.1 we know the import will work
from invenio.utils.text import wait_for_user
try:
wait_for_user("\nUPGRADING TO 1.1.0 FROM A DEVELOPMENT VERSION"
" WILL LEAD TO MANY WARNINGS! Please thoroughly"
" test the upgrade on non-production systems first,"
" and pay close attention to warnings.\n")
except SystemExit:
raise RuntimeError("Upgrade aborted by user.")
else:
logger.info("Invenio version v%s detected." % DB_VERSION)
# ==============
# Helper methods
# ==============
def _create_table(tables, tblname, ddl_stmt):
""" Create table if it does not already exsits """
if tblname not in tables:
run_sql(ddl_stmt)
else:
res = run_sql('SHOW CREATE TABLE %s' % tblname)
your_ddl = res[0][1]
warnings.warn("Table '%s' already exists but was not supposed to."
" Please manually compare the CREATE-statment used to create"
" the table in your database:\n\n%s\n\n"
"against the following CREATE-statement:\n%s\n"
% (tblname, "\n".join([x.strip() for x in your_ddl.splitlines()])
, "\n".join([x.strip() for x in ddl_stmt.splitlines()])))
def _get_table_info(tblname):
""" Retrieve fields and indexes in table. """
try:
tblinfo = {'fields': {}, 'indexes': {}}
for f in run_sql("SHOW FIELDS FROM %s" % tblname):
tblinfo['fields'][f[0]] = f[1:]
for f in run_sql("SHOW INDEXES FROM %s" % tblname):
tblinfo['indexes'][f[2]] = f
return tblinfo
except Exception:
return {'fields': {}, 'indexes': {}}
def _get_tables():
""" Retrieve list of tables in current database. """
return [x[0] for x in run_sql("SHOW TABLES;")]
def _invenio_schema_version_guesser():
""" Introspect database to guess Invenio schema version
Note 1.0.1 did not have any database changes thus 1.0.0 and 1.0.1 schemas
are identical.
@return: One of the values pre-0.99.0, 0.99.0, 0.99.x, 0.99.x-1.0.0, 1.0.0,
1.0.2, master, unknown
"""
tables = [x[0] for x in run_sql("SHOW TABLES;")]
invenio_version = {
'pre-0.99.0': 0,
'0.99.0': 0,
'0.99.x': 0,
'1.0.0': 0,
'1.0.2': 0,
'1.1.0': 0,
}
# 0.92.x indicators
if 'oaiHARVEST' in tables:
tblinfo = _get_table_info('oaiHARVEST')
if 'bibfilterprogram' not in tblinfo['fields']:
invenio_version['pre-0.99.0'] += 1
if 'idxINDEX' in tables:
tblinfo = _get_table_info('idxINDEX')
if 'stemming_language' not in tblinfo['fields']:
invenio_version['pre-0.99.0'] += 1
if 'format' in tables:
tblinfo = _get_table_info('format')
if 'visibility' not in tblinfo['fields']:
invenio_version['pre-0.99.0'] += 1
# 0.99.0 indicators
if 'bibdoc' in tables:
tblinfo = _get_table_info('bibdoc')
if 'more_info' not in tblinfo['fields'] and \
'creation_date' in tblinfo['indexes']:
invenio_version['0.99.0'] += 1
if 'schTASK' in tables:
tblinfo = _get_table_info('schTASK')
if 'priority' not in tblinfo['fields']:
invenio_version['0.99.0'] += 1
if 'sbmAPPROVAL' in tables:
tblinfo = _get_table_info('sbmAPPROVAL')
if 'note' not in tblinfo['fields']:
invenio_version['0.99.0'] += 1
# 0.99.1-5 indicators
if 'oaiARCHIVE' in tables and 'oaiREPOSITORY' not in tables:
if 'sbmAPPROVAL' in tables:
tblinfo = _get_table_info('sbmAPPROVAL')
if 'note' in tblinfo['fields']:
invenio_version['0.99.x'] += 1
if 'bibdoc' in tables:
tblinfo = _get_table_info('bibdoc')
if 'text_extraction_date' not in tblinfo['fields'] \
and 'more_info' in tblinfo['fields']:
invenio_version['0.99.x'] += 1
if 'collection' in tables:
tblinfo = _get_table_info('collection')
if 'restricted' in tblinfo['fields']:
invenio_version['0.99.x'] += 1
# 1.0.0 indicators
if 'collection' in tables:
tblinfo = _get_table_info('collection')
if 'restricted' not in tblinfo['fields']:
invenio_version['1.0.0'] += 1
if 'oaiARCHIVE' not in tables and 'oaiREPOSITORY' in tables:
invenio_version['1.0.0'] += 1
if 'cmtRECORDCOMMENT' in tables:
tblinfo = _get_table_info('cmtRECORDCOMMENT')
if 'status' in tblinfo['fields']:
invenio_version['1.0.0'] += 1
# 1.0.2 indicators
if 'session' in tables:
tblinfo = _get_table_info('session')
if 'session_expiry' in tblinfo['indexes']:
invenio_version['1.0.2'] += 1
# '1.1.0/master' indicators
if 'accMAILCOOKIE' in tables:
tblinfo = _get_table_info('accMAILCOOKIE')
if 'expiration' in tblinfo['indexes']:
invenio_version['1.1.0'] += 1
if 'schTASK' in tables:
tblinfo = _get_table_info('schTASK')
if 'sequenceid' in tblinfo['fields']:
invenio_version['1.1.0'] += 1
if 'format' in tables:
tblinfo = _get_table_info('format')
if 'last_updated' in tblinfo['fields']:
invenio_version['1.1.0'] += 1
if invenio_version['pre-0.99.0'] > 1:
return 'pre-0.99.0'
if invenio_version['0.99.0'] >= 1:
return '0.99.0'
if invenio_version['0.99.x'] >= 1:
if invenio_version['1.0.0'] == 0:
return '0.99.x'
else:
return '0.99.x-1.0.0'
if invenio_version['1.0.0'] >= 1:
if invenio_version['1.1.0'] == 0:
if invenio_version['1.0.2'] == 0:
return '1.0.0'
else:
return '1.0.2'
else:
return 'master'
return 'unknown'
def _run_sql_ignore(query, *args, **kwargs):
""" Execute SQL query but ignore any errors. """
try:
run_sql(query, *args, **kwargs)
except Exception, e:
warnings.warn("Failed to execute query %s: %s" % (query, unicode(e)))
diff --git a/invenio/modules/workflows/tasks/marcxml_tasks.py b/invenio/modules/workflows/tasks/marcxml_tasks.py
index c36a742cf..55a738f2d 100644
--- a/invenio/modules/workflows/tasks/marcxml_tasks.py
+++ b/invenio/modules/workflows/tasks/marcxml_tasks.py
@@ -1,142 +1,142 @@
## This file is part of Invenio.
## Copyright (C) 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
def add_metadata_to_extra_data(obj, eng):
"""
Creates bibrecord from object data and
populates extra_data with metadata
"""
from invenio.legacy.bibrecord import create_record, record_get_field_value
record = create_record(obj.data)
obj.extra_data['redis_search']['category'] =\
record_get_field_value(record[0], '037', code='c')
obj.extra_data['redis_search']['title'] =\
record_get_field_value(record[0], '245', code='a')
obj.extra_data['redis_search']['source'] =\
record_get_field_value(record[0], '035', code='9')
add_metadata_to_extra_data.__title__ = "Metadata Extraction"
add_metadata_to_extra_data.__description__ = "Populates object's extra_data with metadata"
def approve_record(obj, eng):
"""
Will add the approval widget to the record
"""
obj.extra_data["last_task_name"] = 'Record Approval'
try:
obj.extra_data['message'] = 'Record needs approval. Click on widget to resolve.'
obj.extra_data['widget'] = 'approval_widget'
eng.halt("Record needs approval")
except KeyError:
# Log the error
obj.extra_data["error_msg"] = 'Could not assign widget'
approve_record.__title__ = "Record Approval"
approve_record.__description__ = "This task assigns the approval widget to a record."
def convert_record(stylesheet="oaiarxiv2marcxml.xsl"):
def _convert_record(obj, eng):
"""
Will convert the object data, if XML, using the given stylesheet
"""
from invenio.bibconvert_xslt_engine import convert
obj.extra_data["last_task_name"] = 'Convert Record'
try:
obj.data = convert(obj.data, stylesheet)
except:
obj.extra_data["error_msg"] = 'Could not convert record'
raise
_convert_record.__title__ = "Convert Record"
_convert_record.__description__ = "This task converts a XML record."
return _convert_record
def download_fulltext(obj, eng):
"""
Will download the fulltext document
"""
- from invenio.bibdocfile import download_url
+ from invenio.legacy.bibdocfile.api import download_url
obj.extra_data["last_task_name"] = 'Download Fulltext'
try:
eng.log_info("Starting download of %s" % (obj.data['url']))
url = download_url(obj.data['url'])
obj.extra_data['tasks_results']['fulltext_url'] = url
except KeyError:
# Log the error
obj.extra_data["error_msg"] = 'Record does not include url'
eng.log.error("Error: %s" % (obj.extra_data["error_msg"],))
download_fulltext.__title__ = "Fulltext Download"
download_fulltext.__description__ = "This task downloads fulltext."
def match_record(obj, eng):
"""
Will try to find matches in stored records
"""
from invenio.legacy.bibrecord import create_record
from invenio.bibmatch_engine import match_records
obj.extra_data["last_task_name"] = 'Bibmatch Record'
rec = create_record(obj.data)
matches = match_records(records=[rec],
qrystrs=[("title", "[245__a]")])
obj.extra_data['tasks_results']['match_record'] = matches
if matches[2] or matches[3]:
# we have ambiguous or fuzzy results
# render holding pen corresponding template
eng.halt("Match resolution needed")
elif matches[0]:
pass
else:
results = matches[1][0][1]
obj.extra_data['widget'] = 'bibmatch_widget'
match_record.__title__ = "Bibmatch Record"
match_record.__description__ = "This task matches a XML record."
def print_record(obj, eng):
eng.log_info(obj.get_data())
print_record.__title__ = "Print Record"
print_record.__description__ = "Prints the record data to engine log"
def upload_record(mode="ir"):
def _upload_record(obj, eng):
- from invenio.bibtask import task_low_level_submission
+ from invenio.legacy.bibsched.bibtask import task_low_level_submission
obj.extra_data["last_task_name"] = 'Upload Record'
eng.log_info("Saving data to temporary file for upload")
filename = obj.save_to_file()
params = ["-%s" % (mode,), filename]
task_id = task_low_level_submission("bibupload", "bibworkflow",
*tuple(params))
eng.log_info("Submitted task #%s" % (task_id,))
_upload_record.__title__ = "Upload Record"
_upload_record.__description__ = "Uploads the record using BibUpload"
return _upload_record
diff --git a/invenio/modules/workflows/workers/worker_celery.py b/invenio/modules/workflows/workers/worker_celery.py
index 0e9be7870..cb81f07d5 100644
--- a/invenio/modules/workflows/workers/worker_celery.py
+++ b/invenio/modules/workflows/workers/worker_celery.py
@@ -1,83 +1,83 @@
## This file is part of Invenio.
## Copyright (C) 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
-from invenio.bibworkflow_worker_engine import (run_worker,
- restart_worker,
- continue_worker)
from invenio.celery import celery
@celery.task(name='invenio.bibworkflow_workers.worker_celery.run_worker')
def celery_run(workflow_name, data, **kwargs):
"""
Runs the workflow with Celery
"""
+ from invenio.bibworkflow_worker_engine import run_worker
run_worker(workflow_name, data, **kwargs)
@celery.task(name='invenio.bibworkflow_workers.worker_celery.restart_worker')
def celery_restart(wid, **kwargs):
"""
Restarts the workflow with Celery
"""
+ from invenio.bibworkflow_worker_engine import restart_worker
restart_worker(wid, **kwargs)
@celery.task(name='invenio.bibworkflow_workers.worker_celery.continue_worker')
def celery_continue(oid, restart_point, **kwargs):
"""
Restarts the workflow with Celery
"""
+ from invenio.bibworkflow_worker_engine import continue_worker
continue_worker(oid, restart_point, **kwargs)
class worker_celery(object):
def run_worker(self, workflow_name, data, **kwargs):
"""
Helper function to get celery task
decorators to worker_celery
@param workflow_name: name of the workflow to be run
@type workflow_name: string
@param data: list of objects for the workflow
@type data: list
"""
return celery_run.delay(workflow_name, data, **kwargs)
def restart_worker(self, wid, **kwargs):
"""
Helper function to get celery task
decorators to worker_celery
@param wid: uuid of the workflow to be run
@type wid: string
"""
return celery_restart.delay(wid, **kwargs)
def continue_worker(self, oid, restart_point, **kwargs):
"""
Helper function to get celery task
decorators to worker_celery
@param oid: uuid of the object to be started
@type oid: string
@param restart_point: sets the start point
@type restart_point: string
"""
return celery_continue.delay(oid, restart_point, **kwargs)
diff --git a/invenio/testsuite/test_ext_email.py b/invenio/testsuite/test_ext_email.py
index c748f2c80..af8d9ced4 100644
--- a/invenio/testsuite/test_ext_email.py
+++ b/invenio/testsuite/test_ext_email.py
@@ -1,285 +1,290 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Test unit for the miscutil/mailutils module.
"""
import os
import sys
+import pkg_resources
from base64 import encodestring
from StringIO import StringIO
from flask import current_app
from invenio.ext.email import send_email
from invenio.testsuite import make_test_suite, run_test_suite, InvenioTestCase
class MailTestCase(InvenioTestCase):
EMAIL_BACKEND = 'flask.ext.email.backends.console.Mail'
def setUp(self):
super(MailTestCase, self).setUp()
current_app.config['EMAIL_BACKEND'] = self.EMAIL_BACKEND
self.__stdout = sys.stdout
self.stream = sys.stdout = StringIO()
def tearDown(self):
del self.stream
sys.stdout = self.__stdout
del self.__stdout
super(MailTestCase, self).tearDown()
def flush_mailbox(self):
self.stream = sys.stdout = StringIO()
#def get_mailbox_content(self):
# messages = self.stream.getvalue().split('\n' + ('-' * 79) + '\n')
# return [message_from_string(m) for m in messages if m]
class TestMailUtils(MailTestCase):
"""
mailutils TestSuite.
"""
def test_console_send_email(self):
"""
Test that the console backend can be pointed at an arbitrary stream.
"""
msg_content = """Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Subject: Subject
From: from@example.com
To: to@example.com"""
send_email('from@example.com', ['to@example.com'], subject='Subject',
content='Content')
self.assertIn(msg_content, sys.stdout.getvalue())
self.flush_mailbox()
send_email('from@example.com', 'to@example.com', subject='Subject',
content='Content')
self.assertIn(msg_content, sys.stdout.getvalue())
self.flush_mailbox()
def test_email_text_template(self):
"""
Test email text template engine.
"""
from invenio.ext.template import render_template_to_string
contexts = {
'ctx1': {'content': 'Content 1'},
'ctx2': {'content': 'Content 2', 'header': 'Header 2'},
'ctx3': {'content': 'Content 3', 'footer': 'Footer 3'},
'ctx4': {'content': 'Content 4', 'header': 'Header 4', 'footer': 'Footer 4'}
}
msg_content = """Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Subject: %s
From: from@example.com
To: to@example.com"""
for name, ctx in contexts.iteritems():
msg = render_template_to_string('mail_text.tpl', **ctx)
send_email('from@example.com', ['to@example.com'], subject=name,
**ctx)
email = sys.stdout.getvalue()
self.assertIn(msg_content % name, email)
self.assertIn(msg, email)
self.flush_mailbox()
def test_email_html_template(self):
"""
Test email html template engine.
"""
from invenio.ext.template import render_template_to_string
contexts = {
'ctx1': {'html_content': '<b>Content 1</b>'},
'ctx2': {'html_content': '<b>Content 2</b>',
'html_header': '<h1>Header 2</h1>'},
'ctx3': {'html_content': '<b>Content 3</b>',
'html_footer': '<i>Footer 3</i>'},
'ctx4': {'html_content': '<b>Content 4</b>',
'html_header': '<h1>Header 4</h1>',
'html_footer': '<i>Footer 4</i>'}
}
def strip_html_key(ctx):
return dict(map(lambda (k, v): (k[5:], v), ctx.iteritems()))
for name, ctx in contexts.iteritems():
msg = render_template_to_string('mail_html.tpl',
**strip_html_key(ctx))
send_email('from@example.com', ['to@example.com'], subject=name,
content='Content Text', **ctx)
email = sys.stdout.getvalue()
self.assertIn('Content-Type: multipart/alternative;', email)
self.assertIn('Content Text', email)
self.assertIn(msg, email)
self.flush_mailbox()
def test_email_html_image(self):
"""
Test sending html message with an image.
"""
- from invenio.config import CFG_WEBDIR
html_images = {
- 'img1': os.path.join(CFG_WEBDIR, 'img', 'journal_water_dog.gif')
+ 'img1': pkg_resources.resource_filename(
+ 'invenio.base',
+ os.path.join('static', 'img', 'journal_water_dog.gif')
+ )
}
send_email('from@example.com', ['to@example.com'],
subject='Subject', content='Content Text',
html_content='<img src="cid:img1"/>',
html_images=html_images)
email = sys.stdout.getvalue()
self.assertIn('Content Text', email)
self.assertIn('<img src="cid:img1"/>', email)
with open(html_images['img1'], 'r') as f:
self.assertIn(encodestring(f.read()), email)
self.flush_mailbox()
def test_sending_attachment(self):
"""
Test sending email with an attachment.
"""
- from invenio.config import CFG_WEBDIR
attachments = [
- os.path.join(CFG_WEBDIR, 'img', 'journal_header.png')
+ pkg_resources.resource_filename(
+ 'invenio.base',
+ os.path.join('static', 'img', 'journal_header.png')
+ )
]
send_email('from@example.com', ['to@example.com'],
subject='Subject', content='Content Text',
attachments=attachments)
email = sys.stdout.getvalue()
self.assertIn('Content Text', email)
# First attachemnt is image/png
self.assertIn('Content-Type: image/png', email)
for attachment in attachments:
with open(attachment, 'r') as f:
self.assertIn(encodestring(f.read()), email)
self.flush_mailbox()
def test_bbc_undisclosed_recipients(self):
"""
Test that the email receivers are hidden.
"""
msg_content = """Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Subject: Subject
From: from@example.com
To: Undisclosed.Recipients:"""
send_email('from@example.com', ['to@example.com', 'too@example.com'],
subject='Subject', content='Content')
email = sys.stdout.getvalue()
self.assertIn(msg_content, email)
self.assertIn('Bcc: to@example.com,too@example.com', email)
self.flush_mailbox()
send_email('from@example.com', 'to@example.com, too@example.com',
subject='Subject', content='Content')
email = sys.stdout.getvalue()
self.assertIn(msg_content, email)
self.assertIn('Bcc: to@example.com,too@example.com', email)
self.flush_mailbox()
class TestAdminMailBackend(MailTestCase):
EMAIL_BACKEND = 'invenio.ext.email.backends.console_adminonly.Mail'
ADMIN_MESSAGE = "This message would have been sent to the following recipients"
def test_simple_email_header(self):
"""
Test simple email header.
"""
from invenio.config import CFG_SITE_ADMIN_EMAIL
from invenio.ext.template import render_template_to_string
msg_content = """Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Subject: Subject
From: from@example.com
To: %s""" % (CFG_SITE_ADMIN_EMAIL, )
msg = render_template_to_string('mail_text.tpl', content='Content')
self.flush_mailbox()
send_email('from@example.com', ['to@example.com'], subject='Subject',
content='Content')
email = self.stream.getvalue()
self.assertIn(msg_content, email)
self.assertIn(self.ADMIN_MESSAGE, email)
self.assertNotIn('Bcc:', email)
self.assertIn(msg, email)
self.flush_mailbox()
send_email('from@example.com', 'to@example.com', subject='Subject',
content='Content')
email = self.stream.getvalue()
self.assertIn(msg_content, email)
self.assertIn(self.ADMIN_MESSAGE, email)
self.assertNotIn('Bcc:', email)
self.assertIn(msg, email)
self.flush_mailbox()
def test_cc_bcc_headers(self):
"""
Test that no Cc and Bcc headers are sent.
"""
from invenio.config import CFG_SITE_ADMIN_EMAIL
msg_content = """Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Subject: Subject
From: from@example.com
To: %s""" % (CFG_SITE_ADMIN_EMAIL, )
send_email('from@example.com', ['to@example.com', 'too@example.com'],
subject='Subject', content='Content')
email = self.stream.getvalue()
self.assertIn(msg_content, email)
self.assertIn(self.ADMIN_MESSAGE, email)
self.assertIn('to@example.com,too@example.com', email)
self.assertNotIn('Bcc: to@example.com,too@example.com', email)
self.flush_mailbox()
send_email('from@example.com', 'to@example.com, too@example.com',
subject='Subject', content='Content')
email = self.stream.getvalue()
self.assertIn(msg_content, email)
self.assertIn(self.ADMIN_MESSAGE, email)
self.assertIn('to@example.com,too@example.com', email)
self.assertNotIn('Bcc: to@example.com,too@example.com', email)
self.flush_mailbox()
TEST_SUITE = make_test_suite(TestMailUtils, TestAdminMailBackend)
if __name__ == "__main__":
run_test_suite(TEST_SUITE)
diff --git a/invenio/testsuite/test_utils_memoise.py b/invenio/testsuite/test_utils_memoise.py
index 941ff050e..47ceb0ec1 100644
--- a/invenio/testsuite/test_utils_memoise.py
+++ b/invenio/testsuite/test_utils_memoise.py
@@ -1,40 +1,40 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Unit tests for the memoise facility.
"""
from invenio.testsuite import make_test_suite, run_test_suite, InvenioTestCase
class MemoiseTest(InvenioTestCase):
"""Unit test cases for Memoise."""
def test_memoise_fib(self):
"""memoiseutils - test fib() memoisation"""
from invenio.utils.memoise import Memoise
- from invenio.bibtaskex import fib
+ from invenio.legacy.bibsched.bibtaskex import fib
fib_memoised = Memoise(fib)
self.assertEqual(fib(17), fib_memoised(17))
TEST_SUITE = make_test_suite(MemoiseTest, )
if __name__ == "__main__":
run_test_suite(TEST_SUITE)
diff --git a/invenio/testsuite/test_utils_shell.py b/invenio/testsuite/test_utils_shell.py
index 27bb726f1..6158391f4 100644
--- a/invenio/testsuite/test_utils_shell.py
+++ b/invenio/testsuite/test_utils_shell.py
@@ -1,211 +1,212 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Unit tests for shellutils library."""
__revision__ = "$Id$"
import time
import os
from invenio.utils.shell import escape_shell_arg, run_shell_command, \
run_process_with_timeout, Timeout, split_cli_ids_arg
from invenio.testsuite import make_test_suite, run_test_suite, InvenioTestCase
class EscapeShellArgTest(InvenioTestCase):
"""Testing of escaping shell arguments."""
def test_escape_simple(self):
"""shellutils - escaping simple strings"""
self.assertEqual("'hello'",
escape_shell_arg("hello"))
def test_escape_backtick(self):
"""shellutils - escaping strings containing backticks"""
self.assertEqual(r"'hello `world`'",
escape_shell_arg(r'hello `world`'))
def test_escape_quoted(self):
"""shellutils - escaping strings containing single quotes"""
self.assertEqual("'hello'\\''world'",
escape_shell_arg("hello'world"))
def test_escape_double_quoted(self):
"""shellutils - escaping strings containing double-quotes"""
self.assertEqual("""'"hello world"'""",
escape_shell_arg('"hello world"'))
def test_escape_complex_quoted(self):
"""shellutils - escaping strings containing complex quoting"""
self.assertEqual(r"""'"Who is this `Eve'\'', Bob?", asked Alice.'""",
escape_shell_arg(r""""Who is this `Eve', Bob?", asked Alice."""))
def test_escape_windows_style_path(self):
"""shellutils - escaping strings containing windows-style file paths"""
self.assertEqual(r"'C:\Users\Test User\My Documents" \
"\funny file name (for testing).pdf'",
escape_shell_arg(r'C:\Users\Test User\My Documents' \
'\funny file name (for testing).pdf'))
def test_escape_unix_style_path(self):
"""shellutils - escaping strings containing unix-style file paths"""
self.assertEqual(r"'/tmp/z_temp.txt'",
escape_shell_arg(r'/tmp/z_temp.txt'))
def test_escape_number_sign(self):
"""shellutils - escaping strings containing the number sign"""
self.assertEqual(r"'Python comments start with #.'",
escape_shell_arg(r'Python comments start with #.'))
def test_escape_ampersand(self):
"""shellutils - escaping strings containing ampersand"""
self.assertEqual(r"'Today the weather is hot & sunny'",
escape_shell_arg(r'Today the weather is hot & sunny'))
def test_escape_greater_than(self):
"""shellutils - escaping strings containing the greater-than sign"""
self.assertEqual(r"'10 > 5'",
escape_shell_arg(r'10 > 5'))
def test_escape_less_than(self):
"""shellutils - escaping strings containing the less-than sign"""
self.assertEqual(r"'5 < 10'",
escape_shell_arg(r'5 < 10'))
class RunShellCommandTest(InvenioTestCase):
"""Testing of running shell commands."""
def test_run_cmd_hello(self):
"""shellutils - running simple command"""
self.assertEqual((0, "hello world\n", ''),
run_shell_command("echo 'hello world'"))
def test_run_cmd_hello_args(self):
"""shellutils - running simple command with an argument"""
self.assertEqual((0, "hello world\n", ''),
run_shell_command("echo 'hello %s'", ("world",)))
def test_run_cmd_hello_quote(self):
"""shellutils - running simple command with an argument with quote"""
self.assertEqual((0, "hel'lo world\n", ''),
run_shell_command("echo %s %s", ("hel'lo", "world",)))
def test_run_cmd_errorneous(self):
"""shellutils - running wrong command should raise an exception"""
self.assertRaises(TypeError, run_shell_command,
"echo %s %s %s", ("hello", "world",))
class RunProcessWithTimeoutTest(InvenioTestCase):
"""Testing of running a process with timeout."""
def setUp(self):
- from invenio.config import CFG_TMPDIR
- self.script_path = os.path.join(CFG_TMPDIR, 'test_sleeping.sh')
+ from flask import current_app
+ self.script_path = os.path.join(current_app.instance_path, 'test_sleeping.sh')
script = open(self.script_path, 'w')
print >> script, "#!/bin/sh"
print >> script, "date"
print >> script, "echo 'foo'"
print >> script, "echo 'bar' > /dev/stderr"
print >> script, "sleep $1"
print >> script, "date"
script.close()
os.chmod(self.script_path, 0700)
- self.python_script_path = os.path.join(CFG_TMPDIR, 'test_sleeping.py')
+ self.python_script_path = os.path.join(current_app.instance_path, 'test_sleeping.py')
script = open(self.python_script_path, 'w')
print >> script, """\
#!/usr/bin/env python
import os
print os.getpid(), os.getpgrp()
if os.getpid() == os.getpgrp():
print "PID == PGID"
else:
print "PID != PGID"
"""
script.close()
os.chmod(self.python_script_path, 0700)
def tearDown(self):
os.remove(self.script_path)
os.remove(self.python_script_path)
def test_run_cmd_timeout(self):
"""shellutils - running simple command with expiring timeout"""
t1 = time.time()
self.assertRaises(Timeout, run_process_with_timeout, (self.script_path, '15'), timeout=5)
self.failUnless(time.time() - t1 < 8, "%s < 8" % (time.time() - t1))
def test_run_cmd_timeout_no_zombie(self):
"""shellutils - running simple command no zombie"""
self.assertRaises(Timeout, run_process_with_timeout, (self.script_path, '15', "THISISATEST"), timeout=5)
ps_output = run_shell_command('ps aux')[1]
self.failIf('THISISATEST' in ps_output, '"THISISATEST" was found in %s' % ps_output)
self.failIf('sleep 15' in ps_output, '"sleep 15" was found in %s' % ps_output)
def test_run_cmd_timeout_no_timeout(self):
"""shellutils - running simple command without expiring timeout"""
exitstatus, stdout, stderr = run_process_with_timeout([self.script_path, '5'], timeout=10)
self.failUnless('foo' in stdout)
self.failUnless('bar' in stderr)
self.assertEqual(exitstatus, 0)
def test_run_cmd_timeout_big_stdout(self):
"""shellutils - running simple command with a big standard output"""
- from invenio.config import CFG_PYLIBDIR
- test_file = os.path.join(CFG_PYLIBDIR, 'invenio', 'bibcirculation_templates.py')
+ import pkg_resources
+ #FIXME this file will be removed soon
+ test_file = pkg_resources.resource_filename('invenio.legacy.bibcirculation', 'templates.py')
exitstatus, stdout, stderr = run_process_with_timeout(['cat', test_file], timeout=10)
self.assertEqual(open(test_file).read(), stdout)
self.assertEqual(exitstatus, 0)
def test_run_cmd_timeout_pgid(self):
"""shellutils - running simple command should have PID == PGID"""
exitstatus, stdout, stderr = run_process_with_timeout([self.python_script_path, '5'])
self.failIf('PID != PGID' in stdout, 'PID != PGID was found in current output: %s (%s)' % (stdout, stderr))
self.failUnless('PID == PGID' in stdout, 'PID == PGID wasn\'t found in current output: %s (%s)' % (stdout, stderr))
def test_run_cmd_viasudo_no_password(self):
"""shellutils - running simple command via sudo should not wait for password"""
exitstatus, stdout, stderr = run_process_with_timeout([self.script_path, '5'], timeout=10, sudo='foo')
self.assertNotEqual(exitstatus, 0)
class SplitIdsTest(InvenioTestCase):
def test_one(self):
self.assertEqual(split_cli_ids_arg("1"), set([1]))
def test_range(self):
self.assertEqual(split_cli_ids_arg("1-5"), set([1, 2, 3, 4, 5]))
def test_multiple(self):
self.assertEqual(split_cli_ids_arg("1,5,7"), set([1, 5, 7]))
def test_complex(self):
self.assertEqual(split_cli_ids_arg("1-1,7,10-11,4"), set([1, 4, 7, 10, 11]))
TEST_SUITE = make_test_suite(EscapeShellArgTest,
RunShellCommandTest,
RunProcessWithTimeoutTest,
SplitIdsTest)
if __name__ == "__main__":
run_test_suite(TEST_SUITE)
diff --git a/invenio/testsuite/test_utils_solr.py b/invenio/testsuite/test_utils_solr.py
index b914eddc6..d921274a6 100644
--- a/invenio/testsuite/test_utils_solr.py
+++ b/invenio/testsuite/test_utils_solr.py
@@ -1,78 +1,82 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Unit tests for the solrutils library."""
from invenio.base.wrappers import lazy_import
from invenio.testsuite import make_test_suite, run_test_suite, InvenioTestCase
-replace_invalid_solr_characters = lazy_import('invenio.solrutils_bibindex_indexer:replace_invalid_solr_characters')
-get_collection_filter = lazy_import('invenio.solrutils_bibrank_searcher:get_collection_filter')
+replace_invalid_solr_characters = lazy_import(
+ 'invenio.legacy.miscutil.solrutils_bibindex_indexer:'
+ 'replace_invalid_solr_characters')
+get_collection_filter = lazy_import(
+ 'invenio.legacy.miscutil.solrutils_bibrank_searcher:'
+ 'get_collection_filter')
class TestReplaceInvalidCharacters(InvenioTestCase):
"""Test for removal of invalid Solr characters and control characters."""
def test_no_replacement(self):
"""solrutils - no characters to replace"""
utext_in = unicode('foo\nbar\tfab\n\r', 'utf-8')
utext_out = unicode('foo\nbar\tfab\n\r', 'utf-8')
self.assertEqual(utext_out, replace_invalid_solr_characters(utext_in))
def test_replace_control_characters(self):
"""solrutils - replacement of control characters"""
self.assertEqual(u'abc \nde', replace_invalid_solr_characters(u'abc\u0000\nde'))
self.assertEqual(u'abc \nde', replace_invalid_solr_characters(u'abc\u0003\nde'))
self.assertEqual(u'abc \nde', replace_invalid_solr_characters(u'abc\u0008\nde'))
self.assertEqual(u'abc \nde', replace_invalid_solr_characters(u'abc\u000B\nde'))
self.assertEqual(u'abc \nde', replace_invalid_solr_characters(u'abc\u000C\nde'))
self.assertEqual(u'abc \nde', replace_invalid_solr_characters(u'abc\u000E\nde'))
self.assertEqual(u'abc \nde', replace_invalid_solr_characters(u'abc\u0012\nde'))
self.assertEqual(u'abc \nde', replace_invalid_solr_characters(u'abc\u001F\nde'))
def test_replace_invalid_chars(self):
"""solrutils - replacement of invalid characters"""
self.assertEqual(u'abc\nde', replace_invalid_solr_characters(u'abc\uD800\nde'))
self.assertEqual(u'abc\nde', replace_invalid_solr_characters(u'abc\uDF12\nde'))
self.assertEqual(u'abc\nde', replace_invalid_solr_characters(u'abc\uDFFF\nde'))
self.assertEqual(u'abc\nde', replace_invalid_solr_characters(u'abc\uFFFE\nde'))
self.assertEqual(u'abc\nde', replace_invalid_solr_characters(u'abc\uFFFF\nde'))
class TestSolrRankingHelpers(InvenioTestCase):
"""Test for Solr ranking helper functions."""
def test_get_collection_filter(self):
"""solrutils - creation of collection filter"""
from invenio import intbitset
self.assertEqual('', get_collection_filter(intbitset.intbitset([]), 0))
self.assertEqual('', get_collection_filter(intbitset.intbitset([]), 1))
self.assertEqual('', get_collection_filter(intbitset.intbitset([1, 2, 3, 4, 5]), 0))
self.assertEqual('id:(5)', get_collection_filter(intbitset.intbitset([1, 2, 3, 4, 5]), 1))
self.assertEqual('id:(4 5)', get_collection_filter(intbitset.intbitset([1, 2, 3, 4, 5]), 2))
self.assertEqual('id:(1 2 3 4 5)', get_collection_filter(intbitset.intbitset([1, 2, 3, 4, 5]), 5))
self.assertEqual('id:(1 2 3 4 5)', get_collection_filter(intbitset.intbitset([1, 2, 3, 4, 5]), 6))
TEST_SUITE = make_test_suite(TestReplaceInvalidCharacters, TestSolrRankingHelpers)
if __name__ == "__main__":
run_test_suite(TEST_SUITE)
diff --git a/invenio/utils/connector.py b/invenio/utils/connector.py
index 9dc08c36f..5e565bb51 100644
--- a/invenio/utils/connector.py
+++ b/invenio/utils/connector.py
@@ -1,618 +1,618 @@
# -*- coding: utf-8 -*-
#
## This file is part of Invenio.
## Copyright (C) 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Tools to connect to distant Invenio servers using Invenio APIs.
Example of use:
from InvenioConnector import *
cds = InvenioConnector("http://cds.cern.ch")
results = cds.search("higgs")
for record in results:
print record["245__a"][0]
print record["520__b"][0]
for author in record["100__"]:
print author["a"][0], author["u"][0]
FIXME:
- implement cache expiration
- exceptions handling
- parsing of <!-- Search-Engine-Total-Number-Of-Results: N -->
- better checking of input parameters (especially InvenioConnector.__init__ "url")
- improve behaviour when running locally (perform_request_search *requiring* "req" object)
"""
import urllib
import urllib2
import xml.sax
import re
import tempfile
import os
import time
import sys
MECHANIZE_CLIENTFORM_VERSION_CHANGE = (0, 2, 0)
try:
import mechanize
if mechanize.__version__ < MECHANIZE_CLIENTFORM_VERSION_CHANGE:
OLD_MECHANIZE_VERSION = True
import ClientForm
else:
OLD_MECHANIZE_VERSION = False
MECHANIZE_AVAILABLE = True
except ImportError:
MECHANIZE_AVAILABLE = False
try:
# if we are running locally, we can optimize :-)
from invenio.config import CFG_SITE_URL, CFG_SITE_SECURE_URL, CFG_SITE_RECORD, CFG_CERN_SITE
- from invenio.bibtask import task_low_level_submission
+ from invenio.legacy.bibsched.bibtask import task_low_level_submission
from invenio.legacy.search_engine import perform_request_search, collection_restricted_p
from invenio.modules.formatter import format_records
from invenio.utils.url import make_user_agent_string
LOCAL_SITE_URLS = [CFG_SITE_URL, CFG_SITE_SECURE_URL]
CFG_USER_AGENT = make_user_agent_string("invenio_connector")
except ImportError:
LOCAL_SITE_URLS = None
CFG_CERN_SITE = 0
CFG_USER_AGENT = "invenio_connector"
CFG_CDS_URL = "http://cds.cern.ch/"
class InvenioConnectorAuthError(Exception):
"""
This exception is called by InvenioConnector when authentication fails during
remote or local connections.
"""
def __init__(self, value):
"""
Set the internal "value" attribute to that of the passed "value" parameter.
@param value: an error string to display to the user.
@type value: string
"""
Exception.__init__(self)
self.value = value
def __str__(self):
"""
Return oneself as a string (actually, return the contents of self.value).
@return: representation of error
@rtype: string
"""
return str(self.value)
class InvenioConnectorServerError(Exception):
"""
This exception is called by InvenioConnector when using it on a machine with no
Invenio installed and no remote server URL is given during instantiation.
"""
def __init__(self, value):
"""
Set the internal "value" attribute to that of the passed "value" parameter.
@param value: an error string to display to the user.
@type value: string
"""
Exception.__init__(self)
self.value = value
def __str__(self):
"""
Return oneself as a string (actually, return the contents of self.value).
@return: representation of error
@rtype: string
"""
return str(self.value)
class InvenioConnector(object):
"""
Creates an connector to a server running Invenio
"""
def __init__(self, url=None, user="", password="", login_method="Local", local_import_path="invenio"):
"""
Initialize a new instance of the server at given URL.
If the server happens to be running on the local machine, the
access will be done directly using the Python APIs. In that case
you can choose from which base path to import the necessary file
specifying the local_import_path parameter.
@param url: the url to which this instance will be connected.
Defaults to CFG_SITE_URL, if available.
@type url: string
@param user: the optional username for interacting with the Invenio
instance in an authenticated way.
@type user: string
@param password: the corresponding password.
@type password: string
@param login_method: the name of the login method the Invenio instance
is expecting for this user (in case there is more than one).
@type login_method: string
@param local_import_path: the base path from which the connector should
try to load the local connector, if available. Eg "invenio" will
lead to "import invenio.dbquery"
@type local_import_path: string
@raise InvenioConnectorAuthError: if no secure URL is given for authentication
@raise InvenioConnectorServerError: if no URL is given on a machine without Invenio installed
"""
if url == None and LOCAL_SITE_URLS != None:
self.server_url = LOCAL_SITE_URLS[0] # Default to CFG_SITE_URL
elif url == None:
raise InvenioConnectorServerError("You do not seem to have Invenio installed and no remote URL is given")
else:
self.server_url = url
self.local = LOCAL_SITE_URLS and self.server_url in LOCAL_SITE_URLS
self.cached_queries = {}
self.cached_records = {}
self.cached_baskets = {}
self.user = user
self.password = password
self.login_method = login_method
self.browser = None
if self.user:
if not self.server_url.startswith('https://'):
raise InvenioConnectorAuthError("You have to use a secure URL (HTTPS) to login")
if MECHANIZE_AVAILABLE:
self._init_browser()
self._check_credentials()
else:
self.user = None
raise InvenioConnectorAuthError("The Python module Mechanize (and ClientForm" \
" if Mechanize version < 0.2.0) must" \
" be installed to perform authenticated requests.")
def _init_browser(self):
"""
Ovveride this method with the appropriate way to prepare a logged in
browser.
"""
self.browser = mechanize.Browser()
self.browser.set_handle_robots(False)
self.browser.open(self.server_url + "/youraccount/login")
self.browser.select_form(nr=0)
try:
self.browser['nickname'] = self.user
self.browser['password'] = self.password
except:
self.browser['p_un'] = self.user
self.browser['p_pw'] = self.password
# Set login_method to be writable
self.browser.form.find_control('login_method').readonly = False
self.browser['login_method'] = self.login_method
self.browser.submit()
def _check_credentials(self):
out = self.browser.response().read()
if not 'youraccount/logout' in out:
raise InvenioConnectorAuthError("It was not possible to successfully login with the provided credentials" + out)
def search(self, read_cache=True, **kwparams):
"""
Returns records corresponding to the given search query.
See docstring of invenio.legacy.search_engine.perform_request_search()
for an overview of available parameters.
@raise InvenioConnectorAuthError: if authentication fails
"""
parse_results = False
of = kwparams.get('of', "")
if of == "":
parse_results = True
of = "xm"
kwparams['of'] = of
params = urllib.urlencode(kwparams, doseq=1)
# Are we running locally? If so, better directly access the
# search engine directly
if self.local and of != 't':
# See if user tries to search any restricted collection
c = kwparams.get('c', "")
if c != "":
if type(c) is list:
colls = c
else:
colls = [c]
for collection in colls:
if collection_restricted_p(collection):
if self.user:
self._check_credentials()
continue
raise InvenioConnectorAuthError("You are trying to search a restricted collection. Please authenticate yourself.\n")
kwparams['of'] = 'id'
results = perform_request_search(**kwparams)
if of.lower() != 'id':
results = format_records(results, of)
else:
if not self.cached_queries.has_key(params + str(parse_results)) or not read_cache:
if self.user:
results = self.browser.open(self.server_url + "/search?" + params)
else:
results = urllib2.urlopen(self.server_url + "/search?" + params)
if 'youraccount/login' in results.geturl():
# Current user not able to search collection
raise InvenioConnectorAuthError("You are trying to search a restricted collection. Please authenticate yourself.\n")
else:
return self.cached_queries[params + str(parse_results)]
if parse_results:
# FIXME: we should not try to parse if results is string
parsed_records = self._parse_results(results, self.cached_records)
self.cached_queries[params + str(parse_results)] = parsed_records
return parsed_records
else:
# pylint: disable=E1103
# The whole point of the following code is to make sure we can
# handle two types of variable.
try:
res = results.read()
except AttributeError:
res = results
# pylint: enable=E1103
if of == "id":
try:
if type(res) is str:
# Transform to list
res = [int(recid.strip()) for recid in \
res.strip("[]").split(",") if recid.strip() != ""]
res.reverse()
except (ValueError, AttributeError):
res = []
self.cached_queries[params + str(parse_results)] = res
return self.cached_queries[params + str(parse_results)]
def search_with_retry(self, sleeptime=3.0, retrycount=3, **params):
"""
This function performs a search given a dictionary of search(..)
parameters. It accounts for server timeouts as necessary and
will retry some number of times.
@param sleeptime: number of seconds to sleep between retries
@type sleeptime: float
@param retrycount: number of times to retry given search
@type retrycount: int
@param params: search parameters
@type params: **kwds
@rtype: list
@return: returns records in given format
"""
results = []
count = 0
while count < retrycount:
try:
results = self.search(**params)
break
except urllib2.URLError:
sys.stderr.write("Timeout while searching...Retrying\n")
time.sleep(sleeptime)
count += 1
else:
sys.stderr.write("Aborting search after %d attempts.\n" % (retrycount,))
return results
def search_similar_records(self, recid):
"""
Returns the records similar to the given one
"""
return self.search(p="recid:" + str(recid), rm="wrd")
def search_records_cited_by(self, recid):
"""
Returns records cited by the given one
"""
return self.search(p="recid:" + str(recid), rm="citation")
def get_records_from_basket(self, bskid, group_basket=False, read_cache=True):
"""
Returns the records from the (public) basket with given bskid
"""
if not self.cached_baskets.has_key(bskid) or not read_cache:
if self.user:
if group_basket:
group_basket = '&category=G'
else:
group_basket = ''
results = self.browser.open(self.server_url + \
"/yourbaskets/display?of=xm&bskid=" + str(bskid) + group_basket)
else:
results = urllib2.urlopen(self.server_url + \
"/yourbaskets/display_public?of=xm&bskid=" + str(bskid))
else:
return self.cached_baskets[bskid]
parsed_records = self._parse_results(results, self.cached_records)
self.cached_baskets[bskid] = parsed_records
return parsed_records
def get_record(self, recid, read_cache=True):
"""
Returns the record with given recid
"""
if self.cached_records.has_key(recid) or not read_cache:
return self.cached_records[recid]
else:
return self.search(p="recid:" + str(recid))
def upload_marcxml(self, marcxml, mode):
"""
Uploads a record to the server
Parameters:
marcxml - *str* the XML to upload.
mode - *str* the mode to use for the upload.
"-i" insert new records
"-r" replace existing records
"-c" correct fields of records
"-a" append fields to records
"-ir" insert record or replace if it exists
"""
if mode not in ["-i", "-r", "-c", "-a", "-ir"]:
raise NameError, "Incorrect mode " + str(mode)
# Are we running locally? If so, submit directly
if self.local:
(code, marcxml_filepath) = tempfile.mkstemp(prefix="upload_%s" % \
time.strftime("%Y%m%d_%H%M%S_",
time.localtime()))
marcxml_file_d = os.fdopen(code, "w")
marcxml_file_d.write(marcxml)
marcxml_file_d.close()
return task_low_level_submission("bibupload", "", mode, marcxml_filepath)
else:
params = urllib.urlencode({'file': marcxml,
'mode': mode})
## We don't use self.browser as batchuploader is protected by IP
opener = urllib2.build_opener()
opener.addheaders = [('User-Agent', CFG_USER_AGENT)]
return opener.open(self.server_url + "/batchuploader/robotupload", params,)
def _parse_results(self, results, cached_records):
"""
Parses the given results (in MARCXML format).
The given "cached_records" list is a pool of
already existing parsed records (in order to
avoid keeping several times the same records in memory)
"""
parser = xml.sax.make_parser()
handler = RecordsHandler(cached_records)
parser.setContentHandler(handler)
parser.parse(results)
return handler.records
class Record(dict):
"""
Represents a Invenio record
"""
def __init__(self, recid=None, marcxml=None, server_url=None):
#dict.__init__(self)
self.recid = recid
self.marcxml = ""
if marcxml is not None:
self.marcxml = marcxml
#self.record = {}
self.server_url = server_url
def __setitem__(self, item, value):
tag, ind1, ind2, subcode = decompose_code(item)
if subcode is not None:
#if not dict.has_key(self, tag + ind1 + ind2):
# dict.__setitem__(self, tag + ind1 + ind2, [])
dict.__setitem__(self, tag + ind1 + ind2, [{subcode: [value]}])
else:
dict.__setitem__(self, tag + ind1 + ind2, value)
def __getitem__(self, item):
tag, ind1, ind2, subcode = decompose_code(item)
datafields = dict.__getitem__(self, tag + ind1 + ind2)
if subcode is not None:
subfields = []
for datafield in datafields:
if datafield.has_key(subcode):
subfields.extend(datafield[subcode])
return subfields
else:
return datafields
def __contains__(self, item):
return dict.__contains__(item)
def __repr__(self):
return "Record(" + dict.__repr__(self) + ")"
def __str__(self):
return self.marcxml
def export(self, of="marcxml"):
"""
Returns the record in chosen format
"""
return self.marcxml
def url(self):
"""
Returns the URL to this record.
Returns None if not known
"""
if self.server_url is not None and \
self.recid is not None:
return self.server_url + "/"+ CFG_SITE_RECORD +"/" + str(self.recid)
else:
return None
if MECHANIZE_AVAILABLE:
class _SGMLParserFactory(mechanize.DefaultFactory):
"""
Black magic to be able to interact with CERN SSO forms.
"""
def __init__(self, i_want_broken_xhtml_support=False):
if OLD_MECHANIZE_VERSION:
forms_factory = mechanize.FormsFactory(
form_parser_class=ClientForm.XHTMLCompatibleFormParser)
else:
forms_factory = mechanize.FormsFactory(
form_parser_class=mechanize.XHTMLCompatibleFormParser)
mechanize.Factory.__init__(
self,
forms_factory=forms_factory,
links_factory=mechanize.LinksFactory(),
title_factory=mechanize.TitleFactory(),
response_type_finder=mechanize._html.ResponseTypeFinder(
allow_xhtml=i_want_broken_xhtml_support),
)
class CDSInvenioConnector(InvenioConnector):
def __init__(self, user="", password="", local_import_path="invenio"):
"""
This is a specialized InvenioConnector class suitable to connect
to the CERN Document Server (CDS), which uses centralized SSO.
"""
cds_url = CFG_CDS_URL
if user:
cds_url = cds_url.replace('http', 'https')
super(CDSInvenioConnector, self).__init__(cds_url, user, password, local_import_path=local_import_path)
def _init_browser(self):
"""
@note: update this everytime the CERN SSO login form is refactored.
"""
self.browser = mechanize.Browser(factory=_SGMLParserFactory(i_want_broken_xhtml_support=True))
self.browser.set_handle_robots(False)
self.browser.open(self.server_url)
self.browser.follow_link(text_regex="Sign in")
self.browser.select_form(nr=0)
self.browser.form['ctl00$ctl00$NICEMasterPageBodyContent$SiteContentPlaceholder$txtFormsLogin'] = self.user
self.browser.form['ctl00$ctl00$NICEMasterPageBodyContent$SiteContentPlaceholder$txtFormsPassword'] = self.password
self.browser.submit()
self.browser.select_form(nr=0)
self.browser.submit()
class RecordsHandler(xml.sax.handler.ContentHandler):
"MARCXML Parser"
def __init__(self, records):
"""
Parameters:
records - *dict* a dictionary with an already existing cache of records
"""
self.cached_records = records
self.records = []
self.in_record = False
self.in_controlfield = False
self.in_datafield = False
self.in_subfield = False
self.cur_tag = None
self.cur_subfield = None
self.cur_controlfield = None
self.cur_datafield = None
self.cur_record = None
self.recid = 0
self.buffer = ""
self.counts = 0
def startElement(self, name, attributes):
if name == "record":
self.cur_record = Record()
self.in_record = True
elif name == "controlfield":
tag = attributes["tag"]
self.cur_datafield = ""
self.cur_tag = tag
self.cur_controlfield = []
if not self.cur_record.has_key(tag):
self.cur_record[tag] = self.cur_controlfield
self.in_controlfield = True
elif name == "datafield":
tag = attributes["tag"]
self.cur_tag = tag
ind1 = attributes["ind1"]
if ind1 == " ": ind1 = "_"
ind2 = attributes["ind2"]
if ind2 == " ": ind2 = "_"
if not self.cur_record.has_key(tag + ind1 + ind2):
self.cur_record[tag + ind1 + ind2] = []
self.cur_datafield = {}
self.cur_record[tag + ind1 + ind2].append(self.cur_datafield)
self.in_datafield = True
elif name == "subfield":
subcode = attributes["code"]
if not self.cur_datafield.has_key(subcode):
self.cur_subfield = []
self.cur_datafield[subcode] = self.cur_subfield
else:
self.cur_subfield = self.cur_datafield[subcode]
self.in_subfield = True
def characters(self, data):
if self.in_subfield:
self.buffer += data
elif self.in_controlfield:
self.buffer += data
elif "Search-Engine-Total-Number-Of-Results:" in data:
print data
match_obj = re.search("\d+", data)
if match_obj:
print int(match_obj.group())
self.counts = int(match_obj.group())
def endElement(self, name):
if name == "record":
self.in_record = False
elif name == "controlfield":
if self.cur_tag == "001":
self.recid = int(self.buffer)
if self.cached_records.has_key(self.recid):
# Record has already been parsed, no need to add
pass
else:
# Add record to the global cache
self.cached_records[self.recid] = self.cur_record
# Add record to the ordered list of results
self.records.append(self.cached_records[self.recid])
self.cur_controlfield.append(self.buffer)
self.in_controlfield = False
self.buffer = ""
elif name == "datafield":
self.in_datafield = False
elif name == "subfield":
self.in_subfield = False
self.cur_subfield.append(self.buffer)
self.buffer = ""
def decompose_code(code):
"""
Decomposes a MARC "code" into tag, ind1, ind2, subcode
"""
code = "%-6s" % code
ind1 = code[3:4]
if ind1 == " ": ind1 = "_"
ind2 = code[4:5]
if ind2 == " ": ind2 = "_"
subcode = code[5:6]
if subcode == " ": subcode = None
return (code[0:3], ind1, ind2, subcode)
diff --git a/invenio/utils/data/__init__.py b/invenio/utils/data/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/invenio/utils/mimetype.py b/invenio/utils/mimetype.py
index 0fd2389bf..d6816dde3 100644
--- a/invenio/utils/mimetype.py
+++ b/invenio/utils/mimetype.py
@@ -1,244 +1,244 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Invenio mimetype helper functions.
Usage example:
TODO
"""
import re
from invenio.base.globals import cfg
from mimetypes import MimeTypes
from werkzeug import cached_property, LocalProxy
from thread import get_ident
try:
import magic
if hasattr(magic, "open"):
CFG_HAS_MAGIC = 1
elif hasattr(magic, "Magic"):
CFG_HAS_MAGIC = 2
except ImportError:
CFG_HAS_MAGIC = 0
_magic_cookies = {}
if CFG_HAS_MAGIC == 1:
def _get_magic_cookies():
"""
@return: a tuple of magic object.
@rtype: (MAGIC_NONE, MAGIC_COMPRESS, MAGIC_MIME, MAGIC_COMPRESS + MAGIC_MIME)
@note: ... not real magic. Just see: man file(1)
"""
thread_id = get_ident()
if thread_id not in _magic_cookies:
_magic_cookies[thread_id] = {
magic.MAGIC_NONE: magic.open(magic.MAGIC_NONE),
magic.MAGIC_COMPRESS: magic.open(magic.MAGIC_COMPRESS),
magic.MAGIC_MIME: magic.open(magic.MAGIC_MIME),
magic.MAGIC_COMPRESS + magic.MAGIC_MIME: magic.open(magic.MAGIC_COMPRESS + magic.MAGIC_MIME),
magic.MAGIC_MIME_TYPE: magic.open(magic.MAGIC_MIME_TYPE),
}
for key in _magic_cookies[thread_id].keys():
_magic_cookies[thread_id][key].load()
return _magic_cookies[thread_id]
elif CFG_HAS_MAGIC == 2:
def _magic_wrapper(local_path, mime=True, mime_encoding=False):
thread_id = get_ident()
if (thread_id, mime, mime_encoding) not in _magic_cookies:
magic_object = _magic_cookies[thread_id, mime, mime_encoding] = magic.Magic(mime=mime, mime_encoding=mime_encoding)
else:
magic_object = _magic_cookies[thread_id, mime, mime_encoding]
return magic_object.from_file(local_path) # pylint: disable=E1103
class LazyMimeCache(object):
@cached_property
def mimes(self):
"""
Returns extended MimeTypes.
"""
_mimes = MimeTypes(strict=False)
_mimes.suffix_map.update({'.tbz2' : '.tar.bz2'})
_mimes.encodings_map.update({'.bz2' : 'bzip2'})
if cfg['CFG_BIBDOCFILE_ADDITIONAL_KNOWN_MIMETYPES']:
for key, value in cfg['CFG_BIBDOCFILE_ADDITIONAL_KNOWN_MIMETYPES'].iteritems():
_mimes.add_type(key, value)
del key, value
return _mimes
@cached_property
def extensions(self):
"""
Generate the regular expression to match all the known extensions.
@return: the regular expression.
@rtype: regular expression object
"""
_tmp_extensions = self.mimes.encodings_map.keys() + \
self.mimes.suffix_map.keys() + \
self.mimes.types_map[1].keys() + \
cfg['CFG_BIBDOCFILE_ADDITIONAL_KNOWN_FILE_EXTENSIONS']
extensions = []
for ext in _tmp_extensions:
if ext.startswith('.'):
extensions.append(ext)
else:
extensions.append('.' + ext)
extensions.sort()
extensions.reverse()
extensions = set([ext.lower() for ext in extensions])
extensions = '\\' + '$|\\'.join(extensions) + '$'
extensions = extensions.replace('+', '\\+')
return re.compile(extensions, re.I)
#: Lazy mime and extensitons cache.
_mime_cache = LazyMimeCache()
#: MimeTypes instance.
_mimes = LocalProxy(lambda: _mime_cache.mimes)
#: Regular expression to recognized extensions.
_extensions = LocalProxy(lambda: _mime_cache.extensions)
## Use only functions bellow in your code:
def file_strip_ext(afile, skip_version=False, only_known_extensions=False, allow_subformat=True):
"""
Strip in the best way the extension from a filename.
>>> file_strip_ext("foo.tar.gz")
'foo'
>>> file_strip_ext("foo.buz.gz")
'foo.buz'
>>> file_strip_ext("foo.buz")
'foo'
>>> file_strip_ext("foo.buz", only_known_extensions=True)
'foo.buz'
>>> file_strip_ext("foo.buz;1", skip_version=False,
... only_known_extensions=True)
'foo.buz;1'
>>> file_strip_ext("foo.gif;icon")
'foo'
>>> file_strip_ext("foo.gif;icon", only_know_extensions=True,
... allow_subformat=False)
'foo.gif;icon'
@param afile: the path/name of a file.
@type afile: string
@param skip_version: whether to skip a trailing ";version".
@type skip_version: bool
@param only_known_extensions: whether to strip out only known extensions or
to consider as extension anything that follows a dot.
@type only_known_extensions: bool
@param allow_subformat: whether to consider also subformats as part of
the extension.
@type allow_subformat: bool
@return: the name/path without the extension (and version).
@rtype: string
"""
import os
afile = afile.split(';')
if len(afile)>1 and allow_subformat and not afile[-1].isdigit():
afile = afile[0:-1]
if len(afile)>1 and skip_version and afile[-1].isdigit():
afile = afile[0:-1]
afile = ';'.join(afile)
nextfile = _extensions.sub('', afile)
if nextfile == afile and not only_known_extensions:
nextfile = os.path.splitext(afile)[0]
while nextfile != afile:
afile = nextfile
nextfile = _extensions.sub('', afile)
return nextfile
def guess_mimetype_and_encoding(afile):
"""
Tries to guess mimetype and encoding of a file.
@param afile: the path/name of a file
@time afile: string
@return: the mimetype and encoding
@rtype: tuple
"""
return _mimes.guess_type(afile)
def guess_extension(amimetype, normalize=False):
"""
Tries to guess extension for a mimetype.
@param amimetype: name of a mimetype
@time amimetype: string
@return: the extension
@rtype: string
"""
ext = _mimes.guess_extension(amimetype)
if ext and normalize:
## Normalize some common magic mis-interpreation
ext = {'.asc': '.txt', '.obj': '.bin'}.get(ext, ext)
- from invenio.bibdocfile_normalizer import normalize_format
+ from invenio.legacy.bibdocfile.api_normalizer import normalize_format
return normalize_format(ext)
return ext
def get_magic_guesses(fullpath):
"""
Return all the possible guesses from the magic library about
the content of the file.
@param fullpath: location of the file
@type fullpath: string
@return: guesses about content of the file
@rtype: tuple
"""
if CFG_HAS_MAGIC == 1:
magic_cookies = _get_magic_cookies()
magic_result = []
for key in magic_cookies.keys():
magic_result.append(magic_cookies[key].file(fullpath))
return tuple(magic_result)
elif CFG_HAS_MAGIC == 2:
magic_result = []
for key in ({'mime': False, 'mime_encoding': False},
{'mime': True, 'mime_encoding': False},
{'mime': False, 'mime_encoding': True}):
magic_result.append(_magic_wrapper(fullpath, **key))
return tuple(magic_result)
def guess_extension_from_path(local_path):
try:
if CFG_HAS_MAGIC == 1:
magic_cookie = _get_magic_cookies()[magic.MAGIC_MIME_TYPE]
mimetype = magic_cookie.file(local_path)
elif CFG_HAS_MAGIC == 2:
mimetype = _magic_wrapper(local_path, mime=True, mime_encoding=False)
if CFG_HAS_MAGIC:
return guess_extension(mimetype, normalize=True)
except Exception:
pass
diff --git a/invenio/utils/plotextractor/cli.py b/invenio/utils/plotextractor/cli.py
index 276d244aa..2ca305c8e 100644
--- a/invenio/utils/plotextractor/cli.py
+++ b/invenio/utils/plotextractor/cli.py
@@ -1,1310 +1,1310 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import sys
import os
import getopt
import re
import time
from invenio.utils.shell import run_shell_command, Timeout, run_process_with_timeout
from invenio.invenio_connector import InvenioConnector
from invenio.utils.text import wrap_text_in_a_box, \
wait_for_user
from invenio.config import CFG_TMPDIR, CFG_SITE_URL, \
CFG_PLOTEXTRACTOR_DISALLOWED_TEX, \
CFG_PLOTEXTRACTOR_CONTEXT_WORD_LIMIT, \
CFG_PLOTEXTRACTOR_CONTEXT_SENTENCE_LIMIT, \
CFG_PLOTEXTRACTOR_CONTEXT_EXTRACT_LIMIT
-from invenio.bibtask import task_low_level_submission
+from invenio.legacy.bibsched.bibtask import task_low_level_submission
from invenio.plotextractor_getter import get_list_of_all_matching_files, \
parse_and_download, \
make_single_directory, \
tarballs_by_recids, \
tarballs_by_arXiv_id
from invenio.plotextractor_converter import untar, extract_text, \
convert_images
from invenio.plotextractor_output_utils import assemble_caption, \
find_open_and_close_braces, \
create_MARC, get_tex_location, \
get_image_location, \
create_contextfiles, \
prepare_image_data, \
write_message, remove_dups
from tempfile import mkstemp
"""
This programme will take a tarball from arXiv, untar it, convert all its
associated images to PNG, find the captions to the images detailed in the
included TeX document, and write MARCXML that reflects these associations.
"""
ARXIV_HEADER = 'arXiv:'
PLOTS_DIR = 'plots'
MAIN_CAPTION_OR_IMAGE = 0
SUB_CAPTION_OR_IMAGE = 1
def main():
"""
The main program loop.
"""
help_param = 'help'
verbose_param = 'verbose'
tarball_param = 'tarball'
tardir_param = 'tdir'
infile_param = 'input'
sdir_param = 'sdir'
extract_text_param = 'extract-text'
force_param = 'force'
upload_param = 'call-bibupload'
yes_i_know_param = 'yes-i-know'
recid_param = 'recid'
arXiv_param = 'arXiv'
squash_param = 'squash'
refno_url_param = 'refno-url'
refno_param = 'skip-refno'
clean_param = 'clean'
param_abbrs = 'h:t:d:s:i:a:l:xfuyrqck'
params = [help_param, tarball_param + '=', tardir_param + '=', \
sdir_param + '=', infile_param + '=', arXiv_param + '=', refno_url_param + '=', \
extract_text_param, force_param, upload_param, yes_i_know_param, recid_param, \
squash_param, clean_param]
try:
opts, args = getopt.getopt(sys.argv[1:], param_abbrs, params)
except getopt.GetoptError, err:
write_message(str(err))
usage()
sys.exit(2)
tarball = None
sdir = None
infile = None
tdir = None
xtract_text = False
upload_plots = False
force = False
squash = False
squash_path = ""
yes_i_know = False
recids = None
arXiv = None
clean = False
refno_url = CFG_SITE_URL
skip_refno = False
for opt, arg in opts:
if opt in ['-h', help_param]:
usage()
sys.exit()
elif opt in ['-t', tarball_param]:
tarball = arg
elif opt in ['-d', tardir_param]:
tdir = arg
elif opt in ['-i', infile_param]:
infile = arg
elif opt in ['-r', recid_param]:
recids = arg
elif opt in ['-a', arXiv_param]:
arXiv = arg
elif opt in ['-s', sdir_param]:
sdir = arg
elif opt in ['-x', extract_text_param]:
xtract_text = True
elif opt in ['-f', force_param]:
force = True
elif opt in ['-u', upload_param]:
upload_plots = True
elif opt in ['-q', squash_param]:
squash = True
elif opt in ['-y', yes_i_know_param]:
yes_i_know = True
elif opt in ['-c', clean_param]:
clean = True
elif opt in ['-l', refno_url_param]:
refno_url = arg
elif opt in ['-k', refno_param]:
skip_refno = True
else:
usage()
sys.exit()
if sdir == None:
sdir = CFG_TMPDIR
elif not os.path.isdir(sdir):
try:
os.makedirs(sdir)
except:
write_message('Error: We can\'t use this sdir. using ' + \
'CFG_TMPDIR')
sdir = CFG_TMPDIR
if skip_refno:
refno_url = ""
tars_and_gzips = []
if tarball != None:
tars_and_gzips.append(tarball)
if tdir != None:
filetypes = ['gzip compressed', 'tar archive', 'Tar archive'] # FIXME
write_message('Currently processing any tarballs in ' + tdir)
tars_and_gzips.extend(get_list_of_all_matching_files(tdir, filetypes))
if infile != None:
tars_and_gzips.extend(parse_and_download(infile, sdir))
if recids != None:
tars_and_gzips.extend(tarballs_by_recids(recids, sdir))
if arXiv != None:
tars_and_gzips.extend(tarballs_by_arXiv_id([arXiv], sdir))
if tars_and_gzips == []:
write_message('Error: no tarballs to process!')
sys.exit(1)
if squash:
squash_fd, squash_path = mkstemp(suffix="_" + time.strftime("%Y%m%d%H%M%S") + ".xml", \
prefix="plotextractor_", dir=sdir)
os.write(squash_fd, '<?xml version="1.0" encoding="UTF-8"?>\n<collection>\n')
os.close(squash_fd)
for tarball in tars_and_gzips:
process_single(tarball, sdir=sdir, xtract_text=xtract_text, \
upload_plots=upload_plots, force=force, squash=squash_path, \
yes_i_know=yes_i_know, refno_url=refno_url, \
clean=clean)
if squash:
squash_fd = open(squash_path, "a")
squash_fd.write("</collection>\n")
squash_fd.close()
write_message("generated %s" % (squash_path,))
if upload_plots:
upload_to_site(squash_path, yes_i_know)
def process_single(tarball, sdir=CFG_TMPDIR, xtract_text=False, \
upload_plots=False, force=False, squash="", \
yes_i_know=False, refno_url="", \
clean=False):
"""
Processes one tarball end-to-end.
@param: tarball (string): the absolute location of the tarball we wish
to process
@param: sdir (string): where we should put all the intermediate files for
the processing. if you're uploading, this directory should be one
of the ones specified in CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS, else
the upload won't work
@param: xtract_text (boolean): true iff you want to run pdftotext on the
pdf versions of the tarfiles. this programme assumes that the pdfs
are named the same as the tarballs but with a .pdf extension.
@param: upload_plots (boolean): true iff you want to bibupload the plots
extracted by this process
@param: force (boolean): force creation of new xml file
@param: squash: write MARCXML output into a specified 'squash' file
instead of single files.
@param: yes_i_know: if True, no user interaction if upload_plots is True
@param: refno_url: URL to the invenio-instance to query for refno.
@param: clean: if True, everything except the original tarball, plots and
context- files will be removed
@return: marc_name(string): path to generated marcxml file
"""
sub_dir, refno = get_defaults(tarball, sdir, refno_url)
if not squash:
marc_name = os.path.join(sub_dir, '%s.xml' % (refno,))
if (force or not os.path.exists(marc_name)):
marc_fd = open(marc_name, 'w')
marc_fd.write('<?xml version="1.0" encoding="UTF-8"?>\n<collection>\n')
marc_fd.close()
else:
marc_name = squash
if xtract_text:
extract_text(tarball)
try:
extracted_files_list, image_list, tex_files = untar(tarball, sub_dir)
except Timeout:
write_message('Timeout during tarball extraction on %s' % (tarball,))
return
if tex_files == [] or tex_files == None:
write_message('%s is not a tarball' % (os.path.split(tarball)[-1],))
run_shell_command('rm -r %s', (sub_dir,))
return
converted_image_list = convert_images(image_list)
write_message('converted %d of %d images found for %s' % (len(converted_image_list), \
len(image_list), \
os.path.basename(tarball)))
extracted_image_data = []
for tex_file in tex_files:
# Extract images, captions and labels
partly_extracted_image_data = extract_captions(tex_file, sub_dir, \
converted_image_list)
if partly_extracted_image_data != []:
# Add proper filepaths and do various cleaning
cleaned_image_data = prepare_image_data(partly_extracted_image_data, \
tex_file, converted_image_list)
# Using prev. extracted info, get contexts for each image found
extracted_image_data.extend((extract_context(tex_file, cleaned_image_data)))
extracted_image_data = remove_dups(extracted_image_data)
if extracted_image_data == []:
write_message('No plots detected in %s' % (refno,))
else:
if refno_url == "":
refno = None
create_contextfiles(extracted_image_data)
marc_xml = create_MARC(extracted_image_data, tarball, refno)
if not squash:
marc_xml += "\n</collection>"
if marc_name != None:
marc_fd = open(marc_name, 'a')
marc_fd.write('%s\n' % (marc_xml,))
marc_fd.close()
if not squash:
write_message('generated %s' % (marc_name,))
if upload_plots:
upload_to_site(marc_name, yes_i_know)
if clean:
clean_up(extracted_files_list, image_list)
write_message('work complete on %s' % (os.path.split(tarball)[-1],))
return marc_name
def clean_up(extracted_files_list, image_list):
"""
Removes all the intermediate stuff.
@param: extracted_files_list ([string, string, ...]): list of all extracted files
@param: image_list ([string, string, ...]): list of the images to keep
"""
for extracted_file in extracted_files_list:
# Remove everything that is not in the image_list or is not a directory
if extracted_file not in image_list and extracted_file[-1] != os.sep:
run_shell_command('rm %s', (extracted_file,))
def get_defaults(tarball, sdir, refno_url):
"""
A function for parameter-checking.
@param: tarball (string): the location of the tarball to be extracted
@param: sdir (string): the location of the scratch directory for untarring,
conversions, and the ultimate destination of the MARCXML
@param: refno_url (string): server location on where to look for refno
@return sdir, refno (string, string): the same
arguments it was sent as is appropriate.
"""
if sdir == None:
# Missing sdir: using default directory: CFG_TMPDIR
sdir = CFG_TMPDIR
else:
sdir = os.path.split(tarball)[0]
# make a subdir in the scratch directory for each tarball
sdir = make_single_directory(sdir, \
os.path.split(tarball)[-1] + '_' + PLOTS_DIR)
if refno_url != "":
refno = get_reference_number(tarball, refno_url)
if refno == None:
refno = os.path.basename(tarball)
write_message('Error: can\'t find record id for %s' % (refno,))
else:
refno = os.path.basename(tarball)
write_message("Skipping ref-no check")
return sdir, refno
def get_reference_number(tarball, refno_url):
"""
Attempts to determine the reference number of the file by searching.
@param: tarball (string): the name of the tarball as downloaded from
arXiv
@param: refno_url (string): url of repository to check for a
reference number for this record. If not set; returns None
@return: refno (string): the reference number of the paper
"""
if refno_url:
server = InvenioConnector(refno_url)
# we just need the name of the file
tarball = os.path.split(tarball)[1]
prefix = '037__a:'
# the name right now looks like arXiv:hep-ph_9703009
# or arXiv:0910.0476
if tarball.startswith(ARXIV_HEADER):
if len(tarball.split('_')) > 1:
tarball = tarball.split(':')[1]
arXiv_record = tarball.replace('_', '/')
else:
arXiv_record = tarball
result = server.search(p=prefix + arXiv_record, of='id')
if len(result) == 0:
return None
return str(result[0])
arXiv_record = re.findall('(([a-zA-Z\\-]+/\\d+)|(\\d+\\.\\d+))', tarball)
if len(arXiv_record) > 1:
arXiv_record = arXiv_record[0]
result = server.search(p=prefix + arXiv_record, of='id')
if len(result) > 0:
return str(result[0])
tarball_mod = tarball.replace('_', '/')
arXiv_record = re.findall('(([a-zA-Z\\-]+/\\d+)|(\\d+\\.\\d+))', \
tarball_mod)
if len(arXiv_record) > 1:
arXiv_record = arXiv_record[0]
result = server.search(p=prefix + arXiv_record, of='id')
if len(result) > 0:
return str(result[0])
return None
def rotate_image(filename, line, sdir, image_list):
"""
Given a filename and a line, figure out what it is that the author
wanted to do wrt changing the rotation of the image and convert the
file so that this rotation is reflected in its presentation.
@param: filename (string): the name of the file as specified in the TeX
@param: line (string): the line where the rotate command was found
@output: the image file rotated in accordance with the rotate command
@return: True if something was rotated
"""
file_loc = get_image_location(filename, sdir, image_list)
degrees = re.findall('(angle=[-\\d]+|rotate=[-\\d]+)', line)
if len(degrees) < 1:
return False
degrees = degrees[0].split('=')[-1].strip()
if file_loc == None or file_loc == 'ERROR' or\
not re.match('-*\\d+', degrees):
return False
degrees = str(0 - int(degrees))
cmd_list = ['mogrify', '-rotate', degrees, file_loc]
dummy, dummy, cmd_err = run_process_with_timeout(cmd_list)
if cmd_err != '':
return True
else:
return True
def get_context(lines, backwards=False):
"""
Given a relevant string from a TeX file, this function will extract text
from it as far as it is deemed contextually relevant, either backwards or forwards
in the text. The level of relevance allowed is configurable. When it reaches some
point in the text that is determined to be out of scope from the current context,
like text that is identified as a new paragraph, a complex TeX structure
('/begin', '/end', etc.) etc., it will return the previously allocated text.
For use when extracting text with contextual value for an figure or plot.
@param lines (string): string to examine
@param reversed (bool): are we searching backwards?
@return context (string): extracted context
"""
tex_tag = re.compile(r".*\\(\w+).*")
sentence = re.compile(r"(?<=[.?!])[\s]+(?=[A-Z])")
context = []
word_list = lines.split()
if backwards:
word_list.reverse()
# For each word we do the following:
# 1. Check if we have reached word limit
# 2. If not, see if this is a TeX tag and see if its 'illegal'
# 3. Otherwise, add word to context
for word in word_list:
if len(context) >= CFG_PLOTEXTRACTOR_CONTEXT_WORD_LIMIT:
break
match = tex_tag.match(word)
if (match and match.group(1) in CFG_PLOTEXTRACTOR_DISALLOWED_TEX):
# TeX Construct matched, return
if backwards:
# When reversed we need to go back and
# remove unwanted data within brackets
temp_word = ""
while len(context):
temp_word = context.pop()
if '}' in temp_word:
break
break
context.append(word)
if backwards:
context.reverse()
text = " ".join(context)
sentence_list = sentence.split(text)
if backwards:
sentence_list.reverse()
if len(sentence_list) > CFG_PLOTEXTRACTOR_CONTEXT_SENTENCE_LIMIT:
return " ".join(sentence_list[:CFG_PLOTEXTRACTOR_CONTEXT_SENTENCE_LIMIT])
else:
return " ".join(sentence_list)
def extract_context(tex_file, extracted_image_data):
"""
Given a .tex file and a label name, this function will extract the text before
and after for all the references made to this label in the text. The number
of characters to extract before and after is configurable.
@param tex_file (list): path to .tex file
@param extracted_image_data ([(string, string, list), ...]):
a list of tuples of images matched to labels and captions from
this document.
@return extracted_image_data ([(string, string, list, list),
(string, string, list, list),...)]: the same list, but now containing
extracted contexts
"""
if os.path.isdir(tex_file) or not os.path.exists(tex_file):
return []
fd = open(tex_file)
lines = fd.read()
fd.close()
# Generate context for each image and its assoc. labels
new_image_data = []
for image, caption, label in extracted_image_data:
context_list = []
# Generate a list of index tuples for all matches
indicies = [match.span() \
for match in re.finditer(r"(\\(?:fig|ref)\{%s\})" % (re.escape(label),), \
lines)]
for startindex, endindex in indicies:
# Retrive all lines before label until beginning of file
i = startindex - CFG_PLOTEXTRACTOR_CONTEXT_EXTRACT_LIMIT
if i < 0:
text_before = lines[:startindex]
else:
text_before = lines[i:startindex]
context_before = get_context(text_before, backwards=True)
# Retrive all lines from label until end of file and get context
i = endindex + CFG_PLOTEXTRACTOR_CONTEXT_EXTRACT_LIMIT
text_after = lines[endindex:i]
context_after = get_context(text_after)
context_list.append(context_before + ' \\ref{' + label + '} ' + context_after)
new_image_data.append((image, caption, label, context_list))
return new_image_data
def extract_captions(tex_file, sdir, image_list, primary=True):
"""
Take the TeX file and the list of images in the tarball (which all,
presumably, are used in the TeX file) and figure out which captions
in the text are associated with which images
@param: lines (list): list of lines of the TeX file
@param: tex_file (string): the name of the TeX file which mentions
the images
@param: sdir (string): path to current sub-directory
@param: image_list (list): list of images in tarball
@param: primary (bool): is this the primary call to extract_caption?
@return: images_and_captions_and_labels ([(string, string, list),
(string, string, list), ...]):
a list of tuples representing the names of images and their
corresponding figure labels from the TeX file
"""
if os.path.isdir(tex_file) or not os.path.exists(tex_file):
return []
fd = open(tex_file)
lines = fd.readlines()
fd.close()
# possible figure lead-ins
figure_head = '\\begin{figure' # also matches figure*
figure_tail = '\\end{figure' # also matches figure*
picture_head = '\\begin{picture}'
displaymath_head = '\\begin{displaymath}'
subfloat_head = '\\subfloat'
subfig_head = '\\subfigure'
includegraphics_head = '\\includegraphics'
epsfig_head = '\\epsfig'
input_head = '\\input'
# possible caption lead-ins
caption_head = '\\caption'
figcaption_head = '\\figcaption'
label_head = '\\label'
rotate = 'rotate='
angle = 'angle='
eps_tail = '.eps'
ps_tail = '.ps'
doc_head = '\\begin{document}'
doc_tail = '\\end{document}'
extracted_image_data = []
cur_image = ''
caption = ''
labels = []
active_label = ""
# cut out shit before the doc head
if primary:
for line_index in range(len(lines)):
if lines[line_index].find(doc_head) < 0:
lines[line_index] = ''
else:
break
# are we using commas in filenames here?
commas_okay = False
for dummy1, dummy2, filenames in \
os.walk(os.path.split(os.path.split(tex_file)[0])[0]):
for filename in filenames:
if filename.find(',') > -1:
commas_okay = True
break
# a comment is a % not preceded by a \
comment = re.compile("(?<!\\\\)%")
for line_index in range(len(lines)):
# get rid of pesky comments by splitting where the comment is
# and keeping only the part before the %
line = comment.split(lines[line_index])[0]
line = line.strip()
lines[line_index] = line
in_figure_tag = 0
for line_index in range(len(lines)):
line = lines[line_index]
if line == '':
continue
if line.find(doc_tail) > -1:
return extracted_image_data
"""
FIGURE -
structure of a figure:
\begin{figure}
\formatting...
\includegraphics[someoptions]{FILENAME}
\caption{CAPTION} %caption and includegraphics may be switched!
\end{figure}
"""
index = line.find(figure_head)
if index > -1:
in_figure_tag = 1
# some punks don't like to put things in the figure tag. so we
# just want to see if there is anything that is sitting outside
# of it when we find it
cur_image, caption, extracted_image_data = \
put_it_together(cur_image, caption, active_label, extracted_image_data, \
line_index, lines)
# here, you jerks, just make it so that it's fecking impossible to
# figure out your damn inclusion types
index = max([line.find(eps_tail), line.find(ps_tail), \
line.find(epsfig_head)])
if index > -1:
if line.find(eps_tail) > -1 or line.find(ps_tail) > -1:
ext = True
else:
ext = False
filenames = intelligently_find_filenames(line, ext=ext,
commas_okay=commas_okay)
# try to look ahead! sometimes there are better matches after
if line_index < len(lines) - 1:
filenames.extend(\
intelligently_find_filenames(lines[line_index + 1],
commas_okay=commas_okay))
if line_index < len(lines) - 2:
filenames.extend(\
intelligently_find_filenames(lines[line_index + 2],
commas_okay=commas_okay))
for filename in filenames:
filename = str(filename)
if cur_image == '':
cur_image = filename
elif type(cur_image) == list:
if type(cur_image[SUB_CAPTION_OR_IMAGE]) == list:
cur_image[SUB_CAPTION_OR_IMAGE].append(filename)
else:
cur_image[SUB_CAPTION_OR_IMAGE] = [filename]
else:
cur_image = ['', [cur_image, filename]]
"""
Rotate and angle
"""
index = max(line.find(rotate), line.find(angle))
if index > -1:
# which is the image associated to it?
filenames = intelligently_find_filenames(line,
commas_okay=commas_okay)
# try the line after and the line before
if line_index + 1 < len(lines):
filenames.extend(intelligently_find_filenames(lines[line_index + 1],
commas_okay=commas_okay))
if line_index > 1:
filenames.extend(intelligently_find_filenames(lines[line_index - 1],
commas_okay=commas_okay))
already_tried = []
for filename in filenames:
if filename != 'ERROR' and not filename in already_tried:
if rotate_image(filename, line, sdir, image_list):
break
already_tried.append(filename)
"""
INCLUDEGRAPHICS -
structure of includegraphics:
\includegraphics[someoptions]{FILENAME}
"""
index = line.find(includegraphics_head)
if index > -1:
open_curly, open_curly_line, close_curly, dummy = \
find_open_and_close_braces(line_index, index, '{', lines)
filename = lines[open_curly_line][open_curly + 1:close_curly]
if cur_image == '':
cur_image = filename
elif type(cur_image) == list:
if type(cur_image[SUB_CAPTION_OR_IMAGE]) == list:
cur_image[SUB_CAPTION_OR_IMAGE].append(filename)
else:
cur_image[SUB_CAPTION_OR_IMAGE] = [filename]
else:
cur_image = ['', [cur_image, filename]]
"""
{\input{FILENAME}}
\caption{CAPTION}
This input is ambiguous, since input is also used for things like
inclusion of data from other LaTeX files directly.
"""
index = line.find(input_head)
if index > -1:
new_tex_names = intelligently_find_filenames(line, TeX=True, \
commas_okay=commas_okay)
for new_tex_name in new_tex_names:
if new_tex_name != 'ERROR':
new_tex_file = get_tex_location(new_tex_name, tex_file)
if new_tex_file != None and primary: #to kill recursion
extracted_image_data.extend(extract_captions(\
new_tex_file, sdir, \
image_list,
primary=False))
"""PICTURE"""
index = line.find(picture_head)
if index > -1:
# structure of a picture:
# \begin{picture}
# ....not worrying about this now
#write_message('found picture tag')
#FIXME
pass
"""DISPLAYMATH"""
index = line.find(displaymath_head)
if index > -1:
# structure of a displaymath:
# \begin{displaymath}
# ....not worrying about this now
#write_message('found displaymath tag')
#FIXME
pass
"""
CAPTIONS -
structure of a caption:
\caption[someoptions]{CAPTION}
or
\caption{CAPTION}
or
\caption{{options}{CAPTION}}
"""
index = max([line.find(caption_head), line.find(figcaption_head)])
if index > -1:
open_curly, open_curly_line, close_curly, close_curly_line = \
find_open_and_close_braces(line_index, index, '{', lines)
cap_begin = open_curly + 1
cur_caption = assemble_caption(open_curly_line, cap_begin, \
close_curly_line, close_curly, lines)
if caption == '':
caption = cur_caption
elif type(caption) == list:
if type(caption[SUB_CAPTION_OR_IMAGE]) == list:
caption[SUB_CAPTION_OR_IMAGE].append(cur_caption)
else:
caption[SUB_CAPTION_OR_IMAGE] = [cur_caption]
elif caption != cur_caption:
caption = ['', [caption, cur_caption]]
"""
SUBFLOATS -
structure of a subfloat (inside of a figure tag):
\subfloat[CAPTION]{options{FILENAME}}
also associated with the overall caption of the enclosing figure
"""
index = line.find(subfloat_head)
if index > -1:
# if we are dealing with subfloats, we need a different
# sort of structure to keep track of captions and subcaptions
if type(cur_image) != list:
cur_image = [cur_image, []]
if type(caption) != list:
caption = [caption, []]
open_square, open_square_line, close_square, close_square_line = \
find_open_and_close_braces(line_index, index, '[', lines)
cap_begin = open_square + 1
sub_caption = assemble_caption(open_square_line, \
cap_begin, close_square_line, close_square, lines)
caption[SUB_CAPTION_OR_IMAGE].append(sub_caption)
open_curly, open_curly_line, close_curly, dummy = \
find_open_and_close_braces(close_square_line, \
close_square, '{', lines)
sub_image = lines[open_curly_line][open_curly + 1:close_curly]
cur_image[SUB_CAPTION_OR_IMAGE].append(sub_image)
"""
SUBFIGURES -
structure of a subfigure (inside a figure tag):
\subfigure[CAPTION]{
\includegraphics[options]{FILENAME}}
also associated with the overall caption of the enclosing figure
"""
index = line.find(subfig_head)
if index > -1:
# like with subfloats, we need a different structure for keepin
# track of this stuff
if type(cur_image) != list:
cur_image = [cur_image, []]
if type(caption) != list:
caption = [caption, []]
open_square, open_square_line, close_square, close_square_line = \
find_open_and_close_braces(line_index, index, '[', lines)
cap_begin = open_square + 1
sub_caption = assemble_caption(open_square_line, \
cap_begin, close_square_line, close_square, lines)
caption[SUB_CAPTION_OR_IMAGE].append(sub_caption)
index_cpy = index
# find the graphics tag to get the filename
# it is okay if we eat lines here
index = line.find(includegraphics_head)
while index == -1 and (line_index + 1) < len(lines):
line_index = line_index + 1
line = lines[line_index]
index = line.find(includegraphics_head)
if line_index == len(lines):
# didn't find the image name on line
line_index = index_cpy
open_curly, open_curly_line, close_curly, dummy = \
find_open_and_close_braces(line_index, \
index, '{', lines)
sub_image = lines[open_curly_line][open_curly + 1:close_curly]
cur_image[SUB_CAPTION_OR_IMAGE].append(sub_image)
"""
LABELS -
structure of a label:
\label{somelabelnamewhichprobablyincludesacolon}
Labels are used to tag images and will later be used in ref tags
to reference them. This is interesting because in effect the refs
to a plot are additional caption for it.
Notes: labels can be used for many more things than just plots.
We'll have to experiment with how to best associate a label with an
image.. if it's in the caption, it's easy. If it's in a figure, it's
still okay... but the images that aren't in figure tags are numerous.
"""
index = line.find(label_head)
if index > -1 and in_figure_tag:
open_curly, open_curly_line, close_curly, dummy = \
find_open_and_close_braces(line_index, \
index, '{', lines)
label = lines[open_curly_line][open_curly + 1:close_curly]
if label not in labels:
active_label = label
labels.append(label)
"""
FIGURE
important: we put the check for the end of the figure at the end
of the loop in case some pathological person puts everything in one
line
"""
index = max([line.find(figure_tail), line.find(doc_tail)])
if index > -1:
in_figure_tag = 0
cur_image, caption, extracted_image_data = \
put_it_together(cur_image, caption, active_label, extracted_image_data, \
line_index, lines)
"""
END DOCUMENT
we shouldn't look at anything after the end document tag is found
"""
index = line.find(doc_tail)
if index > -1:
break
return extracted_image_data
def put_it_together(cur_image, caption, context, extracted_image_data, line_index, \
lines):
"""
Takes the current image(s) and caption(s) and assembles them into
something useful in the extracted_image_data list.
@param: cur_image (string || list): the image currently being dealt with, or
the list of images, in the case of subimages
@param: caption (string || list): the caption or captions currently in scope
@param: extracted_image_data ([(string, string), (string, string), ...]):
a list of tuples of images matched to captions from this document.
@param: line_index (int): the index where we are in the lines (for
searchback and searchforward purposes)
@param: lines ([string, string, ...]): the lines in the TeX
@return: (cur_image, caption, extracted_image_data): the same arguments it
was sent, processed appropriately
"""
if type(cur_image) == list:
if cur_image[MAIN_CAPTION_OR_IMAGE] == 'ERROR':
cur_image[MAIN_CAPTION_OR_IMAGE] = ''
for image in cur_image[SUB_CAPTION_OR_IMAGE]:
if image == 'ERROR':
cur_image[SUB_CAPTION_OR_IMAGE].remove(image)
if cur_image != '' and caption != '':
if type(cur_image) == list and type(caption) == list:
if cur_image[MAIN_CAPTION_OR_IMAGE] != '' and\
caption[MAIN_CAPTION_OR_IMAGE] != '':
extracted_image_data.append(
(cur_image[MAIN_CAPTION_OR_IMAGE],
caption[MAIN_CAPTION_OR_IMAGE],
context))
if type(cur_image[MAIN_CAPTION_OR_IMAGE]) == list:
# why is the main image a list?
# it's a good idea to attach the main caption to other
# things, but the main image can only be used once
cur_image[MAIN_CAPTION_OR_IMAGE] = ''
if type(cur_image[SUB_CAPTION_OR_IMAGE]) == list:
if type(caption[SUB_CAPTION_OR_IMAGE]) == list:
for index in \
range(len(cur_image[SUB_CAPTION_OR_IMAGE])):
if index < len(caption[SUB_CAPTION_OR_IMAGE]):
long_caption = \
caption[MAIN_CAPTION_OR_IMAGE] + ' : ' + \
caption[SUB_CAPTION_OR_IMAGE][index]
else:
long_caption = \
caption[MAIN_CAPTION_OR_IMAGE] + ' : ' + \
'Caption not extracted'
extracted_image_data.append(
(cur_image[SUB_CAPTION_OR_IMAGE][index],
long_caption, context))
else:
long_caption = caption[MAIN_CAPTION_OR_IMAGE] + \
' : ' + caption[SUB_CAPTION_OR_IMAGE]
for sub_image in cur_image[SUB_CAPTION_OR_IMAGE]:
extracted_image_data.append(
(sub_image, long_caption, context))
else:
if type(caption[SUB_CAPTION_OR_IMAGE]) == list:
long_caption = caption[MAIN_CAPTION_OR_IMAGE]
for sub_cap in caption[SUB_CAPTION_OR_IMAGE]:
long_caption = long_caption + ' : ' + sub_cap
extracted_image_data.append(
(cur_image[SUB_CAPTION_OR_IMAGE], long_caption, context))
else:
#wtf are they lists for?
extracted_image_data.append(
(cur_image[SUB_CAPTION_OR_IMAGE],
caption[SUB_CAPTION_OR_IMAGE], context))
elif type(cur_image) == list:
if cur_image[MAIN_CAPTION_OR_IMAGE] != '':
extracted_image_data.append(
(cur_image[MAIN_CAPTION_OR_IMAGE], caption, context))
if type(cur_image[SUB_CAPTION_OR_IMAGE]) == list:
for image in cur_image[SUB_CAPTION_OR_IMAGE]:
extracted_image_data.append((image, caption, context))
else:
extracted_image_data.append(
(cur_image[SUB_CAPTION_OR_IMAGE], caption, context))
elif type(caption) == list:
if caption[MAIN_CAPTION_OR_IMAGE] != '':
extracted_image_data.append(
(cur_image, caption[MAIN_CAPTION_OR_IMAGE], context))
if type(caption[SUB_CAPTION_OR_IMAGE]) == list:
# multiple caps for one image:
long_caption = caption[MAIN_CAPTION_OR_IMAGE]
for subcap in caption[SUB_CAPTION_OR_IMAGE]:
if long_caption != '':
long_caption += ' : '
long_caption += subcap
extracted_image_data.append((cur_image, long_caption, context))
else:
extracted_image_data.append(
(cur_image, caption[SUB_CAPTION_OR_IMAGE]. context))
else:
extracted_image_data.append((cur_image, caption, context))
elif cur_image != '' and caption == '':
# we may have missed the caption somewhere.
REASONABLE_SEARCHBACK = 25
REASONABLE_SEARCHFORWARD = 5
curly_no_tag_preceding = '(?<!\\w){'
for searchback in range(REASONABLE_SEARCHBACK):
if line_index - searchback < 0:
continue
back_line = lines[line_index - searchback]
m = re.search(curly_no_tag_preceding, back_line)
if m != None:
open_curly = m.start()
open_curly, open_curly_line, close_curly, \
close_curly_line = find_open_and_close_braces(\
line_index - searchback, open_curly, '{', lines)
cap_begin = open_curly + 1
caption = assemble_caption(open_curly_line, cap_begin, \
close_curly_line, close_curly, lines)
if type(cur_image) == list:
extracted_image_data.append(
(cur_image[MAIN_CAPTION_OR_IMAGE], caption, context))
for sub_img in cur_image[SUB_CAPTION_OR_IMAGE]:
extracted_image_data.append((sub_img, caption, context))
else:
extracted_image_data.append((cur_image, caption, context))
break
if caption == '':
for searchforward in range(REASONABLE_SEARCHFORWARD):
if line_index + searchforward >= len(lines):
break
fwd_line = lines[line_index + searchforward]
m = re.search(curly_no_tag_preceding, fwd_line)
if m != None:
open_curly = m.start()
open_curly, open_curly_line, close_curly, \
close_curly_line = find_open_and_close_braces(\
line_index + searchforward, open_curly, '{', lines)
cap_begin = open_curly + 1
caption = assemble_caption(open_curly_line, \
cap_begin, close_curly_line, close_curly, lines)
if type(cur_image) == list:
extracted_image_data.append(
(cur_image[MAIN_CAPTION_OR_IMAGE], caption, context))
for sub_img in cur_image[SUB_CAPTION_OR_IMAGE]:
extracted_image_data.append((sub_img, caption, context))
else:
extracted_image_data.append((cur_image, caption, context))
break
if caption == '':
if type(cur_image) == list:
extracted_image_data.append(
(cur_image[MAIN_CAPTION_OR_IMAGE], 'No caption found', context))
for sub_img in cur_image[SUB_CAPTION_OR_IMAGE]:
extracted_image_data.append((sub_img, 'No caption', context))
else:
extracted_image_data.append(
(cur_image, 'No caption found', context))
elif caption != '' and cur_image == '':
if type(caption) == list:
long_caption = caption[MAIN_CAPTION_OR_IMAGE]
for subcap in caption[SUB_CAPTION_OR_IMAGE]:
long_caption = long_caption + ': ' + subcap
else:
long_caption = caption
extracted_image_data.append(('', 'noimg' + long_caption, context))
# if we're leaving the figure, no sense keeping the data
cur_image = ''
caption = ''
return (cur_image, caption, extracted_image_data)
def intelligently_find_filenames(line, TeX=False, ext=False, commas_okay=False):
"""
Find the filename in the line. We don't support all filenames! Just eps
and ps for now.
@param: line (string): the line we want to get a filename out of
@return: filename ([string, ...]): what is probably the name of the file(s)
"""
files_included = ['ERROR']
if commas_okay:
valid_for_filename = '\\s*[A-Za-z0-9\\-\\=\\+/\\\\_\\.,%#]+'
else:
valid_for_filename = '\\s*[A-Za-z0-9\\-\\=\\+/\\\\_\\.%#]+'
if ext:
valid_for_filename = valid_for_filename + '\.e*ps[texfi2]*'
if TeX:
valid_for_filename = valid_for_filename + '[\.latex]*'
file_inclusion = re.findall('=' + valid_for_filename + '[ ,]', line)
if len(file_inclusion) > 0:
# right now it looks like '=FILENAME,' or '=FILENAME '
for file_included in file_inclusion:
files_included.append(file_included[1:-1])
file_inclusion = re.findall('(?:[ps]*file=|figure=)' + \
valid_for_filename + '[,\\]} ]*', line)
if len(file_inclusion) > 0:
# still has the =
for file_included in file_inclusion:
part_before_equals = file_included.split('=')[0]
if len(part_before_equals) != file_included:
file_included = file_included[len(part_before_equals) + 1:].strip()
if not file_included in files_included:
files_included.append(file_included)
file_inclusion = re.findall('["\'{\\[]' + valid_for_filename + '[}\\],"\']', \
line)
if len(file_inclusion) > 0:
# right now it's got the {} or [] or "" or '' around it still
for file_included in file_inclusion:
file_included = file_included[1:-1]
file_included = file_included.strip()
if not file_included in files_included:
files_included.append(file_included)
file_inclusion = re.findall('^' + valid_for_filename + '$', line)
if len(file_inclusion) > 0:
for file_included in file_inclusion:
file_included = file_included.strip()
if not file_included in files_included:
files_included.append(file_included)
file_inclusion = re.findall('^' + valid_for_filename + '[,\\} $]', line)
if len(file_inclusion) > 0:
for file_included in file_inclusion:
file_included = file_included.strip()
if not file_included in files_included:
files_included.append(file_included)
file_inclusion = re.findall('\\s*' + valid_for_filename + '\\s*$', line)
if len(file_inclusion) > 0:
for file_included in file_inclusion:
file_included = file_included.strip()
if not file_included in files_included:
files_included.append(file_included)
if files_included != ['ERROR']:
files_included = files_included[1:] # cut off the dummy
for file_included in files_included:
if file_included == '':
files_included.remove(file_included)
if ' ' in file_included:
for subfile in file_included.split(' '):
if not subfile in files_included:
files_included.append(subfile)
if ',' in file_included:
for subfile in file_included.split(' '):
if not subfile in files_included:
files_included.append(subfile)
return files_included
def upload_to_site(marcxml, yes_i_know):
"""
makes the appropriate calls to bibupload to get the MARCXML record onto
the site.
@param: marcxml (string): the absolute location of the MARCXML that was
generated by this programme
@param: yes_i_know (boolean): if true, no confirmation. if false, prompt.
@output: a new record on the invenio site
@return: None
"""
if not yes_i_know:
wait_for_user(wrap_text_in_a_box('You are going to upload new ' + \
'plots to the server.'))
task_low_level_submission('bibupload', 'admin', '-a', marcxml)
help_string = """
name: plotextractor
usage:
python plotextractor.py -d tar/dir -s scratch/dir
python plotextractor.py -i inputfile -u
python plotextractor.py --arXiv=arXiv_id
python plotextractor.py --recid=recids
example:
python plotextractor.py -d /some/path/with/tarballs
python plotextractor.py -i input.txt --no-sdir --extract-text
python plotextractor.py --arXiv=hep-ex/0101001
python plotextractor.py --recid=13-20,29
options:
-d, --tardir=
if you wish to do a batch of tarballs, search the tree
rooted at this directory for them
-s, --scratchdir=
the directory for scratchwork (untarring, conversion, etc.).
make sure that this directory is one of the allowed dirs in
CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS to avoid errors. with an
sdir selected, one xml file will be generated for the whole
batch of files processed, and it will live in this sdir.
-i, --input=
if you wish to give an input file for downloading files from
arXiv (or wherever), this is the pointer to that file, which
should contain urls to download, no more than 1 per line. each
line should be the url of a tarball or gzipped tarball, and
each downloaded item will then be processed.
-x, --extract-text
if there is a pdf with the same base name as the tarball for each
tarball this is being run on, running with the -x parameter will
run pdftotext on each of these pdfs and store the result in the
folder
-f, --force
if you want to overwrite everything that was done before, just
force the script to overwrite it. otherwise it will only run on
things that haven't been run on yet (for use with tardir).
-c, --clean
if you wish to do delete all non-essential files that were extracted.
-u, --call-bibupload, --yes-i-know
if you want to upload the plots, ask to call bibupload. appending
the --yes-i-know flag bypasses bibupload's prompt to upload
-l, --refno-url
Specify an URL to the invenio-instance to query for refno.
Defaults to CFG_SITE_URL.
-k, --skip-refno
allows you to skip any refno check
-r, --recid=
if you want to process the tarball of one recid, use this tag. it
will also accept ranges (i.e. --recid=13-20)
-a, --arXiv=
if you want to process the tarball of one arXiv id, use this tag.
-t, --tarball=
for processing one tarball.
-q, --squash
if you want to squash all MARC into a single MARC file (for easier
and faster bibuploading)
-h, --help
Print this help and exit.
description: extracts plots from a tarfile from arXiv and generates
MARCXML that links figures and their captions. converts all
images to PNG format.
"""
def usage():
write_message(help_string)
if __name__ == '__main__':
main()
diff --git a/invenio/utils/text.py b/invenio/utils/text.py
index 14a93c1a6..5d0f91a0e 100644
--- a/invenio/utils/text.py
+++ b/invenio/utils/text.py
@@ -1,772 +1,778 @@
# -*- coding: utf-8 -*-
## This file is part of Invenio.
## Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Functions useful for text wrapping (in a box) and indenting.
"""
__revision__ = "$Id$"
import sys
import re
import textwrap
import htmlentitydefs
+import pkg_resources
from invenio.base.globals import cfg
try:
import chardet
CHARDET_AVAILABLE = True
except ImportError:
CHARDET_AVAILABLE = False
try:
from unidecode import unidecode
UNIDECODE_AVAILABLE = True
except ImportError:
UNIDECODE_AVAILABLE = False
CFG_LATEX_UNICODE_TRANSLATION_CONST = {}
CFG_WRAP_TEXT_IN_A_BOX_STYLES = {
'__DEFAULT' : {
'horiz_sep' : '*',
'max_col' : 72,
'min_col' : 40,
'tab_str' : ' ',
'tab_num' : 0,
'border' : ('**', '*', '**', '** ', ' **', '**', '*', '**'),
'prefix' : '\n',
'suffix' : '\n',
'break_long' : False,
'force_horiz' : False,
},
'squared' : {
'horiz_sep' : '-',
'border' : ('+', '-', '+', '| ', ' |', '+', '-', '+')
},
'double_sharp' : {
'horiz_sep' : '#',
'border' : ('##', '#', '##', '## ', ' ##', '##', '#', '##')
},
'single_sharp' : {
'horiz_sep' : '#',
'border' : ('#', '#', '#', '# ', ' #', '#', '#', '#')
},
'single_star' : {
'border' : ('*', '*', '*', '* ', ' *', '*', '*', '*',)
},
'double_star' : {
},
'no_border' : {
'horiz_sep' : '',
'border' : ('', '', '', '', '', '', '', ''),
'prefix' : '',
'suffix' : ''
},
'conclusion' : {
'border' : ('', '', '', '', '', '', '', ''),
'prefix' : '',
'horiz_sep' : '-',
'force_horiz' : True,
},
'important' : {
'tab_num' : 1,
},
'ascii' : {
'horiz_sep' : (u'├', u'─', u'┤'),
'border' : (u'┌', u'─', u'┐', u'│ ', u' │', u'└', u'─', u'┘'),
},
'ascii_double' : {
'horiz_sep' : (u'╠', u'═', u'╣'),
'border' : (u'╔', u'═', u'╗', u'║ ', u' ║', u'╚', u'═', u'╝'),
}
}
re_unicode_lowercase_a = re.compile(unicode(r"(?u)[áàäâãå]", "utf-8"))
re_unicode_lowercase_ae = re.compile(unicode(r"(?u)[æ]", "utf-8"))
re_unicode_lowercase_oe = re.compile(unicode(r"(?u)[œ]", "utf-8"))
re_unicode_lowercase_e = re.compile(unicode(r"(?u)[éèëê]", "utf-8"))
re_unicode_lowercase_i = re.compile(unicode(r"(?u)[íìïî]", "utf-8"))
re_unicode_lowercase_o = re.compile(unicode(r"(?u)[óòöôõø]", "utf-8"))
re_unicode_lowercase_u = re.compile(unicode(r"(?u)[úùüû]", "utf-8"))
re_unicode_lowercase_y = re.compile(unicode(r"(?u)[ýÿ]", "utf-8"))
re_unicode_lowercase_c = re.compile(unicode(r"(?u)[çć]", "utf-8"))
re_unicode_lowercase_n = re.compile(unicode(r"(?u)[ñ]", "utf-8"))
re_unicode_uppercase_a = re.compile(unicode(r"(?u)[ÁÀÄÂÃÅ]", "utf-8"))
re_unicode_uppercase_ae = re.compile(unicode(r"(?u)[Æ]", "utf-8"))
re_unicode_uppercase_oe = re.compile(unicode(r"(?u)[Œ]", "utf-8"))
re_unicode_uppercase_e = re.compile(unicode(r"(?u)[ÉÈËÊ]", "utf-8"))
re_unicode_uppercase_i = re.compile(unicode(r"(?u)[ÍÌÏÎ]", "utf-8"))
re_unicode_uppercase_o = re.compile(unicode(r"(?u)[ÓÒÖÔÕØ]", "utf-8"))
re_unicode_uppercase_u = re.compile(unicode(r"(?u)[ÚÙÜÛ]", "utf-8"))
re_unicode_uppercase_y = re.compile(unicode(r"(?u)[Ý]", "utf-8"))
re_unicode_uppercase_c = re.compile(unicode(r"(?u)[ÇĆ]", "utf-8"))
re_unicode_uppercase_n = re.compile(unicode(r"(?u)[Ñ]", "utf-8"))
re_latex_lowercase_a = re.compile("\\\\[\"H'`~^vu=k]\{?a\}?")
re_latex_lowercase_ae = re.compile("\\\\ae\\{\\}?")
re_latex_lowercase_oe = re.compile("\\\\oe\\{\\}?")
re_latex_lowercase_e = re.compile("\\\\[\"H'`~^vu=k]\\{?e\\}?")
re_latex_lowercase_i = re.compile("\\\\[\"H'`~^vu=k]\\{?i\\}?")
re_latex_lowercase_o = re.compile("\\\\[\"H'`~^vu=k]\\{?o\\}?")
re_latex_lowercase_u = re.compile("\\\\[\"H'`~^vu=k]\\{?u\\}?")
re_latex_lowercase_y = re.compile("\\\\[\"']\\{?y\\}?")
re_latex_lowercase_c = re.compile("\\\\['uc]\\{?c\\}?")
re_latex_lowercase_n = re.compile("\\\\[c'~^vu]\\{?n\\}?")
re_latex_uppercase_a = re.compile("\\\\[\"H'`~^vu=k]\\{?A\\}?")
re_latex_uppercase_ae = re.compile("\\\\AE\\{?\\}?")
re_latex_uppercase_oe = re.compile("\\\\OE\\{?\\}?")
re_latex_uppercase_e = re.compile("\\\\[\"H'`~^vu=k]\\{?E\\}?")
re_latex_uppercase_i = re.compile("\\\\[\"H'`~^vu=k]\\{?I\\}?")
re_latex_uppercase_o = re.compile("\\\\[\"H'`~^vu=k]\\{?O\\}?")
re_latex_uppercase_u = re.compile("\\\\[\"H'`~^vu=k]\\{?U\\}?")
re_latex_uppercase_y = re.compile("\\\\[\"']\\{?Y\\}?")
re_latex_uppercase_c = re.compile("\\\\['uc]\\{?C\\}?")
re_latex_uppercase_n = re.compile("\\\\[c'~^vu]\\{?N\\}?")
+
+def get_kb_filename(filename='latex-to-unicode.kb'):
+ return pkg_resources.resource_filename('invenio.utils.data', filename)
+
+
def indent_text(text,
nb_tabs=0,
tab_str=" ",
linebreak_input="\n",
linebreak_output="\n",
wrap=False):
"""
add tabs to each line of text
@param text: the text to indent
@param nb_tabs: number of tabs to add
@param tab_str: type of tab (could be, for example "\t", default: 2 spaces
@param linebreak_input: linebreak on input
@param linebreak_output: linebreak on output
@param wrap: wethever to apply smart text wrapping.
(by means of wrap_text_in_a_box)
@return: indented text as string
"""
if not wrap:
lines = text.split(linebreak_input)
tabs = nb_tabs*tab_str
output = ""
for line in lines:
output += tabs + line + linebreak_output
return output
else:
return wrap_text_in_a_box(body=text, style='no_border',
tab_str=tab_str, tab_num=nb_tabs)
_RE_BEGINNING_SPACES = re.compile(r'^\s*')
_RE_NEWLINES_CLEANER = re.compile(r'\n+')
_RE_LONELY_NEWLINES = re.compile(r'\b\n\b')
def wrap_text_in_a_box(body='', title='', style='double_star', **args):
"""Return a nicely formatted text box:
e.g.
******************
** title **
**--------------**
** body **
******************
Indentation and newline are respected.
@param body: the main text
@param title: an optional title
@param style: the name of one of the style in CFG_WRAP_STYLES. By default
the double_star style is used.
You can further tune the desired style by setting various optional
parameters:
@param horiz_sep: a string that is repeated in order to produce a
separator row between the title and the body (if needed)
or a tuple of three characters in the form (l, c, r)
@param max_col: the maximum number of coulmns used by the box
(including indentation)
@param min_col: the symmetrical minimum number of columns
@param tab_str: a string to represent indentation
@param tab_num: the number of leveles of indentations
@param border: a tuple of 8 element in the form
(tl, t, tr, l, r, bl, b, br) of strings that represent the
different corners and sides of the box
@param prefix: a prefix string added before the box
@param suffix: a suffix string added after the box
@param break_long: wethever to break long words in order to respect
max_col
@param force_horiz: True in order to print the horizontal line even when
there is no title
e.g.:
print wrap_text_in_a_box(title='prova',
body=' 123 prova.\n Vediamo come si indenta',
horiz_sep='-', style='no_border', max_col=20, tab_num=1)
prova
----------------
123 prova.
Vediamo come
si indenta
"""
def _wrap_row(row, max_col, break_long):
"""Wrap a single row"""
spaces = _RE_BEGINNING_SPACES.match(row).group()
row = row[len(spaces):]
spaces = spaces.expandtabs()
return textwrap.wrap(row, initial_indent=spaces,
subsequent_indent=spaces, width=max_col,
break_long_words=break_long)
def _clean_newlines(text):
text = _RE_LONELY_NEWLINES.sub(' \n', text)
return _RE_NEWLINES_CLEANER.sub(lambda x: x.group()[:-1], text)
body = unicode(body, 'utf-8')
title = unicode(title, 'utf-8')
astyle = dict(CFG_WRAP_TEXT_IN_A_BOX_STYLES['__DEFAULT'])
if CFG_WRAP_TEXT_IN_A_BOX_STYLES.has_key(style):
astyle.update(CFG_WRAP_TEXT_IN_A_BOX_STYLES[style])
astyle.update(args)
horiz_sep = astyle['horiz_sep']
border = astyle['border']
tab_str = astyle['tab_str'] * astyle['tab_num']
max_col = max(astyle['max_col'] \
- len(border[3]) - len(border[4]) - len(tab_str), 1)
min_col = astyle['min_col']
prefix = astyle['prefix']
suffix = astyle['suffix']
force_horiz = astyle['force_horiz']
break_long = astyle['break_long']
body = _clean_newlines(body)
tmp_rows = [_wrap_row(row, max_col, break_long)
for row in body.split('\n')]
body_rows = []
for rows in tmp_rows:
if rows:
body_rows += rows
else:
body_rows.append('')
if not ''.join(body_rows).strip():
# Concrete empty body
body_rows = []
title = _clean_newlines(title)
tmp_rows = [_wrap_row(row, max_col, break_long)
for row in title.split('\n')]
title_rows = []
for rows in tmp_rows:
if rows:
title_rows += rows
else:
title_rows.append('')
if not ''.join(title_rows).strip():
# Concrete empty title
title_rows = []
max_col = max([len(row) for row in body_rows + title_rows] + [min_col])
mid_top_border_len = max_col \
+ len(border[3]) + len(border[4]) - len(border[0]) - len(border[2])
mid_bottom_border_len = max_col \
+ len(border[3]) + len(border[4]) - len(border[5]) - len(border[7])
top_border = border[0] \
+ (border[1] * mid_top_border_len)[:mid_top_border_len] + border[2]
bottom_border = border[5] \
+ (border[6] * mid_bottom_border_len)[:mid_bottom_border_len] \
+ border[7]
if type(horiz_sep) is tuple and len(horiz_sep) == 3:
horiz_line = horiz_sep[0] + (horiz_sep[1] * (max_col + 2))[:(max_col + 2)] + horiz_sep[2]
else:
horiz_line = border[3] + (horiz_sep * max_col)[:max_col] + border[4]
title_rows = [tab_str + border[3] + row
+ ' ' * (max_col - len(row)) + border[4] for row in title_rows]
body_rows = [tab_str + border[3] + row
+ ' ' * (max_col - len(row)) + border[4] for row in body_rows]
ret = []
if top_border:
ret += [tab_str + top_border]
ret += title_rows
if title_rows or force_horiz:
ret += [tab_str + horiz_line]
ret += body_rows
if bottom_border:
ret += [tab_str + bottom_border]
return (prefix + '\n'.join(ret) + suffix).encode('utf-8')
def wait_for_user(msg=""):
"""
Print MSG and a confirmation prompt, waiting for user's
confirmation, unless silent '--yes-i-know' command line option was
used, in which case the function returns immediately without
printing anything.
"""
if '--yes-i-know' in sys.argv:
return
print msg
try:
answer = raw_input("Please confirm by typing 'Yes, I know!': ")
except KeyboardInterrupt:
print
answer = ''
if answer != 'Yes, I know!':
sys.stderr.write("ERROR: Aborted.\n")
sys.exit(1)
return
def guess_minimum_encoding(text, charsets=('ascii', 'latin1', 'utf8')):
"""Try to guess the minimum charset that is able to represent the given
text using the provided charsets. text is supposed to be encoded in utf8.
Returns (encoded_text, charset) where charset is the first charset
in the sequence being able to encode text.
Returns (text_in_utf8, 'utf8') in case no charset is able to encode text.
@note: If the input text is not in strict UTF-8, then replace any
non-UTF-8 chars inside it.
"""
text_in_unicode = text.decode('utf8', 'replace')
for charset in charsets:
try:
return (text_in_unicode.encode(charset), charset)
except (UnicodeEncodeError, UnicodeDecodeError):
pass
return (text_in_unicode.encode('utf8'), 'utf8')
def encode_for_xml(text, wash=False, xml_version='1.0', quote=False):
"""Encodes special characters in a text so that it would be
XML-compliant.
@param text: text to encode
@return: an encoded text"""
text = text.replace('&', '&amp;')
text = text.replace('<', '&lt;')
if quote:
text = text.replace('"', '&quot;')
if wash:
text = wash_for_xml(text, xml_version=xml_version)
return text
try:
unichr(0x100000)
RE_ALLOWED_XML_1_0_CHARS = re.compile(u'[^\U00000009\U0000000A\U0000000D\U00000020-\U0000D7FF\U0000E000-\U0000FFFD\U00010000-\U0010FFFF]')
RE_ALLOWED_XML_1_1_CHARS = re.compile(u'[^\U00000001-\U0000D7FF\U0000E000-\U0000FFFD\U00010000-\U0010FFFF]')
except ValueError:
# oops, we are running on a narrow UTF/UCS Python build,
# so we have to limit the UTF/UCS char range:
RE_ALLOWED_XML_1_0_CHARS = re.compile(u'[^\U00000009\U0000000A\U0000000D\U00000020-\U0000D7FF\U0000E000-\U0000FFFD]')
RE_ALLOWED_XML_1_1_CHARS = re.compile(u'[^\U00000001-\U0000D7FF\U0000E000-\U0000FFFD]')
def wash_for_xml(text, xml_version='1.0'):
"""
Removes any character which is not in the range of allowed
characters for XML. The allowed characters depends on the version
of XML.
- XML 1.0:
<http://www.w3.org/TR/REC-xml/#charsets>
- XML 1.1:
<http://www.w3.org/TR/xml11/#charsets>
@param text: input string to wash.
@param xml_version: version of the XML for which we wash the
input. Value for this parameter can be '1.0' or '1.1'
"""
if xml_version == '1.0':
return RE_ALLOWED_XML_1_0_CHARS.sub('', unicode(text, 'utf-8')).encode('utf-8')
else:
return RE_ALLOWED_XML_1_1_CHARS.sub('', unicode(text, 'utf-8')).encode('utf-8')
def wash_for_utf8(text, correct=True):
"""Return UTF-8 encoded binary string with incorrect characters washed away.
@param text: input string to wash (can be either a binary string or a Unicode string)
@param correct: whether to correct bad characters or throw exception
"""
if isinstance(text, unicode):
return text.encode('utf-8')
ret = []
while True:
try:
text.decode("utf-8")
except UnicodeDecodeError, e:
if correct:
ret.append(text[:e.start])
text = text[e.end:]
else:
raise e
else:
break
ret.append(text)
return ''.join(ret)
def nice_number(number, thousands_separator=',', max_ndigits_after_dot=None):
"""
Return nicely printed number NUMBER in language LN using
given THOUSANDS_SEPARATOR character.
If max_ndigits_after_dot is specified and the number is float, the
number is rounded by taking in consideration up to max_ndigits_after_dot
digit after the dot.
This version does not pay attention to locale. See
tmpl_nice_number_via_locale().
"""
if type(number) is float:
if max_ndigits_after_dot is not None:
number = round(number, max_ndigits_after_dot)
int_part, frac_part = str(number).split('.')
return '%s.%s' % (nice_number(int(int_part), thousands_separator),
frac_part)
else:
chars_in = list(str(number))
number = len(chars_in)
chars_out = []
for i in range(0, number):
if i % 3 == 0 and i != 0:
chars_out.append(thousands_separator)
chars_out.append(chars_in[number - i - 1])
chars_out.reverse()
return ''.join(chars_out)
def nice_size(size):
"""
@param size: the size.
@type size: int
@return: a nicely printed size.
@rtype: string
"""
unit = 'B'
if size > 1024:
size /= 1024.0
unit = 'KB'
if size > 1024:
size /= 1024.0
unit = 'MB'
if size > 1024:
size /= 1024.0
unit = 'GB'
return '%s %s' % (nice_number(size, max_ndigits_after_dot=2), unit)
def remove_line_breaks(text):
"""
Remove line breaks from input, including unicode 'line
separator', 'paragraph separator', and 'next line' characters.
"""
return unicode(text, 'utf-8').replace('\f', '').replace('\n', '').replace('\r', '').replace(u'\xe2\x80\xa8', '').replace(u'\xe2\x80\xa9', '').replace(u'\xc2\x85', '').encode('utf-8')
def decode_to_unicode(text, default_encoding='utf-8'):
"""
Decode input text into Unicode representation by first using the default
encoding utf-8.
If the operation fails, it detects the type of encoding used in the given text.
For optimal result, it is recommended that the 'chardet' module is installed.
NOTE: Beware that this might be slow for *very* large strings.
If chardet detection fails, it will try to decode the string using the basic
detection function guess_minimum_encoding().
Also, bear in mind that it is impossible to detect the correct encoding at all
times, other then taking educated guesses. With that said, this function will
always return some decoded Unicode string, however the data returned may not
be the same as original data in some cases.
@param text: the text to decode
@type text: string
@param default_encoding: the character encoding to use. Optional.
@type default_encoding: string
@return: input text as Unicode
@rtype: string
"""
if not text:
return ""
try:
return text.decode(default_encoding)
except (UnicodeError, LookupError):
pass
detected_encoding = None
if CHARDET_AVAILABLE:
# We can use chardet to perform detection
res = chardet.detect(text)
if res['confidence'] >= 0.8:
detected_encoding = res['encoding']
if detected_encoding == None:
# No chardet detection, try to make a basic guess
dummy, detected_encoding = guess_minimum_encoding(text)
return text.decode(detected_encoding)
def translate_latex2unicode(text, kb_file=None):
"""
This function will take given text, presumably containing LaTeX symbols,
and attempts to translate it to Unicode using the given or default KB
translation table located under CFG_ETCDIR/bibconvert/KB/latex-to-unicode.kb.
The translated Unicode string will then be returned.
If the translation table and compiled regular expression object is not
previously generated in the current session, they will be.
@param text: a text presumably containing LaTeX symbols.
@type text: string
@param kb_file: full path to file containing latex2unicode translations.
Defaults to CFG_ETCDIR/bibconvert/KB/latex-to-unicode.kb
@type kb_file: string
@return: Unicode representation of translated text
@rtype: unicode
"""
if kb_file is None:
- kb_file = "%s/bibconvert/KB/latex-to-unicode.kb" % (cfg['CFG_ETCDIR'],)
+ kb_file = get_kb_filename()
# First decode input text to Unicode
try:
text = decode_to_unicode(text)
except UnicodeDecodeError:
text = unicode(wash_for_utf8(text))
# Load translation table, if required
if CFG_LATEX_UNICODE_TRANSLATION_CONST == {}:
_load_latex2unicode_constants(kb_file)
# Find all matches and replace text
for match in CFG_LATEX_UNICODE_TRANSLATION_CONST['regexp_obj'].finditer(text):
# If LaTeX style markers {, } and $ are before or after the matching text, it
# will replace those as well
text = re.sub("[\{\$]?%s[\}\$]?" % (re.escape(match.group()),), \
CFG_LATEX_UNICODE_TRANSLATION_CONST['table'][match.group()], \
text)
# Return Unicode representation of translated text
return text
def _load_latex2unicode_constants(kb_file=None):
"""
Load LaTeX2Unicode translation table dictionary and regular expression object
from KB to a global dictionary.
@param kb_file: full path to file containing latex2unicode translations.
Defaults to CFG_ETCDIR/bibconvert/KB/latex-to-unicode.kb
@type kb_file: string
@return: dict of type: {'regexp_obj': regexp match object,
'table': dict of LaTeX -> Unicode mappings}
@rtype: dict
"""
if kb_file is None:
- kb_file = "%s/bibconvert/KB/latex-to-unicode.kb" % (cfg['CFG_ETCDIR'],)
+ kb_file = get_kb_filename()
try:
data = open(kb_file)
except IOError:
# File not found or similar
sys.stderr.write("\nCould not open LaTeX to Unicode KB file. Aborting translation.\n")
return CFG_LATEX_UNICODE_TRANSLATION_CONST
latex_symbols = []
translation_table = {}
for line in data:
# The file has form of latex|--|utf-8. First decode to Unicode.
line = line.decode('utf-8')
mapping = line.split('|--|')
translation_table[mapping[0].rstrip('\n')] = mapping[1].rstrip('\n')
latex_symbols.append(re.escape(mapping[0].rstrip('\n')))
data.close()
CFG_LATEX_UNICODE_TRANSLATION_CONST['regexp_obj'] = re.compile("|".join(latex_symbols))
CFG_LATEX_UNICODE_TRANSLATION_CONST['table'] = translation_table
def translate_to_ascii(values):
"""
Transliterate the string contents of the given sequence into ascii representation.
Returns a sequence with the modified values if the module 'unidecode' is
available. Otherwise it will fall back to the inferior strip_accents function.
For example: H\xc3\xb6hne becomes Hohne.
Note: Passed strings are returned as a list.
@param values: sequence of strings to transform
@type values: sequence
@return: sequence with values transformed to ascii
@rtype: sequence
"""
if not values:
return values
if type(values) == str:
values = [values]
for index, value in enumerate(values):
if not value:
continue
if not UNIDECODE_AVAILABLE:
ascii_text = strip_accents(value)
else:
encoded_text, encoding = guess_minimum_encoding(value)
unicode_text = unicode(encoded_text.decode(encoding))
ascii_text = unidecode(unicode_text).encode('ascii')
values[index] = ascii_text
return values
def xml_entities_to_utf8(text, skip=('lt', 'gt', 'amp')):
"""
Removes HTML or XML character references and entities from a text string
and replaces them with their UTF-8 representation, if possible.
@param text: The HTML (or XML) source text.
@type text: string
@param skip: list of entity names to skip when transforming.
@type skip: iterable
@return: The plain text, as a Unicode string, if necessary.
@author: Based on http://effbot.org/zone/re-sub.htm#unescape-html
"""
def fixup(m):
text = m.group(0)
if text[:2] == "&#":
# character reference
try:
if text[:3] == "&#x":
return unichr(int(text[3:-1], 16)).encode("utf-8")
else:
return unichr(int(text[2:-1])).encode("utf-8")
except ValueError:
pass
else:
# named entity
if text[1:-1] not in skip:
try:
text = unichr(htmlentitydefs.name2codepoint[text[1:-1]]).encode("utf-8")
except KeyError:
pass
return text # leave as is
return re.sub("&#?\w+;", fixup, text)
def strip_accents(x):
"""
Strip accents in the input phrase X (assumed in UTF-8) by replacing
accented characters with their unaccented cousins (e.g. é by e).
@param x: the input phrase to strip.
@type x: string
@return: Return such a stripped X.
"""
x = re_latex_lowercase_a.sub("a", x)
x = re_latex_lowercase_ae.sub("ae", x)
x = re_latex_lowercase_oe.sub("oe", x)
x = re_latex_lowercase_e.sub("e", x)
x = re_latex_lowercase_i.sub("i", x)
x = re_latex_lowercase_o.sub("o", x)
x = re_latex_lowercase_u.sub("u", x)
x = re_latex_lowercase_y.sub("x", x)
x = re_latex_lowercase_c.sub("c", x)
x = re_latex_lowercase_n.sub("n", x)
x = re_latex_uppercase_a.sub("A", x)
x = re_latex_uppercase_ae.sub("AE", x)
x = re_latex_uppercase_oe.sub("OE", x)
x = re_latex_uppercase_e.sub("E", x)
x = re_latex_uppercase_i.sub("I", x)
x = re_latex_uppercase_o.sub("O", x)
x = re_latex_uppercase_u.sub("U", x)
x = re_latex_uppercase_y.sub("Y", x)
x = re_latex_uppercase_c.sub("C", x)
x = re_latex_uppercase_n.sub("N", x)
# convert input into Unicode string:
try:
y = unicode(x, "utf-8")
except:
return x # something went wrong, probably the input wasn't UTF-8
# asciify Latin-1 lowercase characters:
y = re_unicode_lowercase_a.sub("a", y)
y = re_unicode_lowercase_ae.sub("ae", y)
y = re_unicode_lowercase_oe.sub("oe", y)
y = re_unicode_lowercase_e.sub("e", y)
y = re_unicode_lowercase_i.sub("i", y)
y = re_unicode_lowercase_o.sub("o", y)
y = re_unicode_lowercase_u.sub("u", y)
y = re_unicode_lowercase_y.sub("y", y)
y = re_unicode_lowercase_c.sub("c", y)
y = re_unicode_lowercase_n.sub("n", y)
# asciify Latin-1 uppercase characters:
y = re_unicode_uppercase_a.sub("A", y)
y = re_unicode_uppercase_ae.sub("AE", y)
y = re_unicode_uppercase_oe.sub("OE", y)
y = re_unicode_uppercase_e.sub("E", y)
y = re_unicode_uppercase_i.sub("I", y)
y = re_unicode_uppercase_o.sub("O", y)
y = re_unicode_uppercase_u.sub("U", y)
y = re_unicode_uppercase_y.sub("Y", y)
y = re_unicode_uppercase_c.sub("C", y)
y = re_unicode_uppercase_n.sub("N", y)
# return UTF-8 representation of the Unicode string:
return y.encode("utf-8")
def show_diff(original, modified, prefix="<pre>", sufix="</pre>"):
"""
Returns the diff view between source and changed strings.
Function checks both arguments line by line and returns a string
with additional css classes for difference view
@param original: base string
@param modified: changed string
@param prefix: prefix of the output string
@param sufix: sufix of the output string
@return: string with the comparison of the records
@rtype: string
"""
import difflib
differ = difflib.Differ()
result = [prefix]
for line in differ.compare(modified.splitlines(), original.splitlines()):
if line[0] == ' ':
result.append(line.strip())
elif line[0] == '-':
# Mark as deleted
result.append('<strong class="diff_field_deleted">' + line[2:].strip() + "</strong>")
elif line[0] == '+':
# Mark as added/modified
result.append('<strong class="diff_field_added">' + line[2:].strip() + "</strong>")
else:
continue
result.append(sufix)
return '\n'.join(result)
def transliterate_ala_lc(value):
"""
Transliterate a string.
Compatibility with the ALA-LC romanization standard:
http://www.loc.gov/catdir/cpso/roman.html
Maps from one system of writing into another, letter by letter.
Uses 'unidecode' if available.
@param values: string to transform
@type values: string
@return: string transliterated
@rtype: string
"""
if not value:
return value
if UNIDECODE_AVAILABLE:
text = unidecode(value)
else:
text = translate_to_ascii(value)
text = text.pop()
return text
diff --git a/invenio_demosite/testsuite/regression/test_batchuploader.py b/invenio_demosite/testsuite/regression/test_batchuploader.py
index b63b75db5..dcb3bc913 100644
--- a/invenio_demosite/testsuite/regression/test_batchuploader.py
+++ b/invenio_demosite/testsuite/regression/test_batchuploader.py
@@ -1,226 +1,226 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0301
"""Regression tests for the BatchUploader."""
from invenio.testutils import InvenioTestCase
import os
import os.path
import urllib2
import urlparse
import socket
from urllib import urlencode
from invenio.testsuite import make_test_suite, run_test_suite
from invenio.legacy.dbquery import run_sql
from invenio.utils.json import json
from invenio.config import CFG_DEVEL_SITE, CFG_SITE_URL, CFG_TMPDIR, CFG_BINDIR
-from invenio.bibsched import get_last_taskid, delete_task
+from invenio.legacy.bibsched.scripts.bibsched import get_last_taskid, delete_task
from invenio.utils.shell import run_shell_command
-from invenio.bibupload_regression_tests import GenericBibUploadTest
+from invenio.legacy.bibupload.engine_regression_tests import GenericBibUploadTest
from invenio.utils.url import make_user_agent_string
CFG_HAS_CURL = os.path.exists("/usr/bin/curl")
## NOTE: default invenio.conf authorization are granted only to 127.0.0.1
## or 127.0.1.1, a.k.a. localhost, so the following checks if the current host
## is well recognized as localhost. Otherwise disable tests since they would
## fail due to not enough authorizations.
CFG_LOCALHOST_OK = socket.gethostbyname(urlparse.urlparse(CFG_SITE_URL)[1].split(':')[0]) in ('127.0.0.1', '127.0.1.1')
class BatchUploaderRobotUploadTests(GenericBibUploadTest):
"""
Testing Class for robotupload
"""
def setUp(self):
GenericBibUploadTest.setUp(self)
self.callback_result_path = os.path.join(CFG_TMPDIR, 'robotupload.json')
self.callback_url = CFG_SITE_URL + '/httptest/post2?%s' % urlencode({
"save": self.callback_result_path})
self.oracle_callback_url = CFG_SITE_URL + '/httptest/oraclefriendly?%s' % urlencode({
"save": self.callback_result_path})
if os.path.exists(self.callback_result_path):
os.remove(self.callback_result_path)
self.last_taskid = get_last_taskid()
self.marcxml = """\
<record>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Doe, John</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">The title</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">TEST</subfield>
</datafield>
</record>"""
self.req = urllib2.Request(CFG_SITE_URL + '/batchuploader/robotupload/insert')
self.req.add_header('Content-Type', 'application/marcxml+xml')
self.req.add_header('User-Agent', make_user_agent_string('BatchUploader'))
self.req.add_data(self.marcxml)
self.req_callback = urllib2.Request(CFG_SITE_URL + '/batchuploader/robotupload/insert?' + urlencode({
'callback_url': self.callback_url}))
self.req_callback.add_header('Content-Type', 'application/marcxml+xml')
self.req_callback.add_header('User-Agent', 'invenio_webupload')
self.req_callback.add_data(self.marcxml)
self.nonce_url = CFG_SITE_URL + '/batchuploader/robotupload/insert?' + urlencode({
'nonce': "1234",
'callback_url': self.callback_url})
self.req_nonce = urllib2.Request(self.nonce_url)
self.req_nonce.add_header('Content-Type', 'application/marcxml+xml')
self.req_nonce.add_header('User-Agent', 'invenio_webupload')
self.req_nonce.add_data(self.marcxml)
self.oracle_url = CFG_SITE_URL + '/batchuploader/robotupload/insert?' + urlencode({
'special_treatment': 'oracle',
'callback_url': self.oracle_callback_url})
self.req_oracle = urllib2.Request(self.oracle_url)
self.req_oracle.add_header('Content-Type', 'application/marcxml+xml')
self.req_oracle.add_header('User-Agent', 'invenio_webupload')
self.req_oracle.add_data(self.marcxml)
self.legacy_url = CFG_SITE_URL + '/batchuploader/robotupload'
def tearDown(self):
GenericBibUploadTest.tearDown(self)
if os.path.exists(self.callback_result_path):
os.remove(self.callback_result_path)
current_task = get_last_taskid()
if current_task != self.last_taskid:
delete_task(current_task)
if CFG_LOCALHOST_OK:
def test_bad_marcxml(self):
"""batchuploader - robotupload bad MARCXML"""
self.req.add_data("BLABLA")
result = urllib2.urlopen(self.req).read()
self.assertEqual(result, "[ERROR] MARCXML is not valid.\n")
if CFG_LOCALHOST_OK:
def test_bad_agent(self):
"""batchuploader - robotupload bad agent"""
self.req.add_header('User-Agent', 'badagent')
result = urllib2.urlopen(self.req).read()
self.assertEqual(result, "[ERROR] Sorry, the badagent useragent cannot use the service.\n")
if CFG_LOCALHOST_OK:
def test_simple_insert(self):
"""batchuploader - robotupload simple insert"""
from invenio.legacy.search_engine import get_record
result = urllib2.urlopen(self.req).read()
self.failUnless("[INFO]" in result)
current_task = get_last_taskid()
run_shell_command("%s/bibupload %%s" % CFG_BINDIR, [str(current_task)])
current_recid = run_sql("SELECT MAX(id) FROM bibrec")[0][0]
self.failIfEqual(self.last_recid, current_recid)
record = get_record(current_recid)
self.assertEqual(record['245'][0][0], [('a', 'The title')])
if CFG_DEVEL_SITE and CFG_LOCALHOST_OK:
## This expect a particular testing web handler that is available
## only when CFG_DEVEL_SITE is set up correctly
def test_insert_with_callback(self):
"""batchuploader - robotupload insert with callback"""
result = urllib2.urlopen(self.req_callback).read()
self.failUnless("[INFO]" in result, '"%s" did not contained [INFO]' % result)
current_task = get_last_taskid()
run_shell_command("%s/bibupload %%s" % CFG_BINDIR, [str(current_task)])
results = json.loads(open(self.callback_result_path).read())
self.failUnless('results' in results)
self.assertEqual(len(results['results']), 1)
self.failUnless(results['results'][0]['success'])
self.failUnless(results['results'][0]['recid'] > 0)
self.failUnless("""<subfield code="a">Doe, John</subfield>""" in results['results'][0]['marcxml'], results['results'][0]['marcxml'])
def test_insert_with_nonce(self):
"""batchuploader - robotupload insert with nonce"""
result = urllib2.urlopen(self.req_nonce).read()
self.failUnless("[INFO]" in result, '"%s" did not contained "[INFO]"' % result)
current_task = get_last_taskid()
run_shell_command("%s/bibupload %%s" % CFG_BINDIR, [str(current_task)])
results = json.loads(open(self.callback_result_path).read())
self.failUnless('results' in results, '"%s" did not contained "results" key' % results)
self.assertEqual(len(results['results']), 1)
self.assertEqual(results['nonce'], "1234")
self.failUnless(results['results'][0]['success'])
self.failUnless(results['results'][0]['recid'] > 0)
self.failUnless("""<subfield code="a">Doe, John</subfield>""" in results['results'][0]['marcxml'], results['results'][0]['marcxml'])
def test_insert_with_oracle(self):
"""batchuploader - robotupload insert with oracle special treatment"""
import os
if os.path.exists('/opt/invenio/var/log/invenio.err'):
os.remove('/opt/invenio/var/log/invenio.err')
result = urllib2.urlopen(self.req_oracle).read()
self.failUnless("[INFO]" in result, '"%s" did not contained "[INFO]"' % result)
current_task = get_last_taskid()
run_shell_command("%s/bibupload %%s" % CFG_BINDIR, [str(current_task)])
results = json.loads(open(self.callback_result_path).read())
self.failUnless('results' in results, '"%s" did not contained "results" key' % results)
self.assertEqual(len(results['results']), 1)
self.failUnless(results['results'][0]['success'])
self.failUnless(results['results'][0]['recid'] > 0)
self.failUnless("""<subfield code="a">Doe, John</subfield>""" in results['results'][0]['marcxml'], results['results'][0]['marcxml'])
if CFG_HAS_CURL:
def test_insert_via_curl(self):
"""batchuploader - robotupload insert via CLI curl"""
curl_input_file = os.path.join(CFG_TMPDIR, 'curl_test.xml')
open(curl_input_file, "w").write(self.marcxml)
try:
result = run_shell_command('/usr/bin/curl -T %s %s -A %s -H "Content-Type: application/marcxml+xml"', [curl_input_file, self.nonce_url, make_user_agent_string('BatchUploader')])[1]
self.failUnless("[INFO]" in result)
current_task = get_last_taskid()
run_shell_command("%s/bibupload %%s" % CFG_BINDIR, [str(current_task)])
results = json.loads(open(self.callback_result_path).read())
self.failUnless('results' in results, '"%s" did not contained [INFO]' % result)
self.assertEqual(len(results['results']), 1)
self.assertEqual(results['nonce'], "1234")
self.failUnless(results['results'][0]['success'])
self.failUnless(results['results'][0]['recid'] > 0)
self.failUnless("""<subfield code="a">Doe, John</subfield>""" in results['results'][0]['marcxml'], results['results'][0]['marcxml'])
finally:
os.remove(curl_input_file)
def test_legacy_insert_via_curl(self):
"""batchuploader - robotupload legacy insert via CLI curl"""
curl_input_file = os.path.join(CFG_TMPDIR, 'curl_test.xml')
open(curl_input_file, "w").write(self.marcxml)
try:
## curl -F 'file=@localfile.xml' -F 'mode=-i' [-F 'callback_url=http://...'] [-F 'nonce=1234'] http://cds.cern.ch/batchuploader/robotupload -A invenio_webupload
code, result, err = run_shell_command("/usr/bin/curl -v -F file=@%s -F 'mode=-i' -F callback_url=%s -F nonce=1234 %s -A %s", [curl_input_file, self.callback_url, self.legacy_url, make_user_agent_string('BatchUploader')])
self.failUnless("[INFO]" in result, '[INFO] not find in results: %s, %s' % (result, err))
current_task = get_last_taskid()
run_shell_command("%s/bibupload %%s" % CFG_BINDIR, [str(current_task)])
results = json.loads(open(self.callback_result_path).read())
self.failUnless('results' in results, '"%s" did not contained [INFO]' % result)
self.assertEqual(len(results['results']), 1)
self.assertEqual(results['nonce'], "1234")
self.failUnless(results['results'][0]['success'])
self.failUnless(results['results'][0]['recid'] > 0)
self.failUnless("""<subfield code="a">Doe, John</subfield>""" in results['results'][0]['marcxml'], results['results'][0]['marcxml'])
finally:
os.remove(curl_input_file)
TEST_SUITE = make_test_suite(BatchUploaderRobotUploadTests)
if __name__ == "__main__":
run_test_suite(TEST_SUITE, warn_user=True)
diff --git a/invenio_demosite/testsuite/regression/test_bibdocfile.py b/invenio_demosite/testsuite/regression/test_bibdocfile.py
index ffeeb388a..568c16a2d 100644
--- a/invenio_demosite/testsuite/regression/test_bibdocfile.py
+++ b/invenio_demosite/testsuite/regression/test_bibdocfile.py
@@ -1,621 +1,621 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibDocFile Regression Test Suite."""
__revision__ = "$Id$"
import os
import shutil
import time
from datetime import datetime
from invenio.base.wrappers import lazy_import
from invenio.testsuite import InvenioTestCase, make_test_suite, run_test_suite
from invenio.modules.access.local_config import CFG_WEBACCESS_WARNING_MSGS
from invenio.config import \
CFG_SITE_URL, \
CFG_PREFIX, \
CFG_BIBDOCFILE_FILEDIR, \
CFG_SITE_RECORD, \
CFG_WEBDIR, \
CFG_TMPDIR, \
CFG_PATH_MD5SUM
from invenio.utils.mimetype import CFG_HAS_MAGIC
MoreInfo = lazy_import('invenio.bibdocfile:MoreInfo')
Md5Folder = lazy_import('invenio.bibdocfile:Md5Folder')
guess_format_from_url = lazy_import('invenio.bibdocfile:guess_format_from_url')
class BibDocFsInfoTest(InvenioTestCase):
"""Regression tests about the table bibdocfsinfo"""
def setUp(self):
- from invenio.bibdocfile import BibRecDocs
+ from invenio.legacy.bibdocfile.api import BibRecDocs
self.my_bibrecdoc = BibRecDocs(2)
self.unique_name = self.my_bibrecdoc.propose_unique_docname('file')
self.my_bibdoc = self.my_bibrecdoc.add_new_file(CFG_PREFIX + '/lib/webtest/invenio/test.jpg', docname=self.unique_name)
self.my_bibdoc_id = self.my_bibdoc.id
def tearDown(self):
self.my_bibdoc.expunge()
def test_hard_delete(self):
"""bibdocfile - test correct update of bibdocfsinfo when hard-deleting"""
from invenio.legacy.dbquery import run_sql
self.assertEqual(run_sql("SELECT MAX(version) FROM bibdocfsinfo WHERE id_bibdoc=%s", (self.my_bibdoc_id, ))[0][0], 1)
self.assertEqual(run_sql("SELECT last_version FROM bibdocfsinfo WHERE id_bibdoc=%s AND version=1 AND format='.jpg'", (self.my_bibdoc_id, ))[0][0], True)
self.my_bibdoc.add_file_new_version(CFG_PREFIX + '/lib/webtest/invenio/test.gif')
self.assertEqual(run_sql("SELECT MAX(version) FROM bibdocfsinfo WHERE id_bibdoc=%s", (self.my_bibdoc_id, ))[0][0], 2)
self.assertEqual(run_sql("SELECT last_version FROM bibdocfsinfo WHERE id_bibdoc=%s AND version=2 AND format='.gif'", (self.my_bibdoc_id, ))[0][0], True)
self.assertEqual(run_sql("SELECT last_version FROM bibdocfsinfo WHERE id_bibdoc=%s AND version=1 AND format='.jpg'", (self.my_bibdoc_id, ))[0][0], False)
self.my_bibdoc.delete_file('.gif', 2)
self.assertEqual(run_sql("SELECT MAX(version) FROM bibdocfsinfo WHERE id_bibdoc=%s", (self.my_bibdoc_id, ))[0][0], 1)
self.assertEqual(run_sql("SELECT last_version FROM bibdocfsinfo WHERE id_bibdoc=%s AND version=1 AND format='.jpg'", (self.my_bibdoc_id, ))[0][0], True)
class BibDocFileGuessFormat(InvenioTestCase):
"""Regression tests for guess_format_from_url"""
def test_guess_format_from_url_local_no_ext(self):
"""bibdocfile - guess_format_from_url(), local URL, no extension"""
self.assertEqual(guess_format_from_url(os.path.join(CFG_WEBDIR, 'img', 'test')), '.bin')
def test_guess_format_from_url_local_no_ext_with_magic(self):
"""bibdocfile - guess_format_from_url(), local URL, no extension, magic"""
if CFG_HAS_MAGIC:
## with magic
self.assertEqual(guess_format_from_url(os.path.join(CFG_WEBDIR, 'img', 'testgif')), '.gif')
else:
## no magic
self.assertEqual(guess_format_from_url(os.path.join(CFG_WEBDIR, 'img', 'testgif')), '.bin')
def test_guess_format_from_url_local_unknown_ext(self):
"""bibdocfile - guess_format_from_url(), local URL, unknown extension"""
self.assertEqual(guess_format_from_url(os.path.join(CFG_WEBDIR, 'img', 'test.foo')), '.foo')
def test_guess_format_from_url_local_known_ext(self):
"""bibdocfile - guess_format_from_url(), local URL, unknown extension"""
self.assertEqual(guess_format_from_url(os.path.join(CFG_WEBDIR, 'img', 'test.gif')), '.gif')
def test_guess_format_from_url_remote_no_ext(self):
"""bibdocfile - guess_format_from_url(), remote URL, no extension"""
self.assertEqual(guess_format_from_url(CFG_SITE_URL + '/img/test'), '.bin')
def test_guess_format_from_url_remote_no_ext_with_magic(self):
"""bibdocfile - guess_format_from_url(), remote URL, no extension, magic"""
if CFG_HAS_MAGIC:
self.assertEqual(guess_format_from_url(CFG_SITE_URL + '/img/testgif'), '.gif')
else:
self.failUnless(guess_format_from_url(CFG_SITE_URL + '/img/testgif') in ('.bin', '.gif'))
def test_guess_format_from_url_remote_unknown_ext(self):
"""bibdocfile - guess_format_from_url(), remote URL, unknown extension, magic"""
if CFG_HAS_MAGIC:
self.assertEqual(guess_format_from_url(CFG_SITE_URL + '/img/test.foo'), '.gif')
else:
self.failUnless(guess_format_from_url(CFG_SITE_URL + '/img/test.foo') in ('.bin', '.gif'))
def test_guess_format_from_url_remote_known_ext(self):
"""bibdocfile - guess_format_from_url(), remote URL, known extension"""
self.assertEqual(guess_format_from_url(CFG_SITE_URL + '/img/test.gif'), '.gif')
def test_guess_format_from_url_local_gpl_license(self):
local_path = os.path.join(CFG_TMPDIR, 'LICENSE')
print >> open(local_path, 'w'), """
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Library General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
[...]
"""
try:
if CFG_HAS_MAGIC:
self.assertEqual(guess_format_from_url(local_path), '.txt')
else:
self.assertEqual(guess_format_from_url(local_path), '.bin')
finally:
os.remove(local_path)
class BibRecDocsTest(InvenioTestCase):
"""regression tests about BibRecDocs"""
def test_BibRecDocs(self):
"""bibdocfile - BibRecDocs functions"""
- from invenio.bibdocfile import BibRecDocs
+ from invenio.legacy.bibdocfile.api import BibRecDocs
my_bibrecdoc = BibRecDocs(2)
#add bibdoc
my_bibrecdoc.add_new_file(CFG_PREFIX + '/lib/webtest/invenio/test.jpg', 'Main', 'img_test', False, 'test add new file', 'test', '.jpg')
my_bibrecdoc.add_bibdoc(doctype='Main', docname='file', never_fail=False)
self.assertEqual(len(my_bibrecdoc.list_bibdocs()), 3)
my_added_bibdoc = my_bibrecdoc.get_bibdoc('file')
#add bibdocfile in empty bibdoc
my_added_bibdoc.add_file_new_version(CFG_PREFIX + '/lib/webtest/invenio/test.gif', \
description= 'added in empty bibdoc', comment=None, docformat=None, flags=['PERFORM_HIDE_PREVIOUS'])
#propose unique docname
self.assertEqual(my_bibrecdoc.propose_unique_docname('file'), 'file_2')
#has docname
self.assertEqual(my_bibrecdoc.has_docname_p('file'), True)
#merge 2 bibdocs
my_bibrecdoc.merge_bibdocs('img_test', 'file')
self.assertEqual(len(my_bibrecdoc.get_bibdoc("img_test").list_all_files()), 2)
#check file exists
self.assertEqual(my_bibrecdoc.check_file_exists(CFG_PREFIX + '/lib/webtest/invenio/test.jpg', '.jpg'), True)
#get bibdoc names
# we can not rely on the order !
names = set([my_bibrecdoc.get_bibdoc_names('Main')[0], my_bibrecdoc.get_bibdoc_names('Main')[1]])
self.assertTrue('0104007_02' in names)
self.assertTrue('img_test' in names)
#get total size
self.assertEqual(my_bibrecdoc.get_total_size(), 1647591)
#get total size latest version
self.assertEqual(my_bibrecdoc.get_total_size_latest_version(), 1647591)
#display
#value = my_bibrecdoc.display(docname='img_test', version='', doctype='', ln='en', verbose=0, display_hidden=True)
#self.assert_("<small><b>Main</b>" in value)
#get xml 8564
value = my_bibrecdoc.get_xml_8564()
self.assert_('/'+ CFG_SITE_RECORD +'/2/files/img_test.jpg</subfield>' in value)
#check duplicate docnames
self.assertEqual(my_bibrecdoc.check_duplicate_docnames(), True)
def tearDown(self):
- from invenio.bibdocfile import BibRecDocs
+ from invenio.legacy.bibdocfile.api import BibRecDocs
my_bibrecdoc = BibRecDocs(2)
#delete
my_bibrecdoc.delete_bibdoc('img_test')
my_bibrecdoc.delete_bibdoc('file')
my_bibrecdoc.delete_bibdoc('test')
class BibDocsTest(InvenioTestCase):
"""regression tests about BibDocs"""
def test_BibDocs(self):
"""bibdocfile - BibDocs functions"""
- from invenio.bibdocfile import BibRecDocs
+ from invenio.legacy.bibdocfile.api import BibRecDocs
#add file
my_bibrecdoc = BibRecDocs(2)
timestamp1 = datetime(*(time.strptime("2011-10-09 08:07:06", "%Y-%m-%d %H:%M:%S")[:6]))
my_bibrecdoc.add_new_file(CFG_PREFIX + '/lib/webtest/invenio/test.jpg', 'Main', 'img_test', False, 'test add new file', 'test', '.jpg', modification_date=timestamp1)
my_new_bibdoc = my_bibrecdoc.get_bibdoc("img_test")
value = my_bibrecdoc.list_bibdocs()
self.assertEqual(len(value), 2)
#get total file (bibdoc)
self.assertEqual(my_new_bibdoc.get_total_size(), 91750)
#get recid
self.assertEqual(my_new_bibdoc.bibrec_links[0]["recid"], 2)
#change name
my_new_bibdoc.change_name(2, 'new_name')
#get docname
my_bibrecdoc = BibRecDocs(2)
self.assertEqual(my_bibrecdoc.get_docname(my_new_bibdoc.id), 'new_name')
#get type
self.assertEqual(my_new_bibdoc.get_type(), 'Main')
#get id
self.assert_(my_new_bibdoc.get_id() > 80)
#set status
my_new_bibdoc.set_status('new status')
#get status
self.assertEqual(my_new_bibdoc.get_status(), 'new status')
#get base directory
self.assert_(my_new_bibdoc.get_base_dir().startswith(CFG_BIBDOCFILE_FILEDIR))
#get file number
self.assertEqual(my_new_bibdoc.get_file_number(), 1)
#add file new version
timestamp2 = datetime(*(time.strptime("2010-09-08 07:06:05", "%Y-%m-%d %H:%M:%S")[:6]))
my_new_bibdoc.add_file_new_version(CFG_PREFIX + '/lib/webtest/invenio/test.jpg', description= 'the new version', comment=None, docformat=None, flags=["PERFORM_HIDE_PREVIOUS"], modification_date=timestamp2)
self.assertEqual(my_new_bibdoc.list_versions(), [1, 2])
#revert
timestamp3 = datetime.now()
time.sleep(2) # so we can see a difference between now() and the time of the revert
my_new_bibdoc.revert(1)
self.assertEqual(my_new_bibdoc.list_versions(), [1, 2, 3])
self.assertEqual(my_new_bibdoc.get_description('.jpg', version=3), 'test add new file')
#get total size latest version
self.assertEqual(my_new_bibdoc.get_total_size_latest_version(), 91750)
#get latest version
self.assertEqual(my_new_bibdoc.get_latest_version(), 3)
#list latest files
self.assertEqual(len(my_new_bibdoc.list_latest_files()), 1)
self.assertEqual(my_new_bibdoc.list_latest_files()[0].get_version(), 3)
#list version files
self.assertEqual(len(my_new_bibdoc.list_version_files(1, list_hidden=True)), 1)
#display # No Display facility inside of an object !
# value = my_new_bibdoc.display(version='', ln='en', display_hidden=True)
# self.assert_('>test add new file<' in value)
#format already exist
self.assertEqual(my_new_bibdoc.format_already_exists_p('.jpg'), True)
#get file
self.assertEqual(my_new_bibdoc.get_file('.jpg', version='1').get_version(), 1)
#set description
my_new_bibdoc.set_description('new description', '.jpg', version=1)
#get description
self.assertEqual(my_new_bibdoc.get_description('.jpg', version=1), 'new description')
#set comment
my_new_bibdoc.set_description('new comment', '.jpg', version=1)
#get comment
self.assertEqual(my_new_bibdoc.get_description('.jpg', version=1), 'new comment')
#get history
assert len(my_new_bibdoc.get_history()) > 0
#check modification date
self.assertEqual(my_new_bibdoc.get_file('.jpg', version=1).md, timestamp1)
self.assertEqual(my_new_bibdoc.get_file('.jpg', version=2).md, timestamp2)
assert my_new_bibdoc.get_file('.jpg', version=3).md > timestamp3
#delete file
my_new_bibdoc.delete_file('.jpg', 2)
#list all files
self.assertEqual(len(my_new_bibdoc.list_all_files()), 2)
#delete file
my_new_bibdoc.delete_file('.jpg', 3)
#add new format
timestamp4 = datetime(*(time.strptime("2012-11-10 09:08:07", "%Y-%m-%d %H:%M:%S")[:6]))
my_new_bibdoc.add_file_new_format(CFG_PREFIX + '/lib/webtest/invenio/test.gif', version=None, description=None, comment=None, docformat=None, modification_date=timestamp4)
self.assertEqual(len(my_new_bibdoc.list_all_files()), 2)
#check modification time
self.assertEqual(my_new_bibdoc.get_file('.jpg', version=1).md, timestamp1)
self.assertEqual(my_new_bibdoc.get_file('.gif', version=1).md, timestamp4)
#change the format name
my_new_bibdoc.change_docformat('.gif', '.gif;icon-640')
self.assertEqual(my_new_bibdoc.format_already_exists_p('.gif'), False)
self.assertEqual(my_new_bibdoc.format_already_exists_p('.gif;icon-640'), True)
#delete file
my_new_bibdoc.delete_file('.jpg', 1)
#delete file
my_new_bibdoc.delete_file('.gif;icon-640', 1)
#empty bibdoc
self.assertEqual(my_new_bibdoc.empty_p(), True)
#hidden?
self.assertEqual(my_new_bibdoc.hidden_p('.jpg', version=1), False)
#hide
my_new_bibdoc.set_flag('HIDDEN', '.jpg', version=1)
#hidden?
self.assertEqual(my_new_bibdoc.hidden_p('.jpg', version=1), True)
#add and get icon
my_new_bibdoc.add_icon( CFG_PREFIX + '/lib/webtest/invenio/icon-test.gif', modification_date=timestamp4)
my_bibrecdoc = BibRecDocs(2)
value = my_bibrecdoc.get_bibdoc("new_name")
self.assertEqual(value.get_icon().docid, my_new_bibdoc.get_icon().docid)
self.assertEqual(value.get_icon().version, my_new_bibdoc.get_icon().version)
self.assertEqual(value.get_icon().format, my_new_bibdoc.get_icon().format)
#check modification time
self.assertEqual(my_new_bibdoc.get_icon().md, timestamp4)
#delete icon
my_new_bibdoc.delete_icon()
#get icon
self.assertEqual(my_new_bibdoc.get_icon(), None)
#delete
my_new_bibdoc.delete()
self.assertEqual(my_new_bibdoc.deleted_p(), True)
#undelete
my_new_bibdoc.undelete(previous_status='', recid=2)
#expunging
my_new_bibdoc.expunge()
my_bibrecdoc.build_bibdoc_list()
self.failIf('new_name' in my_bibrecdoc.get_bibdoc_names())
self.failUnless(my_bibrecdoc.get_bibdoc_names())
def tearDown(self):
- from invenio.bibdocfile import BibRecDocs
+ from invenio.legacy.bibdocfile.api import BibRecDocs
my_bibrecdoc = BibRecDocs(2)
#delete
my_bibrecdoc.delete_bibdoc('img_test')
my_bibrecdoc.delete_bibdoc('new_name')
class BibRelationTest(InvenioTestCase):
""" regression tests for BibRelation"""
def test_RelationCreation_Version(self):
"""
Testing relations between particular versions of a document
We create two relations differing only on the BibDoc version
number and verify that they are indeed differen (store different data)
"""
- from invenio.bibdocfile import BibRelation
+ from invenio.legacy.bibdocfile.api import BibRelation
rel1 = BibRelation.create(bibdoc1_id = 10, bibdoc2_id=12,
bibdoc1_ver = 1, bibdoc2_ver = 1,
rel_type = "some_rel")
rel2 = BibRelation.create(bibdoc1_id = 10, bibdoc2_id=12,
bibdoc1_ver = 1, bibdoc2_ver = 2,
rel_type = "some_rel")
rel1["key1"] = "value1"
rel1["key2"] = "value2"
rel2["key1"] = "value3"
# now testing the retrieval of data
new_rel1 = BibRelation(bibdoc1_id = 10, bibdoc2_id = 12,
rel_type = "some_rel", bibdoc1_ver = 1,
bibdoc2_ver = 1)
new_rel2 = BibRelation(bibdoc1_id = 10, bibdoc2_id = 12,
rel_type = "some_rel", bibdoc1_ver = 1,
bibdoc2_ver = 2)
self.assertEqual(new_rel1["key1"], "value1")
self.assertEqual(new_rel1["key2"], "value2")
self.assertEqual(new_rel2["key1"], "value3")
# now testing the deletion of relations
new_rel1.delete()
new_rel2.delete()
newer_rel1 = BibRelation.create(bibdoc1_id = 10, bibdoc2_id=12,
bibdoc1_ver = 1, bibdoc2_ver = 1,
rel_type = "some_rel")
newer_rel2 = BibRelation.create(bibdoc1_id = 10, bibdoc2_id=12,
bibdoc1_ver = 1, bibdoc2_ver = 2,
rel_type = "some_rel")
self.assertEqual("key1" in newer_rel1, False)
self.assertEqual("key1" in newer_rel2, False)
newer_rel1.delete()
newer_rel2.delete()
class BibDocFilesTest(InvenioTestCase):
"""regression tests about BibDocFiles"""
def test_BibDocFiles(self):
"""bibdocfile - BibDocFile functions """
#add bibdoc
- from invenio.bibdocfile import BibRecDocs
+ from invenio.legacy.bibdocfile.api import BibRecDocs
my_bibrecdoc = BibRecDocs(2)
timestamp = datetime(*(time.strptime("2010-09-08 07:06:05", "%Y-%m-%d %H:%M:%S")[:6]))
my_bibrecdoc.add_new_file(CFG_PREFIX + '/lib/webtest/invenio/test.jpg', 'Main', 'img_test', False, 'test add new file', 'test', '.jpg', modification_date=timestamp)
my_new_bibdoc = my_bibrecdoc.get_bibdoc("img_test")
my_new_bibdocfile = my_new_bibdoc.list_all_files()[0]
#get url
self.assertEqual(my_new_bibdocfile.get_url(), CFG_SITE_URL + '/%s/2/files/img_test.jpg' % CFG_SITE_RECORD)
#get type
self.assertEqual(my_new_bibdocfile.get_type(), 'Main')
#get path
# we should not test for particular path ! this is in the gestion of the underlying implementation,
# not the interface which should ne tested
# self.assert_(my_new_bibdocfile.get_path().startswith(CFG_BIBDOCFILE_FILEDIR))
# self.assert_(my_new_bibdocfile.get_path().endswith('/img_test.jpg;1'))
#get bibdocid
self.assertEqual(my_new_bibdocfile.get_bibdocid(), my_new_bibdoc.get_id())
#get name
self.assertEqual(my_new_bibdocfile.get_name() , 'img_test')
#get full name
self.assertEqual(my_new_bibdocfile.get_full_name() , 'img_test.jpg')
#get full path
#self.assert_(my_new_bibdocfile.get_full_path().startswith(CFG_BIBDOCFILE_FILEDIR))
#self.assert_(my_new_bibdocfile.get_full_path().endswith('/img_test.jpg;1'))
#get format
self.assertEqual(my_new_bibdocfile.get_format(), '.jpg')
#get version
self.assertEqual(my_new_bibdocfile.get_version(), 1)
#get description
self.assertEqual(my_new_bibdocfile.get_description(), my_new_bibdoc.get_description('.jpg', version=1))
#get comment
self.assertEqual(my_new_bibdocfile.get_comment(), my_new_bibdoc.get_comment('.jpg', version=1))
#get recid
self.assertEqual(my_new_bibdocfile.get_recid(), 2)
#get status
self.assertEqual(my_new_bibdocfile.get_status(), '')
#get size
self.assertEqual(my_new_bibdocfile.get_size(), 91750)
#get checksum
self.assertEqual(my_new_bibdocfile.get_checksum(), '28ec893f9da735ad65de544f71d4ad76')
#check
self.assertEqual(my_new_bibdocfile.check(), True)
#display
import invenio.legacy.template
tmpl = invenio.legacy.template.load("bibdocfile")
value = tmpl.tmpl_display_bibdocfile(my_new_bibdocfile, ln='en')
assert 'files/img_test.jpg?version=1">' in value
#hidden?
self.assertEqual(my_new_bibdocfile.hidden_p(), False)
#check modification date
self.assertEqual(my_new_bibdocfile.md, timestamp)
#delete
my_new_bibdoc.delete()
self.assertEqual(my_new_bibdoc.deleted_p(), True)
class CheckBibDocAuthorizationTest(InvenioTestCase):
"""Regression tests for check_bibdoc_authorization function."""
def test_check_bibdoc_authorization(self):
"""bibdocfile - check_bibdoc_authorization function"""
- from invenio.bibdocfile import check_bibdoc_authorization
+ from invenio.legacy.bibdocfile.api import check_bibdoc_authorization
from invenio.legacy.webuser import collect_user_info, get_uid_from_email
jekyll = collect_user_info(get_uid_from_email('jekyll@cds.cern.ch'))
self.assertEqual(check_bibdoc_authorization(jekyll, 'role:thesesviewer'), (0, CFG_WEBACCESS_WARNING_MSGS[0]))
self.assertEqual(check_bibdoc_authorization(jekyll, 'role: thesesviewer'), (0, CFG_WEBACCESS_WARNING_MSGS[0]))
self.assertEqual(check_bibdoc_authorization(jekyll, 'role: thesesviewer'), (0, CFG_WEBACCESS_WARNING_MSGS[0]))
self.assertEqual(check_bibdoc_authorization(jekyll, 'Role: thesesviewer'), (0, CFG_WEBACCESS_WARNING_MSGS[0]))
self.assertEqual(check_bibdoc_authorization(jekyll, 'email: jekyll@cds.cern.ch'), (0, CFG_WEBACCESS_WARNING_MSGS[0]))
self.assertEqual(check_bibdoc_authorization(jekyll, 'email: jekyll@cds.cern.ch'), (0, CFG_WEBACCESS_WARNING_MSGS[0]))
juliet = collect_user_info(get_uid_from_email('juliet.capulet@cds.cern.ch'))
self.assertEqual(check_bibdoc_authorization(juliet, 'restricted_picture'), (0, CFG_WEBACCESS_WARNING_MSGS[0]))
self.assertEqual(check_bibdoc_authorization(juliet, 'status: restricted_picture'), (0, CFG_WEBACCESS_WARNING_MSGS[0]))
self.assertNotEqual(check_bibdoc_authorization(juliet, 'restricted_video')[0], 0)
self.assertNotEqual(check_bibdoc_authorization(juliet, 'status: restricted_video')[0], 0)
class BibDocFileURLTest(InvenioTestCase):
"""Regression tests for bibdocfile_url_p function."""
def test_bibdocfile_url_p(self):
"""bibdocfile - check bibdocfile_url_p() functionality"""
- from invenio.bibdocfile import bibdocfile_url_p
+ from invenio.legacy.bibdocfile.api import bibdocfile_url_p
self.failUnless(bibdocfile_url_p(CFG_SITE_URL + '/%s/98/files/9709037.pdf' % CFG_SITE_RECORD))
self.failUnless(bibdocfile_url_p(CFG_SITE_URL + '/%s/098/files/9709037.pdf' % CFG_SITE_RECORD))
class MoreInfoTest(InvenioTestCase):
"""regression tests about BibDocFiles"""
def test_initialData(self):
"""Testing if passing the initial data really enriches the existing structure"""
more_info = MoreInfo(docid = 134)
more_info.set_data("ns1", "k1", "vsrjklfh23478956@#%@#@#%")
more_info2 = MoreInfo(docid = 134, initial_data = {"ns1" : { "k2" : "weucb2324@#%@#$%@"}})
self.assertEqual(more_info.get_data("ns1", "k2"), "weucb2324@#%@#$%@")
self.assertEqual(more_info.get_data("ns1", "k1"), "vsrjklfh23478956@#%@#@#%")
self.assertEqual(more_info2.get_data("ns1", "k2"), "weucb2324@#%@#$%@")
self.assertEqual(more_info2.get_data("ns1", "k1"), "vsrjklfh23478956@#%@#@#%")
more_info3 = MoreInfo(docid = 134)
self.assertEqual(more_info3.get_data("ns1", "k2"), "weucb2324@#%@#$%@")
self.assertEqual(more_info3.get_data("ns1", "k1"), "vsrjklfh23478956@#%@#@#%")
more_info.del_key("ns1", "k1")
more_info.del_key("ns1", "k2")
def test_createSeparateRead(self):
"""MoreInfo - testing if information saved using one instance is accessible via
a new one"""
more_info = MoreInfo(docid = 13)
more_info.set_data("some_namespace", "some_key", "vsrjklfh23478956@#%@#@#%")
more_info2 = MoreInfo(docid = 13)
self.assertEqual(more_info.get_data("some_namespace", "some_key"), "vsrjklfh23478956@#%@#@#%")
self.assertEqual(more_info2.get_data("some_namespace", "some_key"), "vsrjklfh23478956@#%@#@#%")
more_info2.del_key("some_namespace", "some_key")
def test_DictionaryBehaviour(self):
"""moreinfo - tests assignments of data, both using the general interface and using
namespaces"""
more_info = MoreInfo()
more_info.set_data("namespace1", "key1", "val1")
more_info.set_data("namespace1", "key2", "val2")
more_info.set_data("namespace2", "key1", "val3")
self.assertEqual(more_info.get_data("namespace1", "key1"), "val1")
self.assertEqual(more_info.get_data("namespace1", "key2"), "val2")
self.assertEqual(more_info.get_data("namespace2", "key1"), "val3")
def test_inMemoryMoreInfo(self):
"""test that MoreInfo is really stored only in memory (no database accesses)"""
m1 = MoreInfo(docid = 101, version = 12, cache_only = True)
m2 = MoreInfo(docid = 101, version = 12, cache_reads = False) # The most direct DB access
m1.set_data("n1", "k1", "v1")
self.assertEqual(m2.get_data("n1","k1"), None)
self.assertEqual(m1.get_data("n1","k1"), "v1")
def test_readCacheMoreInfo(self):
"""we verify that if value is not present in the cache, read will happen from the database"""
m1 = MoreInfo(docid = 102, version = 12)
m2 = MoreInfo(docid = 102, version = 12) # The most direct DB access
self.assertEqual(m2.get_data("n11","k11"), None)
self.assertEqual(m1.get_data("n11","k11"), None)
m1.set_data("n11", "k11", "some value")
self.assertEqual(m1.get_data("n11","k11"), "some value")
self.assertEqual(m2.get_data("n11","k11"), "some value") # read from a different instance
m1.delete()
m2.delete()
class BibDocFileMd5FolderTests(InvenioTestCase):
"""Regression test class for the Md5Folder class"""
def setUp(self):
self.path = os.path.join(CFG_TMPDIR, 'md5_tests')
if not os.path.exists(self.path):
os.makedirs(self.path)
def tearDown(self):
shutil.rmtree(self.path)
def test_empty_md5folder(self):
"""bibdocfile - empty Md5Folder"""
self.assertEqual(Md5Folder(self.path).md5s, {})
def test_one_file_md5folder(self):
"""bibdocfile - one file in Md5Folder"""
open(os.path.join(self.path, 'test.txt'), "w").write("test")
md5s = Md5Folder(self.path)
self.assertEqual(md5s.md5s, {'test.txt': '098f6bcd4621d373cade4e832627b4f6'})
def test_adding_one_more_file_md5folder(self):
"""bibdocfile - one more file in Md5Folder"""
open(os.path.join(self.path, 'test.txt'), "w").write("test")
md5s = Md5Folder(self.path)
self.assertEqual(md5s.md5s, {'test.txt': '098f6bcd4621d373cade4e832627b4f6'})
open(os.path.join(self.path, 'test2.txt'), "w").write("second test")
md5s.update()
self.assertEqual(md5s.md5s, {'test.txt': '098f6bcd4621d373cade4e832627b4f6', 'test2.txt': 'f5a6496b3ed4f2d6e5d602c7be8e6b42'})
def test_detect_corruption(self):
"""bibdocfile - detect corruption in Md5Folder"""
open(os.path.join(self.path, 'test.txt'), "w").write("test")
md5s = Md5Folder(self.path)
open(os.path.join(self.path, 'test.txt'), "w").write("second test")
self.failIf(md5s.check('test.txt'))
md5s.update(only_new=False)
self.failUnless(md5s.check('test.txt'))
self.assertEqual(md5s.get_checksum('test.txt'), 'f5a6496b3ed4f2d6e5d602c7be8e6b42')
if CFG_PATH_MD5SUM:
def test_md5_algorithms(self):
"""bibdocfile - compare md5 algorithms"""
- from invenio.bibdocfile import calculate_md5, \
+ from invenio.legacy.bibdocfile.api import calculate_md5, \
calculate_md5_external
filepath = os.path.join(self.path, 'test.txt')
open(filepath, "w").write("test")
self.assertEqual(calculate_md5(filepath, force_internal=True),
calculate_md5_external(filepath))
TEST_SUITE = make_test_suite(BibDocFileMd5FolderTests,
BibRecDocsTest,
BibDocsTest,
BibDocFilesTest,
MoreInfoTest,
BibRelationTest,
BibDocFileURLTest,
CheckBibDocAuthorizationTest,
BibDocFsInfoTest,
BibDocFileGuessFormat)
if __name__ == "__main__":
run_test_suite(TEST_SUITE, warn_user=True)
diff --git a/invenio_demosite/testsuite/regression/test_bibupload.py b/invenio_demosite/testsuite/regression/test_bibupload.py
index ad4e880e0..9d5761282 100644
--- a/invenio_demosite/testsuite/regression/test_bibupload.py
+++ b/invenio_demosite/testsuite/regression/test_bibupload.py
@@ -1,5991 +1,5991 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# pylint: disable=C0301
"""Regression tests for the BibUpload."""
__revision__ = "$Id$"
import base64
import cPickle
import re
import os
import pprint
import sys
import time
from marshal import loads
from zlib import decompress
from urllib import urlencode
from urllib2 import urlopen
from invenio.config import CFG_OAI_ID_FIELD, CFG_PREFIX, CFG_SITE_URL, CFG_TMPDIR, \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG, \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG, \
CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG, \
CFG_BINDIR, \
CFG_SITE_RECORD, \
CFG_DEVEL_SITE, \
CFG_BIBUPLOAD_REFERENCE_TAG, \
CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE
from invenio.utils.json import json
from invenio.legacy.dbquery import run_sql, get_table_status_info
from invenio.testsuite import InvenioTestCase, make_test_suite, run_test_suite, test_web_page_content
from invenio.base.wrappers import lazy_import
from invenio.utils.hash import md5
from invenio.utils.shell import run_shell_command
BibRecDocs = lazy_import('invenio.bibdocfile:BibRecDocs')
BibRelation = lazy_import('invenio.bibdocfile:BibRelation')
MoreInfo = lazy_import('invenio.bibdocfile:MoreInfo')
bibupload = lazy_import('invenio.bibupload')
print_record = lazy_import('invenio.legacy.search_engine:print_record')
get_record = lazy_import('invenio.legacy.search_engine:get_record')
create_record = lazy_import('invenio.legacy.bibrecord:create_record')
records_identical = lazy_import('invenio.legacy.bibrecord:records_identical')
encode_for_xml = lazy_import('invenio.utils.text:encode_for_xml')
# helper functions:
RE_005 = re.compile(re.escape('tag="005"'))
def get_record_from_bibxxx(recid):
"""Return a recstruct built from bibxxx tables"""
record = "<record>"
record += """ <controlfield tag="001">%s</controlfield>\n""" % recid
# controlfields
query = "SELECT b.tag,b.value,bb.field_number FROM bib00x AS b, bibrec_bib00x AS bb "\
"WHERE bb.id_bibrec=%s AND b.id=bb.id_bibxxx AND b.tag LIKE '00%%' "\
"ORDER BY bb.field_number, b.tag ASC"
res = run_sql(query, (recid, ))
for row in res:
field, value = row[0], row[1]
value = encode_for_xml(value)
record += """ <controlfield tag="%s">%s</controlfield>\n""" % \
(encode_for_xml(field[0:3]), value)
# datafields
i = 1 # Do not process bib00x and bibrec_bib00x, as
# they are controlfields. So start at bib01x and
# bibrec_bib00x (and set i = 0 at the end of
# first loop)
for digit1 in range(0, 10):
for digit2 in range(i, 10):
bx = "bib%d%dx" % (digit1, digit2)
bibx = "bibrec_bib%d%dx" % (digit1, digit2)
query = "SELECT b.tag,b.value,bb.field_number FROM %s AS b, %s AS bb "\
"WHERE bb.id_bibrec=%%s AND b.id=bb.id_bibxxx AND b.tag LIKE %%s"\
"ORDER BY bb.field_number, b.tag ASC" % (bx, bibx)
res = run_sql(query, (recid, str(digit1)+str(digit2)+'%'))
field_number_old = -999
field_old = ""
for row in res:
field, value, field_number = row[0], row[1], row[2]
ind1, ind2 = field[3], field[4]
if ind1 == "_" or ind1 == "":
ind1 = " "
if ind2 == "_" or ind2 == "":
ind2 = " "
if field_number != field_number_old or field[:-1] != field_old[:-1]:
if field_number_old != -999:
record += """ </datafield>\n"""
record += """ <datafield tag="%s" ind1="%s" ind2="%s">\n""" % \
(encode_for_xml(field[0:3]), encode_for_xml(ind1), encode_for_xml(ind2))
field_number_old = field_number
field_old = field
# print subfield value
value = encode_for_xml(value)
record += """ <subfield code="%s">%s</subfield>\n""" % \
(encode_for_xml(field[-1:]), value)
# all fields/subfields printed in this run, so close the tag:
if field_number_old != -999:
record += """ </datafield>\n"""
i = 0 # Next loop should start looking at bib%0 and bibrec_bib00x
# we are at the end of printing the record:
record += " </record>\n"
return record
def remove_tag_001_from_xmbuffer(xmbuffer):
"""Remove tag 001 from MARCXML buffer. Useful for testing two
MARCXML buffers without paying attention to recIDs attributed
during the bibupload.
"""
return re.sub(r'<controlfield tag="001">.*</controlfield>', '', xmbuffer)
def compare_xmbuffers(xmbuffer1, xmbuffer2):
"""Compare two XM (XML MARC) buffers by removing whitespaces and version
numbers in tags 005 before testing.
"""
def remove_blanks_from_xmbuffer(xmbuffer):
"""Remove \n and blanks from XMBUFFER."""
out = xmbuffer.replace("\n", "")
out = out.replace(" ", "")
return out
# remove 005 revision numbers:
xmbuffer1 = re.sub(r'<controlfield tag="005">.*?</controlfield>', '', xmbuffer1)
xmbuffer2 = re.sub(r'<controlfield tag="005">.*?</controlfield>', '', xmbuffer2)
# remove whitespace:
xmbuffer1 = remove_blanks_from_xmbuffer(xmbuffer1)
xmbuffer2 = remove_blanks_from_xmbuffer(xmbuffer2)
if len(RE_005.findall(xmbuffer1)) > 1:
return "More than 1 005 tag has been found in the first XM: %s" % xmbuffer1
if len(RE_005.findall(xmbuffer2)) > 1:
return "More than 1 005 tag has been found in the second XM: %s" % xmbuffer2
if xmbuffer1 != xmbuffer2:
return "\n=" + xmbuffer1 + "=\n" + '!=' + "\n=" + xmbuffer2 + "=\n"
return ''
def remove_tag_001_from_hmbuffer(hmbuffer):
"""Remove tag 001 from HTML MARC buffer. Useful for testing two
HTML MARC buffers without paying attention to recIDs attributed
during the bibupload.
"""
return re.sub(r'(^|\n)(<pre>)?[0-9]{9}\s001__\s\d+($|\n)', '', hmbuffer)
def compare_hmbuffers(hmbuffer1, hmbuffer2):
"""Compare two HM (HTML MARC) buffers by removing whitespaces
before testing.
"""
hmbuffer1 = hmbuffer1.strip()
hmbuffer2 = hmbuffer2.strip()
# remove eventual <pre>...</pre> formatting:
hmbuffer1 = re.sub(r'^<pre>', '', hmbuffer1)
hmbuffer2 = re.sub(r'^<pre>', '', hmbuffer2)
hmbuffer1 = re.sub(r'</pre>$', '', hmbuffer1)
hmbuffer2 = re.sub(r'</pre>$', '', hmbuffer2)
# remove 005 revision numbers:
hmbuffer1 = re.sub(r'(^|\n)[0-9]{9}\s005.*($|\n)', '\n', hmbuffer1)
hmbuffer2 = re.sub(r'(^|\n)[0-9]{9}\s005.*($|\n)', '\n', hmbuffer2)
hmbuffer1 = hmbuffer1.strip()
hmbuffer2 = hmbuffer2.strip()
# remove leading recid, leaving only field values:
hmbuffer1 = re.sub(r'(^|\n)[0-9]{9}\s', '', hmbuffer1)
hmbuffer2 = re.sub(r'(^|\n)[0-9]{9}\s', '', hmbuffer2)
# remove leading whitespace:
hmbuffer1 = re.sub(r'(^|\n)\s+', '', hmbuffer1)
hmbuffer2 = re.sub(r'(^|\n)\s+', '', hmbuffer2)
compared_hmbuffers = hmbuffer1 == hmbuffer2
if not compared_hmbuffers:
return "\n=" + hmbuffer1 + "=\n" + '!=' + "\n=" + hmbuffer2 + "=\n"
return ''
def wipe_out_record_from_all_tables(recid):
"""
Wipe out completely the record and all its traces of RECID from
the database (bibrec, bibrec_bibxxx, bibxxx, bibfmt). Useful for
the time being for test cases.
"""
# delete all the linked bibdocs
try:
for bibdoc in BibRecDocs(recid).list_bibdocs():
bibdoc.expunge()
# delete from bibrec:
run_sql("DELETE FROM bibrec WHERE id=%s", (recid,))
# delete from bibrec_bibxxx:
for i in range(0, 10):
for j in range(0, 10):
run_sql("DELETE FROM %(bibrec_bibxxx)s WHERE id_bibrec=%%s" % # kwalitee: disable=sql
{'bibrec_bibxxx': "bibrec_bib%i%ix" % (i, j)},
(recid,))
# delete all unused bibxxx values:
for i in range(0, 10):
for j in range(0, 10):
run_sql("DELETE %(bibxxx)s FROM %(bibxxx)s " \
" LEFT JOIN %(bibrec_bibxxx)s " \
" ON %(bibxxx)s.id=%(bibrec_bibxxx)s.id_bibxxx " \
" WHERE %(bibrec_bibxxx)s.id_bibrec IS NULL" % \
{'bibxxx': "bib%i%ix" % (i, j),
'bibrec_bibxxx': "bibrec_bib%i%ix" % (i, j)})
# delete from bibfmt:
run_sql("DELETE FROM bibfmt WHERE id_bibrec=%s", (recid,))
# delete from bibrec_bibdoc:
run_sql("DELETE FROM bibrec_bibdoc WHERE id_bibrec=%s", (recid,))
# delete from holdingpen
run_sql("DELETE FROM bibHOLDINGPEN WHERE id_bibrec=%s", (recid,))
# delete from hstRECORD
run_sql("DELETE FROM hstRECORD WHERE id_bibrec=%s", (recid,))
except Exception, err:
print >> sys.stderr, "Exception captured while wiping records: %s" % err
def try_url_download(url):
"""Try to download a given URL"""
try:
open_url = urlopen(url)
open_url.read()
except Exception, e:
raise StandardError("Downloading %s is impossible because of %s"
% (url, str(e)))
return True
def force_webcoll(recid):
from invenio.bibindex_engine_config import CFG_BIBINDEX_INDEX_TABLE_TYPE
from invenio import bibindex_engine
reload(bibindex_engine)
from invenio import websearch_webcoll
reload(websearch_webcoll)
index_id, index_name, index_tags = bibindex_engine.get_word_tables(["collection"])[0]
bibindex_engine.WordTable(index_name, index_id, index_tags, "idxWORD%02dF", wordtable_type=CFG_BIBINDEX_INDEX_TABLE_TYPE["Words"], tag_to_tokenizer_map={'8564_u': "BibIndexFulltextTokenizer"}).add_recIDs([[recid, recid]], 1)
#sleep 1s to make sure all tables are ready
time.sleep(1)
c = websearch_webcoll.Collection()
c.calculate_reclist()
c.update_reclist()
class GenericBibUploadTest(InvenioTestCase):
"""Generic BibUpload testing class with predefined
setUp and tearDown methods.
"""
def setUp(self):
- from invenio.bibtask import task_set_task_param, setup_loggers
+ from invenio.legacy.bibsched.bibtask import task_set_task_param, setup_loggers
self.verbose = 0
setup_loggers()
task_set_task_param('verbose', self.verbose)
self.last_recid = run_sql("SELECT MAX(id) FROM bibrec")[0][0]
def tearDown(self):
for recid in run_sql("SELECT id FROM bibrec WHERE id>%s", (self.last_recid,)):
wipe_out_record_from_all_tables(recid[0])
def check_record_consistency(self, recid):
rec_in_history = create_record(decompress(run_sql("SELECT marcxml FROM hstRECORD WHERE id_bibrec=%s ORDER BY job_date DESC LIMIT 1", (recid, ))[0][0]))[0]
rec_in_xm = create_record(decompress(run_sql("SELECT value FROM bibfmt WHERE id_bibrec=%s AND format='xm'", (recid, ))[0][0]))[0]
rec_in_bibxxx = create_record(get_record_from_bibxxx(recid))[0]
self.failUnless(records_identical(rec_in_xm, rec_in_history, skip_005=False), "\n%s\n!=\n%s\n" % (rec_in_xm, rec_in_history))
self.failUnless(records_identical(rec_in_xm, rec_in_bibxxx, skip_005=False, ignore_duplicate_subfields=True, ignore_duplicate_controlfields=True), "\n%s\n!=\n%s\n" % (rec_in_xm, rec_in_bibxxx))
if CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE:
rec_in_recstruct = loads(decompress(run_sql("SELECT value FROM bibfmt WHERE id_bibrec=%s AND format='recstruct'", (recid, ))[0][0]))
self.failUnless(records_identical(rec_in_xm, rec_in_recstruct, skip_005=False, ignore_subfield_order=True), "\n%s\n!=\n%s\n" % (rec_in_xm, rec_in_recstruct))
class BibUploadRealCaseRemovalDOIViaBibEdit(GenericBibUploadTest):
def test_removal_of_doi_via_bibedit(self):
test = """<record>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">HEP</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Fiore, Gaetano</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On quantum mechanics with a magnetic field on R**n and on a torus T**n, and their relation</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="p">Int.J.Theor.Phys.</subfield>
<subfield code="v">52</subfield>
<subfield code="c">877-896</subfield>
<subfield code="y">2013</subfield>
</datafield>
<datafield tag="650" ind1="1" ind2="7">
<subfield code="2">INSPIRE</subfield>
<subfield code="a">General Physics</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">Published</subfield>
</datafield>
<datafield tag="300" ind1=" " ind2=" ">
<subfield code="a">20</subfield>
</datafield>
<datafield tag="269" ind1=" " ind2=" ">
<subfield code="c">2013</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="9">author</subfield>
<subfield code="a">Bloch theory with magnetic field</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="9">author</subfield>
<subfield code="a">Fiber bundles</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="9">author</subfield>
<subfield code="a">Gauge symmetry</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="9">author</subfield>
<subfield code="a">Quantization on manifolds</subfield>
</datafield>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="9">Springer</subfield>
<subfield code="a">We show in elementary terms the equivalence in a general gauge of a U(1)-gauge theory of a scalar charged particle on a torus to the analogous theory on ℝ( )n( ) constrained by quasiperiodicity under translations in the lattice Λ. The latter theory provides a global description of the former: the quasiperiodic wavefunctions ψ defined on ℝ( )n( ) play the role of sections of the associated hermitean line bundle E on , since also E admits a global description as a quotient. The components of the covariant derivatives corresponding to a constant (necessarily integral) magnetic field B=dA generate a Lie algebra g ( )Q( ) and together with the periodic functions the algebra of observables . The non-abelian part of g ( )Q( ) is a Heisenberg Lie algebra with the electric charge operator Q as the central generator, the corresponding Lie group G ( )Q( ) acts on the Hilbert space as the translation group up to phase factors. Also the space of sections of E is mapped into itself by g∈G ( )Q( ). We identify the socalled magnetic translation group as a subgroup of the observables’ group Y ( )Q( ). We determine the unitary irreducible representations of corresponding to integer charges and for each of them an associated orthonormal basis explicitly in configuration space. We also clarify how in the n=2m case a holomorphic structure and Theta functions arise on the associated complex torus.</subfield>
</datafield>
<datafield tag="024" ind1="7" ind2=" ">
<subfield code="2">DOI</subfield>
<subfield code="a">10.1007/s10773-012-1396-z</subfield>
</datafield>
<datafield tag="035" ind1=" " ind2=" ">
<subfield code="a">Fiore:2013nua</subfield>
<subfield code="9">INSPIRETeX</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">Published</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">Citeable</subfield>
</datafield>
</record>
"""
recs = create_record(test)
_, recid, _ = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid)
new_rec = get_record(recid)
del new_rec['024'] ## let's delete DOI
_, recid2, _ = bibupload.bibupload(new_rec, opt_mode='replace')
self.assertEqual(recid, recid2)
self.check_record_consistency(recid2)
class BibUploadTypicalBibEditSessionTest(GenericBibUploadTest):
"""Testing a typical BibEdit session"""
def setUp(self):
GenericBibUploadTest.setUp(self)
self.test = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
</record>
"""
recs = bibupload.xml_marc_to_records(self.test)
# We call the main function with the record as a parameter
_, self.recid, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(self.recid)
# We retrieve the inserted xml
inserted_xm = print_record(self.recid, 'xm')
# Compare if the two MARCXML are the same
self.assertEqual(compare_xmbuffers(remove_tag_001_from_xmbuffer(inserted_xm),
self.test), '')
self.history = run_sql("SELECT * FROM hstRECORD WHERE id_bibrec=%s", (self.recid, )) # kwalitee: disable=sql
self.timestamp = run_sql("SELECT modification_date FROM bibrec WHERE id=%s", (self.recid,))
self.tag005 = get_record(self.recid)['005'][0][3]
def test_simple_replace(self):
"""BibUpload - test a simple replace as in BibEdit"""
marc_to_replace1 = """
<record>
<controlfield tag="001">%(recid)s</controlfield>
<controlfield tag="005">%(tag005)s</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Foo</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="a">bla bla bla</subfield>
</datafield>
</record>
""" % {'recid': self.recid, 'tag005': self.tag005}
recs = bibupload.xml_marc_to_records(marc_to_replace1)
# We call the main function with the record as a parameter
_, self.recid, _ = bibupload.bibupload_records(recs, opt_mode='replace')[0]
self.check_record_consistency(self.recid)
## The change should have been applied!
self.failUnless(records_identical(recs[0], get_record(self.recid)), "\n%s\n!=\n%s\n" % (recs[0], get_record(self.recid)))
marc_to_replace2 = """
<record>
<controlfield tag="001">%(recid)s</controlfield>
<controlfield tag="005">%(tag005)s</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Queen Elisabeth</subfield>
<subfield code="u">Great Britain</subfield>
</datafield>
</record>
""" % {'recid': self.recid, 'tag005': self.tag005}
expected_marc = """
<record>
<controlfield tag="001">%(recid)s</controlfield>
<controlfield tag="005">%(tag005)s</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Foo</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="a">bla bla bla</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Queen Elisabeth</subfield>
<subfield code="u">Great Britain</subfield>
</datafield>
</record>
""" % {'recid': self.recid, 'tag005': self.tag005}
recs = bibupload.xml_marc_to_records(marc_to_replace2)
# We call the main function with the record as a parameter
_, self.recid, _ = bibupload.bibupload_records(recs, opt_mode='replace')[0]
self.check_record_consistency(self.recid)
## The change should have been merged with the previous without conflict
self.failUnless(records_identical(bibupload.xml_marc_to_records(expected_marc)[0], get_record(self.recid)))
def test_replace_with_conflict(self):
"""BibUpload - test a replace as in BibEdit that leads to conflicts"""
marc_to_replace1 = """
<record>
<controlfield tag="001">%(recid)s</controlfield>
<controlfield tag="005">%(tag005)s</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Foo</subfield>
<subfield code="u">Test Institute2</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="a">bla bla bla</subfield>
</datafield>
</record>
""" % {'recid': self.recid, 'tag005': self.tag005}
recs = bibupload.xml_marc_to_records(marc_to_replace1)
# We call the main function with the record as a parameter
_, self.recid, _ = bibupload.bibupload_records(recs, opt_mode='replace')[0]
self.check_record_consistency(self.recid)
## The change should have been applied!
self.failUnless(records_identical(recs[0], get_record(self.recid)), "\n%s\n!=\n%s" % (recs[0], get_record(self.recid)))
marc_to_replace2 = """
<record>
<controlfield tag="001">%(recid)s</controlfield>
<controlfield tag="005">%(tag005)s</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Queen Elisabeth</subfield>
<subfield code="u">Great Britain</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">No more Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="a">bla bla bla</subfield>
</datafield>
</record>
""" % {'recid': self.recid, 'tag005': self.tag005}
recs = bibupload.xml_marc_to_records(marc_to_replace2)
# We call the main function with the record as a parameter
_, self.recid, _ = bibupload.bibupload_records(recs, opt_mode='replace')[0]
self.check_record_consistency(self.recid)
## The change should have been merged with the previous without conflict
self.failUnless(records_identical(bibupload.xml_marc_to_records(marc_to_replace1)[0], get_record(self.recid)), "%s != %s" % (bibupload.xml_marc_to_records(marc_to_replace1)[0], get_record(self.recid)))
self.failUnless(records_identical(bibupload.xml_marc_to_records(marc_to_replace2)[0], bibupload.xml_marc_to_records(run_sql("SELECT changeset_xml FROM bibHOLDINGPEN WHERE id_bibrec=%s", (self.recid,))[0][0])[0]))
class BibUploadNoUselessHistoryTest(GenericBibUploadTest):
"""Testing generation of history only when necessary"""
def setUp(self):
GenericBibUploadTest.setUp(self)
self.test = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
</record>
"""
recs = bibupload.xml_marc_to_records(self.test)
# We call the main function with the record as a parameter
_, self.recid, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(self.recid)
# We retrieve the inserted xml
inserted_xm = print_record(self.recid, 'xm')
# Compare if the two MARCXML are the same
self.assertEqual(compare_xmbuffers(remove_tag_001_from_xmbuffer(inserted_xm),
self.test), '')
self.history = run_sql("SELECT * FROM hstRECORD WHERE id_bibrec=%s", (self.recid, )) # kwalitee: disable=sql
self.timestamp = run_sql("SELECT modification_date FROM bibrec WHERE id=%s", (self.recid,))
def test_replace_identical_record(self):
"""bibupload - replace with identical record does not touch history"""
xml_to_upload = """
<record>
<controlfield tag="001">%s</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
</record>
""" % self.recid
recs = bibupload.xml_marc_to_records(xml_to_upload)
# We call the main function with the record as a parameter
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='replace')[0]
self.check_record_consistency(recid)
self.assertEqual(self.recid, recid)
self.assertEqual(self.history, run_sql("SELECT * FROM hstRECORD WHERE id_bibrec=%s", (self.recid, ))) # kwalitee: disable=sql
self.assertEqual(self.timestamp, run_sql("SELECT modification_date FROM bibrec WHERE id=%s", (self.recid,)))
def test_correct_identical_correction(self):
"""bibupload - correct with identical correction does not touch history"""
xml_to_upload = """
<record>
<controlfield tag="001">%s</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
</record>
""" % self.recid
recs = bibupload.xml_marc_to_records(xml_to_upload)
# We call the main function with the record as a parameter
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='correct')[0]
self.check_record_consistency(recid)
self.assertEqual(self.recid, recid)
self.maxDiff = None
self.assertEqual(self.history, run_sql("SELECT * FROM hstRECORD WHERE id_bibrec=%s", (self.recid, ))) # kwalitee: disable=sql
self.assertEqual(self.timestamp, run_sql("SELECT modification_date FROM bibrec WHERE id=%s", (self.recid,)))
def test_replace_different_record(self):
"""bibupload - replace with different records does indeed touch history"""
xml_to_upload = """
<record>
<controlfield tag="001">%s</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
</record>
""" % self.recid
recs = bibupload.xml_marc_to_records(xml_to_upload)
# We call the main function with the record as a parameter
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='replace')[0]
self.check_record_consistency(recid)
self.assertEqual(self.recid, recid)
self.assertNotEqual(self.history, run_sql("SELECT * FROM hstRECORD WHERE id_bibrec=%s", (self.recid, ))) # kwalitee: disable=sql
self.failUnless(len(self.history) == 1 and len(run_sql("SELECT * FROM hstRECORD WHERE id_bibrec=%s", (self.recid, ))) == 2) # kwalitee: disable=sql
self.assertNotEqual(self.timestamp, run_sql("SELECT modification_date FROM bibrec WHERE id=%s", (self.recid,)))
def test_correct_different_correction(self):
"""bibupload - correct with different correction does indeed touch history"""
xml_to_upload = """
<record>
<controlfield tag="001">%s</controlfield>
<controlfield tag="003">FooBar</controlfield>
</record>
""" % self.recid
recs = bibupload.xml_marc_to_records(xml_to_upload)
# We call the main function with the record as a parameter
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='correct')[0]
self.check_record_consistency(recid)
self.assertEqual(self.recid, recid)
self.assertNotEqual(self.history, run_sql("SELECT * FROM hstRECORD WHERE id_bibrec=%s", (self.recid, ))) # kwalitee: disable=sql
self.failUnless(len(self.history) == 1 and len(run_sql("SELECT * FROM hstRECORD WHERE id_bibrec=%s", (self.recid, ))) == 2) # kwalitee: disable=sql
self.assertNotEqual(self.timestamp, run_sql("SELECT modification_date FROM bibrec WHERE id=%s", (self.recid,)))
class BibUploadCallbackURLTest(GenericBibUploadTest):
"""Testing usage of CLI callback_url"""
def setUp(self):
GenericBibUploadTest.setUp(self)
self.test = """<record>
<datafield tag ="245" ind1=" " ind2=" ">
<subfield code="a">something</subfield>
</datafield>
<datafield tag ="700" ind1=" " ind2=" ">
<subfield code="a">Tester, J Y</subfield>
<subfield code="u">MIT</subfield>
</datafield>
<datafield tag ="700" ind1=" " ind2=" ">
<subfield code="a">Tester, K J</subfield>
<subfield code="u">CERN2</subfield>
</datafield>
<datafield tag ="700" ind1=" " ind2=" ">
<subfield code="a">Tester, G</subfield>
<subfield code="u">CERN3</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="a">test11</subfield>
<subfield code="c">test31</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="a">test12</subfield>
<subfield code="c">test32</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="a">test13</subfield>
<subfield code="c">test33</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="b">test21</subfield>
<subfield code="d">test41</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="b">test22</subfield>
<subfield code="d">test42</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="a">test14</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="e">test51</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="e">test52</subfield>
</datafield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">CERN</subfield>
</datafield>
</record>"""
self.testfile_path = os.path.join(CFG_TMPDIR, 'bibupload_regression_test_input.xml')
open(self.testfile_path, "w").write(self.test)
self.resultfile_path = os.path.join(CFG_TMPDIR, 'bibupload_regression_test_result.json')
if CFG_DEVEL_SITE:
def test_simple_insert_callback_url(self):
"""bibupload - --callback-url with simple insert"""
- from invenio.bibtask import task_low_level_submission
+ from invenio.legacy.bibsched.bibtask import task_low_level_submission
taskid = task_low_level_submission('bibupload', 'test', '-i', self.testfile_path, '--callback-url', CFG_SITE_URL + '/httptest/post2?%s' % urlencode({"save": self.resultfile_path}), '-v0')
run_shell_command(CFG_BINDIR + '/bibupload %s', [str(taskid)])
results = json.loads(open(self.resultfile_path).read())
self.failUnless('results' in results)
self.assertEqual(len(results['results']), 1)
self.failUnless(results['results'][0]['success'])
self.failUnless(results['results'][0]['recid'] > 0)
self.failUnless("""<subfield code="a">Tester, J Y</subfield>""" in results['results'][0]['marcxml'], results['results'][0]['marcxml'])
class BibUploadBibRelationsTest(GenericBibUploadTest):
def setUp(self):
GenericBibUploadTest.setUp(self)
self.upload_xml = """<record>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">A very wise author</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(url_site)s/img/user-icon-1-20x20.gif</subfield>
<subfield code="t">Main</subfield>
<subfield code="n">docname</subfield>
<subfield code="i">TMP:id_identifier1</subfield>
<subfield code="v">TMP:ver_identifier1</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(url_site)s/record/8/files/9812226.pdf?version=1</subfield>
<subfield code="t">Main</subfield>
<subfield code="n">docname2</subfield>
<subfield code="i">TMP:id_identifier2</subfield>
<subfield code="v">TMP:ver_identifier2</subfield>
</datafield>
<datafield tag="BDR" ind1=" " ind2=" ">
<subfield code="i">TMP:id_identifier1</subfield>
<subfield code="v">TMP:ver_identifier1</subfield>
<subfield code="j">TMP:id_identifier2</subfield>
<subfield code="w">TMP:ver_identifier2</subfield>
<subfield code="t">is_extracted_from</subfield>
</datafield>
</record>""" % {'url_site' : CFG_SITE_URL}
def test_upload_with_tmpids(self):
"""bibupload - Trying to upload a relation between two new documents ... and then to delete"""
recs = bibupload.xml_marc_to_records(self.upload_xml)
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
# ertrive document numbers and check if there exists a relation between them
brd = BibRecDocs(recid)
docs = brd.list_bibdocs()
self.assertEqual(2, len(docs), "Incorrect number of documents attached to a record")
rels = docs[0].get_incoming_relations("is_extracted_from") + docs[0].get_outgoing_relations("is_extracted_from")
self.assertEqual(1, len(rels), "Incorrect number of relations retrieved from the first document")
rels = docs[1].get_incoming_relations("is_extracted_from") + docs[1].get_outgoing_relations("is_extracted_from")
self.assertEqual(1, len(rels), "Incorrect number of relations retrieved from the second document")
created_relation_id = rels[0].id
rels = docs[0].get_incoming_relations("different_type_of_relation") + docs[0].get_outgoing_relations("different_type_of_relation")
self.assertEqual(0, len(rels), "Incorrect number of relations retrieved from the first document")
upload_xml_2 = """<record>
<controlfield tag="001">%(rec_id)s</controlfield>
<datafield tag="BDR" ind1=" " ind2=" ">
<subfield code="r">%(rel_id)s</subfield>
<subfield code="d">DELETE</subfield>
</datafield>
</record>""" % {'rel_id' : created_relation_id, 'rec_id' : recid}
recs = bibupload.xml_marc_to_records(upload_xml_2)
bibupload.bibupload_records(recs, opt_mode='correct')[0]
brd = BibRecDocs(recid)
docs = brd.list_bibdocs()
self.assertEqual(2, len(docs), "Incorrect number of documents attached to a record")
rels = docs[0].get_incoming_relations("is_extracted_from") + docs[0].get_outgoing_relations("is_extracted_from")
self.assertEqual(0, len(rels), "Incorrect number of relations retrieved from the first document")
rels = docs[1].get_incoming_relations("is_extracted_from") + docs[1].get_outgoing_relations("is_extracted_from")
self.assertEqual(0, len(rels), "Incorrect number of relations retrieved from the second document")
rels = docs[0].get_incoming_relations("different_type_of_relation") + docs[0].get_outgoing_relations("different_type_of_relation")
self.assertEqual(0, len(rels), "Incorrect number of relations retrieved from the first document")
def test_delete_by_docids(self):
"""bibupload - delete relation entry by the docid inside the currently modified record
Uploading a sample relation and trying to modify it by refering to other parameters than
the relation number"""
recs = bibupload.xml_marc_to_records(self.upload_xml)
dummyerr, recid, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
brd = BibRecDocs(recid)
docs = brd.list_bibdocs()
self.assertEqual(2, len(docs), "Incorrect number of attached documents")
rel = (docs[0].get_incoming_relations("is_extracted_from") + docs[0].get_outgoing_relations("is_extracted_from"))[0]
upload_xml_2 = """<record>
<controlfield tag="001">%(rec_id)s</controlfield>
<datafield tag="BDR" ind1=" " ind2=" ">
<subfield code="i">%(first_docid)s</subfield>
<subfield code="v">%(first_docver)s</subfield>
<subfield code="j">%(second_docid)s</subfield>
<subfield code="w">%(second_docver)s</subfield>
<subfield code="t">is_extracted_from</subfield>
<subfield code="d">DELETE</subfield>
</datafield>
</record>""" % { 'rec_id' : recid,
'first_docid': rel.bibdoc1_id,
'first_docver' : rel.bibdoc1_ver,
'second_docid': rel.bibdoc2_id,
'second_docver' : rel.bibdoc2_ver}
recs = bibupload.xml_marc_to_records(upload_xml_2)
bibupload.bibupload_records(recs, opt_mode='correct')[0]
brd = BibRecDocs(recid)
docs = brd.list_bibdocs()
self.assertEqual(2, len(docs), "Incorrect number of documents attached to a record")
rels = docs[0].get_incoming_relations("is_extracted_from") + docs[0].get_outgoing_relations("is_extracted_from")
self.assertEqual(0, len(rels), "Incorrect number of relations retrieved from the first document")
rels = docs[1].get_incoming_relations("is_extracted_from") + docs[1].get_outgoing_relations("is_extracted_from")
self.assertEqual(0, len(rels), "Incorrect number of relations retrieved from the second document")
rels = docs[0].get_incoming_relations("different_type_of_relation") + docs[0].get_outgoing_relations("different_type_of_relation")
self.assertEqual(0, len(rels), "Incorrect number of relations retrieved from the first document")
def test_remove_by_name(self):
"""bibupload - trying removing relation by providing bibdoc names rather than relation numbers"""
recs = bibupload.xml_marc_to_records(self.upload_xml)
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
brd = BibRecDocs(recid)
docs = brd.list_bibdocs()
self.assertEqual(2, len(docs), "Incorrect number of attached documents")
rel = (docs[0].get_incoming_relations("is_extracted_from") + docs[0].get_outgoing_relations("is_extracted_from"))[0]
upload_xml_2 = """<record>
<controlfield tag="001">%(rec_id)s</controlfield>
<datafield tag="BDR" ind1=" " ind2=" ">
<subfield code="n">docname</subfield>
<subfield code="v">%(first_docver)s</subfield>
<subfield code="o">docname2</subfield>
<subfield code="w">%(second_docver)s</subfield>
<subfield code="t">is_extracted_from</subfield>
<subfield code="d">DELETE</subfield>
</datafield>
</record>""" % {'rec_id' : recid,
'first_docver' : rel.bibdoc1_ver,
'second_docver' : rel.bibdoc2_ver}
# the above is incorrect ! we assert that nothing has been removed
recs = bibupload.xml_marc_to_records(upload_xml_2)
_ = bibupload.bibupload_records(recs, opt_mode='correct')[0]
brd = BibRecDocs(recid)
docs = brd.list_bibdocs()
self.assertEqual(2, len(docs), "Incorrect number of documents attached to a record")
rels = docs[0].get_incoming_relations("is_extracted_from") + docs[0].get_outgoing_relations("is_extracted_from")
self.assertEqual(0, len(rels), "Incorrect number of relations retrieved from the first document")
rels = docs[1].get_incoming_relations("is_extracted_from") + docs[1].get_outgoing_relations("is_extracted_from")
self.assertEqual(0, len(rels), "Incorrect number of relations retrieved from the second document")
rels = docs[0].get_incoming_relations("different_type_of_relation") + docs[0].get_outgoing_relations("different_type_of_relation")
self.assertEqual(0, len(rels), "Incorrect number of relations retrieved from the first document")
def test_remove_by_name_incorrect(self):
"""bibupload - trying removing relation by providing bibdoc names rather than relation numbers, but providing incorrect name"""
recs = bibupload.xml_marc_to_records(self.upload_xml)
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
brd = BibRecDocs(recid)
docs = brd.list_bibdocs()
self.assertEqual(2, len(docs), "Incorrect number of attached documents")
rel = (docs[0].get_incoming_relations("is_extracted_from") + docs[0].get_outgoing_relations("is_extracted_from"))[0]
upload_xml_2 = """<record>
<controlfield tag="001">%(rec_id)s</controlfield>
<datafield tag="BDR" ind1=" " ind2=" ">
<subfield code="n">docname1</subfield>
<subfield code="v">%(first_docver)s</subfield>
<subfield code="o">docname2</subfield>
<subfield code="w">%(second_docver)s</subfield>
<subfield code="t">is_extracted_from</subfield>
<subfield code="d">DELETE</subfield>
</datafield>
</record>""" % { 'rec_id' : recid,
'first_docver' : rel.bibdoc1_ver,
'second_docver' : rel.bibdoc2_ver}
# the above is incorrect ! we assert that nothing has been removed
recs = bibupload.xml_marc_to_records(upload_xml_2)
_ = bibupload.bibupload_records(recs, opt_mode='correct')[0]
brd = BibRecDocs(recid)
docs = brd.list_bibdocs()
self.assertEqual(2, len(docs), "Incorrect number of documents attached to a record")
rels = docs[0].get_incoming_relations("is_extracted_from") + docs[0].get_outgoing_relations("is_extracted_from")
self.assertEqual(1, len(rels), "Incorrect number of relations retrieved from the first document")
rels = docs[1].get_incoming_relations("is_extracted_from") + docs[1].get_outgoing_relations("is_extracted_from")
self.assertEqual(1, len(rels), "Incorrect number of relations retrieved from the second document")
rels = docs[0].get_incoming_relations("different_type_of_relation") + docs[0].get_outgoing_relations("different_type_of_relation")
self.assertEqual(0, len(rels), "Incorrect number of relations retrieved from the first document")
def _upload_initial_moreinfo_key(self):
"""Prepare MoreInfo with sample keys and check it has been correctly uploaded
uploaded dic: {"ns1" : {"k1":"val1", "k2":[1,2,3,"something"], "k3" : (1,3,2,"something else"), "k4" : {"a":"b", 1:2}}}
... after encoding gives KGRwMQpTJ25zMScKcDIKKGRwMwpTJ2szJwpwNAooSTEKSTMKSTIKUydzb21ldGhpbmcgZWxzZScKdHA1CnNTJ2syJwpwNgoobHA3CkkxCmFJMgphSTMKYVMnc29tZXRoaW5nJwpwOAphc1MnazEnCnA5ClMndmFsMScKcDEwCnNTJ2s0JwpwMTEKKGRwMTIKUydhJwpTJ2InCnNJMQpJMgpzc3Mu
"""
moreinfo_str = "KGRwMQpTJ25zMScKcDIKKGRwMwpTJ2szJwpwNAooSTEKSTMKSTIKUydzb21ldGhpbmcgZWxzZScKdHA1CnNTJ2syJwpwNgoobHA3CkkxCmFJMgphSTMKYVMnc29tZXRoaW5nJwpwOAphc1MnazEnCnA5ClMndmFsMScKcDEwCnNTJ2s0JwpwMTEKKGRwMTIKUydhJwpTJ2InCnNJMQpJMgpzc3Mu"
xml_to_upload = """<record>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">A very wise author</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(url_site)s/img/user-icon-1-20x20.gif</subfield>
<subfield code="t">Main</subfield>
<subfield code="n">docname</subfield>
<subfield code="i">TMP:id_identifier1</subfield>
<subfield code="v">TMP:ver_identifier1</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(url_site)s/record/8/files/9812226.pdf?version=1</subfield>
<subfield code="t">Main</subfield>
<subfield code="n">docname2</subfield>
<subfield code="i">TMP:id_identifier2</subfield>
<subfield code="v">TMP:ver_identifier2</subfield>
</datafield>
<datafield tag="BDR" ind1=" " ind2=" ">
<subfield code="i">TMP:id_identifier1</subfield>
<subfield code="v">TMP:ver_identifier1</subfield>
<subfield code="j">TMP:id_identifier2</subfield>
<subfield code="w">TMP:ver_identifier2</subfield>
<subfield code="t">is_extracted_from</subfield>
<subfield code="m">%(moreinfo_str)s</subfield>
</datafield>
</record>""" % {'url_site' : CFG_SITE_URL, 'moreinfo_str' : moreinfo_str}
recs = bibupload.xml_marc_to_records(xml_to_upload)
dummyerr, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
brd = BibRecDocs(recid)
docs = brd.list_bibdocs()
self.assertEqual(2, len(docs), "Incorrect number of attached documents")
return ((docs[0].get_incoming_relations("is_extracted_from") + docs[0].get_outgoing_relations("is_extracted_from"))[0], recid)
def test_add_relation_moreinfo_key(self):
"""bibupload - upload new MoreInfo key into the dictionary related to a relation"""
rel, _ = self._upload_initial_moreinfo_key()
# asserting correctness of data
self.assertEqual(rel.more_info.get_data("ns1", "k1"), "val1", "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k1)")
self.assertEqual(rel.more_info.get_data("ns1", "k2")[0], 1, "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k2)")
self.assertEqual(rel.more_info.get_data("ns1", "k2")[1], 2, "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k2)")
self.assertEqual(rel.more_info.get_data("ns1", "k2")[2], 3, "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k2)")
self.assertEqual(rel.more_info.get_data("ns1", "k2")[3], "something", "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k2)")
self.assertEqual(rel.more_info.get_data("ns1", "k3"), (1,3,2,"something else") , "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k3)")
self.assertEqual(rel.more_info.get_data("ns1", "k4")[1], 2, "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k4)")
self.assertEqual(rel.more_info.get_data("ns1", "k4")["a"], "b", "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k4)")
def test_modify_relation_moreinfo_key(self):
"""bibupload - modify existing MoreInfo key """
#the update : {"ns1":{"k1": "different value"}}
rel, recid = self._upload_initial_moreinfo_key()
moreinfo_str = "KGRwMQpTJ25zMScKcDIKKGRwMwpTJ2sxJwpwNApTJ2RpZmZlcmVudCB2YWx1ZScKcDUKc3Mu"
upload_xml = """
<record>
<controlfield tag="001">%(rec_id)s</controlfield>
<datafield tag="BDR" ind1=" " ind2=" ">
<subfield code="n">docname</subfield>
<subfield code="o">docname2</subfield>
<subfield code="v">1</subfield>
<subfield code="w">1</subfield>
<subfield code="t">is_extracted_from</subfield>
<subfield code="m">%(moreinfo_str)s</subfield>
</datafield>
</record>""" % {"rec_id" : recid, "moreinfo_str": moreinfo_str}
recs = bibupload.xml_marc_to_records(upload_xml)
bibupload.bibupload_records(recs, opt_mode='correct')[0]
rel = BibRelation(rel_id = rel.id)
self.assertEqual(rel.more_info.get_data("ns1", "k1"), "different value", "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k1)")
self.assertEqual(rel.more_info.get_data("ns1", "k2")[0], 1, "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k2)")
self.assertEqual(rel.more_info.get_data("ns1", "k2")[1], 2, "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k2)")
self.assertEqual(rel.more_info.get_data("ns1", "k2")[2], 3, "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k2)")
self.assertEqual(rel.more_info.get_data("ns1", "k2")[3], "something", "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k2)")
self.assertEqual(rel.more_info.get_data("ns1", "k3"), (1,3,2,"something else") , "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k3)")
self.assertEqual(rel.more_info.get_data("ns1", "k4")[1], 2, "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k4)")
self.assertEqual(rel.more_info.get_data("ns1", "k4")["a"], "b", "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k4)")
self.assertEqual(rel.more_info.get_data("ns2", "k4"), None, "Retrieved not none value for nonexisting namespace !")
def test_remove_relation_moreinfo_key(self):
"""bibupload - remove existing MoreInfo key """
#the update : {"ns1":{"k3": None}}
rel, recid = self._upload_initial_moreinfo_key()
moreinfo_str = "KGRwMQpTJ25zMScKcDIKKGRwMwpTJ2szJwpwNApOc3Mu"
upload_xml = """
<record>
<controlfield tag="001">%(rec_id)s</controlfield>
<datafield tag="BDR" ind1=" " ind2=" ">
<subfield code="n">docname</subfield>
<subfield code="o">docname2</subfield>
<subfield code="v">1</subfield>
<subfield code="w">1</subfield>
<subfield code="t">is_extracted_from</subfield>
<subfield code="m">%(moreinfo_str)s</subfield>
</datafield>
</record>""" % {"rec_id" : recid, "moreinfo_str": moreinfo_str}
recs = bibupload.xml_marc_to_records(upload_xml)
bibupload.bibupload_records(recs, opt_mode='correct')
rel = BibRelation(rel_id = rel.id)
self.assertEqual(rel.more_info.get_data("ns1", "k1"), "val1", "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k1)")
self.assertEqual(rel.more_info.get_data("ns1", "k2")[0], 1, "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k2)")
self.assertEqual(rel.more_info.get_data("ns1", "k2")[1], 2, "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k2)")
self.assertEqual(rel.more_info.get_data("ns1", "k2")[2], 3, "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k2)")
self.assertEqual(rel.more_info.get_data("ns1", "k2")[3], "something", "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k2)")
self.assertEqual(rel.more_info.get_data("ns1", "k3"), None , "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k3)")
self.assertEqual(rel.more_info.get_data("ns1", "k4")[1], 2, "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k4)")
self.assertEqual(rel.more_info.get_data("ns1", "k4")["a"], "b", "Retrieved incorrect data from the MoreInfo Dictionary (namespace : ns1 key: k4)")
class BibUploadMoreInfoTest(GenericBibUploadTest):
"""bibupload - Testing upload of different types of MoreInfo """
def _dict_checker(self, dic, more_info, equal = True):
""" Check the more_info for being conform with the dictionary
@param equal - The mode of conformity. True means that the dictionary
has to be equal with the MoreInfo. False means that dictionary
has to be contained in the MoreInfo
"""
for namespace in dic:
for key in dic[namespace]:
self.assertEqual(cPickle.dumps(dic[namespace][key]),
cPickle.dumps(more_info.get_data(namespace, key)),
"Different values for the value of key %s in the namespace %s inside of the MoreInfo object" % \
(namespace, key))
if equal:
for namespace in more_info.get_namespaces():
for key in more_info.get_keys(namespace):
self.assertTrue(namespace in dic,
"namespace %s present in the MoreInfo, but not present in the dictionary" % \
(namespace, ))
self.assertTrue(key in dic[namespace],
"key %s present in the namespace %s of the MoreInfo but not present in the dictionary" % \
(namespace, key))
self.assertEqual(cPickle.dumps(more_info.get_data(namespace, key)),
cPickle.dumps(dic[namespace][key]),
"Value for namespace '%s' and key '%s' varies between MoreInfo and the dictionary. moreinfo value: '%s' dictionary value: '%s'" % \
(namespace, key, repr(more_info.get_data(namespace, key)), repr(dic[namespace][key])))
def test_relation_moreinfo_insert(self):
"""bibupload - Testing the upload of BibRelation and corresponding MoreInfo field"""
# Cleaning existing data
rels = BibRelation.get_relations(bibdoc1_id = 70, bibdoc2_id = 71, rel_type = "is_extracted_from")
for rel in rels:
rel.delete()
# Uploading
relation_upload_template = """
<record>
<datafield tag="BDR" ind1=" " ind2=" ">
<subfield code="i">70</subfield>
<subfield code="j">71</subfield>
<subfield code="t">is_extracted_from</subfield>
<subfield code="m">%s</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Some author</subfield>
</datafield>
</record>"""
data_to_insert = {"first namespace": {"k1" : "val1", "k2" : "val2"},
"second" : {"k1" : "#@$#$@###!!!", "k123": {1:2, 9: (6,2,7)}}}
serialised = base64.b64encode(cPickle.dumps(data_to_insert))
recs = bibupload.xml_marc_to_records(relation_upload_template % (serialised, ))
bibupload.bibupload_records(recs, opt_mode='insert')[0]
# Verifying the correctness of the uploaded data
rels = BibRelation.get_relations(bibdoc1_id = 70, bibdoc2_id = 71, rel_type = "is_extracted_from")
self.assertEqual(len(rels), 1)
rel = rels[0]
self.assertEqual(rel.bibdoc1_id, 70)
self.assertEqual(rel.bibdoc2_id, 71)
self.assertEqual(rel.get_data("first namespace", "k1"), "val1")
self.assertEqual(rel.get_data("first namespace", "k2"), "val2")
self.assertEqual(rel.get_data("second", "k1"), "#@$#$@###!!!")
self.assertEqual(rel.get_data("second", "k123")[1], 2)
self.assertEqual(rel.get_data("second", "k123")[9], (6,2,7))
self._dict_checker(data_to_insert, rel.more_info)
# Cleaning after the upload ... just in case we have selected more
for rel in rels:
rel.delete()
def _serialise_data(self, data):
return base64.b64encode(cPickle.dumps(data))
# Subfield tags used to upload particular types of MoreInfo
_mi_bibdoc = "w"
_mi_bibdoc_version = "p"
_mi_bibdoc_version_format = "b"
_mi_bibdoc_format = "u"
def _generate_moreinfo_tag(self, mi_type, data):
"""
"""
serialised = self._serialise_data(data)
return """<subfield code="%s">%s</subfield>""" % (mi_type, serialised)
def test_document_moreinfo_insert(self):
"""bibupload - Inserting new MoreInfo to the document
1) Inserting new MoreInfo to new document
2) Inserting new MoreInfo keys existing document version
3) Removing keys from MoreInfo
4) Removing document and asserting, MoreInfo gets removed as well
5) Overriding MoreInfo keys
"""
moreinfo_upload_template = """
<record>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="n">0106015_01</subfield>
<subfield code="f">.jpg</subfield>
<subfield code="r">restricted_picture</subfield>
%%(additional_content)s
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Some author</subfield>
</datafield>
</record>""" % {"siteurl": CFG_SITE_URL}
sfs = []
sfs.append(self._generate_moreinfo_tag(BibUploadMoreInfoTest._mi_bibdoc,
{"first namespace" :
{"type": "document moreinfo"}}))
sfs.append(self._generate_moreinfo_tag(BibUploadMoreInfoTest._mi_bibdoc_version,
{"first namespace" :
{"type": "Bibdoc - version moreinfo"}}))
sfs.append(self._generate_moreinfo_tag(BibUploadMoreInfoTest._mi_bibdoc_version_format,
{"first namespace" :
{"type": "Bibdoc - version, format moreinfo"}}))
sfs.append(self._generate_moreinfo_tag(BibUploadMoreInfoTest._mi_bibdoc_format,
{"first namespace" :
{"type": "Bibdoc - format moreinfo"}}))
marcxml_1 = moreinfo_upload_template % {"additional_content" : "\n".join(sfs)}
recs = bibupload.xml_marc_to_records(marcxml_1)
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
# now checking if all the data has been uploaded correctly
bdr = BibRecDocs(recid)
doc = bdr.list_bibdocs()[0]
docid = doc.get_id()
mi_doc = MoreInfo(docid = docid)
mi_doc_ver = MoreInfo(docid = docid, version = 1)
mi_doc_ver_fmt = MoreInfo(docid = docid, version = 1, docformat=".jpg")
mi_doc_fmt = MoreInfo(docid = docid, docformat=".jpg")
self._dict_checker({"first namespace" : {"type": "document moreinfo"}},
mi_doc, equal=False) # in case of the document only inclusive check
self._dict_checker({"first namespace" : {"type": "Bibdoc - version moreinfo"}},
mi_doc_ver)
self._dict_checker({"first namespace" : {
"type": "Bibdoc - version, format moreinfo"}},
mi_doc_ver_fmt)
self._dict_checker({"first namespace" : {"type": "Bibdoc - format moreinfo"}},
mi_doc_fmt)
#now appending to a particular version of MoreInfo
# uplad new key to an existing dictionary of a version
def _get_mit_template(recid, bibdocid=None, bibdocname=None,
version=None, docformat=None, relation=None, data=None):
if data is None:
ser = None
else:
ser = base64.b64encode(cPickle.dumps(data))
subfields = []
for s_code, val in (("r", relation), ("i", bibdocid),
("n", bibdocname), ("v", version),
("f", docformat) , ("m", ser)):
if not val is None:
subfields.append("""<subfield code="%s">%s</subfield>""" % \
(s_code, val))
return """<record>
<controlfield tag="001">%s</controlfield>
<datafield tag="BDM" ind1=" " ind2=" ">
%s
</datafield>
</record>""" % (str(recid), ("\n".join(subfields)))
marcxml_2 = _get_mit_template(recid, version = 1, bibdocid = docid,
data= {"first namespace" :
{"new key": {1:2, 987:678}}})
recs = bibupload.xml_marc_to_records(marcxml_2)
bibupload.bibupload_records(recs, opt_mode='append')
mi = MoreInfo(docid = docid, version = 1)
self._dict_checker({
"first namespace" : {"type": "Bibdoc - version moreinfo",
"new key": {1:2, 987:678}
}
}, mi)
#removing the entire old content of the MoreInfo and uploading new
data = {"ns1" : {"nk1": 12, "mk1": "this is new content"},
"namespace two" : {"ddd" : "bbb"}}
marcxml_3 = _get_mit_template(recid, version = 1, bibdocid = docid,
data= data)
recs = bibupload.xml_marc_to_records(marcxml_3)
bibupload.bibupload_records(recs, opt_mode='correct')
mi = MoreInfo(docid = docid, version = 1)
self._dict_checker(data, mi)
# removing a particular key
marcxml_4 = _get_mit_template(recid, version = 1, bibdocid = docid,
data= {"ns1": {"nk1" : None}})
recs = bibupload.xml_marc_to_records(marcxml_4)
bibupload.bibupload_records(recs, opt_mode='append')
mi = MoreInfo(docid = docid, version = 1)
self._dict_checker( {"ns1" : { "mk1": "this is new content"},
"namespace two" : {"ddd" : "bbb"}}, mi)
# adding new key
marcxml_5 = _get_mit_template(recid, version = 1, bibdocid = docid,
data= {"ns1": {"newkey" : "newvalue"}})
recs = bibupload.xml_marc_to_records(marcxml_5)
bibupload.bibupload_records(recs, opt_mode='append')
mi = MoreInfo(docid = docid, version = 1)
self._dict_checker( {"ns1" : { "mk1": "this is new content", "newkey" : "newvalue"},
"namespace two" : {"ddd" : "bbb"}}, mi)
class BibUploadInsertModeTest(GenericBibUploadTest):
"""Testing insert mode."""
def setUp(self):
# pylint: disable=C0103
"""Initialise the MARCXML variable"""
GenericBibUploadTest.setUp(self)
self.test = """<record>
<datafield tag ="245" ind1=" " ind2=" ">
<subfield code="a">something</subfield>
</datafield>
<datafield tag ="700" ind1=" " ind2=" ">
<subfield code="a">Tester, J Y</subfield>
<subfield code="u">MIT</subfield>
</datafield>
<datafield tag ="700" ind1=" " ind2=" ">
<subfield code="a">Tester, K J</subfield>
<subfield code="u">CERN2</subfield>
</datafield>
<datafield tag ="700" ind1=" " ind2=" ">
<subfield code="a">Tester, G</subfield>
<subfield code="u">CERN3</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="a">test11</subfield>
<subfield code="c">test31</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="a">test12</subfield>
<subfield code="c">test32</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="a">test13</subfield>
<subfield code="c">test33</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="b">test21</subfield>
<subfield code="d">test41</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="b">test22</subfield>
<subfield code="d">test42</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="a">test14</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="e">test51</subfield>
</datafield>
<datafield tag ="111" ind1=" " ind2=" ">
<subfield code="e">test52</subfield>
</datafield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">CERN</subfield>
</datafield>
</record>"""
self.test_hm = """
100__ $$aTester, T$$uCERN
111__ $$atest11$$ctest31
111__ $$atest12$$ctest32
111__ $$atest13$$ctest33
111__ $$btest21$$dtest41
111__ $$btest22$$dtest42
111__ $$atest14
111__ $$etest51
111__ $$etest52
245__ $$asomething
700__ $$aTester, J Y$$uMIT
700__ $$aTester, K J$$uCERN2
700__ $$aTester, G$$uCERN3
"""
def test_create_record_id(self):
"""bibupload - insert mode, trying to create a new record ID in the database"""
rec_id = bibupload.create_new_record()
self.assertNotEqual(None, rec_id)
def test_create_specific_record_id(self):
"""bibupload - insert mode, trying to create a new specifc record ID in the database"""
expected_rec_id = run_sql("SELECT MAX(id) FROM bibrec")[0][0] + 1
rec_id = bibupload.create_new_record(expected_rec_id)
self.assertEqual(rec_id, expected_rec_id)
def test_no_retrieve_record_id(self):
"""bibupload - insert mode, detection of record ID in the input file"""
# We create create the record out of the xml marc
recs = bibupload.xml_marc_to_records(self.test)
# We call the function which should retrieve the record id
rec_id = bibupload.retrieve_rec_id(recs[0], 'insert')
# We compare the value found with None
self.assertEqual(None, rec_id)
def test_insert_complete_xmlmarc(self):
"""bibupload - insert mode, trying to insert complete MARCXML file"""
# Initialize the global variable
# We create create the record out of the xml marc
recs = bibupload.xml_marc_to_records(self.test)
# We call the main function with the record as a parameter
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# We retrieve the inserted xml
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
inserted_hm = print_record(recid, 'hm')
# Compare if the two MARCXML are the same
self.assertEqual(compare_xmbuffers(remove_tag_001_from_xmbuffer(inserted_xm),
self.test), '')
self.assertEqual(compare_hmbuffers(remove_tag_001_from_hmbuffer(inserted_hm),
self.test_hm), '')
def test_retrieve_005_tag(self):
"""bibupload - insert mode, verifying insertion of 005 control field for record """
# Convert marc xml into record structure
from invenio.legacy.bibrecord import record_has_field, record_get_field_value
recs = bibupload.xml_marc_to_records(self.test)
dummy, recid, dummy = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid)
# Retrive the inserted record based on the record id
rec = get_record(recid)
# We retrieve the creationdate date from the database
query = """SELECT DATE_FORMAT(last_updated,'%%Y%%m%%d%%H%%i%%s') FROM bibfmt where id_bibrec=%s AND format='xm'"""
res = run_sql(query, (recid, ))
self.assertEqual(record_has_field(rec, '005'), True)
self.assertEqual(str(res[0][0]) + '.0', record_get_field_value(rec, '005', '', ''))
class BibUploadAppendModeTest(GenericBibUploadTest):
"""Testing append mode."""
def setUp(self):
# pylint: disable=C0103
"""Initialize the MARCXML variable"""
GenericBibUploadTest.setUp(self)
self.test_existing = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHOPHO</subfield>
</datafield>
</record>"""
self.test_to_append = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, U</subfield>
<subfield code="u">CERN</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHOPHO</subfield>
</datafield>
</record>"""
self.test_expected_xm = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, U</subfield>
<subfield code="u">CERN</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHOPHO</subfield>
</datafield>
</record>"""
self.test_expected_hm = """
001__ 123456789
100__ $$aTester, T$$uDESY
100__ $$aTester, U$$uCERN
970__ $$a0003719PHOPHO
"""
# insert test record:
test_to_upload = self.test_existing.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(test_to_upload)
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
self.test_recid = recid
# replace test buffers with real recid of inserted test record:
self.test_existing = self.test_existing.replace('123456789',
str(self.test_recid))
self.test_to_append = self.test_to_append.replace('123456789',
str(self.test_recid))
self.test_expected_xm = self.test_expected_xm.replace('123456789',
str(self.test_recid))
self.test_expected_hm = self.test_expected_hm.replace('123456789',
str(self.test_recid))
def test_retrieve_record_id(self):
"""bibupload - append mode, the input file should contain a record ID"""
# We create create the record out of the xml marc
recs = bibupload.xml_marc_to_records(self.test_to_append)
# We call the function which should retrieve the record id
rec_id = bibupload.retrieve_rec_id(recs[0], 'append')
# We compare the value found with None
self.assertEqual(self.test_recid, rec_id)
# clean up after ourselves:
def test_update_modification_record_date(self):
"""bibupload - append mode, checking the update of the modification date"""
from invenio.utils.date import convert_datestruct_to_datetext
# Initialize the global variable
# We create create the record out of the xml marc
recs = bibupload.xml_marc_to_records(self.test_existing)
# We call the function which should retrieve the record id
rec_id = bibupload.retrieve_rec_id(recs[0], opt_mode='append')
# Retrieve current localtime
now = time.localtime()
# We update the modification date
bibupload.update_bibrec_date(convert_datestruct_to_datetext(now), rec_id, False)
# We retrieve the modification date from the database
query = """SELECT DATE_FORMAT(modification_date,'%%Y-%%m-%%d %%H:%%i:%%s') FROM bibrec where id = %s"""
res = run_sql(query, (str(rec_id), ))
# We compare the two results
self.assertEqual(res[0][0], convert_datestruct_to_datetext(now))
# clean up after ourselves:
def test_append_complete_xml_marc(self):
"""bibupload - append mode, appending complete MARCXML file"""
# Now we append a datafield
# We create create the record out of the xml marc
recs = bibupload.xml_marc_to_records(self.test_to_append)
# We call the main function with the record as a parameter
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='append')[0]
self.check_record_consistency(recid)
# We retrieve the inserted xm
after_append_xm = print_record(recid, 'xm')
after_append_hm = print_record(recid, 'hm')
# Compare if the two MARCXML are the same
self.assertEqual(compare_xmbuffers(after_append_xm, self.test_expected_xm), '')
self.assertEqual(compare_hmbuffers(after_append_hm, self.test_expected_hm), '')
def test_retrieve_updated_005_tag(self):
"""bibupload - append mode, updating 005 control tag after modifiction """
from invenio.legacy.bibrecord import record_get_field_value
recs = bibupload.xml_marc_to_records(self.test_to_append)
_, recid, _ = bibupload.bibupload(recs[0], opt_mode='append')
self.check_record_consistency(recid)
rec = get_record(recid)
query = """SELECT DATE_FORMAT(MAX(job_date),'%%Y%%m%%d%%H%%i%%s') FROM hstRECORD where id_bibrec = %s"""
res = run_sql(query, (str(recid), ))
self.assertEqual(str(res[0][0])+'.0',record_get_field_value(rec,'005','',''))
class BibUploadCorrectModeTest(GenericBibUploadTest):
"""
Testing correcting a record containing similar tags (identical
tag, different indicators). Currently Invenio replaces only
those tags that have matching indicators too, unlike ALEPH500 that
does not pay attention to indicators, it corrects all fields with
the same tag, regardless of the indicator values.
"""
def setUp(self):
"""Initialize the MARCXML test record."""
GenericBibUploadTest.setUp(self)
self.testrec1_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
</record>
"""
self.testrec1_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, Jane$$uTest Institute
10047 $$aTest, John$$uTest University
10048 $$aCool
10047 $$aTest, Jim$$uTest Laboratory
"""
self.testrec1_xm_to_correct = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Joseph</subfield>
<subfield code="u">Test Academy</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test2, Joseph</subfield>
<subfield code="u">Test2 Academy</subfield>
</datafield>
</record>
"""
self.testrec1_corrected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Joseph</subfield>
<subfield code="u">Test Academy</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test2, Joseph</subfield>
<subfield code="u">Test2 Academy</subfield>
</datafield>
</record>
"""
self.testrec1_corrected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, Jane$$uTest Institute
10048 $$aCool
10047 $$aTest, Joseph$$uTest Academy
10047 $$aTest2, Joseph$$uTest2 Academy
"""
# insert test record:
test_record_xm = self.testrec1_xm.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(test_record_xm)
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recID:
self.testrec1_xm = self.testrec1_xm.replace('123456789', str(recid))
self.testrec1_hm = self.testrec1_hm.replace('123456789', str(recid))
self.testrec1_xm_to_correct = self.testrec1_xm_to_correct.replace('123456789', str(recid))
self.testrec1_corrected_xm = self.testrec1_corrected_xm.replace('123456789', str(recid))
self.testrec1_corrected_hm = self.testrec1_corrected_hm.replace('123456789', str(recid))
# test of the inserted record:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm, self.testrec1_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm, self.testrec1_hm), '')
def test_record_correction(self):
"""bibupload - correct mode, similar MARCXML tags/indicators"""
# correct some tags:
recs = bibupload.xml_marc_to_records(self.testrec1_xm_to_correct)
_, self.recid, _ = bibupload.bibupload_records(recs, opt_mode='correct')[0]
self.check_record_consistency(self.recid)
corrected_xm = print_record(self.recid, 'xm')
corrected_hm = print_record(self.recid, 'hm')
# did it work?
self.assertEqual(compare_xmbuffers(corrected_xm, self.testrec1_corrected_xm), '')
self.assertEqual(compare_hmbuffers(corrected_hm, self.testrec1_corrected_hm), '')
# clean up after ourselves:
return
class BibUploadDeleteModeTest(GenericBibUploadTest):
"""
Testing deleting specific tags from a record while keeping anything else
untouched. Currently Invenio deletes only those tags that have
matching indicators too, unlike ALEPH500 that does not pay attention to
indicators, it corrects all fields with the same tag, regardless of the
indicator values.
"""
def setUp(self):
"""Initialize the MARCXML test record."""
GenericBibUploadTest.setUp(self)
self.testrec1_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>
"""
self.testrec1_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, Jane$$uTest Institute
10047 $$aTest, John$$uTest University
10048 $$aCool
10047 $$aTest, Jim$$uTest Laboratory
888__ $$adumb text
"""
self.testrec1_xm_to_delete = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Johnson</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>
"""
self.testrec1_corrected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
</record>
"""
self.testrec1_corrected_hm = """
001__ 123456789
003__ SzGeCERN
10047 $$aTest, John$$uTest University
10047 $$aTest, Jim$$uTest Laboratory
"""
# insert test record:
test_record_xm = self.testrec1_xm.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(test_record_xm)
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recID:
self.testrec1_xm = self.testrec1_xm.replace('123456789', str(recid))
self.testrec1_hm = self.testrec1_hm.replace('123456789', str(recid))
self.testrec1_xm_to_delete = self.testrec1_xm_to_delete.replace('123456789', str(recid))
self.testrec1_corrected_xm = self.testrec1_corrected_xm.replace('123456789', str(recid))
self.testrec1_corrected_hm = self.testrec1_corrected_hm.replace('123456789', str(recid))
# test of the inserted record:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm, self.testrec1_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm, self.testrec1_hm), '')
# Checking dumb text is in bibxxx
self.failUnless(run_sql("SELECT id_bibrec from bibrec_bib88x WHERE id_bibrec=%s", (recid, )))
def test_record_tags_deletion(self):
"""bibupload - delete mode, deleting specific tags"""
# correct some tags:
recs = bibupload.xml_marc_to_records(self.testrec1_xm_to_delete)
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='delete')[0]
self.check_record_consistency(recid)
corrected_xm = print_record(recid, 'xm')
corrected_hm = print_record(recid, 'hm')
# did it work?
self.assertEqual(compare_xmbuffers(corrected_xm, self.testrec1_corrected_xm), '')
self.assertEqual(compare_hmbuffers(corrected_hm, self.testrec1_corrected_hm), '')
# Checking dumb text is no more in bibxxx
self.failIf(run_sql("SELECT id_bibrec from bibrec_bib88x WHERE id_bibrec=%s", (recid, )))
# clean up after ourselves:
class BibUploadReplaceModeTest(GenericBibUploadTest):
"""Testing replace mode."""
def test_record_replace(self):
"""bibupload - replace mode, similar MARCXML tags/indicators"""
# replace some tags:
testrec1_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
</record>
"""
testrec1_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, Jane$$uTest Institute
10047 $$aTest, John$$uTest University
10048 $$aCool
10047 $$aTest, Jim$$uTest Laboratory
"""
testrec1_xm_to_replace = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Joseph</subfield>
<subfield code="u">Test Academy</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test2, Joseph</subfield>
<subfield code="u">Test2 Academy</subfield>
</datafield>
</record>
"""
testrec1_replaced_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Joseph</subfield>
<subfield code="u">Test Academy</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test2, Joseph</subfield>
<subfield code="u">Test2 Academy</subfield>
</datafield>
</record>
"""
testrec1_replaced_hm = """
001__ 123456789
10047 $$aTest, Joseph$$uTest Academy
10047 $$aTest2, Joseph$$uTest2 Academy
"""
# insert test record:
test_record_xm = testrec1_xm.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(test_record_xm)
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recID:
testrec1_xm = testrec1_xm.replace('123456789', str(recid))
testrec1_hm = testrec1_hm.replace('123456789', str(recid))
testrec1_xm_to_replace = testrec1_xm_to_replace.replace('123456789', str(recid))
testrec1_replaced_xm = testrec1_replaced_xm.replace('123456789', str(recid))
testrec1_replaced_hm = testrec1_replaced_hm.replace('123456789', str(recid))
# test of the inserted record:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm, testrec1_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm, testrec1_hm), '')
recs = bibupload.xml_marc_to_records(testrec1_xm_to_replace)
_, recid, _ = bibupload.bibupload(recs[0], opt_mode='replace')
self.check_record_consistency(recid)
replaced_xm = print_record(recid, 'xm')
replaced_hm = print_record(recid, 'hm')
# did it work?
self.assertEqual(compare_xmbuffers(replaced_xm, testrec1_replaced_xm), '')
self.assertEqual(compare_hmbuffers(replaced_hm, testrec1_replaced_hm), '')
def test_record_replace_force_non_existing(self):
"""bibupload - replace mode, force non existing recid"""
- from invenio.bibtask import task_set_option
+ from invenio.legacy.bibsched.bibtask import task_set_option
# replace some tags:
the_recid = self.last_recid + 1
testrec1_xm = """
<record>
<controlfield tag="001">%s</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
</record>
""" % the_recid
testrec1_hm = """
001__ %s
003__ SzGeCERN
100__ $$aTest, Jane$$uTest Institute
10047 $$aTest, John$$uTest University
10048 $$aCool
10047 $$aTest, Jim$$uTest Laboratory
""" % the_recid
recs = bibupload.xml_marc_to_records(testrec1_xm)
task_set_option('force', True)
try:
err, recid, msg = bibupload.bibupload_records(recs, opt_mode='replace')[0]
self.check_record_consistency(recid)
finally:
task_set_option('force', False)
replaced_xm = print_record(recid, 'xm')
replaced_hm = print_record(recid, 'hm')
# did it work?
self.assertEqual(compare_xmbuffers(replaced_xm, testrec1_xm), '')
self.assertEqual(compare_hmbuffers(replaced_hm, testrec1_hm), '')
self.assertEqual(recid, the_recid)
def test_record_replace_non_existing(self):
"""bibupload - replace mode, non existing recid"""
# replace some tags:
the_recid = self.last_recid + 1
testrec1_xm = """
<record>
<controlfield tag="001">%s</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
</record>
""" % the_recid
recs = bibupload.xml_marc_to_records(testrec1_xm)
err, recid, _ = bibupload.bibupload(recs[0], opt_mode='replace')
self.assertEqual((err, recid), (1, -1))
def test_record_replace_two_recids(self):
"""bibupload - replace mode, two recids"""
# replace some tags:
testrec1_xm = """
<record>
<controlfield tag="001">300</controlfield>
<controlfield tag="001">305</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="8">
<subfield code="a">Cool</subfield>
</datafield>
<datafield tag="100" ind1="4" ind2="7">
<subfield code="a">Test, Jim</subfield>
<subfield code="u">Test Laboratory</subfield>
</datafield>
</record>
"""
recs = bibupload.xml_marc_to_records(testrec1_xm)
err, recid, _ = bibupload.bibupload(recs[0], opt_mode='replace')
# did it work?
self.assertEqual((err, recid), (1, -1))
class BibUploadReferencesModeTest(GenericBibUploadTest):
"""Testing references mode.
NOTE: in the past this was done by calling bibupload --reference|-z
which is now simply implying bibupload --correct.
"""
def setUp(self):
"""Initialize the MARCXML variable"""
GenericBibUploadTest.setUp(self)
self.test_insert = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">CERN</subfield>
</datafield>
</record>"""
self.test_reference = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag =\"""" + CFG_BIBUPLOAD_REFERENCE_TAG + """\" ind1="C" ind2="5">
<subfield code="m">M. Lüscher and P. Weisz, String excitation energies in SU(N) gauge theories beyond the free-string approximation,</subfield>
<subfield code="s">J. High Energy Phys. 07 (2004) 014</subfield>
</datafield>
</record>"""
self.test_reference_expected_xm = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">CERN</subfield>
</datafield>
<datafield tag =\"""" + CFG_BIBUPLOAD_REFERENCE_TAG + """\" ind1="C" ind2="5">
<subfield code="m">M. Lüscher and P. Weisz, String excitation energies in SU(N) gauge theories beyond the free-string approximation,</subfield>
<subfield code="s">J. High Energy Phys. 07 (2004) 014</subfield>
</datafield>
</record>"""
self.test_insert_hm = """
001__ 123456789
100__ $$aTester, T$$uCERN
"""
self.test_reference_expected_hm = """
001__ 123456789
100__ $$aTester, T$$uCERN
%(reference_tag)sC5 $$mM. Lüscher and P. Weisz, String excitation energies in SU(N) gauge theories beyond the free-string approximation,$$sJ. High Energy Phys. 07 (2004) 014
""" % {'reference_tag': CFG_BIBUPLOAD_REFERENCE_TAG}
# insert test record:
test_insert = self.test_insert.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(test_insert)
_, recid, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recID:
self.test_insert = self.test_insert.replace('123456789', str(recid))
self.test_insert_hm = self.test_insert_hm.replace('123456789', str(recid))
self.test_reference = self.test_reference.replace('123456789', str(recid))
self.test_reference_expected_xm = self.test_reference_expected_xm.replace('123456789', str(recid))
self.test_reference_expected_hm = self.test_reference_expected_hm.replace('123456789', str(recid))
# test of the inserted record:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm, self.test_insert), '')
self.assertEqual(compare_hmbuffers(inserted_hm, self.test_insert_hm), '')
self.test_recid = recid
def test_reference_complete_xml_marc(self):
"""bibupload - reference mode, inserting references MARCXML file"""
# We create create the record out of the xml marc
recs = bibupload.xml_marc_to_records(self.test_reference)
# We call the main function with the record as a parameter
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='reference')[0]
self.check_record_consistency(recid)
# We retrieve the inserted xml
reference_xm = print_record(recid, 'xm')
reference_hm = print_record(recid, 'hm')
# Compare if the two MARCXML are the same
self.assertEqual(compare_xmbuffers(reference_xm, self.test_reference_expected_xm), '')
self.assertEqual(compare_hmbuffers(reference_hm, self.test_reference_expected_hm), '')
class BibUploadRecordsWithSYSNOTest(GenericBibUploadTest):
"""Testing uploading of records that have external SYSNO present."""
def setUp(self):
# pylint: disable=C0103
"""Initialize the MARCXML test records."""
GenericBibUploadTest.setUp(self)
# Note that SYSNO fields are repeated but with different
# subfields, this is to test whether bibupload would not
# mistakenly pick up wrong values.
self.xm_testrec1 = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 1</subfield>
</datafield>
<datafield tag="%(sysnotag)s" ind1="%(sysnoind1)s" ind2="%(sysnoind2)s">
<subfield code="%(sysnosubfieldcode)s">sysno1</subfield>
</datafield>
<datafield tag="%(sysnotag)s" ind1="%(sysnoind1)s" ind2="%(sysnoind2)s">
<subfield code="0">sysno2</subfield>
</datafield>
</record>
""" % {'sysnotag': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[0:3],
'sysnoind1': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[3:4] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[3:4] or " ",
'sysnoind2': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[4:5] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[4:5] or " ",
'sysnosubfieldcode': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[5:6],
}
self.hm_testrec1 = """
001__ 123456789
003__ SzGeCERN
100__ $$aBar, Baz$$uFoo
245__ $$aOn the quux and huux 1
%(sysnotag)s%(sysnoind1)s%(sysnoind2)s $$%(sysnosubfieldcode)ssysno1
%(sysnotag)s%(sysnoind1)s%(sysnoind2)s $$0sysno2
""" % {'sysnotag': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[0:3],
'sysnoind1': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[3:4],
'sysnoind2': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[4:5],
'sysnosubfieldcode': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[5:6],
}
self.xm_testrec1_to_update = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 1 Updated</subfield>
</datafield>
<datafield tag="%(sysnotag)s" ind1="%(sysnoind1)s" ind2="%(sysnoind2)s">
<subfield code="%(sysnosubfieldcode)s">sysno1</subfield>
</datafield>
<datafield tag="%(sysnotag)s" ind1="%(sysnoind1)s" ind2="%(sysnoind2)s">
<subfield code="0">sysno2</subfield>
</datafield>
</record>
""" % {'sysnotag': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[0:3],
'sysnoind1': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[3:4] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[3:4] or " ",
'sysnoind2': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[4:5] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[4:5] or " ",
'sysnosubfieldcode': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[5:6],
}
self.xm_testrec1_updated = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 1 Updated</subfield>
</datafield>
<datafield tag="%(sysnotag)s" ind1="%(sysnoind1)s" ind2="%(sysnoind2)s">
<subfield code="%(sysnosubfieldcode)s">sysno1</subfield>
</datafield>
<datafield tag="%(sysnotag)s" ind1="%(sysnoind1)s" ind2="%(sysnoind2)s">
<subfield code="0">sysno2</subfield>
</datafield>
</record>
""" % {'sysnotag': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[0:3],
'sysnoind1': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[3:4] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[3:4] or " ",
'sysnoind2': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[4:5] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[4:5] or " ",
'sysnosubfieldcode': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[5:6],
}
self.hm_testrec1_updated = """
001__ 123456789
003__ SzGeCERN
100__ $$aBar, Baz$$uFoo
245__ $$aOn the quux and huux 1 Updated
%(sysnotag)s%(sysnoind1)s%(sysnoind2)s $$%(sysnosubfieldcode)ssysno1
%(sysnotag)s%(sysnoind1)s%(sysnoind2)s $$0sysno2
""" % {'sysnotag': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[0:3],
'sysnoind1': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[3:4],
'sysnoind2': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[4:5],
'sysnosubfieldcode': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[5:6],
}
self.xm_testrec2 = """
<record>
<controlfield tag="001">987654321</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 2</subfield>
</datafield>
<datafield tag="%(sysnotag)s" ind1="%(sysnoind1)s" ind2="%(sysnoind2)s">
<subfield code="%(sysnosubfieldcode)s">sysno2</subfield>
</datafield>
<datafield tag="%(sysnotag)s" ind1="%(sysnoind1)s" ind2="%(sysnoind2)s">
<subfield code="0">sysno1</subfield>
</datafield>
</record>
""" % {'sysnotag': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[0:3],
'sysnoind1': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[3:4] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[3:4] or " ",
'sysnoind2': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[4:5] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[4:5] or " ",
'sysnosubfieldcode': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[5:6],
}
self.hm_testrec2 = """
001__ 987654321
003__ SzGeCERN
100__ $$aBar, Baz$$uFoo
245__ $$aOn the quux and huux 2
%(sysnotag)s%(sysnoind1)s%(sysnoind2)s $$%(sysnosubfieldcode)ssysno2
%(sysnotag)s%(sysnoind1)s%(sysnoind2)s $$0sysno1
""" % {'sysnotag': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[0:3],
'sysnoind1': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[3:4],
'sysnoind2': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[4:5],
'sysnosubfieldcode': CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG[5:6],
}
def test_insert_the_same_sysno_record(self):
"""bibupload - SYSNO tag, refuse to insert the same SYSNO record"""
# initialize bibupload mode:
if self.verbose:
print "test_insert_the_same_sysno_record() started"
# insert record 1 first time:
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummyerr1, recid1, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid1)
inserted_xm = print_record(recid1, 'xm')
inserted_hm = print_record(recid1, 'hm')
# use real recID when comparing whether it worked:
self.xm_testrec1 = self.xm_testrec1.replace('123456789', str(recid1))
self.hm_testrec1 = self.hm_testrec1.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1), '')
# insert record 2 first time:
testrec_to_insert_first = self.xm_testrec2.replace('<controlfield tag="001">987654321</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummyerr2, recid2, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid2)
inserted_xm = print_record(recid2, 'xm')
inserted_hm = print_record(recid2, 'hm')
# use real recID when comparing whether it worked:
self.xm_testrec2 = self.xm_testrec2.replace('987654321', str(recid2))
self.hm_testrec2 = self.hm_testrec2.replace('987654321', str(recid2))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec2), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec2), '')
# try to insert updated record 1, it should fail:
recs = bibupload.xml_marc_to_records(self.xm_testrec1_to_update)
dummyerr1_updated, recid1_updated, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.assertEqual(-1, recid1_updated)
if self.verbose:
print "test_insert_the_same_sysno_record() finished"
def test_insert_or_replace_the_same_sysno_record(self):
"""bibupload - SYSNO tag, allow to insert or replace the same SYSNO record"""
# initialize bibupload mode:
if self.verbose:
print "test_insert_or_replace_the_same_sysno_record() started"
# insert/replace record 1 first time:
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummyerr1, recid1, _ = bibupload.bibupload_records(recs, opt_mode='replace_or_insert')[0]
self.check_record_consistency(recid1)
inserted_xm = print_record(recid1, 'xm')
inserted_hm = print_record(recid1, 'hm')
# use real recID in test buffers when comparing whether it worked:
self.xm_testrec1 = self.xm_testrec1.replace('123456789', str(recid1))
self.hm_testrec1 = self.hm_testrec1.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1), '')
# try to insert/replace updated record 1, it should be okay:
recs = bibupload.xml_marc_to_records(self.xm_testrec1_to_update)
dummyerr1_updated, recid1_updated, _ = bibupload.bibupload_records(recs,
opt_mode='replace_or_insert')[0]
self.check_record_consistency(recid1_updated)
inserted_xm = print_record(recid1_updated, 'xm')
inserted_hm = print_record(recid1_updated, 'hm')
self.assertEqual(recid1, recid1_updated)
# use real recID in test buffers when comparing whether it worked:
self.xm_testrec1_updated = self.xm_testrec1_updated.replace('123456789', str(recid1))
self.hm_testrec1_updated = self.hm_testrec1_updated.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1_updated), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1_updated), '')
if self.verbose:
print "test_insert_or_replace_the_same_sysno_record() finished"
def test_replace_nonexisting_sysno_record(self):
"""bibupload - SYSNO tag, refuse to replace non-existing SYSNO record"""
# initialize bibupload mode:
if self.verbose:
print "test_replace_nonexisting_sysno_record() started"
# insert record 1 first time:
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummy, recid1, _ = bibupload.bibupload_records(recs, opt_mode='replace_or_insert')[0]
self.check_record_consistency(recid1)
inserted_xm = print_record(recid1, 'xm')
inserted_hm = print_record(recid1, 'hm')
# use real recID in test buffers when comparing whether it worked:
self.xm_testrec1 = self.xm_testrec1.replace('123456789', str(recid1))
self.hm_testrec1 = self.hm_testrec1.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1), '')
# try to replace record 2 it should fail:
testrec_to_insert_first = self.xm_testrec2.replace('<controlfield tag="001">987654321</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummy, recid2, _ = bibupload.bibupload_records(recs, opt_mode='replace')[0]
self.assertEqual(-1, recid2)
if self.verbose:
print "test_replace_nonexisting_sysno_record() finished"
class BibUploadRecordsWithEXTOAIIDTest(GenericBibUploadTest):
"""Testing uploading of records that have external EXTOAIID present."""
def setUp(self):
# pylint: disable=C0103
"""Initialize the MARCXML test records."""
GenericBibUploadTest.setUp(self)
# Note that EXTOAIID fields are repeated but with different
# subfields, this is to test whether bibupload would not
# mistakenly pick up wrong values.
self.xm_testrec1 = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="%(extoaiidtag)s" ind1="%(extoaiidind1)s" ind2="%(extoaiidind2)s">
<subfield code="%(extoaiidsubfieldcode)s">extoaiid1</subfield>
<subfield code="%(extoaisrcsubfieldcode)s">extoaisrc1</subfield>
</datafield>
<datafield tag="%(extoaiidtag)s" ind1="%(extoaiidind1)s" ind2="%(extoaiidind2)s">
<subfield code="0">extoaiid2</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 1</subfield>
</datafield>
</record>
""" % {'extoaiidtag': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[0:3],
'extoaiidind1': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4] or " ",
'extoaiidind2': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5] or " ",
'extoaiidsubfieldcode': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[5:6],
'extoaisrcsubfieldcode' : CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG[5:6],
}
self.hm_testrec1 = """
001__ 123456789
003__ SzGeCERN
%(extoaiidtag)s%(extoaiidind1)s%(extoaiidind2)s $$%(extoaisrcsubfieldcode)sextoaisrc1$$%(extoaiidsubfieldcode)sextoaiid1
%(extoaiidtag)s%(extoaiidind1)s%(extoaiidind2)s $$0extoaiid2
100__ $$aBar, Baz$$uFoo
245__ $$aOn the quux and huux 1
""" % {'extoaiidtag': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[0:3],
'extoaiidind1': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4],
'extoaiidind2': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5],
'extoaiidsubfieldcode': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[5:6],
'extoaisrcsubfieldcode' : CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG[5:6],
}
self.xm_testrec1_to_update = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="%(extoaiidtag)s" ind1="%(extoaiidind1)s" ind2="%(extoaiidind2)s">
<subfield code="%(extoaiidsubfieldcode)s">extoaiid1</subfield>
<subfield code="%(extoaisrcsubfieldcode)s">extoaisrc1</subfield>
</datafield>
<datafield tag="%(extoaiidtag)s" ind1="%(extoaiidind1)s" ind2="%(extoaiidind2)s">
<subfield code="0">extoaiid2</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 1 Updated</subfield>
</datafield>
</record>
""" % {'extoaiidtag': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[0:3],
'extoaiidind1': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4] or " ",
'extoaiidind2': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5] or " ",
'extoaiidsubfieldcode': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[5:6],
'extoaisrcsubfieldcode' : CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG[5:6],
}
self.xm_testrec1_updated = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="%(extoaiidtag)s" ind1="%(extoaiidind1)s" ind2="%(extoaiidind2)s">
<subfield code="%(extoaiidsubfieldcode)s">extoaiid1</subfield>
<subfield code="%(extoaisrcsubfieldcode)s">extoaisrc1</subfield>
</datafield>
<datafield tag="%(extoaiidtag)s" ind1="%(extoaiidind1)s" ind2="%(extoaiidind2)s">
<subfield code="0">extoaiid2</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 1 Updated</subfield>
</datafield>
</record>
""" % {'extoaiidtag': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[0:3],
'extoaiidind1': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4] or " ",
'extoaiidind2': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5] or " ",
'extoaiidsubfieldcode': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[5:6],
'extoaisrcsubfieldcode' : CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG[5:6],
}
self.hm_testrec1_updated = """
001__ 123456789
003__ SzGeCERN
%(extoaiidtag)s%(extoaiidind1)s%(extoaiidind2)s $$%(extoaisrcsubfieldcode)sextoaisrc1$$%(extoaiidsubfieldcode)sextoaiid1
%(extoaiidtag)s%(extoaiidind1)s%(extoaiidind2)s $$0extoaiid2
100__ $$aBar, Baz$$uFoo
245__ $$aOn the quux and huux 1 Updated
""" % {'extoaiidtag': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[0:3],
'extoaiidind1': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4],
'extoaiidind2': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5],
'extoaiidsubfieldcode': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[5:6],
'extoaisrcsubfieldcode' : CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG[5:6],
}
self.xm_testrec2 = """
<record>
<controlfield tag="001">987654321</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="%(extoaiidtag)s" ind1="%(extoaiidind1)s" ind2="%(extoaiidind2)s">
<subfield code="%(extoaiidsubfieldcode)s">extoaiid2</subfield>
<subfield code="%(extoaisrcsubfieldcode)s">extoaisrc1</subfield>
</datafield>
<datafield tag="%(extoaiidtag)s" ind1="%(extoaiidind1)s" ind2="%(extoaiidind2)s">
<subfield code="0">extoaiid1</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 2</subfield>
</datafield>
</record>
""" % {'extoaiidtag': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[0:3],
'extoaiidind1': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4] or " ",
'extoaiidind2': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5] or " ",
'extoaiidsubfieldcode': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[5:6],
'extoaisrcsubfieldcode' : CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG[5:6],
}
self.hm_testrec2 = """
001__ 987654321
003__ SzGeCERN
%(extoaiidtag)s%(extoaiidind1)s%(extoaiidind2)s $$%(extoaisrcsubfieldcode)sextoaisrc1$$%(extoaiidsubfieldcode)sextoaiid2
%(extoaiidtag)s%(extoaiidind1)s%(extoaiidind2)s $$0extoaiid1
100__ $$aBar, Baz$$uFoo
245__ $$aOn the quux and huux 2
""" % {'extoaiidtag': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[0:3],
'extoaiidind1': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4],
'extoaiidind2': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5],
'extoaiidsubfieldcode': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[5:6],
'extoaisrcsubfieldcode' : CFG_BIBUPLOAD_EXTERNAL_OAIID_PROVENANCE_TAG[5:6],
}
def test_insert_the_same_extoaiid_record(self):
"""bibupload - EXTOAIID tag, refuse to insert the same EXTOAIID record"""
# initialize bibupload mode:
if self.verbose:
print "test_insert_the_same_extoaiid_record() started"
# insert record 1 first time:
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummyerr1, recid1, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid1)
inserted_xm = print_record(recid1, 'xm')
inserted_hm = print_record(recid1, 'hm')
# use real recID when comparing whether it worked:
self.xm_testrec1 = self.xm_testrec1.replace('123456789', str(recid1))
self.hm_testrec1 = self.hm_testrec1.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1), '')
# insert record 2 first time:
testrec_to_insert_first = self.xm_testrec2.replace('<controlfield tag="001">987654321</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummyerr2, recid2, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid2)
inserted_xm = print_record(recid2, 'xm')
inserted_hm = print_record(recid2, 'hm')
# use real recID when comparing whether it worked:
self.xm_testrec2 = self.xm_testrec2.replace('987654321', str(recid2))
self.hm_testrec2 = self.hm_testrec2.replace('987654321', str(recid2))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec2), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec2), '')
# try to insert updated record 1, it should fail:
recs = bibupload.xml_marc_to_records(self.xm_testrec1_to_update)
dummyerr1_updated, recid1_updated, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.assertEqual(-1, recid1_updated)
if self.verbose:
print "test_insert_the_same_extoaiid_record() finished"
def test_insert_or_replace_the_same_extoaiid_record(self):
"""bibupload - EXTOAIID tag, allow to insert or replace the same EXTOAIID record"""
# initialize bibupload mode:
if self.verbose:
print "test_insert_or_replace_the_same_extoaiid_record() started"
# insert/replace record 1 first time:
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummyerr1, recid1, _ = bibupload.bibupload_records(recs, opt_mode='replace_or_insert')[0]
self.check_record_consistency(recid1)
inserted_xm = print_record(recid1, 'xm')
inserted_hm = print_record(recid1, 'hm')
# use real recID in test buffers when comparing whether it worked:
self.xm_testrec1 = self.xm_testrec1.replace('123456789', str(recid1))
self.hm_testrec1 = self.hm_testrec1.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1), '')
# try to insert/replace updated record 1, it should be okay:
recs = bibupload.xml_marc_to_records(self.xm_testrec1_to_update)
dummyerr1_updated, recid1_updated, _ = bibupload.bibupload_records(recs, opt_mode='replace_or_insert')[0]
self.check_record_consistency(recid1_updated)
inserted_xm = print_record(recid1_updated, 'xm')
inserted_hm = print_record(recid1_updated, 'hm')
self.assertEqual(recid1, recid1_updated)
# use real recID in test buffers when comparing whether it worked:
self.xm_testrec1_updated = self.xm_testrec1_updated.replace('123456789', str(recid1))
self.hm_testrec1_updated = self.hm_testrec1_updated.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1_updated), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1_updated), '')
if self.verbose:
print "test_insert_or_replace_the_same_extoaiid_record() finished"
def test_replace_nonexisting_extoaiid_record(self):
"""bibupload - EXTOAIID tag, refuse to replace non-existing EXTOAIID record"""
# initialize bibupload mode:
if self.verbose:
print "test_replace_nonexisting_extoaiid_record() started"
# insert record 1 first time:
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummyerr1, recid1, _ = bibupload.bibupload_records(recs, opt_mode='replace_or_insert')[0]
self.check_record_consistency(recid1)
inserted_xm = print_record(recid1, 'xm')
inserted_hm = print_record(recid1, 'hm')
# use real recID in test buffers when comparing whether it worked:
self.xm_testrec1 = self.xm_testrec1.replace('123456789', str(recid1))
self.hm_testrec1 = self.hm_testrec1.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1), '')
# try to replace record 2 it should fail:
testrec_to_insert_first = self.xm_testrec2.replace('<controlfield tag="001">987654321</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummyerr2, recid2, _ = bibupload.bibupload_records(recs, opt_mode='replace')[0]
self.assertEqual(-1, recid2)
if self.verbose:
print "test_replace_nonexisting_extoaiid_record() finished"
class BibUploadRecordsWithOAIIDTest(GenericBibUploadTest):
"""Testing uploading of records that have OAI ID present."""
def setUp(self):
"""Initialize the MARCXML test records."""
GenericBibUploadTest.setUp(self)
# Note that OAI fields are repeated but with different
# subfields, this is to test whether bibupload would not
# mistakenly pick up wrong values.
self.xm_testrec1 = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 1</subfield>
</datafield>
<datafield tag="%(oaitag)s" ind1="%(oaiind1)s" ind2="%(oaiind2)s">
<subfield code="%(oaisubfieldcode)s">oai:foo:1</subfield>
</datafield>
<datafield tag="%(oaitag)s" ind1="%(oaiind1)s" ind2="%(oaiind2)s">
<subfield code="0">oai:foo:2</subfield>
</datafield>
</record>
""" % {'oaitag': CFG_OAI_ID_FIELD[0:3],
'oaiind1': CFG_OAI_ID_FIELD[3:4] != "_" and \
CFG_OAI_ID_FIELD[3:4] or " ",
'oaiind2': CFG_OAI_ID_FIELD[4:5] != "_" and \
CFG_OAI_ID_FIELD[4:5] or " ",
'oaisubfieldcode': CFG_OAI_ID_FIELD[5:6],
}
self.hm_testrec1 = """
001__ 123456789
003__ SzGeCERN
100__ $$aBar, Baz$$uFoo
245__ $$aOn the quux and huux 1
%(oaitag)s%(oaiind1)s%(oaiind2)s $$%(oaisubfieldcode)soai:foo:1
%(oaitag)s%(oaiind1)s%(oaiind2)s $$0oai:foo:2
""" % {'oaitag': CFG_OAI_ID_FIELD[0:3],
'oaiind1': CFG_OAI_ID_FIELD[3:4],
'oaiind2': CFG_OAI_ID_FIELD[4:5],
'oaisubfieldcode': CFG_OAI_ID_FIELD[5:6],
}
self.xm_testrec1_to_update = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 1 Updated</subfield>
</datafield>
<datafield tag="%(oaitag)s" ind1="%(oaiind1)s" ind2="%(oaiind2)s">
<subfield code="%(oaisubfieldcode)s">oai:foo:1</subfield>
</datafield>
<datafield tag="%(oaitag)s" ind1="%(oaiind1)s" ind2="%(oaiind2)s">
<subfield code="0">oai:foo:2</subfield>
</datafield>
</record>
""" % {'oaitag': CFG_OAI_ID_FIELD[0:3],
'oaiind1': CFG_OAI_ID_FIELD[3:4] != "_" and \
CFG_OAI_ID_FIELD[3:4] or " ",
'oaiind2': CFG_OAI_ID_FIELD[4:5] != "_" and \
CFG_OAI_ID_FIELD[4:5] or " ",
'oaisubfieldcode': CFG_OAI_ID_FIELD[5:6],
}
self.xm_testrec1_updated = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 1 Updated</subfield>
</datafield>
<datafield tag="%(oaitag)s" ind1="%(oaiind1)s" ind2="%(oaiind2)s">
<subfield code="%(oaisubfieldcode)s">oai:foo:1</subfield>
</datafield>
<datafield tag="%(oaitag)s" ind1="%(oaiind1)s" ind2="%(oaiind2)s">
<subfield code="0">oai:foo:2</subfield>
</datafield>
</record>
""" % {'oaitag': CFG_OAI_ID_FIELD[0:3],
'oaiind1': CFG_OAI_ID_FIELD[3:4] != "_" and \
CFG_OAI_ID_FIELD[3:4] or " ",
'oaiind2': CFG_OAI_ID_FIELD[4:5] != "_" and \
CFG_OAI_ID_FIELD[4:5] or " ",
'oaisubfieldcode': CFG_OAI_ID_FIELD[5:6],
}
self.hm_testrec1_updated = """
001__ 123456789
003__ SzGeCERN
100__ $$aBar, Baz$$uFoo
245__ $$aOn the quux and huux 1 Updated
%(oaitag)s%(oaiind1)s%(oaiind2)s $$%(oaisubfieldcode)soai:foo:1
%(oaitag)s%(oaiind1)s%(oaiind2)s $$0oai:foo:2
""" % {'oaitag': CFG_OAI_ID_FIELD[0:3],
'oaiind1': CFG_OAI_ID_FIELD[3:4],
'oaiind2': CFG_OAI_ID_FIELD[4:5],
'oaisubfieldcode': CFG_OAI_ID_FIELD[5:6],
}
self.xm_testrec2 = """
<record>
<controlfield tag="001">987654321</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 2</subfield>
</datafield>
<datafield tag="%(oaitag)s" ind1="%(oaiind1)s" ind2="%(oaiind2)s">
<subfield code="%(oaisubfieldcode)s">oai:foo:2</subfield>
</datafield>
<datafield tag="%(oaitag)s" ind1="%(oaiind1)s" ind2="%(oaiind2)s">
<subfield code="0">oai:foo:1</subfield>
</datafield>
</record>
""" % {'oaitag': CFG_OAI_ID_FIELD[0:3],
'oaiind1': CFG_OAI_ID_FIELD[3:4] != "_" and \
CFG_OAI_ID_FIELD[3:4] or " ",
'oaiind2': CFG_OAI_ID_FIELD[4:5] != "_" and \
CFG_OAI_ID_FIELD[4:5] or " ",
'oaisubfieldcode': CFG_OAI_ID_FIELD[5:6],
}
self.hm_testrec2 = """
001__ 987654321
003__ SzGeCERN
100__ $$aBar, Baz$$uFoo
245__ $$aOn the quux and huux 2
%(oaitag)s%(oaiind1)s%(oaiind2)s $$%(oaisubfieldcode)soai:foo:2
%(oaitag)s%(oaiind1)s%(oaiind2)s $$0oai:foo:1
""" % {'oaitag': CFG_OAI_ID_FIELD[0:3],
'oaiind1': CFG_OAI_ID_FIELD[3:4],
'oaiind2': CFG_OAI_ID_FIELD[4:5],
'oaisubfieldcode': CFG_OAI_ID_FIELD[5:6],
}
def test_insert_the_same_oai_record(self):
"""bibupload - OAIID tag, refuse to insert the same OAI record"""
# insert record 1 first time:
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummyerr1, recid1, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid1)
inserted_xm = print_record(recid1, 'xm')
inserted_hm = print_record(recid1, 'hm')
# use real recID when comparing whether it worked:
self.xm_testrec1 = self.xm_testrec1.replace('123456789', str(recid1))
self.hm_testrec1 = self.hm_testrec1.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1), '')
# insert record 2 first time:
testrec_to_insert_first = self.xm_testrec2.replace('<controlfield tag="001">987654321</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummyerr2, recid2, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid2)
inserted_xm = print_record(recid2, 'xm')
inserted_hm = print_record(recid2, 'hm')
# use real recID when comparing whether it worked:
self.xm_testrec2 = self.xm_testrec2.replace('987654321', str(recid2))
self.hm_testrec2 = self.hm_testrec2.replace('987654321', str(recid2))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec2), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec2), '')
# try to insert updated record 1, it should fail:
recs = bibupload.xml_marc_to_records(self.xm_testrec1_to_update)
dummyerr1_updated, recid1_updated, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.assertEqual(-1, recid1_updated)
def test_insert_or_replace_the_same_oai_record(self):
"""bibupload - OAIID tag, allow to insert or replace the same OAI record"""
# initialize bibupload mode:
# insert/replace record 1 first time:
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummyerr1, recid1, _ = bibupload.bibupload_records(recs, opt_mode='replace_or_insert')[0]
self.check_record_consistency(recid1)
inserted_xm = print_record(recid1, 'xm')
inserted_hm = print_record(recid1, 'hm')
# use real recID in test buffers when comparing whether it worked:
self.xm_testrec1 = self.xm_testrec1.replace('123456789', str(recid1))
self.hm_testrec1 = self.hm_testrec1.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1), '')
# try to insert/replace updated record 1, it should be okay:
recs = bibupload.xml_marc_to_records(self.xm_testrec1_to_update)
dummyerr1_updated, recid1_updated, _ = bibupload.bibupload_records(recs, opt_mode='replace_or_insert')[0]
self.check_record_consistency(recid1_updated)
inserted_xm = print_record(recid1_updated, 'xm')
inserted_hm = print_record(recid1_updated, 'hm')
self.assertEqual(recid1, recid1_updated)
# use real recID in test buffers when comparing whether it worked:
self.xm_testrec1_updated = self.xm_testrec1_updated.replace('123456789', str(recid1))
self.hm_testrec1_updated = self.hm_testrec1_updated.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1_updated), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1_updated), '')
def test_replace_nonexisting_oai_record(self):
"""bibupload - OAIID tag, refuse to replace non-existing OAI record"""
# insert record 1 first time:
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummyerr1, recid1, _ = bibupload.bibupload_records(recs, opt_mode='replace_or_insert')[0]
self.check_record_consistency(recid1)
inserted_xm = print_record(recid1, 'xm')
inserted_hm = print_record(recid1, 'hm')
# use real recID in test buffers when comparing whether it worked:
self.xm_testrec1 = self.xm_testrec1.replace('123456789', str(recid1))
self.hm_testrec1 = self.hm_testrec1.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1), '')
# try to replace record 2 it should fail:
testrec_to_insert_first = self.xm_testrec2.replace('<controlfield tag="001">987654321</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
dummyerr2, recid2, _ = bibupload.bibupload_records(recs, opt_mode='replace')[0]
self.assertEqual(-1, recid2)
class BibUploadRecordsWithDOITest(GenericBibUploadTest):
"""Testing uploading of records with DOI."""
def setUp(self):
"""Initialize the MARCXML test records."""
GenericBibUploadTest.setUp(self)
self.xm_testrec1 = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="%(doitag)s" ind1="%(doiind1)s" ind2="%(doiind2)s">
<subfield code="%(doisubfieldcodesource)s">doi</subfield>
<subfield code="%(doisubfieldcodevalue)s">10.5170/123-456-789</subfield>
</datafield>
<datafield tag="%(doitag)s" ind1="%(doiind1)s" ind2="%(doiind2)s">
<subfield code="%(doisubfieldcodesource)s">nondoi</subfield>
<subfield code="%(doisubfieldcodevalue)s">10.5170/123-456-789-0</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 1</subfield>
</datafield>
</record>
""" % {'doitag': '024',
'doiind1': '7',
'doiind2': ' ',
'doisubfieldcodevalue': 'a',
'doisubfieldcodesource': '2'
}
self.hm_testrec1 = """
001__ 123456789
003__ SzGeCERN
%(doitag)s%(doiind1)s%(doiind2)s $$%(doisubfieldcodesource)sdoi$$%(doisubfieldcodevalue)s10.5170/123-456-789
%(doitag)s%(doiind1)s%(doiind2)s $$%(doisubfieldcodesource)snondoi$$%(doisubfieldcodevalue)s10.5170/123-456-789-0
100__ $$aBar, Baz$$uFoo
245__ $$aOn the quux and huux 1
""" % {'doitag': '024',
'doiind1': '7',
'doiind2': '_',
'doisubfieldcodevalue': 'a',
'doisubfieldcodesource': '2'
}
self.xm_testrec1_to_update = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="%(doitag)s" ind1="%(doiind1)s" ind2="%(doiind2)s">
<subfield code="%(doisubfieldcodesource)s">doi</subfield>
<subfield code="%(doisubfieldcodevalue)s">10.5170/123-456-789</subfield>
</datafield>
<datafield tag="%(doitag)s" ind1="%(doiind1)s" ind2="%(doiind2)s">
<subfield code="%(doisubfieldcodesource)s">nondoi</subfield>
<subfield code="%(doisubfieldcodevalue)s">10.5170/123-456-789-0</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 1 Updated</subfield>
</datafield>
</record>
""" % {'doitag': '024',
'doiind1': '7',
'doiind2': ' ',
'doisubfieldcodevalue': 'a',
'doisubfieldcodesource': '2'
}
self.xm_testrec1_updated = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="%(doitag)s" ind1="%(doiind1)s" ind2="%(doiind2)s">
<subfield code="%(doisubfieldcodesource)s">doi</subfield>
<subfield code="%(doisubfieldcodevalue)s">10.5170/123-456-789</subfield>
</datafield>
<datafield tag="%(doitag)s" ind1="%(doiind1)s" ind2="%(doiind2)s">
<subfield code="%(doisubfieldcodesource)s">nondoi</subfield>
<subfield code="%(doisubfieldcodevalue)s">10.5170/123-456-789-0</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 1 Updated</subfield>
</datafield>
</record>
""" % {'doitag': '024',
'doiind1': '7',
'doiind2': ' ',
'doisubfieldcodevalue': 'a',
'doisubfieldcodesource': '2'
}
self.hm_testrec1_updated = """
001__ 123456789
003__ SzGeCERN
%(doitag)s%(doiind1)s%(doiind2)s $$%(doisubfieldcodesource)sdoi$$%(doisubfieldcodevalue)s10.5170/123-456-789
%(doitag)s%(doiind1)s%(doiind2)s $$%(doisubfieldcodesource)snondoi$$%(doisubfieldcodevalue)s10.5170/123-456-789-0
100__ $$aBar, Baz$$uFoo
245__ $$aOn the quux and huux 1 Updated
""" % {'doitag': '024',
'doiind1': '7',
'doiind2': '_',
'doisubfieldcodevalue': 'a',
'doisubfieldcodesource': '2'
}
self.xm_testrec2 = """
<record>
<controlfield tag="001">987654321</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="%(doitag)s" ind1="%(doiind1)s" ind2="%(doiind2)s">
<subfield code="%(doisubfieldcodesource)s">doi</subfield>
<subfield code="%(doisubfieldcodevalue)s">10.5170/987-654-321</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 2</subfield>
</datafield>
</record>
""" % {'doitag': '024',
'doiind1': '7',
'doiind2': ' ',
'doisubfieldcodevalue': 'a',
'doisubfieldcodesource': '2'
}
self.hm_testrec2 = """
001__ 987654321
003__ SzGeCERN
%(doitag)s%(doiind1)s%(doiind2)s $$%(doisubfieldcodesource)sdoi$$%(doisubfieldcodevalue)s10.5170/987-654-321
100__ $$aBar, Baz$$uFoo
245__ $$aOn the quux and huux 2
""" % {'doitag': '024',
'doiind1': '7',
'doiind2': '_',
'doisubfieldcodevalue': 'a',
'doisubfieldcodesource': '2'
}
self.xm_testrec2_to_update = """
<record>
<controlfield tag="001">987654321</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="%(doitag)s" ind1="%(doiind1)s" ind2="%(doiind2)s">
<subfield code="%(doisubfieldcodesource)s">doi</subfield>
<subfield code="%(doisubfieldcodevalue)s">10.5170/123-456-789</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
</record>
""" % {'doitag': '024',
'doiind1': '7',
'doiind2': ' ',
'doisubfieldcodevalue': 'a',
'doisubfieldcodesource': '2'
}
self.xm_testrec3 = """
<record>
<controlfield tag="001">192837645</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="%(doitag)s" ind1="%(doiind1)s" ind2="%(doiind2)s">
<subfield code="%(doisubfieldcodesource)s">doi</subfield>
<subfield code="%(doisubfieldcodevalue)s">10.5170/123-456-789-0</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 4</subfield>
</datafield>
</record>
""" % {'doitag': '024',
'doiind1': '7',
'doiind2': ' ',
'doisubfieldcodevalue': 'a',
'doisubfieldcodesource': '2'
}
self.hm_testrec3 = """
001__ 192837645
003__ SzGeCERN
%(doitag)s%(doiind1)s%(doiind2)s $$%(doisubfieldcodesource)sdoi$$%(doisubfieldcodevalue)s10.5170/123-456-789-0
100__ $$aBar, Baz$$uFoo
245__ $$aOn the quux and huux 4
""" % {'doitag': '024',
'doiind1': '7',
'doiind2': '_',
'doisubfieldcodevalue': 'a',
'doisubfieldcodesource': '2'
}
self.xm_testrec4 = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="%(doitag)s" ind1="%(doiind1)s" ind2="%(doiind2)s">
<subfield code="%(doisubfieldcodesource)s">doi</subfield>
<subfield code="%(doisubfieldcodevalue)s">10.5170/123-456-789-non-existing</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 5</subfield>
</datafield>
</record>
""" % {'doitag': '024',
'doiind1': '7',
'doiind2': ' ',
'doisubfieldcodevalue': 'a',
'doisubfieldcodesource': '2'
}
self.xm_testrec5 = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="%(doitag)s" ind1="%(doiind1)s" ind2="%(doiind2)s">
<subfield code="%(doisubfieldcodesource)s">doi</subfield>
<subfield code="%(doisubfieldcodevalue)s">10.5170/123-456-789</subfield>
</datafield>
<datafield tag="%(doitag)s" ind1="%(doiind1)s" ind2="%(doiind2)s">
<subfield code="%(doisubfieldcodesource)s">doi</subfield>
<subfield code="%(doisubfieldcodevalue)s">10.5170/987-654-321</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Bar, Baz</subfield>
<subfield code="u">Foo</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">On the quux and huux 6</subfield>
</datafield>
</record>
""" % {'doitag': '024',
'doiind1': '7',
'doiind2': ' ',
'doisubfieldcodevalue': 'a',
'doisubfieldcodesource': '2'
}
def test_insert_the_same_doi_matching_on_doi(self):
"""bibupload - DOI tag, refuse to "insert" twice same DOI (matching on DOI)"""
# insert record 1 first time:
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
err1, recid1, msg1 = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid1)
inserted_xm = print_record(recid1, 'xm')
inserted_hm = print_record(recid1, 'hm')
# use real recID when comparing whether it worked:
self.xm_testrec1 = self.xm_testrec1.replace('123456789', str(recid1))
self.hm_testrec1 = self.hm_testrec1.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1), '')
# insert record 2 first time:
testrec_to_insert_first = self.xm_testrec2.replace('<controlfield tag="001">987654321</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
err2, recid2, msg2 = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid2)
inserted_xm = print_record(recid2, 'xm')
inserted_hm = print_record(recid2, 'hm')
# use real recID when comparing whether it worked:
self.xm_testrec2 = self.xm_testrec2.replace('987654321', str(recid2))
self.hm_testrec2 = self.hm_testrec2.replace('987654321', str(recid2))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec2), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec2), '')
# try to insert again record 1 (without recid, matching on DOI)
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
err1_updated, recid1_updated, msg1_updated = bibupload.bibupload(recs[0], opt_mode='insert')
self.assertEqual(-1, recid1_updated)
# if we try to update, append or correct, the same record is matched
recs = bibupload.xml_marc_to_records(self.xm_testrec1_to_update)
err1_updated, recid1_updated, msg1_updated = bibupload.bibupload(recs[0], opt_mode='correct')
self.check_record_consistency(recid1_updated)
self.assertEqual(recid1, recid1_updated)
err1_updated, recid1_updated, msg1_updated = bibupload.bibupload(recs[0], opt_mode='append')
self.check_record_consistency(recid1_updated)
self.assertEqual(recid1, recid1_updated)
err1_updated, recid1_updated, msg1_updated = bibupload.bibupload(recs[0], opt_mode='replace')
self.check_record_consistency(recid1_updated)
self.assertEqual(recid1, recid1_updated)
def test_insert_the_same_doi_matching_on_recid(self):
"""bibupload - DOI tag, refuse to "insert" twice same DOI (matching on recid)"""
# First upload 2 test records
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
err1, recid1, msg1 = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid1)
testrec_to_insert_first = self.xm_testrec2.replace('<controlfield tag="001">987654321</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
err2, recid2, msg2 = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid2)
# try to update record 2 with DOI already in record 1. It must fail:
testrec_to_update = self.xm_testrec2_to_update.replace('<controlfield tag="001">987654321</controlfield>',
'<controlfield tag="001">%s</controlfield>' % recid2)
recs = bibupload.xml_marc_to_records(testrec_to_update)
err, recid, msg = bibupload.bibupload(recs[0], opt_mode='replace')
self.check_record_consistency(recid)
self.assertEqual(1, err)
# Ditto in correct and append mode
recs = bibupload.xml_marc_to_records(testrec_to_update)
err, recid, msg = bibupload.bibupload(recs[0], opt_mode='correct')
self.check_record_consistency(recid)
self.assertEqual(1, err)
recs = bibupload.xml_marc_to_records(testrec_to_update)
err, recid, msg = bibupload.bibupload(recs[0], opt_mode='append')
self.check_record_consistency(recid)
self.assertEqual(1, err)
def test_insert_or_replace_the_same_doi_record(self):
"""bibupload - DOI tag, allow to insert or replace matching on DOI"""
# insert/replace record 1 first time:
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
err1, recid1, msg1 = bibupload.bibupload(recs[0], opt_mode='replace_or_insert')
self.check_record_consistency(recid1)
inserted_xm = print_record(recid1, 'xm')
inserted_hm = print_record(recid1, 'hm')
# use real recID in test buffers when comparing whether it worked:
self.xm_testrec1 = self.xm_testrec1.replace('123456789', str(recid1))
self.hm_testrec1 = self.hm_testrec1.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1), '')
# try to insert/replace updated record 1, it should be okay:
recs = bibupload.xml_marc_to_records(self.xm_testrec1_to_update)
err1_updated, recid1_updated, msg1_updated = bibupload.bibupload(recs[0], opt_mode='replace_or_insert')
self.check_record_consistency(recid1_updated)
inserted_xm = print_record(recid1_updated, 'xm')
inserted_hm = print_record(recid1_updated, 'hm')
self.assertEqual(recid1, recid1_updated)
# use real recID in test buffers when comparing whether it worked:
self.xm_testrec1_updated = self.xm_testrec1_updated.replace('123456789', str(recid1))
self.hm_testrec1_updated = self.hm_testrec1_updated.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1_updated), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1_updated), '')
def test_correct_the_same_doi_record(self):
"""bibupload - DOI tag, allow to correct matching on DOI"""
# insert/replace record 1 first time:
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
err1, recid1, msg1 = bibupload.bibupload(recs[0], opt_mode='replace_or_insert')
self.check_record_consistency(recid1)
inserted_xm = print_record(recid1, 'xm')
inserted_hm = print_record(recid1, 'hm')
# use real recID in test buffers when comparing whether it worked:
self.xm_testrec1 = self.xm_testrec1.replace('123456789', str(recid1))
self.hm_testrec1 = self.hm_testrec1.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1), '')
# try to correct updated record 1, it should be okay:
recs = bibupload.xml_marc_to_records(self.xm_testrec1_to_update)
err1_updated, recid1_updated, msg1_updated = bibupload.bibupload(recs[0], opt_mode='correct')
self.check_record_consistency(recid1_updated)
inserted_xm = print_record(recid1_updated, 'xm')
inserted_hm = print_record(recid1_updated, 'hm')
self.assertEqual(recid1, recid1_updated)
# use real recID in test buffers when comparing whether it worked:
self.xm_testrec1_updated = self.xm_testrec1_updated.replace('123456789', str(recid1))
self.hm_testrec1_updated = self.hm_testrec1_updated.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1_updated), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1_updated), '')
def test_replace_nonexisting_doi_record(self):
"""bibupload - DOI tag, refuse to replace non-existing DOI record (matching on DOI)"""
testrec_to_insert_first = self.xm_testrec4
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
err4, recid4, msg4 = bibupload.bibupload(recs[0], opt_mode='replace')
self.assertEqual(-1, recid4)
def test_matching_on_doi_source_field(self):
"""bibupload - DOI tag, test matching records using DOI value AND source field ($2)"""
# insert test record 1, with a "fake" doi (not "doi" in source field):
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
err1, recid1, msg1 = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid1)
inserted_xm = print_record(recid1, 'xm')
inserted_hm = print_record(recid1, 'hm')
# use real recID when comparing whether it worked:
self.xm_testrec1 = self.xm_testrec1.replace('123456789', str(recid1))
self.hm_testrec1 = self.hm_testrec1.replace('123456789', str(recid1))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec1), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec1), '')
# insert record 3, which matches record 1 "fake" doi, so it
# should work.
testrec_to_insert_first = self.xm_testrec3.replace('<controlfield tag="001">192837645</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
err3, recid3, msg3 = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid3)
inserted_xm = print_record(recid3, 'xm')
inserted_hm = print_record(recid3, 'hm')
# use real recID when comparing whether it worked:
self.xm_testrec3 = self.xm_testrec3.replace('192837645', str(recid3))
self.hm_testrec3 = self.hm_testrec3.replace('192837645', str(recid3))
self.assertEqual(compare_xmbuffers(inserted_xm,
self.xm_testrec3), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
self.hm_testrec3), '')
def test_replace_or_update_record__with_ambiguous_doi(self):
"""bibupload - DOI tag, refuse to replace/correct/append on the basis of ambiguous DOI"""
# First upload 2 test records with two different DOIs:
testrec_to_insert_first = self.xm_testrec1.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
err1, recid1, msg1 = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid1)
self.assertEqual(0, err1)
testrec_to_insert_first = self.xm_testrec2.replace('<controlfield tag="001">987654321</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec_to_insert_first)
err2, recid2, msg2 = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid2)
self.assertEqual(0, err2)
# Now try to insert record with DOIs matching the records
# previously uploaded. It must fail.
testrec = self.xm_testrec5.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(testrec)
err5, recid5, msg5 = bibupload.bibupload(recs[0], opt_mode='insert')
self.assertEqual(1, err5)
# Ditto for other modes:
recs = bibupload.xml_marc_to_records(testrec)
err5, recid5, msg5 = bibupload.bibupload(recs[0], opt_mode='replace_or_insert')
self.assertEqual(1, err5)
recs = bibupload.xml_marc_to_records(testrec)
err5, recid5, msg5 = bibupload.bibupload(recs[0], opt_mode='replace')
self.assertEqual(1, err5)
recs = bibupload.xml_marc_to_records(testrec)
err5, recid5, msg5 = bibupload.bibupload(recs[0], opt_mode='correct')
self.assertEqual(1, err5)
recs = bibupload.xml_marc_to_records(testrec)
err5, recid5, msg5 = bibupload.bibupload(recs[0], opt_mode='append')
self.assertEqual(1, err5)
# The same is true if a recid exists in the input MARCXML (as
# long as DOIs are ambiguous):
testrec = self.xm_testrec5.replace('<controlfield tag="001">123456789</controlfield>',
'<controlfield tag="001">%s</controlfield>' % recid1)
recs = bibupload.xml_marc_to_records(testrec)
err5, recid5, msg5 = bibupload.bibupload(recs[0], opt_mode='replace_or_insert')
self.assertEqual(1, err5)
recs = bibupload.xml_marc_to_records(testrec)
err5, recid5, msg5 = bibupload.bibupload(recs[0], opt_mode='replace')
self.assertEqual(1, err5)
recs = bibupload.xml_marc_to_records(testrec)
err5, recid5, msg5 = bibupload.bibupload(recs[0], opt_mode='correct')
self.assertEqual(1, err5)
recs = bibupload.xml_marc_to_records(testrec)
err5, recid5, msg5 = bibupload.bibupload(recs[0], opt_mode='append')
self.assertEqual(1, err5)
class BibUploadIndicatorsTest(GenericBibUploadTest):
"""
Testing uploading of a MARCXML record with indicators having
either blank space (as per MARC schema) or empty string value (old
behaviour).
"""
def setUp(self):
"""Initialize the MARCXML test record."""
GenericBibUploadTest.setUp(self)
self.testrec1_xm = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
</record>
"""
self.testrec1_hm = """
003__ SzGeCERN
100__ $$aTest, John$$uTest University
"""
self.testrec2_xm = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1="" ind2="">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
</record>
"""
self.testrec2_hm = """
003__ SzGeCERN
100__ $$aTest, John$$uTest University
"""
def test_record_with_spaces_in_indicators(self):
"""bibupload - inserting MARCXML with spaces in indicators"""
recs = bibupload.xml_marc_to_records(self.testrec1_xm)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(remove_tag_001_from_xmbuffer(inserted_xm),
self.testrec1_xm), '')
self.assertEqual(compare_hmbuffers(remove_tag_001_from_hmbuffer(inserted_hm),
self.testrec1_hm), '')
def test_record_with_no_spaces_in_indicators(self):
"""bibupload - inserting MARCXML with no spaces in indicators"""
recs = bibupload.xml_marc_to_records(self.testrec2_xm)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(remove_tag_001_from_xmbuffer(inserted_xm),
self.testrec2_xm), '')
self.assertEqual(compare_hmbuffers(remove_tag_001_from_hmbuffer(inserted_hm),
self.testrec2_hm), '')
class BibUploadUpperLowerCaseTest(GenericBibUploadTest):
"""
Testing treatment of similar records with only upper and lower
case value differences in the bibxxx table.
"""
def setUp(self):
"""Initialize the MARCXML test records."""
GenericBibUploadTest.setUp(self)
self.testrec1_xm = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
</record>
"""
self.testrec1_hm = """
003__ SzGeCERN
100__ $$aTest, John$$uTest University
"""
self.testrec2_xm = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1="" ind2="">
<subfield code="a">TeSt, JoHn</subfield>
<subfield code="u">Test UniVeRsity</subfield>
</datafield>
</record>
"""
self.testrec2_hm = """
003__ SzGeCERN
100__ $$aTeSt, JoHn$$uTest UniVeRsity
"""
def test_record_with_upper_lower_case_letters(self):
"""bibupload - inserting similar MARCXML records with upper/lower case"""
# insert test record #1:
recs = bibupload.xml_marc_to_records(self.testrec1_xm)
dummyerr1, recid1, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid1)
recid1_inserted_xm = print_record(recid1, 'xm')
recid1_inserted_hm = print_record(recid1, 'hm')
# insert test record #2:
recs = bibupload.xml_marc_to_records(self.testrec2_xm)
dummyerr1, recid2, _ = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid2)
recid2_inserted_xm = print_record(recid2, 'xm')
recid2_inserted_hm = print_record(recid2, 'hm')
# let us compare stuff now:
self.assertEqual(compare_xmbuffers(remove_tag_001_from_xmbuffer(recid1_inserted_xm),
self.testrec1_xm), '')
self.assertEqual(compare_hmbuffers(remove_tag_001_from_hmbuffer(recid1_inserted_hm),
self.testrec1_hm), '')
self.assertEqual(compare_xmbuffers(remove_tag_001_from_xmbuffer(recid2_inserted_xm),
self.testrec2_xm), '')
self.assertEqual(compare_hmbuffers(remove_tag_001_from_hmbuffer(recid2_inserted_hm),
self.testrec2_hm), '')
class BibUploadControlledProvenanceTest(GenericBibUploadTest):
"""Testing treatment of tags under controlled provenance in the correct mode."""
def setUp(self):
"""Initialize the MARCXML test record."""
GenericBibUploadTest.setUp(self)
self.testrec1_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">Test title</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="a">blabla</subfield>
<subfield code="9">sam</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="a">blublu</subfield>
<subfield code="9">sim</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="a">human</subfield>
</datafield>
</record>
"""
self.testrec1_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, Jane$$uTest Institute
245__ $$aTest title
6531_ $$9sam$$ablabla
6531_ $$9sim$$ablublu
6531_ $$ahuman
"""
self.testrec1_xm_to_correct = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="a">bleble</subfield>
<subfield code="9">sim</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="a">bloblo</subfield>
<subfield code="9">som</subfield>
</datafield>
</record>
"""
self.testrec1_corrected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">Test title</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="a">blabla</subfield>
<subfield code="9">sam</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="a">human</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="a">bleble</subfield>
<subfield code="9">sim</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="a">bloblo</subfield>
<subfield code="9">som</subfield>
</datafield>
</record>
"""
self.testrec1_corrected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, Jane$$uTest Institute
245__ $$aTest title
6531_ $$9sam$$ablabla
6531_ $$ahuman
6531_ $$9sim$$ableble
6531_ $$9som$$abloblo
"""
# insert test record:
test_record_xm = self.testrec1_xm.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(test_record_xm)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recID:
self.testrec1_xm = self.testrec1_xm.replace('123456789', str(recid))
self.testrec1_hm = self.testrec1_hm.replace('123456789', str(recid))
self.testrec1_xm_to_correct = self.testrec1_xm_to_correct.replace('123456789', str(recid))
self.testrec1_corrected_xm = self.testrec1_corrected_xm.replace('123456789', str(recid))
self.testrec1_corrected_hm = self.testrec1_corrected_hm.replace('123456789', str(recid))
# test of the inserted record:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm, self.testrec1_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm, self.testrec1_hm), '')
def test_controlled_provenance_persistence(self):
"""bibupload - correct mode, tags with controlled provenance"""
# correct metadata tags; will the protected tags be kept?
recs = bibupload.xml_marc_to_records(self.testrec1_xm_to_correct)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='correct')[0]
self.check_record_consistency(recid)
corrected_xm = print_record(recid, 'xm')
corrected_hm = print_record(recid, 'hm')
# did it work?
self.assertEqual(compare_xmbuffers(corrected_xm, self.testrec1_corrected_xm), '')
self.assertEqual(compare_hmbuffers(corrected_hm, self.testrec1_corrected_hm), '')
class BibUploadStrongTagsTest(GenericBibUploadTest):
"""Testing treatment of strong tags and the replace mode."""
def setUp(self):
"""Initialize the MARCXML test record."""
GenericBibUploadTest.setUp(self)
self.testrec1_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Jane</subfield>
<subfield code="u">Test Institute</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">Test title</subfield>
</datafield>
<datafield tag="%(strong_tag)s" ind1=" " ind2=" ">
<subfield code="a">A value</subfield>
<subfield code="b">Another value</subfield>
</datafield>
</record>
""" % {'strong_tag': bibupload.CFG_BIBUPLOAD_STRONG_TAGS[0]}
self.testrec1_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, Jane$$uTest Institute
245__ $$aTest title
%(strong_tag)s__ $$aA value$$bAnother value
""" % {'strong_tag': bibupload.CFG_BIBUPLOAD_STRONG_TAGS[0]}
self.testrec1_xm_to_replace = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Joseph</subfield>
<subfield code="u">Test Academy</subfield>
</datafield>
</record>
"""
self.testrec1_replaced_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Joseph</subfield>
<subfield code="u">Test Academy</subfield>
</datafield>
<datafield tag="%(strong_tag)s" ind1=" " ind2=" ">
<subfield code="a">A value</subfield>
<subfield code="b">Another value</subfield>
</datafield>
</record>
""" % {'strong_tag': bibupload.CFG_BIBUPLOAD_STRONG_TAGS[0]}
self.testrec1_replaced_hm = """
001__ 123456789
100__ $$aTest, Joseph$$uTest Academy
%(strong_tag)s__ $$aA value$$bAnother value
""" % {'strong_tag': bibupload.CFG_BIBUPLOAD_STRONG_TAGS[0]}
# insert test record:
test_record_xm = self.testrec1_xm.replace('<controlfield tag="001">123456789</controlfield>',
'')
recs = bibupload.xml_marc_to_records(test_record_xm)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recID:
self.testrec1_xm = self.testrec1_xm.replace('123456789', str(recid))
self.testrec1_hm = self.testrec1_hm.replace('123456789', str(recid))
self.testrec1_xm_to_replace = self.testrec1_xm_to_replace.replace('123456789', str(recid))
self.testrec1_replaced_xm = self.testrec1_replaced_xm.replace('123456789', str(recid))
self.testrec1_replaced_hm = self.testrec1_replaced_hm.replace('123456789', str(recid))
# test of the inserted record:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm, self.testrec1_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm, self.testrec1_hm), '')
def test_strong_tags_persistence(self):
"""bibupload - strong tags, persistence in replace mode"""
# replace all metadata tags; will the strong tags be kept?
recs = bibupload.xml_marc_to_records(self.testrec1_xm_to_replace)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='replace')[0]
self.check_record_consistency(recid)
replaced_xm = print_record(recid, 'xm')
replaced_hm = print_record(recid, 'hm')
# did it work?
self.assertEqual(compare_xmbuffers(replaced_xm, self.testrec1_replaced_xm), '')
self.assertEqual(compare_hmbuffers(replaced_hm, self.testrec1_replaced_hm), '')
class BibUploadPretendTest(GenericBibUploadTest):
"""
Testing bibupload --pretend correctness.
"""
def setUp(self):
- from invenio.bibtask import task_set_task_param
+ from invenio.legacy.bibsched.bibtask import task_set_task_param
GenericBibUploadTest.setUp(self)
self.demo_data = bibupload.xml_marc_to_records(open(os.path.join(CFG_TMPDIR, 'demobibdata.xml')).read())[0]
self.before = self._get_tables_fingerprint()
task_set_task_param('pretend', True)
def tearDown(self):
- from invenio.bibtask import task_set_task_param
+ from invenio.legacy.bibsched.bibtask import task_set_task_param
GenericBibUploadTest.tearDown(self)
task_set_task_param('pretend', False)
@staticmethod
def _get_tables_fingerprint():
"""
Take lenght and last modification time of all the tables that
might be touched by bibupload and return them in a nice structure.
"""
fingerprint = {}
tables = ['bibrec', 'bibdoc', 'bibrec_bibdoc', 'bibdoc_bibdoc', 'bibfmt', 'hstDOCUMENT', 'hstRECORD', 'bibHOLDINGPEN', 'bibdocmoreinfo', 'bibdocfsinfo']
for i in xrange(100):
tables.append('bib%02dx' % i)
tables.append('bibrec_bib%02dx' % i)
for table in tables:
fingerprint[table] = get_table_status_info(table)
return fingerprint
@staticmethod
def _checks_tables_fingerprints(before, after):
"""
Checks differences in table_fingerprints.
"""
for table in before.keys():
if before[table] != after[table]:
raise StandardError("Table %s has been modified: before was [%s], after was [%s]" % (table, pprint.pformat(before[table]), pprint.pformat(after[table])))
return True
def test_pretend_insert(self):
"""bibupload - pretend insert"""
bibupload.bibupload_records([self.demo_data], opt_mode='insert', pretend=True)
self.failUnless(self._checks_tables_fingerprints(self.before, self._get_tables_fingerprint()))
def test_pretend_correct(self):
"""bibupload - pretend correct"""
bibupload.bibupload_records([self.demo_data], opt_mode='correct', pretend=True)
self.failUnless(self._checks_tables_fingerprints(self.before, self._get_tables_fingerprint()))
def test_pretend_replace(self):
"""bibupload - pretend replace"""
bibupload.bibupload_records([self.demo_data], opt_mode='replace', pretend=True)
self.failUnless(self._checks_tables_fingerprints(self.before, self._get_tables_fingerprint()))
def test_pretend_append(self):
"""bibupload - pretend append"""
bibupload.bibupload_records([self.demo_data], opt_mode='append', pretend=True)
self.failUnless(self._checks_tables_fingerprints(self.before, self._get_tables_fingerprint()))
def test_pretend_replace_or_insert(self):
"""bibupload - pretend replace or insert"""
bibupload.bibupload_records([self.demo_data], opt_mode='replace_or_insert', pretend=True)
self.failUnless(self._checks_tables_fingerprints(self.before, self._get_tables_fingerprint()))
def test_pretend_holdingpen(self):
"""bibupload - pretend holdingpen"""
bibupload.bibupload_records([self.demo_data], opt_mode='holdingpen', pretend=True)
self.failUnless(self._checks_tables_fingerprints(self.before, self._get_tables_fingerprint()))
def test_pretend_delete(self):
"""bibupload - pretend delete"""
bibupload.bibupload_records([self.demo_data], opt_mode='delete', pretend=True)
self.failUnless(self._checks_tables_fingerprints(self.before, self._get_tables_fingerprint()))
def test_pretend_reference(self):
"""bibupload - pretend reference"""
bibupload.bibupload_records([self.demo_data], opt_mode='reference', pretend=True)
self.failUnless(self._checks_tables_fingerprints(self.before, self._get_tables_fingerprint()))
class BibUploadHoldingPenTest(GenericBibUploadTest):
"""
Testing the Holding Pen usage.
"""
def setUp(self):
- from invenio.bibtask import task_set_task_param, setup_loggers
+ from invenio.legacy.bibsched.bibtask import task_set_task_param, setup_loggers
GenericBibUploadTest.setUp(self)
self.verbose = 9
setup_loggers()
task_set_task_param('verbose', self.verbose)
self.recid = 10
self.oai_id = "oai:cds.cern.ch:CERN-EP-2001-094"
def test_holding_pen_upload_with_recid(self):
"""bibupload - holding pen upload with recid"""
test_to_upload = """<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
<record>
<controlfield tag="001">%s</controlfield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Kleefeld, F</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Newcomer, Y</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Rupp, G</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Scadron, M D</subfield>
</datafield>
</record>
</collection>""" % self.recid
recs = bibupload.xml_marc_to_records(test_to_upload)
bibupload.insert_record_into_holding_pen(recs[0], "")
res = run_sql("SELECT changeset_xml FROM bibHOLDINGPEN WHERE id_bibrec=%s", (self.recid, ))
self.failUnless("Rupp, G" in res[0][0])
def test_holding_pen_upload_with_oai_id(self):
"""bibupload - holding pen upload with oai_id"""
test_to_upload = """<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
<record>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Kleefeld, F</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Newcomer, Y</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Rupp, G</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Scadron, M D</subfield>
</datafield>
<datafield tag="%(extoaiidtag)s" ind1="%(extoaiidind1)s" ind2="%(extoaiidind2)s">
<subfield code="%(extoaiidsubfieldcode)s">%(value)s</subfield>
</datafield>
</record>
</collection>""" % {'extoaiidtag': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[0:3],
'extoaiidind1': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[3:4] or " ",
'extoaiidind2': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5] != "_" and \
CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[4:5] or " ",
'extoaiidsubfieldcode': CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG[5:6],
'value': self.oai_id
}
recs = bibupload.xml_marc_to_records(test_to_upload)
bibupload.insert_record_into_holding_pen(recs[0], self.oai_id)
res = run_sql("SELECT changeset_xml FROM bibHOLDINGPEN WHERE id_bibrec=%s AND oai_id=%s", (self.recid, self.oai_id))
self.failUnless("Rupp, G" in res[0][0])
def tearDown(self):
GenericBibUploadTest.tearDown(self)
run_sql("DELETE FROM bibHOLDINGPEN WHERE id_bibrec=%s", (self.recid, ))
class BibUploadFFTModeTest(GenericBibUploadTest):
"""
Testing treatment of fulltext file transfer import mode.
"""
def _test_bibdoc_status(self, recid, docname, status):
res = run_sql('SELECT bd.status FROM bibrec_bibdoc as bb JOIN bibdoc as bd ON bb.id_bibdoc = bd.id WHERE bb.id_bibrec = %s AND bb.docname = %s', (recid, docname))
self.failUnless(res)
self.assertEqual(status, res[0][0])
def test_writing_rights(self):
"""bibupload - FFT has writing rights"""
self.failUnless(bibupload.writing_rights_p())
def test_simple_fft_insert(self):
"""bibupload - simple FFT insert"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
</datafield>
</record>
""" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif
""" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
self.failUnless(try_url_download(testrec_expected_url))
def test_fft_insert_with_valid_embargo(self):
"""bibupload - FFT insert with valid embargo"""
# define the test case:
future_date = time.strftime('%Y-%m-%d', time.gmtime(time.time() + 24 * 3600 * 2))
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="r">firerole: deny until '%(future_date)s'
allow any</subfield>
</datafield>
</record>
""" % {
'future_date': future_date,
'siteurl': CFG_SITE_URL
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
</datafield>
</record>
""" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif
""" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
result = urlopen(testrec_expected_url).read()
self.failUnless("This file is restricted." in result, result)
def test_fft_insert_with_expired_embargo(self):
"""bibupload - FFT insert with expired embargo"""
# define the test case:
past_date = time.strftime('%Y-%m-%d', time.gmtime(time.time() - 24 * 3600 * 2))
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">ARTICLE</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="r">firerole: deny until '%(past_date)s'
allow any</subfield>
</datafield>
</record>
""" % {
'past_date': past_date,
'siteurl': CFG_SITE_URL
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">ARTICLE</subfield>
</datafield>
</record>
""" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif
980__ $$aARTICLE
""" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
result = urlopen(testrec_expected_url).read()
self.failIf("If you already have an account, please login using the form below." in result, result)
self.assertEqual(test_web_page_content(testrec_expected_url, 'hyde', 'h123yde', expected_text='Authorization failure'), [])
force_webcoll(recid)
self.assertEqual(test_web_page_content(testrec_expected_url, 'hyde', 'h123yde', expected_text=urlopen("%(siteurl)s/img/site_logo.gif" % {
'siteurl': CFG_SITE_URL
}).read()), [])
def test_exotic_format_fft_append(self):
"""bibupload - exotic format FFT append"""
# define the test case:
from invenio.modules.access.local_config import CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS
testfile = os.path.join(CFG_TMPDIR, 'test.ps.Z')
open(testfile, 'w').write('TEST')
email_tag = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][0:3]
email_ind1 = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][3]
email_ind2 = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][4]
email_code = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][5]
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="%(email_tag)s" ind1="%(email_ind1)s" ind2="%(email_ind2)s">
<subfield code="%(email_code)s">jekyll@cds.cern.ch</subfield>
</datafield>
</record>
""" % {
'email_tag': email_tag,
'email_ind1': email_ind1 == '_' and ' ' or email_ind1,
'email_ind2': email_ind2 == '_' and ' ' or email_ind2,
'email_code': email_code}
testrec_to_append = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%s</subfield>
</datafield>
</record>
""" % testfile
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="%(email_tag)s" ind1="%(email_ind1)s" ind2="%(email_ind2)s">
<subfield code="%(email_code)s">jekyll@cds.cern.ch</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/test.ps.Z</subfield>
</datafield>
</record>
""" % {'siteurl': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
'email_tag': email_tag,
'email_ind1': email_ind1 == '_' and ' ' or email_ind1,
'email_ind2': email_ind2 == '_' and ' ' or email_ind2,
'email_code': email_code}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
%(email_tag)s%(email_ind1)s%(email_ind2)s $$%(email_code)sjekyll@cds.cern.ch
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/test.ps.Z
""" % {'siteurl': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
'email_tag': email_tag,
'email_ind1': email_ind1 == ' ' and '_' or email_ind1,
'email_ind2': email_ind2 == ' ' and '_' or email_ind2,
'email_code': email_code}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/test.ps.Z" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url2 = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/test?format=ps.Z" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_to_append = testrec_to_append.replace('123456789',
str(recid))
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
testrec_expected_url2 = testrec_expected_url.replace('123456789',
str(recid))
recs = bibupload.xml_marc_to_records(testrec_to_append)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='append')[0]
self.check_record_consistency(recid)
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
self.assertEqual(test_web_page_content(testrec_expected_url, 'jekyll', 'j123ekyll', expected_text='TEST'), [])
self.assertEqual(test_web_page_content(testrec_expected_url2, 'jekyll', 'j123ekyll', expected_text='TEST'), [])
def test_fft_check_md5_through_bibrecdoc_str(self):
"""bibupload - simple FFT insert, check md5 through BibRecDocs.str()"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%s/img/head.gif</subfield>
</datafield>
</record>
""" % CFG_SITE_URL
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
original_md5 = md5(urlopen('%s/img/head.gif' % CFG_SITE_URL).read()).hexdigest()
bibrec_str = str(BibRecDocs(int(recid)))
md5_found = False
for row in bibrec_str.split('\n'):
if 'checksum' in row:
if original_md5 in row:
md5_found = True
self.failUnless(md5_found)
def test_detailed_fft_insert(self):
"""bibupload - detailed FFT insert"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="t">SuperMain</subfield>
<subfield code="d">This is a description</subfield>
<subfield code="z">This is a comment</subfield>
<subfield code="n">CIDIESSE</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/rss.png</subfield>
<subfield code="t">SuperMain</subfield>
<subfield code="f">.jpeg</subfield>
<subfield code="d">This is a description</subfield>
<subfield code="z">This is a second comment</subfield>
<subfield code="n">CIDIESSE</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/CIDIESSE.gif</subfield>
<subfield code="y">This is a description</subfield>
<subfield code="z">This is a comment</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/CIDIESSE.jpeg</subfield>
<subfield code="y">This is a description</subfield>
<subfield code="z">This is a second comment</subfield>
</datafield>
</record>
""" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/CIDIESSE.gif$$yThis is a description$$zThis is a comment
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/CIDIESSE.jpeg$$yThis is a description$$zThis is a second comment
""" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url1 = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/CIDIESSE.gif" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url2 = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/CIDIESSE.jpeg" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url1 = testrec_expected_url1.replace('123456789',
str(recid))
testrec_expected_url2 = testrec_expected_url1.replace('123456789',
str(recid))
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
self.failUnless(try_url_download(testrec_expected_url1))
self.failUnless(try_url_download(testrec_expected_url2))
def test_simple_fft_insert_with_restriction(self):
"""bibupload - simple FFT insert with restriction"""
# define the test case:
from invenio.modules.access.local_config import CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS
email_tag = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][0:3]
email_ind1 = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][3]
email_ind2 = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][4]
email_code = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][5]
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="%(email_tag)s" ind1="%(email_ind1)s" ind2="%(email_ind2)s">
<subfield code="%(email_code)s">jekyll@cds.cern.ch</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">ARTICLE</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="r">thesis</subfield>
<subfield code="x">%(siteurl)s/img/sb.gif</subfield>
</datafield>
</record>
""" % {'email_tag': email_tag,
'email_ind1': email_ind1 == '_' and ' ' or email_ind1,
'email_ind2': email_ind2 == '_' and ' ' or email_ind2,
'email_code': email_code,
'siteurl': CFG_SITE_URL}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="%(email_tag)s" ind1="%(email_ind1)s" ind2="%(email_ind2)s">
<subfield code="%(email_code)s">jekyll@cds.cern.ch</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif?subformat=icon</subfield>
<subfield code="x">icon</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">ARTICLE</subfield>
</datafield>
</record>
""" % {'siteurl': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
'email_tag': email_tag,
'email_ind1': email_ind1 == '_' and ' ' or email_ind1,
'email_ind2': email_ind2 == '_' and ' ' or email_ind2,
'email_code': email_code}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
%(email_tag)s%(email_ind1)s%(email_ind2)s $$%(email_code)sjekyll@cds.cern.ch
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif?subformat=icon$$xicon
980__ $$aARTICLE
""" % {'siteurl': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
'email_tag': email_tag,
'email_ind1': email_ind1 == ' ' and '_' or email_ind1,
'email_ind2': email_ind2 == ' ' and '_' or email_ind2,
'email_code': email_code}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_icon = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif?subformat=icon" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
testrec_expected_icon = testrec_expected_icon.replace('123456789',
str(recid))
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
self.assertEqual(test_web_page_content(testrec_expected_icon, 'jekyll', 'j123ekyll', expected_text=urlopen('%(siteurl)s/img/sb.gif' % {
'siteurl': CFG_SITE_URL
}).read()), [])
self.assertEqual(test_web_page_content(testrec_expected_icon, 'hyde', 'h123yde', expected_text='Authorization failure'), [])
force_webcoll(recid)
self.assertEqual(test_web_page_content(testrec_expected_icon, 'hyde', 'h123yde', expected_text=urlopen('%(siteurl)s/img/restricted.gif' % {'siteurl': CFG_SITE_URL}).read()), [])
self.failUnless("HTTP Error 401: Unauthorized" in test_web_page_content(testrec_expected_url, 'hyde', 'h123yde')[0])
self.failUnless("This file is restricted." in urlopen(testrec_expected_url).read())
def test_simple_fft_insert_with_icon(self):
"""bibupload - simple FFT insert with icon"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="x">%(siteurl)s/img/sb.gif</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif?subformat=icon</subfield>
<subfield code="x">icon</subfield>
</datafield>
</record>
""" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif?subformat=icon$$xicon
""" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_icon = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif?subformat=icon" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
testrec_expected_icon = testrec_expected_icon.replace('123456789',
str(recid))
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
self.failUnless(try_url_download(testrec_expected_url))
self.failUnless(try_url_download(testrec_expected_icon))
def test_multiple_fft_insert(self):
"""bibupload - multiple FFT insert"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/head.gif</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/%(CFG_SITE_RECORD)s/95/files/9809057.pdf</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(prefix)s/var/tmp/demobibdata.xml</subfield>
</datafield>
</record>
""" % {
'prefix': CFG_PREFIX,
'siteurl': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/9809057.pdf</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/demobibdata.xml</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/head.gif</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
</datafield>
</record>
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/9809057.pdf
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/demobibdata.xml
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/head.gif
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
testrec_expected_urls = []
for files in ('site_logo.gif', 'head.gif', '9809057.pdf', 'demobibdata.xml'):
testrec_expected_urls.append('%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/%(files)s' % {'siteurl' : CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD, 'files' : files})
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_urls = []
for files in ('site_logo.gif', 'head.gif', '9809057.pdf', 'demobibdata.xml'):
testrec_expected_urls.append('%(siteurl)s/%(CFG_SITE_RECORD)s/%(recid)s/files/%(files)s' % {'siteurl' : CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD, 'files' : files, 'recid' : recid})
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
for url in testrec_expected_urls:
self.failUnless(try_url_download(url))
self._test_bibdoc_status(recid, 'head', '')
self._test_bibdoc_status(recid, '9809057', '')
self._test_bibdoc_status(recid, 'site_logo', '')
self._test_bibdoc_status(recid, 'demobibdata', '')
def test_simple_fft_correct(self):
"""bibupload - simple FFT correct"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
test_to_correct = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/sb.gif</subfield>
<subfield code="n">site_logo</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
</datafield>
</record>
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
test_to_correct = test_to_correct.replace('123456789',
str(recid))
# correct test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_correct)
bibupload.bibupload_records(recs, opt_mode='correct')[0]
self.check_record_consistency(recid)
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.failUnless(try_url_download(testrec_expected_url))
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
self._test_bibdoc_status(recid, 'site_logo', '')
def test_fft_correct_already_exists(self):
"""bibupload - FFT correct with already identical existing file"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="d">a description</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/help.png</subfield>
<subfield code="n">site_logo</subfield>
<subfield code="d">another description</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/rss.png</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/line.gif</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/merge.png</subfield>
<subfield code="n">line</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
test_to_correct = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="d">a second description</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/help.png</subfield>
<subfield code="n">site_logo</subfield>
<subfield code="d">another second description</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/refresh.png</subfield>
<subfield code="n">rss</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/line.gif</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/merge-small.png</subfield>
<subfield code="n">line</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/line.gif</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/line.png</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/rss.png</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
<subfield code="y">a second description</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.png</subfield>
<subfield code="y">another second description</subfield>
</datafield>
</record>
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/line.gif
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/line.png
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/rss.png
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif$$ya second description
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.png$$yanother second description
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url2 = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/rss.png" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url3 = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.png" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url4 = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/line.png" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url5 = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/line.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
_, recid, _ = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
testrec_expected_url2 = testrec_expected_url2.replace('123456789',
str(recid))
testrec_expected_url3 = testrec_expected_url3.replace('123456789',
str(recid))
testrec_expected_url4 = testrec_expected_url4.replace('123456789',
str(recid))
testrec_expected_url5 = testrec_expected_url5.replace('123456789',
str(recid))
test_to_correct = test_to_correct.replace('123456789',
str(recid))
# correct test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_correct)
bibupload.bibupload(recs[0], opt_mode='correct')
self.check_record_consistency(recid)
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.failUnless(try_url_download(testrec_expected_url))
self.failUnless(try_url_download(testrec_expected_url2))
self.failUnless(try_url_download(testrec_expected_url3))
self.failUnless(try_url_download(testrec_expected_url4))
self.failUnless(try_url_download(testrec_expected_url5))
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
bibrecdocs = BibRecDocs(recid)
self.failUnless(bibrecdocs.get_bibdoc('rss').list_versions(), [1, 2])
self.failUnless(bibrecdocs.get_bibdoc('site_logo').list_versions(), [1])
self.failUnless(bibrecdocs.get_bibdoc('line').list_versions(), [1, 2])
def test_fft_correct_modify_doctype(self):
"""bibupload - FFT correct with different doctype"""
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="d">a description</subfield>
<subfield code="t">TEST1</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
test_to_correct = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="n">site_logo</subfield>
<subfield code="t">TEST2</subfield>
</datafield>
</record>
"""
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
<subfield code="y">a description</subfield>
</datafield>
</record>
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
_, recid, _ = bibupload.bibupload(recs[0], opt_mode='insert')
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
bibrecdocs = BibRecDocs(recid)
self.failUnless(bibrecdocs.get_bibdoc('site_logo').doctype, 'TEST1')
# correct test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_correct)
bibupload.bibupload(recs[0], opt_mode='correct')
# compare expected results:
inserted_xm = print_record(recid, 'xm')
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
bibrecdocs = BibRecDocs(recid)
self.failUnless(bibrecdocs.get_bibdoc('site_logo').doctype, 'TEST2')
def test_fft_append_already_exists(self):
"""bibupload - FFT append with already identical existing file"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="d">a description</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
test_to_append = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="d">a second description</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/help.png</subfield>
<subfield code="n">site_logo</subfield>
<subfield code="d">another second description</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
<subfield code="y">a description</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.png</subfield>
<subfield code="y">another second description</subfield>
</datafield>
</record>
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif$$ya description
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.png$$yanother second description
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url2 = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.png" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
_, recid, _ = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
test_to_append = test_to_append.replace('123456789',
str(recid))
# correct test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_append)
err, recid, msg = bibupload.bibupload(recs[0], opt_mode='append')
self.check_record_consistency(recid)
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.failUnless(try_url_download(testrec_expected_url))
self.failUnless(try_url_download(testrec_expected_url2))
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
def test_fft_implicit_fix_marc(self):
"""bibupload - FFT implicit FIX-MARC"""
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="0" ind2=" ">
<subfield code="f">foo@bar.com</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="f">%(siteurl)s/img/site_logo.gif</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
test_to_correct = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="856" ind1="0" ind2=" ">
<subfield code="f">foo@bar.com</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/img/site_logo.gif</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
</datafield>
</record>
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="0" ind2=" ">
<subfield code="f">foo@bar.com</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/img/site_logo.gif</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8560_ $$ffoo@bar.com
8564_ $$u%(siteurl)s/img/site_logo.gif
""" % {
'siteurl': CFG_SITE_URL
}
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
test_to_correct = test_to_correct.replace('123456789',
str(recid))
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
# correct test record with implicit FIX-MARC:
recs = bibupload.xml_marc_to_records(test_to_correct)
bibupload.bibupload_records(recs, opt_mode='correct')[0]
self.check_record_consistency(recid)
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
def test_fft_vs_bibedit(self):
"""bibupload - FFT Vs. BibEdit compatibility"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
test_to_replace = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">http://www.google.com/</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="z">BibEdit Comment</subfield>
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
<subfield code="y">BibEdit Description</subfield>
<subfield code="x">01</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">http://cern.ch/</subfield>
</datafield>
</record>
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_xm = str(test_to_replace)
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$uhttp://www.google.com/
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif$$x01$$yBibEdit Description$$zBibEdit Comment
8564_ $$uhttp://cern.ch/
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
test_to_replace = test_to_replace.replace('123456789',
str(recid))
# correct test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_replace)
bibupload.bibupload_records(recs, opt_mode='replace')[0]
self.check_record_consistency(recid)
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.failUnless(try_url_download(testrec_expected_url))
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
self._test_bibdoc_status(recid, 'site_logo', '')
bibrecdocs = BibRecDocs(recid)
bibdoc = bibrecdocs.get_bibdoc('site_logo')
self.assertEqual(bibdoc.get_description('.gif'), 'BibEdit Description')
def test_detailed_fft_correct(self):
"""bibupload - detailed FFT correct
"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="d">Try</subfield>
<subfield code="z">Comment</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
test_to_correct = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/head.gif</subfield>
<subfield code="n">site_logo</subfield>
<subfield code="m">patata</subfield>
<subfield code="d">Next Try</subfield>
<subfield code="z">KEEP-OLD-VALUE</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/patata.gif</subfield>
<subfield code="y">Next Try</subfield>
<subfield code="z">Comment</subfield>
</datafield>
</record>
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/patata.gif$$yNext Try$$zComment
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/patata.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
test_to_correct = test_to_correct.replace('123456789',
str(recid))
# correct test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_correct)
bibupload.bibupload_records(recs, opt_mode='correct')
self.check_record_consistency(recid)
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.failUnless(try_url_download(testrec_expected_url))
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '', "bufers not equal: %s and %s" % (inserted_xm, testrec_expected_xm))
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '', "bufers not equal: %s and %s" % (inserted_hm, testrec_expected_hm))
self._test_bibdoc_status(recid, 'patata', '')
def test_no_url_fft_correct(self):
"""bibupload - no_url FFT correct"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="d">Try</subfield>
<subfield code="z">Comment</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
test_to_correct = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="n">site_logo</subfield>
<subfield code="m">patata</subfield>
<subfield code="f">.gif</subfield>
<subfield code="d">KEEP-OLD-VALUE</subfield>
<subfield code="z">Next Comment</subfield>
</datafield>
</record>
"""
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/patata.gif</subfield>
<subfield code="y">Try</subfield>
<subfield code="z">Next Comment</subfield>
</datafield>
</record>
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/patata.gif$$yTry$$zNext Comment
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/patata.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
test_to_correct = test_to_correct.replace('123456789',
str(recid))
# correct test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_correct)
bibupload.bibupload_records(recs, opt_mode='correct')[0]
self.check_record_consistency(recid)
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.failUnless(try_url_download(testrec_expected_url))
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
self._test_bibdoc_status(recid, 'patata', '')
def test_new_icon_fft_append(self):
"""bibupload - new icon FFT append"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
</record>
"""
test_to_correct = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="n">site_logo</subfield>
<subfield code="x">%(siteurl)s/img/site_logo.gif</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif?subformat=icon</subfield>
<subfield code="x">icon</subfield>
</datafield>
</record>
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif?subformat=icon$$xicon
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif?subformat=icon" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
test_to_correct = test_to_correct.replace('123456789',
str(recid))
# correct test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_correct)
bibupload.bibupload_records(recs, opt_mode='append')[0]
self.check_record_consistency(recid)
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.failUnless(try_url_download(testrec_expected_url))
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
self._test_bibdoc_status(recid, 'site_logo', '')
def test_multiple_fft_correct(self):
"""bibupload - multiple FFT correct"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="d">Try</subfield>
<subfield code="z">Comment</subfield>
<subfield code="r">Restricted</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/okay.gif</subfield>
<subfield code="n">site_logo</subfield>
<subfield code="f">.jpeg</subfield>
<subfield code="d">Try jpeg</subfield>
<subfield code="z">Comment jpeg</subfield>
<subfield code="r">Restricted</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
test_to_correct = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/loading.gif</subfield>
<subfield code="n">site_logo</subfield>
<subfield code="m">patata</subfield>
<subfield code="f">.gif</subfield>
<subfield code="r">New restricted</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/patata.gif</subfield>
</datafield>
</record>
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/patata.gif
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/patata.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
test_to_correct = test_to_correct.replace('123456789',
str(recid))
# correct test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_correct)
bibupload.bibupload_records(recs, opt_mode='correct')[0]
self.check_record_consistency(recid)
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.failUnless("This file is restricted." in urlopen(testrec_expected_url).read())
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
self._test_bibdoc_status(recid, 'patata', 'New restricted')
def test_purge_fft_correct(self):
"""bibupload - purge FFT correct"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/head.gif</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
test_to_correct = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
test_to_purge = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="t">PURGE</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/head.gif</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
</datafield>
</record>
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/head.gif
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif
""" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
test_to_correct = test_to_correct.replace('123456789',
str(recid))
test_to_purge = test_to_purge.replace('123456789',
str(recid))
# correct test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_correct)
bibupload.bibupload_records(recs, opt_mode='correct')[0]
self.check_record_consistency(recid)
# purge test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_purge)
bibupload.bibupload_records(recs, opt_mode='correct')
self.check_record_consistency(recid)
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.failUnless(try_url_download(testrec_expected_url))
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
self._test_bibdoc_status(recid, 'site_logo', '')
self._test_bibdoc_status(recid, 'head', '')
def test_revert_fft_correct(self):
"""bibupload - revert FFT correct"""
# define the test case:
from invenio.modules.access.local_config import CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS
email_tag = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][0:3]
email_ind1 = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][3]
email_ind2 = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][4]
email_code = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][5]
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="%(email_tag)s" ind1="%(email_ind1)s" ind2="%(email_ind2)s">
<subfield code="%(email_code)s">jekyll@cds.cern.ch</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/iconpen.gif</subfield>
<subfield code="n">site_logo</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL,
'email_tag': email_tag,
'email_ind1': email_ind1 == '_' and ' ' or email_ind1,
'email_ind2': email_ind2 == '_' and ' ' or email_ind2,
'email_code': email_code}
test_to_correct = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%s/img/head.gif</subfield>
<subfield code="n">site_logo</subfield>
</datafield>
</record>
""" % CFG_SITE_URL
test_to_revert = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="n">site_logo</subfield>
<subfield code="t">REVERT</subfield>
<subfield code="v">1</subfield>
</datafield>
</record>
"""
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="%(email_tag)s" ind1="%(email_ind1)s" ind2="%(email_ind2)s">
<subfield code="%(email_code)s">jekyll@cds.cern.ch</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
</datafield>
</record>
""" % {'siteurl': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
'email_tag': email_tag,
'email_ind1': email_ind1 == '_' and ' ' or email_ind1,
'email_ind2': email_ind2 == '_' and ' ' or email_ind2,
'email_code': email_code}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
%(email_tag)s%(email_ind1)s%(email_ind2)s $$%(email_code)sjekyll@cds.cern.ch
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif
""" % {'siteurl': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
'email_tag': email_tag,
'email_ind1': email_ind1 == ' ' and '_' or email_ind1,
'email_ind2': email_ind2 == ' ' and '_' or email_ind2,
'email_code': email_code}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
test_to_correct = test_to_correct.replace('123456789',
str(recid))
test_to_revert = test_to_revert.replace('123456789',
str(recid))
# correct test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_correct)
bibupload.bibupload_records(recs, opt_mode='correct')
self.check_record_consistency(recid)
# revert test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_revert)
bibupload.bibupload_records(recs, opt_mode='correct')
self.check_record_consistency(recid)
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.failUnless(try_url_download(testrec_expected_url))
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
self._test_bibdoc_status(recid, 'site_logo', '')
expected_content_version1 = urlopen('%s/img/iconpen.gif' % CFG_SITE_URL).read()
expected_content_version2 = urlopen('%s/img/head.gif' % CFG_SITE_URL).read()
expected_content_version3 = expected_content_version1
self.assertEqual(test_web_page_content('%s/%s/%s/files/site_logo.gif?version=1' % (CFG_SITE_URL, CFG_SITE_RECORD, recid), 'jekyll', 'j123ekyll', expected_content_version1), [])
self.assertEqual(test_web_page_content('%s/%s/%s/files/site_logo.gif?version=2' % (CFG_SITE_URL, CFG_SITE_RECORD, recid), 'jekyll', 'j123ekyll', expected_content_version2), [])
self.assertEqual(test_web_page_content('%s/%s/%s/files/site_logo.gif?version=3' % (CFG_SITE_URL, CFG_SITE_RECORD, recid), 'jekyll', 'j123ekyll', expected_content_version3), [])
def test_simple_fft_replace(self):
"""bibupload - simple FFT replace"""
# define the test case:
from invenio.modules.access.local_config import CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS
email_tag = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][0:3]
email_ind1 = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][3]
email_ind2 = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][4]
email_code = CFG_ACC_GRANT_AUTHOR_RIGHTS_TO_EMAILS_IN_TAGS[0][5]
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="%(email_tag)s" ind1="%(email_ind1)s" ind2="%(email_ind2)s">
<subfield code="%(email_code)s">jekyll@cds.cern.ch</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/iconpen.gif</subfield>
<subfield code="n">site_logo</subfield>
</datafield>
</record>
""" % {'siteurl': CFG_SITE_URL,
'email_tag': email_tag,
'email_ind1': email_ind1 == '_' and ' ' or email_ind1,
'email_ind2': email_ind2 == '_' and ' ' or email_ind2,
'email_code': email_code}
test_to_replace = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="%(email_tag)s" ind1="%(email_ind1)s" ind2="%(email_ind2)s">
<subfield code="%(email_code)s">jekyll@cds.cern.ch</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/head.gif</subfield>
</datafield>
</record>
""" % {'siteurl': CFG_SITE_URL,
'email_tag': email_tag,
'email_ind1': email_ind1 == '_' and ' ' or email_ind1,
'email_ind2': email_ind2 == '_' and ' ' or email_ind2,
'email_code': email_code}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="%(email_tag)s" ind1="%(email_ind1)s" ind2="%(email_ind2)s">
<subfield code="%(email_code)s">jekyll@cds.cern.ch</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/head.gif</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
'email_tag': email_tag,
'email_ind1': email_ind1 == '_' and ' ' or email_ind1,
'email_ind2': email_ind2 == '_' and ' ' or email_ind2,
'email_code': email_code}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
%(email_tag)s%(email_ind1)s%(email_ind2)s $$%(email_code)sjekyll@cds.cern.ch
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/head.gif
""" % {
'siteurl': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
'email_tag': email_tag,
'email_ind1': email_ind1 == ' ' and '_' or email_ind1,
'email_ind2': email_ind2 == ' ' and '_' or email_ind2,
'email_code': email_code}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/head.gif" % { 'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
test_to_replace = test_to_replace.replace('123456789',
str(recid))
# replace test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_replace)
bibupload.bibupload_records(recs, opt_mode='replace')
self.check_record_consistency(recid)
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.failUnless(try_url_download(testrec_expected_url))
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
expected_content_version = urlopen('%s/img/head.gif' % CFG_SITE_URL).read()
self.assertEqual(test_web_page_content(testrec_expected_url, 'hyde', 'h123yde', expected_text='Authorization failure'), [])
self.assertEqual(test_web_page_content(testrec_expected_url, 'jekyll', 'j123ekyll', expected_text=expected_content_version), [])
def test_simple_fft_insert_with_modification_time(self):
"""bibupload - simple FFT insert with modification time"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">ARTICLE</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="s">2006-05-04 03:02:01</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
testrec_expected_xm = """
<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">ARTICLE</subfield>
</datafield>
</record>
""" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_hm = """
001__ 123456789
003__ SzGeCERN
100__ $$aTest, John$$uTest University
8564_ $$u%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif
980__ $$aARTICLE
""" % {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/site_logo.gif" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
testrec_expected_url2 = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload_records(recs, opt_mode='insert')[0]
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_xm = testrec_expected_xm.replace('123456789',
str(recid))
testrec_expected_hm = testrec_expected_hm.replace('123456789',
str(recid))
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
testrec_expected_url2 = testrec_expected_url2.replace('123456789',
str(recid))
# compare expected results:
inserted_xm = print_record(recid, 'xm')
inserted_hm = print_record(recid, 'hm')
self.assertEqual(compare_xmbuffers(inserted_xm,
testrec_expected_xm), '')
self.assertEqual(compare_hmbuffers(inserted_hm,
testrec_expected_hm), '')
self.failUnless(try_url_download(testrec_expected_url))
force_webcoll(recid)
self.assertEqual(test_web_page_content(testrec_expected_url2, expected_text='<em>04 May 2006, 03:02</em>'), [])
def test_multiple_fft_insert_with_modification_time(self):
"""bibupload - multiple FFT insert with modification time"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">ARTICLE</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="s">2006-05-04 03:02:01</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/head.gif</subfield>
<subfield code="s">2007-05-04 03:02:01</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/%(CFG_SITE_RECORD)s/95/files/9809057.pdf</subfield>
<subfield code="s">2008-05-04 03:02:01</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(prefix)s/var/tmp/demobibdata.xml</subfield>
<subfield code="s">2009-05-04 03:02:01</subfield>
</datafield>
</record>
""" % {
'prefix': CFG_PREFIX,
'siteurl': CFG_SITE_URL,
'CFG_SITE_RECORD': CFG_SITE_RECORD,
}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
force_webcoll(recid)
self.assertEqual(test_web_page_content(testrec_expected_url, expected_text=['<em>04 May 2006, 03:02</em>', '<em>04 May 2007, 03:02</em>', '<em>04 May 2008, 03:02</em>', '<em>04 May 2009, 03:02</em>']), [])
def test_simple_fft_correct_with_modification_time(self):
"""bibupload - simple FFT correct with modification time"""
# define the test case:
test_to_upload = """
<record>
<controlfield tag="003">SzGeCERN</controlfield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, John</subfield>
<subfield code="u">Test University</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">ARTICLE</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/site_logo.gif</subfield>
<subfield code="s">2007-05-04 03:02:01</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
test_to_correct = """
<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">%(siteurl)s/img/sb.gif</subfield>
<subfield code="n">site_logo</subfield>
<subfield code="s">2008-05-04 03:02:01</subfield>
</datafield>
</record>
""" % {
'siteurl': CFG_SITE_URL
}
testrec_expected_url = "%(siteurl)s/%(CFG_SITE_RECORD)s/123456789/files/" \
% {'siteurl': CFG_SITE_URL, 'CFG_SITE_RECORD': CFG_SITE_RECORD}
# insert test record:
recs = bibupload.xml_marc_to_records(test_to_upload)
dummy, recid, dummy = bibupload.bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(recid)
# replace test buffers with real recid of inserted test record:
testrec_expected_url = testrec_expected_url.replace('123456789',
str(recid))
test_to_correct = test_to_correct.replace('123456789',
str(recid))
# correct test record with new FFT:
recs = bibupload.xml_marc_to_records(test_to_correct)
err, recid, msg = bibupload.bibupload(recs[0], opt_mode='correct')
self.check_record_consistency(recid)
force_webcoll(recid)
self.assertEqual(test_web_page_content(testrec_expected_url, expected_text=['<em>04 May 2008, 03:02</em>']), [])
TEST_SUITE = make_test_suite(BibUploadNoUselessHistoryTest,
BibUploadHoldingPenTest,
BibUploadInsertModeTest,
BibUploadAppendModeTest,
BibUploadCorrectModeTest,
BibUploadDeleteModeTest,
BibUploadReplaceModeTest,
BibUploadReferencesModeTest,
BibUploadRecordsWithSYSNOTest,
BibUploadRecordsWithEXTOAIIDTest,
BibUploadRecordsWithOAIIDTest,
BibUploadIndicatorsTest,
BibUploadUpperLowerCaseTest,
BibUploadControlledProvenanceTest,
BibUploadStrongTagsTest,
BibUploadFFTModeTest,
BibUploadPretendTest,
BibUploadCallbackURLTest,
BibUploadMoreInfoTest,
BibUploadBibRelationsTest,
BibUploadRecordsWithDOITest,
BibUploadTypicalBibEditSessionTest,
BibUploadRealCaseRemovalDOIViaBibEdit,
)
if __name__ == "__main__":
run_test_suite(TEST_SUITE, warn_user=True)
diff --git a/invenio_demosite/testsuite/regression/test_bibupload_revisionverifier.py b/invenio_demosite/testsuite/regression/test_bibupload_revisionverifier.py
index f809681c8..719e89461 100644
--- a/invenio_demosite/testsuite/regression/test_bibupload_revisionverifier.py
+++ b/invenio_demosite/testsuite/regression/test_bibupload_revisionverifier.py
@@ -1,1118 +1,1118 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""
Contains Test Cases for Revision Verifier module used along with BibUpload.
"""
from invenio.base.wrappers import lazy_import
from invenio.testsuite import make_test_suite, run_test_suite, nottest
get_record = lazy_import('invenio.legacy.search_engine:get_record')
print_record = lazy_import('invenio.legacy.search_engine:print_record')
bibupload = lazy_import('invenio.bibupload:bibupload')
xml_marc_to_records = lazy_import('invenio.bibupload:xml_marc_to_records')
record_get_field_value = lazy_import('invenio.legacy.bibrecord:record_get_field_value')
record_xml_output = lazy_import('invenio.legacy.bibrecord:record_xml_output')
-from invenio.bibupload_revisionverifier \
+from invenio.legacy.bibupload.revisionverifier \
import RevisionVerifier, \
InvenioBibUploadConflictingRevisionsError, \
InvenioBibUploadMissing005Error, \
InvenioBibUploadUnchangedRecordError, \
InvenioBibUploadInvalidRevisionError
-from invenio.bibupload_regression_tests import GenericBibUploadTest, \
+from invenio.legacy.bibupload.engine_regression_tests import GenericBibUploadTest, \
compare_xmbuffers
from invenio.testutils import make_test_suite, run_test_suite, nottest
from invenio.dbquery import run_sql
@nottest
def init_test_records():
"""
Initializes test records for revision verifying scenarios
Inserts 1st version and then appends new field every 1 sec
to create 2nd and 3rd version of the record
Returns a dict of following format :
{'id':recid,
'rev1':(rev1_rec, rev1_005),
'rev2':(rev2_rec, rev2_005tag),
'rev3':(rev3_rec, rev3_005tag)}
"""
# Rev 1 -- tag 100
rev1 = """ <record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
</record>"""
# Append 970 to Rev1
rev1_append = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHOPHO</subfield>
</datafield>
</record>"""
# Rev 2 -- Rev 1 + tag 970
rev2 = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHOPHO</subfield>
</datafield>
</record>"""
# Append 888 to Rev2
rev2_append = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>"""
# Rev 3 -- Rev 2 + tag 888
rev3 = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHOPHO</subfield>
</datafield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>"""
init_details = {}
insert_record = rev1.replace(
'<controlfield tag="001">123456789</controlfield>', '')
insert_record = insert_record.replace(
'<controlfield tag="005">20110101000000.0</controlfield>', '')
recs = xml_marc_to_records(insert_record)
# --> Revision 1 submitted
res = bibupload(recs[0], opt_mode='insert')
recid = res[1]
init_details['id'] = (str(recid), )
rec = get_record(recid)
rev_tag = record_get_field_value(rec, '005', '', '')
# update the test data
rev1 = rev1.replace('123456789', str(recid))
rev1 = rev1.replace('20110101000000.0', rev_tag)
rev1_append = rev1_append.replace('123456789', str(recid))
rev2 = rev2.replace('123456789', str(recid))
rev2 = rev2.replace('20110101000000.0', rev_tag)
rev2_append = rev2_append.replace('123456789', str(recid))
rev3 = rev3.replace('123456789', str(recid))
init_details['rev1'] = (rev1, rev_tag)
old_rev_tag = rev_tag
# --> Revision 2 submitted
recs = xml_marc_to_records(rev1_append)
res = bibupload(recs[0], opt_mode='append')
rec = get_record(recid)
rev_tag = record_get_field_value(rec, '005')
rev2 = rev2.replace(old_rev_tag, rev_tag)
rev3 = rev3.replace('20110101000000.0', rev_tag)
init_details['rev2'] = (rev2, rev_tag)
old_rev_tag = rev_tag
# --> Revision 3 submitted
recs = xml_marc_to_records(rev2_append)
res = bibupload(recs[0], opt_mode='append')
rec = get_record(recid)
rev_tag = record_get_field_value(rec, '005')
rev3 = rev3.replace(old_rev_tag, rev_tag)
init_details['rev3'] = (rev3, rev_tag)
return init_details
class RevisionVerifierForCorrectAddition(GenericBibUploadTest):
"""
Test Cases for Patch generation when fields added in Upload Record.
Scenarios:
* Field added in Upload Record and not added in Original Record
* Another instance of existing Field added in Upload Record and
not added in Original Record
"""
def setUp(self):
""" Sets Up sample Records for Adding Field Scenario."""
GenericBibUploadTest.setUp(self)
self.data = init_test_records()
# Rev 2 Update -- Rev2 + tag 300
self.rev2_add_field = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHOPHO</subfield>
</datafield>
<datafield tag="300" ind1=" " ind2=" ">
<subfield code="a">100P</subfield>
</datafield>
</record>"""
#Rev 2 Update -- Rev2 + tag 100*
self.rev2_add_sim_field = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="100" ind1="C" ind2="0">
<subfield code="a">Devel, D</subfield>
<subfield code="u">FUZZY</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHOPHO</subfield>
</datafield>
</record>"""
# Record Patch -- Ouput For a New Field
self.patch = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag="300" ind1=" " ind2=" ">
<subfield code="a">100P</subfield>
</datafield>
</record>"""
# Record Patch -- Outpute for a New Identical Field
self.patch_identical_field = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="100" ind1="C" ind2="0">
<subfield code="a">Devel, D</subfield>
<subfield code="u">FUZZY</subfield>
</datafield>
</record>"""
self.rev2_add_field = self.rev2_add_field.replace(
'123456789', self.data['id'][0])
self.rev2_add_field = self.rev2_add_field.replace(
'20110101000000.0', \
self.data['rev2'][1])
self.rev2_add_sim_field = self.rev2_add_sim_field.replace(
'123456789', self.data['id'][0])
self.rev2_add_sim_field = self.rev2_add_sim_field.replace(
'20110101000000.0', \
self.data['rev2'][1])
self.patch = self.patch.replace('123456789', self.data['id'][0])
self.patch_identical_field = self.patch_identical_field.replace(
'123456789', \
self.data['id'][0])
def test_add_new_field(self):
""" BibUpload Revision Verifier - Rev3-100/970/888, Added 300 to Rev2(100/970), Patch Generated for 300"""
upload_recs = xml_marc_to_records(self.rev2_add_field)
orig_recs = xml_marc_to_records(self.data['rev3'][0])
rev_verifier = RevisionVerifier()
(opt_mode, patch, dummy_affected_tags) = rev_verifier.verify_revision(upload_recs[0], \
orig_recs[0], \
'replace')
self.assertEqual('correct', opt_mode)
self.assertEqual(compare_xmbuffers(record_xml_output(patch), self.patch), '')
def test_add_identical_field(self):
""" BibUpload Revision Verifier - Rev3-100/970/888, Added 100 to Rev2(100/970), Patch Generated for 100"""
upload_identical_rec = xml_marc_to_records(self.rev2_add_sim_field)
orig_recs = xml_marc_to_records(self.data['rev3'][0])
rev_verifier = RevisionVerifier()
(opt_mode, patch, dummy_affected_tags) = rev_verifier.verify_revision(upload_identical_rec[0], \
orig_recs[0], \
'replace')
self.assertEqual('correct', opt_mode)
self.assertEqual(compare_xmbuffers(record_xml_output(patch), self.patch_identical_field), '')
class RevisionVerifierForConflictingAddition(GenericBibUploadTest):
"""
Test Cases for Conflicts when fields added in Upload Record.
Scenarios:
* Field added in Upload Record but also added in Original Record
* Field added in Upload Record but similar field modified in Original
"""
def setUp(self):
""" Sets Up sample Records for Adding Field Scenario."""
GenericBibUploadTest.setUp(self)
self.data = init_test_records()
# Rev 2 Update -- Rev2 + tag 888
self.rev2_add_conf_field = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>"""
#Rev 2 Update -- Rev2 + tag 100*
self.rev2_add_sim_field = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="100" ind1="C" ind2="0">
<subfield code="a">Devel, D</subfield>
<subfield code="u">FUZZY</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHOPHO</subfield>
</datafield>
</record>"""
#Rev 3 -- Rev2 + tag 100* +tag 888
self.rev3_add_sim_field = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="100" ind1="C" ind2="1">
<subfield code="a">Devel, D</subfield>
<subfield code="z">FUZZY</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHOPHO</subfield>
</datafield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>"""
# Rev 3 -- tag 100 updated from Rev 2 + Tag 888
self.rev3_mod = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="z">DEVEL, U</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHOPHO</subfield>
</datafield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>"""
self.rev2_add_conf_field = self.rev2_add_conf_field.replace(
'123456789', self.data['id'][0])
self.rev2_add_conf_field = self.rev2_add_conf_field.replace(
'20110101000000.0', \
self.data['rev2'][1])
self.rev2_add_sim_field = self.rev2_add_sim_field.replace(
'123456789', self.data['id'][0])
self.rev2_add_sim_field = self.rev2_add_sim_field.replace(
'20110101000000.0', \
self.data['rev2'][1])
self.rev3_mod = self.rev3_mod.replace('123456789', self.data['id'][0])
self.rev3_mod = self.rev3_mod.replace('20110101000000.0', \
self.data['rev3'][1])
self.rev3_add_sim_field = self.rev3_add_sim_field.replace(
'123456789', \
self.data['id'][0])
self.rev3_add_sim_field = self.rev3_add_sim_field.replace(
'20110101000000.0', \
self.data['rev3'][1])
def test_add_conflict_field(self):
""" BibUpload Revision Verifier - Rev3-100/970/888, Added 888 to Rev2(100/970), Conflict Expected"""
upload_conf_rec = xml_marc_to_records(self.rev2_add_conf_field)
orig_recs = xml_marc_to_records(self.data['rev3'][0])
rev_verifier = RevisionVerifier()
self.assertRaises(InvenioBibUploadConflictingRevisionsError, \
rev_verifier.verify_revision, \
upload_conf_rec[0], \
orig_recs[0], \
'replace')
def test_conflicting_similarfield(self):
""" BibUpload Revision Verifier - Rev3-100/970/888, Added 100 to Rev2(100/970), 100 added to Rev3, Conflict Expected"""
upload_identical_rec = xml_marc_to_records(self.rev2_add_sim_field)
orig_recs = xml_marc_to_records(self.rev3_add_sim_field)
rev_verifier = RevisionVerifier()
self.assertRaises(InvenioBibUploadConflictingRevisionsError, \
rev_verifier.verify_revision, \
upload_identical_rec[0], \
orig_recs[0], \
'replace')
def test_conflicting_modfield(self):
""" BibUpload Revision Verifier - Rev3-100/970/888, Added 100 to Rev2(100/970), Rev3 100 modified, Conflict Expected"""
upload_identical_rec = xml_marc_to_records(self.rev2_add_sim_field)
orig_recs = xml_marc_to_records(self.rev3_mod)
rev_verifier = RevisionVerifier()
self.assertRaises(InvenioBibUploadConflictingRevisionsError, \
rev_verifier.verify_revision, \
upload_identical_rec[0], \
orig_recs[0], \
'replace')
class RevisionVerifierForCorrectModification(GenericBibUploadTest):
"""
Test Cases for Patch generation when fields are modified.
Scenarios:
* Fields modified in Upload Record but not modified in Original Record
"""
def setUp(self):
""" Sets up sample records for Modified Fields Scenarios."""
GenericBibUploadTest.setUp(self)
self.data = init_test_records()
# Rev 2 Update -- Rev2 ~ tag 970 Modified
self.rev2_mod_field = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PZOOPZOO</subfield>
</datafield>
</record>"""
# Modify Record Patch Output
self.patch = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PZOOPZOO</subfield>
</datafield>
</record>"""
# Scenario 2 - 970CP added to existing record
# Rev 2 Update -- Rev2 ~ tag 970CP Added
self.rev2_mod_field_diff_ind = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHOPHO</subfield>
</datafield>
<datafield tag ="970" ind1="C" ind2="P">
<subfield code="a">0003719XYZOXYZO</subfield>
</datafield>
</record>"""
# Modify Record Patch Output
self.patch_diff_ind = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="970" ind1="C" ind2="P">
<subfield code="a">0003719XYZOXYZO</subfield>
</datafield>
</record>"""
# Scenario 3 - 970__ deleted and 970CP added to existing record
# Rev 2 Update -- Rev2 ~ tag 970CP Added
self.rev2_mod_del_one_add_one = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="970" ind1="C" ind2="P">
<subfield code="a">0003719XYZOXYZO</subfield>
</datafield>
</record>"""
# Modify Record Patch Output - 1st possibility
self.patch_del_one_add_one = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="0">__DELETED_FIELDS__</subfield>
</datafield>
<datafield tag ="970" ind1="C" ind2="P">
<subfield code="a">0003719XYZOXYZO</subfield>
</datafield>
</record>"""
# Modify Record Patch Output - 2nd possibility
self.patch_del_one_add_one_2 = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="970" ind1="C" ind2="P">
<subfield code="a">0003719XYZOXYZO</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="0">__DELETED_FIELDS__</subfield>
</datafield>
</record>"""
self.rev2_mod_field = self.rev2_mod_field.replace(
'123456789', \
self.data['id'][0])
self.rev2_mod_field = self.rev2_mod_field.replace(
'20110101000000.0', \
self.data['rev2'][1])
self.patch = self.patch.replace('123456789', self.data['id'][0])
self.rev2_mod_field_diff_ind = self.rev2_mod_field_diff_ind.replace(
'123456789', \
self.data['id'][0])
self.rev2_mod_field_diff_ind = self.rev2_mod_field_diff_ind.replace(
'20110101000000.0', \
self.data['rev2'][1])
self.patch_diff_ind = self.patch_diff_ind.replace('123456789', self.data['id'][0])
self.rev2_mod_del_one_add_one = self.rev2_mod_del_one_add_one.replace(
'123456789', \
self.data['id'][0])
self.rev2_mod_del_one_add_one = self.rev2_mod_del_one_add_one.replace(
'20110101000000.0', \
self.data['rev2'][1])
self.patch_del_one_add_one = self.patch_del_one_add_one.replace('123456789', self.data['id'][0])
self.patch_del_one_add_one_2 = self.patch_del_one_add_one_2.replace('123456789', self.data['id'][0])
def test_modified_fields(self):
""" BibUpload Revision Verifier - Rev3-100/970/888, Modified 970 in Rev2(100/970), Patch Generated for 970"""
upload_recs = xml_marc_to_records(self.rev2_mod_field)
orig_recs = xml_marc_to_records(self.data['rev3'][0])
rev_verifier = RevisionVerifier()
(opt_mode, patch, dummy_affected_tags) = rev_verifier.verify_revision(
upload_recs[0], \
orig_recs[0], \
'replace')
self.assertEqual('correct', opt_mode)
self.assertEqual(compare_xmbuffers(record_xml_output(patch), self.patch), '')
def test_correcting_added_field_with_diff_ind(self):
""" BibUpload Revision Verifier - Rev3-100/970__/888, Added 970CP in Rev2(100/970__), Patch Generated for 970CP"""
upload_recs = xml_marc_to_records(self.rev2_mod_field_diff_ind)
orig_recs = xml_marc_to_records(self.data['rev3'][0])
rev_verifier = RevisionVerifier()
(opt_mode, patch, dummy_affected_tags) = rev_verifier.verify_revision(
upload_recs[0], \
orig_recs[0], \
'replace')
self.assertEqual('correct', opt_mode)
self.assertEqual(compare_xmbuffers(record_xml_output(patch), self.patch_diff_ind), '')
def test_correcting_del_field_add_field_diff_ind(self):
""" BibUpload Revision Verifier - Rev3-100/970__/888, Deleted 970__ and Added 970CP in Rev2(100/970__), Patch Generated for 970__/970CP"""
upload_recs = xml_marc_to_records(self.rev2_mod_del_one_add_one)
orig_recs = xml_marc_to_records(self.data['rev3'][0])
rev_verifier = RevisionVerifier()
(opt_mode, patch, dummy_affected_tags) = rev_verifier.verify_revision(
upload_recs[0], \
orig_recs[0], \
'replace')
self.assertEqual('correct', opt_mode)
#NOTE:for multiple fields in patch it is better to compare with different possible patch strings
#This is due to unsorted key-value pairs of generated patch dictionary
#self.assertEqual(compare_xmbuffers(record_xml_output(patch), self.patch_del_one_add_one), '')
self.failUnless((compare_xmbuffers(record_xml_output(patch), self.patch_del_one_add_one)!='') \
or (compare_xmbuffers(record_xml_output(patch), self.patch_del_one_add_one_2)!=''))
class RevisionVerifierForConflictingModification(GenericBibUploadTest):
"""
Test Cases for Revision Verifier when fields modified are conflicting.
Scenarios:
* Fields modified in both Upload Record and Original Record
* Fields modified in Upload record but deleted from Original Record
"""
def setUp(self):
""" Sets up sample records for Modified Fields Scenarios."""
GenericBibUploadTest.setUp(self)
self.data = init_test_records()
# Rev 2 Update -- Rev2 ~ tag 970 Modified
self.rev2_mod_field = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PZOOPZOO</subfield>
</datafield>
</record>"""
# Rev 3 MOdified = Rev3 ~ Tag 970 modified - Conflict with Rev2-Update
self.rev3_mod = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHYPHY</subfield>
</datafield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>"""
# Rev 3 MOdified = Rev3 ~ Tag 970 Deleted - Conflict with Rev2-Update
self.rev3_deleted = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>"""
self.rev2_mod_field = self.rev2_mod_field.replace('123456789', \
self.data['id'][0])
self.rev2_mod_field = self.rev2_mod_field.replace('20110101000000.0', \
self.data['rev2'][1])
self.rev3_mod = self.rev3_mod.replace('123456789', self.data['id'][0])
self.rev3_mod = self.rev3_mod.replace('20110101000000.0', \
self.data['rev3'][1])
self.rev3_deleted = self.rev3_deleted.replace('123456789', \
self.data['id'][0])
self.rev3_deleted = self.rev3_deleted.replace('20110101000000.0', \
self.data['rev3'][1])
def test_conflicting_modified_field(self):
""" BibUpload Revision Verifier - Rev3-100/970/888, Modified 970 in Rev2(100/970), 970 modified in Rev3, Conflict Expected"""
upload_conf_recs = xml_marc_to_records(self.rev2_mod_field)
orig_recs = xml_marc_to_records(self.rev3_mod)
rev_verifier = RevisionVerifier()
self.assertRaises(
InvenioBibUploadConflictingRevisionsError, \
rev_verifier.verify_revision, \
upload_conf_recs[0], \
orig_recs[0], \
'replace')
def test_conflicting_deleted_field(self):
""" BibUpload Revision Verifier - Rev3-100/970/888, Modified 970 in Rev2(100/970), 970 removed in Rev3, Conflict Expected"""
upload_conf_recs = xml_marc_to_records(self.rev2_mod_field)
orig_recs = xml_marc_to_records(self.rev3_deleted)
rev_verifier = RevisionVerifier()
self.assertRaises(
InvenioBibUploadConflictingRevisionsError, \
rev_verifier.verify_revision, \
upload_conf_recs[0], \
orig_recs[0], \
'replace')
class RevisionVerifierForDeletingFields(GenericBibUploadTest):
"""
Test Cases for Revision Verifier when fields are to be deleted from upload record.
Scenarios:
* Fields modified in both Upload Record and Original Record
* Fields modified in Upload record but deleted from Original Record
"""
def setUp(self):
""" Sets up sample records for Modified Fields Scenarios."""
GenericBibUploadTest.setUp(self)
# Rev 1
self.rev1 = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="300" ind1=" " ind2=" ">
<subfield code="a">Test, Field-1</subfield>
</datafield>
<datafield tag ="300" ind1=" " ind2=" ">
<subfield code="a">Test, Field-2</subfield>
</datafield>
<datafield tag ="300" ind1="C" ind2="P">
<subfield code="a">Test, Field-3</subfield>
</datafield>
</record>"""
# Rev 1 -- To Replace
self.rev1_mod = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
</record>"""
# Patch with SPECIAL DELETE FIELD-1
self.patch_1 = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="300" ind1=" " ind2=" ">
<subfield code="0">__DELETE_FIELDS__</subfield>
</datafield>
<datafield tag ="300" ind1="C" ind2="P">
<subfield code="0">__DELETE_FIELDS__</subfield>
</datafield>
</record>"""
# Patch with SPECIAL DELETE FIELD-2
self.patch_2 = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="300" ind1="C" ind2="P">
<subfield code="0">__DELETE_FIELDS__</subfield>
</datafield>
<datafield tag ="300" ind1=" " ind2=" ">
<subfield code="0">__DELETE_FIELDS__</subfield>
</datafield>
</record>"""
self.rev_to_insert = self.rev1.replace('<controlfield tag="001">123456789</controlfield>', '')
self.rev_to_insert = self.rev_to_insert.replace('<controlfield tag="005">20110101000000.0</controlfield>','')
rec = xml_marc_to_records(self.rev_to_insert)
dummy_error, self.recid, dummy_msg = bibupload(rec[0], opt_mode='insert')
self.check_record_consistency(self.recid)
self.rev1 = self.rev1.replace('123456789', str(self.recid))
self.rev1_mod = self.rev1_mod.replace('123456789', str(self.recid))
self.patch_1 = self.patch_1.replace('123456789', str(self.recid))
self.patch_2 = self.patch_2.replace('123456789', str(self.recid))
record = get_record(self.recid)
rev = record_get_field_value(record, '005')
self.rev1 = self.rev1.replace('20110101000000.0', rev)
self.rev1_mod = self.rev1_mod.replace('20110101000000.0', rev)
def test_for_special_delete_field(self):
""" BibUpload Revision Verifier - Rev1-100/300, Modified 100 in Rev1-Mod, Deleted 300 in Rev1-Mod (100/300), Patch for DELETE generated"""
upload_rec = xml_marc_to_records(self.rev1_mod)
orig_rec = xml_marc_to_records(self.rev1)
rev_verifier = RevisionVerifier()
(opt_mode, final_patch, dummy_affected_tags) = rev_verifier.verify_revision(upload_rec[0], \
orig_rec[0], \
'replace')
self.assertEqual('correct', opt_mode)
self.failUnless((compare_xmbuffers(self.patch_1, record_xml_output(final_patch))!='') or \
(compare_xmbuffers(self.patch_2, record_xml_output(final_patch))!=''))
class RevisionVerifierForInterchangedFields(GenericBibUploadTest):
"""
Contains Test Cases for Re-ordered Fields.
Scenarios include:
* Same set of fields but in different order
"""
def setUp(self):
""" Sets up sample records for Modified Fields Scenarios."""
GenericBibUploadTest.setUp(self)
# Rev 1 -- 100-1/100-2/100-3
self.rev1 = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester1, T</subfield>
<subfield code="u">DESY1</subfield>
</datafield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester2, T</subfield>
<subfield code="u">DESY2</subfield>
</datafield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester3, T</subfield>
<subfield code="u">DESY3</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHYPHY</subfield>
</datafield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>"""
# Rev 1 Modified -- 100-2/100-3/100-1
self.rev1_mod = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester2, T</subfield>
<subfield code="u">DESY2</subfield>
</datafield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester3, T</subfield>
<subfield code="u">DESY3</subfield>
</datafield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester1, T</subfield>
<subfield code="u">DESY1</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PHYPHY</subfield>
</datafield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>"""
self.patch = """<record>
<controlfield tag="001">123456789</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester2, T</subfield>
<subfield code="u">DESY2</subfield>
</datafield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester3, T</subfield>
<subfield code="u">DESY3</subfield>
</datafield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester1, T</subfield>
<subfield code="u">DESY1</subfield>
</datafield>
</record>"""
insert_record = self.rev1.replace(
'<controlfield tag="001">123456789</controlfield>', '')
insert_record = insert_record.replace(
'<controlfield tag="005">20110101000000.0</controlfield>', '')
recs = xml_marc_to_records(insert_record)
# --> Revision 1 submitted
res = bibupload(recs[0], opt_mode='insert')
self.recid = res[1]
self.check_record_consistency(self.recid)
rec = get_record(self.recid)
rev_tag = record_get_field_value(rec, '005', '', '')
# update the test data
self.rev1 = self.rev1.replace('123456789', str(self.recid))
self.rev1 = self.rev1.replace('20110101000000.0', rev_tag)
self.rev1_mod = self.rev1_mod.replace('123456789', str(self.recid))
self.rev1_mod = self.rev1_mod.replace('20110101000000.0', rev_tag)
self.patch = self.patch.replace('123456789', str(self.recid))
def test_interchanged_fields(self):
""" BibUpload Revision Verifier - Rev1--100-1/100-2/100-3/970/888, Rev1-Up--100-2/100-3/100-1/970/888, Patch Generated for 100"""
upload_recs = xml_marc_to_records(self.rev1_mod)
orig_recs = xml_marc_to_records(self.rev1)
rev_verifier = RevisionVerifier()
(opt_mode, patch, dummy_affected_tags) = rev_verifier.verify_revision(
upload_recs[0], \
orig_recs[0], \
'replace')
self.assertEqual('correct', opt_mode)
self.assertEqual(compare_xmbuffers(record_xml_output(patch), self.patch), '')
class RevisionVerifierForCommonCases(GenericBibUploadTest):
"""
Contains Test Cases for Common Scenarios.
Scenarios include :
* Invalid Revision
* Invalide Opt_Mode value
* Missing Revision in Upload Record
"""
def setUp(self):
""" Set up all the sample records required for Test Cases."""
GenericBibUploadTest.setUp(self)
self.data = init_test_records()
# Rev 2 Update -- Rev2 ~ tag 970 Modified
self.rev2_modified = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester, T</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="970" ind1=" " ind2=" ">
<subfield code="a">0003719PZOOPZOO</subfield>
</datafield>
</record>"""
self.rev2_modified = self.rev2_modified.replace('123456789', \
self.data['id'][0])
def test_unchanged_record_upload(self):
""" BibUpload Revision Verifier - Uploading Unchanged Record, Raise UnchangedRecordError"""
upload_recs = xml_marc_to_records(self.data['rev3'][0])
orig_recs = xml_marc_to_records(self.data['rev3'][0])
rev_verifier = RevisionVerifier()
self.assertRaises(InvenioBibUploadUnchangedRecordError, \
rev_verifier.verify_revision, \
upload_recs[0], \
orig_recs[0], \
'replace')
def test_missing_revision(self):
""" BibUpload Revision Verifier - Missing 005 Tag scenario, Raise Missing005Error."""
self.rev2_modified = self.rev2_modified.replace(
'<controlfield tag="005">20110101000000.0</controlfield>', \
'')
upload_recs = xml_marc_to_records(self.rev2_modified)
orig_recs = xml_marc_to_records(self.data['rev3'][0])
rev_verifier = RevisionVerifier()
self.assertRaises(InvenioBibUploadMissing005Error, \
rev_verifier.verify_revision, \
upload_recs[0], \
orig_recs[0], \
'replace')
def test_invalid_operation(self):
""" BibUpload Revision Verifier - Incorrect opt_mode parameter."""
upload_recs = xml_marc_to_records(self.rev2_modified)
orig_recs = xml_marc_to_records(self.data['rev3'][0])
rev_verifier = RevisionVerifier()
for item in ['append', 'format', 'insert', 'delete', 'reference']:
self.assertEqual(rev_verifier.verify_revision(
upload_recs[0], \
orig_recs[0], \
item), None)
def test_invalid_revision(self):
""" BibUpload Revision Verifier - Wrong Revision in the Upload Record, Raise InvalidRevisionError"""
self.rev2_modified = self.rev2_modified.replace(
'<controlfield tag="005">20110101000000.0</controlfield>', \
'<controlfield tag="005">20110101020304.0</controlfield>')
rev_verifier = RevisionVerifier()
upload_recs = xml_marc_to_records(self.rev2_modified)
orig_recs = xml_marc_to_records(self.data['rev3'][0])
self.assertRaises(InvenioBibUploadInvalidRevisionError, \
rev_verifier.verify_revision, \
upload_recs[0], \
orig_recs[0], \
'replace')
class RevisionVerifierFromBibUpload(GenericBibUploadTest):
""" Test Case for End-to-End Bibupload with Revision Verifier module Enabled """
def setUp(self):
""" Set up all the sample records required for Test Cases."""
GenericBibUploadTest.setUp(self)
# Rev 1 -- To Insert
self.rev1 = """<record>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="870" ind1=" " ind2=" ">
<subfield code="a">3719PZOOPZOO</subfield>
</datafield>
</record>"""
# Rev 1 Modified -- To Replace
self.rev1_modified = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="870" ind1=" " ind2=" ">
<subfield code="a">3719PZOOPZOO_modified</subfield>
</datafield>
</record>"""
# Rev 2 Update -- Rev2
self.rev2 = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="870" ind1=" " ind2=" ">
<subfield code="a">3719PZOOPZOO</subfield>
</datafield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>"""
# Rev 2 MOdified -- Rev2 - 870 modified
self.rev2_modified = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="870" ind1=" " ind2=" ">
<subfield code="a">3719PZOOPZOO_another modification</subfield>
</datafield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>"""
self.final_xm = """<record>
<controlfield tag="001">123456789</controlfield>
<controlfield tag="005">20110101000000.0</controlfield>
<datafield tag ="100" ind1=" " ind2=" ">
<subfield code="a">Tester</subfield>
<subfield code="u">DESY</subfield>
</datafield>
<datafield tag ="870" ind1=" " ind2=" ">
<subfield code="a">3719PZOOPZOO_modified</subfield>
</datafield>
<datafield tag="888" ind1=" " ind2=" ">
<subfield code="a">dumb text</subfield>
</datafield>
</record>"""
def test_BibUpload_revision_verifier(self):
""" BibUpload Revision Verifier - Called from BibUpload Operation - Patch & Conflict Scenarios"""
recs = xml_marc_to_records(self.rev1)
# --> Revision 1 submitted
error, self.recid, dummy_msg = bibupload(recs[0], opt_mode='insert')
self.check_record_consistency(self.recid)
record = get_record(self.recid)
rev = record_get_field_value(record, '005', '', '')
recs = xml_marc_to_records(self.rev1)
self.rev2 = self.rev2.replace('123456789', str(self.recid))
self.rev2 = self.rev2.replace('20110101000000.0', rev)
self.rev1_modified = self.rev1_modified.replace('123456789', str(self.recid))
self.rev1_modified = self.rev1_modified.replace('20110101000000.0', rev)
self.final_xm = self.final_xm.replace('123456789', str(self.recid))
recs = xml_marc_to_records(self.rev1)
recs = xml_marc_to_records(self.rev2)
# --> Revision 2 submitted
error, self.recid, dummy_msg = bibupload(recs[0], opt_mode='replace')
self.check_record_consistency(self.recid)
record = get_record(self.recid)
self.rev2 = self.rev2.replace(rev, record_get_field_value(record, '005', '', ''))
self.rev2_modified = self.rev2_modified.replace('123456789', str(self.recid))
self.rev2_modified = self.rev2_modified.replace('20110101000000.0', record_get_field_value(record, '005', '', ''))
# --> Revision 1 modified submitted
recs = xml_marc_to_records(self.rev1_modified)
error, self.recid, dummy_msg = bibupload(recs[0], opt_mode='replace')
self.check_record_consistency(self.recid)
record = get_record(self.recid)
rev = record_get_field_value(record, '005', '', '')
self.final_xm = self.final_xm.replace('20110101000000.0', rev)
self.assertEqual(compare_xmbuffers(self.final_xm, print_record(self.recid, 'xm')), '')
# --> Revision 2 modified submitted
recs = xml_marc_to_records(self.rev2_modified)
error, self.recid, dummy_msg = bibupload(recs[0], opt_mode='replace')
self.check_record_consistency(self.recid)
self.assertEquals(error, 2)
class RevisionVerifierHistoryOfAffectedFields(GenericBibUploadTest):
"""Checks if column 'affected fields' from hstRECORD table
is filled correctly"""
def setUp(self):
GenericBibUploadTest.setUp(self)
self.data = init_test_records()
def test_inserted_record_with_no_affected_tags_in_hst(self):
"""Checks if inserted record has affected fields in hstRECORD table"""
query = "SELECT affected_fields from hstRECORD where id_bibrec=5 ORDER BY job_date DESC"
res = run_sql(query)
self.assertEqual(res[0][0], "")
def test_corrected_record_affected_tags(self):
"""Checks if corrected record has affected fields in hstRECORD table"""
query = "SELECT affected_fields from hstRECORD where id_bibrec=12 ORDER BY job_date DESC"
res = run_sql(query)
self.assertEqual(res[0][0], "005__%,8564_%,909C0%,909C1%,909C5%,909CO%,909CS%")
def test_append_to_record_affected_tags(self):
"""Checks if record with appended parts has proper affected fields in hstRECORD table"""
query = """SELECT affected_fields from hstRECORD where id_bibrec=%s
ORDER BY job_date DESC""" % self.data["id"][0]
res = run_sql(query)
self.assertEqual(res[0][0], '005__%,888__%')
self.assertEqual(res[1][0], '005__%,970__%')
self.assertEqual(res[2][0], '')
TEST_SUITE = make_test_suite(RevisionVerifierForCorrectAddition,
RevisionVerifierForCorrectModification,
RevisionVerifierForInterchangedFields,
RevisionVerifierForDeletingFields,
RevisionVerifierForConflictingAddition,
RevisionVerifierForConflictingModification,
RevisionVerifierForCommonCases,
RevisionVerifierHistoryOfAffectedFields)
if __name__ == '__main__':
run_test_suite(TEST_SUITE, warn_user=True)
diff --git a/invenio_demosite/testsuite/regression/test_solrutils.py b/invenio_demosite/testsuite/regression/test_solrutils.py
index 6460bfe1a..0e1881e73 100644
--- a/invenio_demosite/testsuite/regression/test_solrutils.py
+++ b/invenio_demosite/testsuite/regression/test_solrutils.py
@@ -1,404 +1,404 @@
## This file is part of Invenio.
## Copyright (C) 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
from invenio.testutils import InvenioTestCase
from invenio.config import CFG_SOLR_URL, CFG_SITE_URL, CFG_SITE_NAME
from invenio.testsuite import make_test_suite, \
run_test_suite, \
test_web_page_content, \
nottest
from invenio import intbitset
-from invenio.solrutils_bibindex_searcher import solr_get_bitset
+from invenio.legacy.miscutil.solrutils_bibindex_searcher import solr_get_bitset
from invenio.solrutils_bibrank_searcher import solr_get_ranked, solr_get_similar_ranked
from invenio.legacy.search_engine import get_collection_reclist
from invenio.legacy.bibrank.bridge_utils import get_external_word_similarity_ranker, \
get_logical_fields, \
get_tags, \
get_field_content_in_utf8
ROWS = 100
HITSETS = {
'Willnotfind': intbitset.intbitset([]),
'higgs': intbitset.intbitset([47, 48, 51, 52, 55, 56, 58, 68, 79, 85, 89, 96]),
'of': intbitset.intbitset([8, 10, 11, 12, 15, 43, 44, 45, 46, 47, 48, 49, 50, 51,
52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 64, 68, 74,
77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,
91, 92, 93, 94, 95, 96, 97]),
'"higgs boson"': intbitset.intbitset([55, 56]),
}
def get_topN(n, data):
res = dict()
for key, value in data.iteritems():
res[key] = value[-n:]
return res
class TestSolrSearch(InvenioTestCase):
"""Test for Solr search. Requires:
make install-solrutils
CFG_SOLR_URL set
fulltext index in idxINDEX containing 'SOLR' in indexer column
AND EITHER
Solr index built: ./bibindex -w fulltext for all records
OR
WRD method referring to Solr: <invenio installation>/etc/bibrank$ cp template_word_similarity_solr.cfg wrd.cfg
and ./bibrank -w wrd for all records
"""
def _get_result(self, query, index='fulltext'):
return solr_get_bitset(index, query)
@nottest
def test_get_bitset(self):
"""solrutils - search results"""
self.assertEqual(HITSETS['Willnotfind'], self._get_result('Willnotfind'))
self.assertEqual(HITSETS['higgs'], self._get_result('higgs'))
self.assertEqual(HITSETS['of'], self._get_result('of'))
self.assertEqual(HITSETS['"higgs boson"'], self._get_result('"higgs boson"'))
class TestSolrRanking(InvenioTestCase):
"""Test for Solr ranking. Requires:
make install-solrutils
CFG_SOLR_URL set
fulltext index in idxINDEX containing 'SOLR' in indexer column
AND EITHER
Solr index built: ./bibindex -w fulltext for all records
OR
WRD method referring to Solr: <invenio installation>/etc/bibrank$ cp template_word_similarity_solr.cfg wrd.cfg
and ./bibrank -w wrd for all records
"""
def _get_ranked_result_sequence(self, query, index='fulltext', rows=ROWS, hitset=None):
if hitset is None:
hitset=HITSETS[query]
ranked_result = solr_get_ranked('%s:%s' % (index, query), hitset, self._get_ranking_params(), rows)
return tuple([pair[0] for pair in ranked_result[0]])
def _get_ranked_topN(self, n):
return get_topN(n, self._RANKED)
_RANKED = {
'Willnotfind': tuple(),
'higgs': (79, 51, 55, 47, 56, 96, 58, 68, 52, 48, 89, 85),
'of': (50, 61, 60, 54, 56, 53, 10, 68, 44, 57, 83, 95, 92, 91, 74, 45, 48, 62, 82,
49, 51, 89, 90, 96, 43, 8, 64, 97, 15, 85, 78, 46, 55, 79, 84, 88, 81, 52,
58, 86, 11, 80, 93, 77, 12, 59, 87, 47, 94),
'"higgs boson"': (55, 56),
}
def _get_ranking_params(self, cutoff_amount=10000, cutoff_time=2000):
"""
Default values from template_word_similarity_solr.cfg
"""
return {
'cutoff_amount': cutoff_amount,
'cutoff_time_ms': cutoff_time
}
@nottest
def test_get_ranked(self):
"""solrutils - ranking results"""
all_ranked = 0
ranked_top = self._get_ranked_topN(all_ranked)
self.assertEqual(ranked_top['Willnotfind'], self._get_ranked_result_sequence(query='Willnotfind'))
self.assertEqual(ranked_top['higgs'], self._get_ranked_result_sequence(query='higgs'))
self.assertEqual(ranked_top['of'], self._get_ranked_result_sequence(query='of'))
self.assertEqual(ranked_top['"higgs boson"'], self._get_ranked_result_sequence(query='"higgs boson"'))
@nottest
def test_get_ranked_top(self):
"""solrutils - ranking top results"""
top_n = 0
self.assertEqual(tuple(), self._get_ranked_result_sequence(query='Willnotfind', rows=top_n))
self.assertEqual(tuple(), self._get_ranked_result_sequence(query='higgs', rows=top_n))
self.assertEqual(tuple(), self._get_ranked_result_sequence(query='of', rows=top_n))
self.assertEqual(tuple(), self._get_ranked_result_sequence(query='"higgs boson"', rows=top_n))
top_n = 2
ranked_top = self._get_ranked_topN(top_n)
self.assertEqual(ranked_top['Willnotfind'], self._get_ranked_result_sequence(query='Willnotfind', rows=top_n))
self.assertEqual(ranked_top['higgs'], self._get_ranked_result_sequence(query='higgs', rows=top_n))
self.assertEqual(ranked_top['of'], self._get_ranked_result_sequence(query='of', rows=top_n))
self.assertEqual(ranked_top['"higgs boson"'], self._get_ranked_result_sequence(query='"higgs boson"', rows=top_n))
top_n = 10
ranked_top = self._get_ranked_topN(top_n)
self.assertEqual(ranked_top['Willnotfind'], self._get_ranked_result_sequence(query='Willnotfind', rows=top_n))
self.assertEqual(ranked_top['higgs'], self._get_ranked_result_sequence(query='higgs', rows=top_n))
self.assertEqual(ranked_top['of'], self._get_ranked_result_sequence(query='of', rows=top_n))
self.assertEqual(ranked_top['"higgs boson"'], self._get_ranked_result_sequence(query='"higgs boson"', rows=top_n))
@nottest
def test_get_ranked_smaller_hitset(self):
"""solrutils - ranking smaller hitset"""
hitset = intbitset.intbitset([47, 56, 58, 68, 85, 89])
self.assertEqual((47, 56, 58, 68, 89, 85), self._get_ranked_result_sequence(query='higgs', hitset=hitset))
hitset = intbitset.intbitset([45, 50, 61, 74, 94])
self.assertEqual((50, 61, 74, 45, 94), self._get_ranked_result_sequence(query='of', hitset=hitset))
self.assertEqual((74, 45, 94), self._get_ranked_result_sequence(query='of', hitset=hitset, rows=3))
@nottest
def test_get_ranked_larger_hitset(self):
"""solrutils - ranking larger hitset"""
hitset = intbitset.intbitset([47, 56, 58, 68, 85, 89])
self.assertEqual(tuple(), self._get_ranked_result_sequence(query='Willnotfind', hitset=hitset))
hitset = intbitset.intbitset([47, 56, 55, 56, 58, 68, 85, 89])
self.assertEqual((55, 56), self._get_ranked_result_sequence(query='"higgs boson"', hitset=hitset))
class TestSolrSimilarToRecid(InvenioTestCase):
"""Test for Solr similar ranking. Requires:
make install-solrutils
CFG_SOLR_URL set
fulltext index in idxINDEX containing 'SOLR' in indexer column
WRD method referring to Solr: <invenio installation>/etc/bibrank$ cp template_word_similarity_solr.cfg wrd.cfg
./bibrank -w wrd for all records
"""
def _get_similar_result_sequence(self, recid, rows=ROWS):
similar_result = solr_get_similar_ranked(recid, self._all_records, self._get_similar_ranking_params(), rows)
return tuple([pair[0] for pair in similar_result[0]])[-rows:]
def _get_similar_topN(self, n):
return get_topN(n, self._SIMILAR)
_SIMILAR = {
30: (12, 95, 85, 82, 44, 1, 89, 64, 58, 15, 96, 61, 50, 86, 78, 77, 65, 62, 60,
47, 46, 100, 99, 102, 91, 80, 7, 5, 92, 88, 74, 57, 55, 108, 84, 81, 79, 54,
101, 11, 103, 94, 48, 83, 72, 63, 2, 68, 51, 53, 97, 93, 70, 45, 52, 14,
59, 6, 10, 32, 33, 29, 30),
59: (17, 69, 3, 20, 109, 14, 22, 33, 28, 24, 60, 6, 73, 113, 5, 107, 78, 4, 13,
8, 45, 72, 74, 46, 104, 63, 71, 44, 87, 70, 103, 92, 57, 49, 7, 88, 68, 77,
62, 10, 93, 2, 65, 55, 43, 94, 96, 1, 11, 99, 91, 61, 51, 15, 64, 97, 89, 101,
108, 80, 86, 90, 54, 95, 102, 47, 100, 79, 83, 48, 12, 81, 82, 58, 50, 56, 84,
85, 53, 52, 59)
}
def _get_similar_ranking_params(self, cutoff_amount=10000, cutoff_time=2000):
"""
Default values from template_word_similarity_solr.cfg
"""
return {
'cutoff_amount': cutoff_amount,
'cutoff_time_ms': cutoff_time,
'find_similar_to_recid': {
'more_results_factor': 5,
'mlt_fl': 'mlt',
'mlt_mintf': 0,
'mlt_mindf': 0,
'mlt_minwl': 0,
'mlt_maxwl': 0,
'mlt_maxqt': 25,
'mlt_maxntp': 1000,
'mlt_boost': 'false'
}
}
_all_records = get_collection_reclist(CFG_SITE_NAME)
@nottest
def test_get_similar_ranked(self):
"""solrutils - similar results"""
all_ranked = 0
similar_top = self._get_similar_topN(all_ranked)
recid = 30
self.assertEqual(similar_top[recid], self._get_similar_result_sequence(recid=recid))
recid = 59
self.assertEqual(similar_top[recid], self._get_similar_result_sequence(recid=recid))
@nottest
def test_get_similar_ranked_top(self):
"""solrutils - similar top results"""
top_n = 5
similar_top = self._get_similar_topN(top_n)
recid = 30
self.assertEqual(similar_top[recid], self._get_similar_result_sequence(recid=recid, rows=top_n))
recid = 59
self.assertEqual(similar_top[recid], self._get_similar_result_sequence(recid=recid, rows=top_n))
class TestSolrWebSearch(InvenioTestCase):
"""Test for webbased Solr search. Requires:
make install-solrutils
CFG_SOLR_URL set
fulltext index in idxINDEX containing 'SOLR' in indexer column
AND EITHER
Solr index built: ./bibindex -w fulltext for all records
OR
WRD method referring to Solr: <invenio installation>/etc/bibrank$ cp template_word_similarity_solr.cfg wrd.cfg
and ./bibrank -w wrd for all records
"""
@nottest
def test_get_result(self):
"""solrutils - web search results"""
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/search?of=id&p=fulltext%3AWillnotfind&rg=100',
expected_text="[]"))
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/search?of=id&p=fulltext%3Ahiggs&rg=100',
expected_text="[12, 47, 48, 51, 52, 55, 56, 58, 68, 79, 80, 81, 85, 89, 96]"))
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/search?of=id&p=fulltext%3Aof&rg=100',
expected_text="[8, 10, 11, 12, 15, 43, 44, 45, 46, 47, 48, 49, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 64, 68, 74, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97]"))
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/search?of=id&p=fulltext%3A%22higgs+boson%22&rg=100',
expected_text="[12, 47, 51, 55, 56, 68, 81, 85]"))
class TestSolrWebRanking(InvenioTestCase):
"""Test for webbased Solr ranking. Requires:
make install-solrutils
CFG_SOLR_URL set
fulltext index in idxINDEX containing 'SOLR' in indexer column
AND EITHER
Solr index built: ./bibindex -w fulltext for all records
OR
WRD method referring to Solr: <invenio installation>/etc/bibrank$ cp template_word_similarity_solr.cfg wrd.cfg
and ./bibrank -w wrd for all records
"""
@nottest
def test_get_ranked(self):
"""solrutils - web ranking results"""
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/search?of=id&p=fulltext%3AWillnotfind&rg=100&rm=wrd',
expected_text="[]"))
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/search?of=id&p=fulltext%3Ahiggs&rm=wrd',
expected_text="[12, 51, 79, 80, 81, 55, 47, 56, 96, 58, 68, 52, 48, 89, 85]"))
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/search?of=id&p=fulltext%3Ahiggs&rg=100&rm=wrd',
expected_text="[12, 80, 81, 79, 51, 55, 47, 56, 96, 58, 68, 52, 48, 89, 85]"))
# Record 77 is restricted
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/search?of=id&p=fulltext%3Aof&rm=wrd',
expected_text="[8, 10, 15, 43, 44, 45, 46, 48, 49, 51, 52, 53, 54, 55, 56, 57, 58, 60, 61, 62, 64, 68, 74, 78, 79, 81, 82, 83, 84, 85, 88, 89, 90, 91, 92, 95, 96, 97, 86, 11, 80, 93, 77, 12, 59, 87, 47, 94]",
username='admin'))
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/search?of=id&p=fulltext%3Aof&rg=100&rm=wrd',
expected_text="[61, 60, 54, 56, 53, 10, 68, 44, 57, 83, 95, 92, 91, 74, 45, 48, 62, 82, 49, 51, 89, 90, 96, 43, 8, 64, 97, 15, 85, 78, 46, 55, 79, 84, 88, 81, 52, 58, 86, 11, 80, 93, 77, 12, 59, 87, 47, 94]",
username='admin'))
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/search?of=id&p=fulltext%3A%22higgs+boson%22&rg=100&rm=wrd',
expected_text="[12, 47, 51, 68, 81, 85, 55, 56]"))
class TestSolrWebSimilarToRecid(InvenioTestCase):
"""Test for webbased Solr similar ranking. Requires:
make install-solrutils
CFG_SOLR_URL set
fulltext index in idxINDEX containing 'SOLR' in indexer column
WRD method referring to Solr: <invenio installation>/etc/bibrank$ cp template_word_similarity_solr.cfg wrd.cfg
./bibrank -w wrd for all records
"""
@nottest
def test_get_similar_ranked(self):
"""solrutils - web similar results"""
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/search?of=id&p=recid%3A30&rm=wrd',
expected_text="[1, 3, 4, 8, 9, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 34, 43, 44, 49, 50, 56, 58, 61, 64, 66, 67, 69, 71, 73, 75, 76, 77, 78, 82, 85, 86, 87, 89, 90, 95, 96, 98, 104, 107, 109, 113, 65, 62, 60, 47, 46, 100, 99, 102, 91, 80, 7, 5, 92, 88, 74, 57, 55, 108, 84, 81, 79, 54, 101, 11, 103, 94, 48, 83, 72, 63, 2, 68, 51, 53, 97, 93, 70, 45, 52, 14, 59, 6, 10, 32, 33, 29, 30]"))
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/search?of=id&p=recid%3A30&rg=100&rm=wrd',
expected_text="[3, 4, 8, 9, 13, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 34, 43, 49, 56, 66, 67, 69, 71, 73, 75, 76, 87, 90, 98, 104, 107, 109, 113, 12, 95, 85, 82, 44, 1, 89, 64, 58, 15, 96, 61, 50, 86, 78, 77, 65, 62, 60, 47, 46, 100, 99, 102, 91, 80, 7, 5, 92, 88, 74, 57, 55, 108, 84, 81, 79, 54, 101, 11, 103, 94, 48, 83, 72, 63, 2, 68, 51, 53, 97, 93, 70, 45, 52, 14, 59, 6, 10, 32, 33, 29, 30]"))
class TestSolrLoadLogicalFieldSettings(InvenioTestCase):
"""Test for loading Solr logical field settings. Requires:
make install-solrutils
CFG_SOLR_URL set
WRD method referring to Solr: <invenio installation>/etc/bibrank$ cp template_word_similarity_solr.cfg wrd.cfg
"""
@nottest
def test_load_logical_fields(self):
"""solrutils - load logical fields"""
self.assertEqual({'abstract': ['abstract'], 'author': ['author'], 'title': ['title'], 'keyword': ['keyword']},
get_logical_fields())
@nottest
def test_load_tags(self):
"""solrutils - load tags"""
self.assertEqual({'abstract': ['520__%'], 'author': ['100__a', '700__a'], 'title': ['245__%', '246__%'], 'keyword': ['6531_a']},
get_tags())
class TestSolrBuildFieldContent(InvenioTestCase):
"""Test for building Solr field content. Requires:
make install-solrutils
CFG_SOLR_URL set
WRD method referring to Solr: <invenio installation>/etc/bibrank$ cp template_word_similarity_solr.cfg wrd.cfg
"""
@nottest
def test_build_default_field_content(self):
"""solrutils - build default field content"""
tags = get_tags()
self.assertEqual(u'Ellis, J Enqvist, K Nanopoulos, D V',
get_field_content_in_utf8(18, 'author', tags))
self.assertEqual(u'Kahler manifolds gravitinos axions constraints noscale',
get_field_content_in_utf8(18, 'keyword', tags))
self.assertEqual(u'In 1962, CERN hosted the 11th International Conference on High Energy Physics. Among the distinguished visitors were eight Nobel prizewinners.Left to right: Cecil F. Powell, Isidor I. Rabi, Werner Heisenberg, Edwin M. McMillan, Emile Segre, Tsung Dao Lee, Chen Ning Yang and Robert Hofstadter.',
get_field_content_in_utf8(6, 'abstract', tags))
@nottest
def test_build_custom_field_content(self):
"""solrutils - build custom field content"""
tags = {'abstract': ['520__%', '590__%']}
self.assertEqual(u"""In 1962, CERN hosted the 11th International Conference on High Energy Physics. Among the distinguished visitors were eight Nobel prizewinners.Left to right: Cecil F. Powell, Isidor I. Rabi, Werner Heisenberg, Edwin M. McMillan, Emile Segre, Tsung Dao Lee, Chen Ning Yang and Robert Hofstadter. En 1962, le CERN est l'hote de la onzieme Conference Internationale de Physique des Hautes Energies. Parmi les visiteurs eminents se trouvaient huit laureats du prix Nobel.De gauche a droite: Cecil F. Powell, Isidor I. Rabi, Werner Heisenberg, Edwin M. McMillan, Emile Segre, Tsung Dao Lee, Chen Ning Yang et Robert Hofstadter.""",
get_field_content_in_utf8(6, 'abstract', tags))
TESTS = []
if CFG_SOLR_URL:
TESTS.extend((TestSolrSearch, TestSolrWebSearch))
if get_external_word_similarity_ranker() == 'solr':
TESTS.extend((TestSolrRanking,
TestSolrSimilarToRecid,
TestSolrWebRanking,
TestSolrWebSimilarToRecid,
TestSolrLoadLogicalFieldSettings,
TestSolrBuildFieldContent,
))
TEST_SUITE = make_test_suite(*TESTS)
if __name__ == "__main__":
run_test_suite(TEST_SUITE, warn_user=True)
diff --git a/invenio_demosite/testsuite/regression/test_webdeposit.py b/invenio_demosite/testsuite/regression/test_webdeposit.py
index e8ea40fb2..8a37cfab2 100644
--- a/invenio_demosite/testsuite/regression/test_webdeposit.py
+++ b/invenio_demosite/testsuite/regression/test_webdeposit.py
@@ -1,348 +1,348 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
from invenio.testsuite import make_test_suite, run_test_suite, \
InvenioTestCase, make_pdf_fixture
class TestWebDepositUtils(InvenioTestCase):
def clear_tables(self):
from invenio.modules.workflows.models import Workflow, BibWorkflowObject
from invenio.ext.sqlalchemy import db
dep_workflows = Workflow.get(
Workflow.module_name == "webdeposit"
).all()
for workflow in dep_workflows:
BibWorkflowObject.query.filter(
BibWorkflowObject.id_workflow == workflow.uuid
).delete()
Workflow.get(Workflow.module_name == "webdeposit").delete()
db.session.commit()
def setUp(self):
self.clear_tables()
super(TestWebDepositUtils, self).setUp()
def tearDown(self):
self.clear_tables()
super(TestWebDepositUtils, self).tearDown()
#
# Utility methods
#
def login_user(self, username='admin'):
from invenio.legacy.websession_model import User
from invenio.ext.login import login_user, current_user
user_id = User.query.filter_by(nickname=username).one().id
login_user(user_id)
assert user_id == current_user.get_id()
return user_id
#
# Tests
#
def test_deposit_files(self):
from flask import current_app, url_for
from invenio.modules.deposit.loader import \
deposition_metadata
from invenio.modules.workflows.models import Workflow
from invenio.webdeposit_utils import create_workflow, deposit_files, \
get_latest_or_new_workflow
user_id = self.login_user()
# Test for every deposition type
for deposition_type in deposition_metadata.keys():
workflow = create_workflow(deposition_type, user_id)
uuid = workflow.get_uuid()
pdffile = make_pdf_fixture("test.pdf")
with current_app.test_request_context(
url_for(
'webdeposit.upload_file', deposition_type=deposition_type,
uuid=uuid
),
method='POST',
data={
'file': pdffile,
'name': 'test.pdf', }):
deposit_files(user_id, deposition_type, uuid, preingest=True)
workflow = get_latest_or_new_workflow(deposition_type, user_id)
workflow.run()
draft = Workflow.get(
Workflow.id_user == user_id, Workflow.uuid == uuid
).one().extra_data['drafts'][0]
assert len(draft['form_values']['files']) == 1
filemeta = draft['form_values']['files'][0]
assert filemeta['name'] == 'test.pdf'
assert filemeta['content_type'] == 'application/pdf'
def test_workflow_creation(self):
from invenio.modules.deposit.loader import \
deposition_metadata
from invenio.modules.workflows.models import Workflow
from invenio.webdeposit_workflow import DepositionWorkflow
from invenio.webdeposit_utils import get_latest_or_new_workflow, \
get_workflow, delete_workflow, InvenioWebDepositNoDepositionType
user_id = self.login_user()
number_of_dep_types = len(deposition_metadata)
# Test for every deposition type
for deposition_type in deposition_metadata.keys():
# New workflow is created
workflow = get_latest_or_new_workflow(deposition_type,
user_id=user_id)
self.assertTrue(workflow is not None)
# The just created workflow is retrieved as latest
workflow2 = get_latest_or_new_workflow(deposition_type,
user_id=user_id)
self.assertTrue(workflow2 is not None)
self.assertEqual(str(workflow2.uuid), str(workflow.uuid))
# and also retrieved with its uuid
workflow = get_workflow(workflow.uuid, deposition_type)
self.assertTrue(workflow is not None)
# Test get_workflow function with random arguments
deposition_type = deposition_metadata.keys()[-1]
workflow = get_workflow('some_uuid_that_doesnt_exist', deposition_type)
self.assertTrue(workflow is None)
# Create workflow without using webdeposit_utils
workflow = DepositionWorkflow(deposition_type=deposition_type,
user_id=1)
self.assertRaises(InvenioWebDepositNoDepositionType,
get_workflow, workflow.get_uuid(),
'deposition_type_that_doesnt_exist')
# Test that the retrieved workflow is the same and not None
workflow2 = get_workflow(workflow.get_uuid(), deposition_type)
self.assertTrue(workflow2 is not None)
self.assertEqual(workflow2.get_uuid(), workflow.get_uuid())
# Check the number of created workflows
count_workflows = Workflow.get(
Workflow.module_name == "webdeposit"
).count()
self.assertEqual(count_workflows, number_of_dep_types + 1)
uuid = workflow.get_uuid()
delete_workflow(1, uuid)
workflow = get_workflow(uuid, deposition_type)
self.assertTrue(workflow is None)
def test_form_functions(self):
from invenio.modules.deposit.loader import \
deposition_metadata
from invenio.modules.deposit import forms
from invenio.webdeposit_workflow import DepositionWorkflow
from invenio.webdeposit_utils import get_form, \
get_form_status, set_form_status, CFG_DRAFT_STATUS
from invenio.modules.workflows.models import Workflow
for metadata in deposition_metadata.values():
for wf_function in metadata['workflow']:
if 'render_form' == wf_function.func_name:
break
user_id = self.login_user()
deposition_workflow = DepositionWorkflow(deposition_type='Article',
user_id=user_id)
uuid = deposition_workflow.get_uuid()
# Run the workflow to insert a form in the db
deposition_workflow.run()
# There is only one form in the db
workflows = Workflow.get(module_name='webdeposit')
assert len(workflows.all()) == 1
assert len(workflows[0].extra_data['drafts']) == 1
# Test that guest user doesn't have access to the form
form = get_form(0, uuid=uuid)
assert form is None
# Test that the current form has the right type
form = get_form(user_id, uuid=deposition_workflow.get_uuid())
assert isinstance(form, forms['ArticleForm'])
assert str(uuid) == str(deposition_workflow.get_uuid())
# Test that form is returned with get_form function
form = get_form(user_id, deposition_workflow.get_uuid())
assert form is not None
form = get_form(user_id, deposition_workflow.get_uuid(), step=0)
assert form is not None
# Second step doesn't have a form
form = get_form(user_id, deposition_workflow.get_uuid(), step=1)
assert form is None
form_status = get_form_status(user_id, deposition_workflow.get_uuid())
assert form_status == CFG_DRAFT_STATUS['unfinished']
form_status = get_form_status(user_id, deposition_workflow.get_uuid(),
step=2)
assert form_status is None
set_form_status(user_id, uuid, CFG_DRAFT_STATUS['finished'])
form_status = get_form_status(user_id, deposition_workflow.get_uuid())
assert form_status == CFG_DRAFT_STATUS['finished']
def test_field_functions(self):
from invenio.webdeposit_workflow import DepositionWorkflow
from invenio.webdeposit_utils import draft_field_get, draft_field_set
user_id = self.login_user()
workflow = DepositionWorkflow(deposition_type='Article',
user_id=user_id)
workflow.run() # Insert a form
uuid = workflow.get_uuid()
# Test for a field that's not there
value = draft_field_get(user_id, uuid, 'field_that_doesnt_exist')
self.assertTrue(value is None)
# Test for a field that hasn't been inserted in db yet
value = draft_field_get(user_id, uuid, 'publisher')
self.assertTrue(value is None)
draft_field_set(user_id, uuid, 'publisher',
'Test Publishers Association')
value = draft_field_get(user_id, uuid, 'publisher')
self.assertTrue(value is 'Test Publishers Association')
def test_record_creation(self):
import os
from wtforms import TextAreaField
from datetime import datetime
from invenio.legacy.search_engine import record_exists
from invenio.cache import cache
from invenio.config import CFG_PREFIX
from invenio.modules.workflows.models import Workflow
from invenio.bibworkflow_config import CFG_WORKFLOW_STATUS
- from invenio.bibsched_model import SchTASK
+ from invenio.legacy.bibsched.scripts.bibsched_model import SchTASK
from invenio.webdeposit_utils import get_form, create_workflow, \
set_form_status, CFG_DRAFT_STATUS
from invenio.modules.deposit.loader import \
deposition_metadata
from invenio.webdeposit_workflow_utils import \
create_record_from_marc
from invenio.legacy.bibfield import get_record
user_id = self.login_user()
for deposition_type in deposition_metadata.keys():
deposition = create_workflow(deposition_type, user_id)
assert deposition is not None
# Check if deposition creates a record
create_rec = create_record_from_marc()
function_exists = False
for workflow_function in deposition.workflow:
if create_rec.func_code == workflow_function .func_code:
function_exists = True
if not function_exists:
# if a record is not created,
#continue with the next deposition
continue
uuid = deposition.get_uuid()
cache.delete_many("1:current_deposition_type", "1:current_uuid")
cache.add("1:current_deposition_type", deposition_type)
cache.add("1:current_uuid", uuid)
# Run the workflow
deposition.run()
# Create form's json based on the field name
form = get_form(user_id, uuid=uuid)
webdeposit_json = {}
# Fill the json with dummy data
for field in form:
if isinstance(field, TextAreaField):
# If the field is associated with a marc field
if field.has_recjson_key() or field.has_cook_function():
webdeposit_json[field.name] = "test " + field.name
draft = dict(form_type=form.__class__.__name__,
form_values=webdeposit_json,
step=0, # dummy step
status=CFG_DRAFT_STATUS['finished'],
timestamp=str(datetime.now()))
# Add a draft for the first step
Workflow.set_extra_data(user_id=user_id, uuid=uuid,
key='drafts', value={0: draft})
workflow_status = CFG_WORKFLOW_STATUS.RUNNING
while workflow_status != CFG_WORKFLOW_STATUS.COMPLETED:
# Continue workflow
deposition.run()
set_form_status(user_id, uuid, CFG_DRAFT_STATUS['finished'])
workflow_status = deposition.get_status()
# Workflow is finished. Test if record is created
recid = deposition.get_data('recid')
assert recid is not None
# Test that record id exists
assert record_exists(recid) == 1
# Test that the task exists
task_id = deposition.get_data('task_id')
assert task_id is not None
bibtask = SchTASK.query.filter(SchTASK.id == task_id).first()
assert bibtask is not None
# Run bibupload, bibindex, webcoll manually
cmd = "%s/bin/bibupload %s" % (CFG_PREFIX, task_id)
assert not os.system(cmd)
rec = get_record(recid)
marc = rec.legacy_export_as_marc()
for field in form:
if isinstance(field, TextAreaField):
# If the field is associated with a marc field
if field.has_recjson_key() or field.has_cook_function():
assert "test " + field.name in marc
TEST_SUITE = make_test_suite(TestWebDepositUtils)
if __name__ == "__main__":
run_test_suite(TEST_SUITE)
diff --git a/invenio_demosite/testsuite/regression/test_webstyle.py b/invenio_demosite/testsuite/regression/test_webstyle.py
index f1d4f5bb7..242279e83 100644
--- a/invenio_demosite/testsuite/regression/test_webstyle.py
+++ b/invenio_demosite/testsuite/regression/test_webstyle.py
@@ -1,167 +1,167 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebStyle Regression Test Suite."""
__revision__ = "$Id$"
import httplib
import os
import urlparse
import mechanize
from flask import url_for
from urllib2 import urlopen, HTTPError
from invenio.config import CFG_SITE_URL, CFG_SITE_SECURE_URL, CFG_PREFIX, CFG_DEVEL_SITE
from invenio.testsuite import InvenioTestCase, make_test_suite, run_test_suite, nottest
def get_final_url(url):
"""Perform a GET request to the given URL, discarding the result and return
the final one in case of redirections"""
response = urlopen(url)
response.read()
return response.url
class WebStyleWSGIUtilsTests(InvenioTestCase):
"""Test WSGI Utils."""
if CFG_DEVEL_SITE:
def test_iteration_over_posted_file(self):
"""webstyle - posting a file via form upload"""
path = os.path.join(CFG_PREFIX, 'lib', 'webtest', 'invenio', 'test.gif')
body = open(path).read()
br = mechanize.Browser()
br.open(CFG_SITE_URL + '/httptest/post1').read()
br.select_form(nr=0)
br.form.add_file(open(path))
body2 = br.submit().read()
self.assertEqual(body, body2, "Body sent differs from body received")
pass
if CFG_DEVEL_SITE:
def test_posting_file(self):
"""webstyle - direct posting of a file"""
- from invenio.bibdocfile import calculate_md5
+ from invenio.legacy.bibdocfile.api import calculate_md5
path = os.path.join(CFG_PREFIX, 'lib', 'webtest', 'invenio', 'test.gif')
body = open(path).read()
md5 = calculate_md5(path)
mimetype = 'image/gif'
connection = httplib.HTTPConnection(urlparse.urlsplit(CFG_SITE_URL)[1])
connection.request('POST', '/httptest/post2', body, {'Content-MD5': md5, 'Content-Type': mimetype, 'Content-Disposition': 'filename=test.gif'})
response = connection.getresponse()
body2 = response.read()
self.assertEqual(body, body2, "Body sent differs from body received")
class WebStyleGotoTests(InvenioTestCase):
"""Test the goto framework"""
def tearDown(self):
from invenio.modules.redirector.api import drop_redirection
drop_redirection('first_record')
drop_redirection('invalid_external')
drop_redirection('latest_article')
drop_redirection('latest_pdf_article')
def test_plugin_availability(self):
"""webstyle - test GOTO plugin availability"""
from invenio.modules.redirector.api import CFG_GOTO_PLUGINS
self.failUnless('goto_plugin_simple' in CFG_GOTO_PLUGINS)
self.failUnless('goto_plugin_latest_record' in CFG_GOTO_PLUGINS)
self.failUnless('goto_plugin_cern_hr_documents' in CFG_GOTO_PLUGINS)
self.failIf(CFG_GOTO_PLUGINS.get_broken_plugins())
def test_simple_relative_redirection(self):
"""webstyle - test simple relative redirection via goto_plugin_simple"""
from invenio.modules.redirector.api import register_redirection
register_redirection('first_record', 'goto_plugin_simple', parameters={'url': '/record/1'})
self.assertEqual(get_final_url(CFG_SITE_URL + '/goto/first_record'), CFG_SITE_URL + '/record/1')
def test_simple_absolute_redirection(self):
"""webstyle - test simple absolute redirection via goto_plugin_simple"""
from invenio.modules.redirector.api import register_redirection
register_redirection('first_record', 'goto_plugin_simple', parameters={'url': CFG_SITE_URL + '/record/1'})
self.assertEqual(get_final_url(CFG_SITE_URL + '/goto/first_record'), CFG_SITE_URL + '/record/1')
def test_simple_absolute_redirection_https(self):
"""webstyle - test simple absolute redirection to https via goto_plugin_simple"""
from invenio.modules.redirector.api import register_redirection
register_redirection('first_record', 'goto_plugin_simple', parameters={'url': CFG_SITE_SECURE_URL + '/record/1'})
self.assertEqual(get_final_url(CFG_SITE_URL + '/goto/first_record'), CFG_SITE_SECURE_URL + '/record/1')
def test_invalid_external_redirection(self):
"""webstyle - test simple absolute redirection to https via goto_plugin_simple"""
from invenio.modules.redirector.api import register_redirection
register_redirection('invalid_external', 'goto_plugin_simple', parameters={'url': 'http://www.google.com'})
self.assertRaises(HTTPError, get_final_url, CFG_SITE_URL + '/goto/google')
def test_latest_article_redirection(self):
"""webstyle - test redirecting to latest article via goto_plugin_latest_record"""
from invenio.modules.redirector.api import register_redirection
register_redirection('latest_article', 'goto_plugin_latest_record', parameters={'cc': 'Articles'})
self.assertEqual(get_final_url(CFG_SITE_URL + '/goto/latest_article'), CFG_SITE_URL + '/record/128')
@nottest
def FIXME_TICKET_1293_test_latest_pdf_article_redirection(self):
"""webstyle - test redirecting to latest article via goto_plugin_latest_record"""
from invenio.modules.redirector.api import register_redirection
register_redirection('latest_pdf_article', 'goto_plugin_latest_record', parameters={'cc': 'Articles', 'format': '.pdf'})
self.assertEqual(get_final_url(CFG_SITE_URL + '/goto/latest_pdf_article'), CFG_SITE_URL + '/record/97/files/0002060.pdf')
@nottest
def FIXME_TICKET_1293_test_URL_argument_in_redirection(self):
"""webstyle - test redirecting while passing arguments on the URL"""
from invenio.modules.redirector.api import register_redirection
register_redirection('latest_article', 'goto_plugin_latest_record', parameters={'cc': 'Articles'})
self.assertEqual(get_final_url(CFG_SITE_URL + '/goto/latest_article?format=.pdf'), CFG_SITE_URL + '/record/97/files/0002060.pdf')
def test_updating_redirection(self):
"""webstyle - test updating redirection"""
from invenio.modules.redirector.api import register_redirection, update_redirection
register_redirection('first_record', 'goto_plugin_simple', parameters={'url': '/record/1'})
update_redirection('first_record', 'goto_plugin_simple', parameters={'url': '/record/2'})
self.assertEqual(get_final_url(CFG_SITE_URL + '/goto/first_record'), CFG_SITE_URL + '/record/2')
class WebInterfaceHandlerFlaskTest(InvenioTestCase):
"""Test webinterface handlers."""
def test_authenticated_decorator(self):
response = self.client.get(url_for('webmessage.index'),
base_url=CFG_SITE_SECURE_URL,
follow_redirects=True)
self.assert401(response)
self.login('admin', '')
response = self.client.get(url_for('webmessage.index'),
base_url=CFG_SITE_SECURE_URL,
follow_redirects=True)
self.assert200(response)
self.logout()
response = self.client.get(url_for('webmessage.index'),
base_url=CFG_SITE_SECURE_URL,
follow_redirects=True)
self.assert401(response)
TEST_SUITE = make_test_suite(WebStyleWSGIUtilsTests,
WebStyleGotoTests,
WebInterfaceHandlerFlaskTest)
if __name__ == "__main__":
run_test_suite(TEST_SUITE, warn_user=True)
diff --git a/invenio_demosite/testsuite/regression/test_websubmit.py b/invenio_demosite/testsuite/regression/test_websubmit.py
index fd4e455d7..9fbb50d25 100644
--- a/invenio_demosite/testsuite/regression/test_websubmit.py
+++ b/invenio_demosite/testsuite/regression/test_websubmit.py
@@ -1,278 +1,278 @@
# -*- coding: utf-8 -*-
##
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""WebSubmit Regression Test Suite."""
__revision__ = "$Id$"
import os
from logging import StreamHandler, DEBUG
from cStringIO import StringIO
from invenio.ext.logging import register_exception
from invenio.config import CFG_SITE_URL, CFG_PREFIX, CFG_TMPDIR, CFG_PATH_PDFTK
from invenio.base.wrappers import lazy_import
from invenio.testsuite import make_test_suite, run_test_suite, \
test_web_page_content, merge_error_messages, \
InvenioTestCase
from invenio.base.factory import with_app_context
websubmit_file_stamper = lazy_import('invenio.websubmit_file_stamper')
class WebSubmitWebPagesAvailabilityTest(InvenioTestCase):
"""Check WebSubmit web pages whether they are up or not."""
def test_submission_pages_availability(self):
"""websubmit - availability of submission pages"""
baseurl = CFG_SITE_URL + '/submit/'
_exports = ['', 'direct']
error_messages = []
for url in [baseurl + page for page in _exports]:
error_messages.extend(test_web_page_content(url))
if error_messages:
self.fail(merge_error_messages(error_messages))
return
def test_publiline_pages_availability(self):
"""websubmit - availability of approval pages"""
baseurl = CFG_SITE_URL
_exports = ['/approve.py', '/publiline.py',
'/yourapprovals.py']
error_messages = []
for url in [baseurl + page for page in _exports]:
error_messages.extend(test_web_page_content(url))
if error_messages:
self.fail(merge_error_messages(error_messages))
return
def test_your_submissions_pages_availability(self):
"""websubmit - availability of Your Submissions pages"""
baseurl = CFG_SITE_URL
_exports = ['/yoursubmissions.py']
error_messages = []
for url in [baseurl + page for page in _exports]:
error_messages.extend(test_web_page_content(url))
if error_messages:
self.fail(merge_error_messages(error_messages))
return
def test_help_page_availability(self):
"""websubmit - availability of WebSubmit help page"""
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/help/submit-guide',
expected_text="Submit Guide"))
class WebSubmitLegacyURLsTest(InvenioTestCase):
""" Check that the application still responds to legacy URLs"""
def test_legacy_help_page_link(self):
"""websubmit - legacy Submit Guide page link"""
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/help/submit',
expected_text="Submit Guide"))
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/help/submit/',
expected_text="Submit Guide"))
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/help/submit/index.en.html',
expected_text="Submit Guide"))
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/help/submit/access.en.html',
expected_text="Submit Guide"))
class WebSubmitXSSVulnerabilityTest(InvenioTestCase):
"""Test possible XSS vulnerabilities of the submission engine."""
def test_xss_in_submission_doctype(self):
"""websubmit - no XSS vulnerability in doctype parameter"""
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/submit?doctype=%3CSCRIPT%3Ealert%28%22XSS%22%29%3B%3C%2FSCRIPT%3E',
expected_text='Unable to find document type: &lt;SCRIPT&gt;alert("XSS")', username="jekyll",
password="j123ekyll"))
def test_xss_in_submission_act(self):
"""websubmit - no XSS vulnerability in act parameter"""
self.assertEqual([],
test_web_page_content(CFG_SITE_URL + '/submit?doctype=DEMOTHE&access=1_1&act=%3CSCRIPT%3Ealert%28%22XSS%22%29%3B%3C%2FSCRIPT%3E',
expected_text='Invalid doctype and act parameters', username="jekyll",
password="j123ekyll"))
def test_xss_in_submission_page(self):
"""websubmit - no XSS vulnerability in access parameter"""
self.assertEqual([],
test_web_page_content(CFG_SITE_URL +
'/submit?doctype=DEMOTHE&access=/../../../etc/passwd&act=SBI&startPg=1&ln=en&ln=en', expected_text='Invalid parameters', username="jekyll",
password="j123ekyll"))
self.assertEqual([],
test_web_page_content(CFG_SITE_URL +
'/submit?doctype=DEMOTHE&access=%3CSCRIPT%3Ealert%28%22XSS%22%29%3B%3C%2FSCRIPT%3E&act=SBI', expected_text='Invalid parameters', username="jekyll",
password="j123ekyll"))
@with_app_context()
def WebSubmitFileConverterTestGenerator():
- from invenio.websubmit_file_converter import get_conversion_map, can_convert
+ from invenio.legacy.websubmit.file_converter import get_conversion_map, can_convert
if can_convert('.odt', '.txt'):
## Special test for unoconv/LibreOffice
yield WebSubmitFileConverterTest(os.path.join(CFG_PREFIX, 'lib', 'webtest', 'invenio', 'test.odt'), '.odt', '.txt')
if can_convert('.doc', '.txt'):
## Special test for unoconv/LibreOffice
yield WebSubmitFileConverterTest(os.path.join(CFG_PREFIX, 'lib', 'webtest', 'invenio', 'test.doc'), '.doc', '.txt')
for from_format in get_conversion_map().keys():
input_file = os.path.join(CFG_PREFIX, 'lib', 'webtest', 'invenio', 'test%s' % from_format)
if not os.path.exists(input_file):
## Can't run such a test because there is no test example
continue
for to_format in get_conversion_map().keys():
if from_format == to_format:
continue
conversion_map = can_convert(from_format, to_format)
if conversion_map:
if [converter for converter in conversion_map if converter[0].__name__ == 'unoconv']:
## We don't want to test unoconv which is tested separately
continue
yield WebSubmitFileConverterTest(input_file, from_format, to_format)
class WebSubmitFileConverterTest(InvenioTestCase):
"""Test WebSubmit file converter tool"""
def __init__(self, input_file, from_format, to_format):
super(WebSubmitFileConverterTest, self).__init__('_run_test')
self.from_format = from_format
self.to_format = to_format
self.input_file = input_file
def setUp(self):
- from invenio.websubmit_file_converter import get_file_converter_logger
+ from invenio.legacy.websubmit.file_converter import get_file_converter_logger
logger = get_file_converter_logger()
self.log = StringIO()
logger.setLevel(DEBUG)
for handler in logger.handlers:
logger.removeHandler(handler)
handler = StreamHandler(self.log)
handler.setLevel(DEBUG)
logger.addHandler(handler)
def shortDescription(self):
return """websubmit - test %s to %s conversion""" % (self.from_format, self.to_format)
def _run_test(self):
- from invenio.websubmit_file_converter import InvenioWebSubmitFileConverterError, convert_file
+ from invenio.legacy.websubmit.file_converter import InvenioWebSubmitFileConverterError, convert_file
try:
tmpdir_snapshot1 = set(os.listdir(CFG_TMPDIR))
output_file = convert_file(self.input_file, output_format=self.to_format)
tmpdir_snapshot2 = set(os.listdir(CFG_TMPDIR))
tmpdir_snapshot2.discard(os.path.basename(output_file))
if not os.path.exists(output_file):
raise InvenioWebSubmitFileConverterError("output_file %s was not correctly created" % output_file)
if tmpdir_snapshot2 - tmpdir_snapshot1:
raise InvenioWebSubmitFileConverterError("Some temporary files were left over: %s" % (tmpdir_snapshot2 - tmpdir_snapshot1))
except Exception, err:
register_exception(alert_admin=True)
self.fail("ERROR: when converting from %s to %s: %s, the log contained: %s" % (self.from_format, self.to_format, err, self.log.getvalue()))
if CFG_PATH_PDFTK:
class WebSubmitStampingTest(InvenioTestCase):
"""Test WebSubmit file stamping tool"""
def test_stamp_coverpage(self):
"""websubmit - creation of a PDF cover page stamp (APIs)"""
file_stamper_options = { 'latex-template' : "demo-stamp-left.tex",
'latex-template-var' : {'REPORTNUMBER':'TEST-2010','DATE':'10/10/2000'},
'input-file' : CFG_PREFIX + "/lib/webtest/invenio/test.pdf",
'output-file' : "test-stamp-coverpage.pdf",
'stamp' : "coverpage",
'layer' : "foreground",
'verbosity' : 0,
}
try:
(stamped_file_path_only, stamped_file_name) = \
websubmit_file_stamper.stamp_file(file_stamper_options)
except:
self.fail("Stamping failed")
# Test that file is now bigger...
assert os.path.getsize(os.path.join(stamped_file_path_only,
stamped_file_name)) > 12695
def test_stamp_firstpage(self):
"""websubmit - stamping first page of a PDF (APIs)"""
file_stamper_options = { 'latex-template' : "demo-stamp-left.tex",
'latex-template-var' : {'REPORTNUMBER':'TEST-2010','DATE':'10/10/2000'},
'input-file' : CFG_PREFIX + "/lib/webtest/invenio/test.pdf",
'output-file' : "test-stamp-firstpage.pdf",
'stamp' : "first",
'layer' : "background",
'verbosity' : 0,
}
try:
(stamped_file_path_only, stamped_file_name) = \
websubmit_file_stamper.stamp_file(file_stamper_options)
except:
self.fail("Stamping failed")
# Test that file is now bigger...
assert os.path.getsize(os.path.join(stamped_file_path_only,
stamped_file_name)) > 12695
def test_stamp_allpages(self):
"""websubmit - stamping all pages of a PDF (APIs)"""
file_stamper_options = { 'latex-template' : "demo-stamp-left.tex",
'latex-template-var' : {'REPORTNUMBER':'TEST-2010','DATE':'10/10/2000'},
'input-file' : CFG_PREFIX + "/lib/webtest/invenio/test.pdf",
'output-file' : "test-stamp-allpages.pdf",
'stamp' : "all",
'layer' : "foreground",
'verbosity' : 0,
}
try:
(stamped_file_path_only, stamped_file_name) = \
websubmit_file_stamper.stamp_file(file_stamper_options)
except:
self.fail("Stamping failed")
# Test that file is now bigger...
assert os.path.getsize(os.path.join(stamped_file_path_only,
stamped_file_name)) > 12695
else:
## pdftk is not available. Disabling stamping-related
## regression tests.
class WebSubmitStampingTest(InvenioTestCase):
pass
TEST_SUITE = make_test_suite(WebSubmitWebPagesAvailabilityTest,
WebSubmitLegacyURLsTest,
WebSubmitXSSVulnerabilityTest,
WebSubmitStampingTest)
for test in WebSubmitFileConverterTestGenerator():
TEST_SUITE.addTest(test)
if __name__ == "__main__":
run_test_suite(TEST_SUITE, warn_user=True)
diff --git a/modules/bibauthorid/lib/Makefile.am b/modules/bibauthorid/lib/Makefile.am
index c8131d4a5..6dfe71e8e 100644
--- a/modules/bibauthorid/lib/Makefile.am
+++ b/modules/bibauthorid/lib/Makefile.am
@@ -1,55 +1,49 @@
##
## This file is part of Invenio.
## Copyright (C) 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir = $(libdir)/python/invenio
pylib_DATA = \
bibauthorid_cli.py \
bibauthorid_daemon.py \
bibauthorid_least_squares.py \
bibauthorid_personid_maintenance.py \
bibauthorid_scheduler.py \
bibauthorid_tortoise.py \
bibauthorid_cluster_set.py \
- bibauthorid_dbinterface.py \
bibauthorid_matrix_optimization.py \
bibauthorid_prob_matrix.py \
- bibauthorid_searchinterface.py \
bibauthorid_webapi.py \
bibauthorid_comparison.py \
bibauthorid_frontinterface.py \
bibauthorid_merge.py \
bibauthorid_rabbit.py \
- bibauthorid_string_utils.py \
bibauthorid_backinterface.py \
- bibauthorid_config.py \
- bibauthorid_general_utils.py \
- bibauthorid_name_utils.py \
bibauthorid_recipes.py \
bibauthorid_templates.py \
bibauthorid_wedge.py \
bibauthorid_webauthorprofileinterface.py
jsdir=$(localstatedir)/www/js
js_DATA = bibauthorid.js
EXTRA_DIST = $(pylib_DATA) \
$(js_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/bibcatalog/lib/Makefile.am b/modules/bibcatalog/lib/Makefile.am
index d49436bb9..976739101 100644
--- a/modules/bibcatalog/lib/Makefile.am
+++ b/modules/bibcatalog/lib/Makefile.am
@@ -1,26 +1,25 @@
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir = $(libdir)/python/invenio
-pylib_DATA = bibcatalog.py bibcatalog_templates.py bibcatalog_system.py \
- bibcatalog_system_rt.py bibcatalog_system_rt_unit_tests.py \
- bibcatalog_system_email.py bibcatalog_system_email_unit_tests.py
+pylib_DATA = bibcatalog_system_rt_unit_tests.py \
+ bibcatalog_system_email_unit_tests.py
EXTRA_DIST = $(pylib_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/bibcirculation/lib/Makefile.am b/modules/bibcirculation/lib/Makefile.am
index b9e7476de..84134aef5 100644
--- a/modules/bibcirculation/lib/Makefile.am
+++ b/modules/bibcirculation/lib/Makefile.am
@@ -1,35 +1,26 @@
## $Id: Makefile.am,v 1.4 2008/08/25 13:23:38 joaquim Exp $
## This file is part of Invenio.
## Copyright (C) 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir = $(libdir)/python/invenio
-pylib_DATA = bibcirculation.py \
- bibcirculation_cern_ldap.py \
- bibcirculation_config.py \
- bibcirculation_daemon.py \
- bibcirculation_dblayer.py \
- bibcirculation_regression_tests.py \
- bibcirculation_templates.py \
- bibcirculation_utils.py \
- bibcirculationadmin_webinterface.py \
- bibcirculationadminlib.py
+pylib_DATA = bibcirculation_regression_tests.py
EXTRA_DIST = $(pylib_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/bibconvert/etc/Makefile.am b/modules/bibconvert/etc/Makefile.am
index f6b4c0d51..624a5bed2 100644
--- a/modules/bibconvert/etc/Makefile.am
+++ b/modules/bibconvert/etc/Makefile.am
@@ -1,32 +1,31 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
kbdir = $(sysconfdir)/bibconvert/KB
kb_DATA = entdec-to-latin1.kb entdec-to-utf8.kb \
- enthex-to-latin1.kb enthex-to-utf8.kb \
- latex-to-unicode.kb
+ enthex-to-latin1.kb enthex-to-utf8.kb
bfxdir = $(sysconfdir)/bibconvert/config
bfx_DATA = oaidc2marcxml.bfx
xsldir = $(sysconfdir)/bibconvert/config
xsl_DATA = oaidc2marcxml.xsl oaimarc2marcxml.xsl oaiarxiv2marcxml.xsl \
oaidmf2marcxml.xsl authorlist2marcxml.xsl crossref2marcxml.xsl bibtex2marcxml.cfg
EXTRA_DIST = $(kb_DATA) $(bfx_DATA) $(xsl_DATA)
CLEANFILES = *~ *.tmp
diff --git a/modules/bibdocfile/lib/Makefile.am b/modules/bibdocfile/lib/Makefile.am
index 1ab4381e5..6b0804fc1 100644
--- a/modules/bibdocfile/lib/Makefile.am
+++ b/modules/bibdocfile/lib/Makefile.am
@@ -1,35 +1,30 @@
## This file is part of Invenio.
## Copyright (C) 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
bibdocfilepluginsdir = $(libdir)/python/invenio/bibdocfile_plugins
-bibdocfileplugins_DATA = bom_*.py
+bibdocfileplugins_DATA =
pylibdir = $(libdir)/python/invenio
-pylib_DATA = bibdocfile_config.py file.py \
- bibdocfile_templates.py \
- bibdocfile_managedocfiles.py \
- bibdocfile.py \
- bibdocfilecli.py \
- bibdocfile_regression_tests.py
+pylib_DATA = bibdocfile_regression_tests.py
-noinst_DATA = fulltext_files_migration_kit.py icon_migration_kit.py
+noinst_DATA =
EXTRA_DIST = $(pylib_DATA) $(noinst_DATA) $(bibdocfileplugins)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/bibedit/lib/Makefile.am b/modules/bibedit/lib/Makefile.am
index d5303ee8c..db7243e46 100644
--- a/modules/bibedit/lib/Makefile.am
+++ b/modules/bibedit/lib/Makefile.am
@@ -1,61 +1,58 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir = $(libdir)/python/invenio
webdir=$(localstatedir)/www/img
jsdir=$(localstatedir)/www/js
-pylib_DATA = bibedit_config.py \
- bibedit_dblayer.py \
- bibedit_engine.py \
+pylib_DATA = \
bibedit_unit_tests.py \
bibedit_regression_tests.py \
bibedit_templates.py \
bibeditcli.py \
- bibedit_utils.py \
bibedit_web_tests.py \
bibeditmulti_templates.py \
bibeditmulti_webinterface.py \
bibeditmulti_engine.py
js_DATA = bibedit_display.js \
bibedit_engine.js \
bibedit_keys.js \
bibedit_menu.js \
bibeditmulti.js \
bibedit_holdingpen.js \
marcxml.js \
bibedit_clipboard.js \
bibedit_template_interface.js \
bibedit_engine_unit_tests.js \
bibedit_engine_unit_tests.conf \
bibedit_refextract.js
web_DATA = bibedit.css
install-data-hook:
## Let's initialize an empty jquery.min.js, so that if the admin does not
## run make install-jquery-plugins, at least when jquery.min.js is
## downloaded by the client browser, no 404 error is raised.
@touch $(jsdir)/jquery.min.js
EXTRA_DIST = $(pylib_DATA) \
$(js_DATA) \
$(web_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/bibformat/etc/output_formats/Makefile.am b/modules/bibformat/etc/output_formats/Makefile.am
index 0054139bc..d64b4060f 100644
--- a/modules/bibformat/etc/output_formats/Makefile.am
+++ b/modules/bibformat/etc/output_formats/Makefile.am
@@ -1,36 +1,28 @@
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
etcdir = $(sysconfdir)/bibformat/output_formats
-etc_DATA = HA.bfo HB.bfo HC.bfo HD.bfo HM.bfo HP.bfo HS.bfo HX.bfo \
- HDM.bfo HDREF.bfo HDFILE.bfo HDACT.bfo \
- BSR.bfo \
- DCITE.bfo \
- EXCEL.bfo \
- MOBB.bfo MOBD.bfo \
- XD.bfo XE.bfo XE8X.bfo XM.bfo XN.bfo XP.bfo XR.bfo XW.bfo \
- XOAIDC.bfo XO.bfo XOAIMARC.bfo \
- WAPAFF.bfo
+etc_DATA =
tmpdir = $(prefix)/var/tmp
-tmp_DATA = TEST1.bfo TEST2.bfo TEST3.bfo
+tmp_DATA =
EXTRA_DIST = $(etc_DATA) $(tmp_DATA)
CLEANFILES = *.tmp
diff --git a/modules/bibindex/lib/Makefile.am b/modules/bibindex/lib/Makefile.am
index fb86ecb97..c40f5d15f 100644
--- a/modules/bibindex/lib/Makefile.am
+++ b/modules/bibindex/lib/Makefile.am
@@ -1,29 +1,27 @@
## This file is part of Invenio.
## Copyright (C) 2004, 2005, 2006, 2007, 2008, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
SUBDIRS = tokenizers
pylibdir = $(libdir)/python/invenio
-pylib_DATA = bibindex_engine.py bibindex_engine_config.py bibindex_engine_unit_tests.py \
- bibindexadminlib.py bibindex_engine_stemmer.py bibindex_engine_stopwords.py \
- bibindex_engine_stemmer_unit_tests.py bibindex_engine_stemmer_greek.py \
- bibindex_engine_tokenizer_unit_tests.py \
- bibindexadmin_regression_tests.py bibindex_engine_washer.py \
- bibindex_regression_tests.py bibindex_engine_utils.py
+pylib_DATA = bibindex_engine_unit_tests.py \
+ bibindex_engine_stemmer_unit_tests.py bibindex_engine_tokenizer_unit_tests.py \
+ bibindexadmin_regression_tests.py
+
EXTRA_DIST = $(pylib_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/bibsched/bin/Makefile.am b/modules/bibsched/bin/Makefile.am
index 11731f3d7..9c4cb29b4 100644
--- a/modules/bibsched/bin/Makefile.am
+++ b/modules/bibsched/bin/Makefile.am
@@ -1,22 +1,22 @@
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
-bin_SCRIPTS = bibsched bibtaskex bibtasklet
+bin_SCRIPTS =
-EXTRA_DIST = bibsched.in bibtaskex.in bibtasklet.in
+EXTRA_DIST =
CLEANFILES = *~ *.tmp
diff --git a/modules/bibsched/bin/bibsched.in b/modules/bibsched/bin/bibsched.in
index 558177fe6..064d3fdd1 100644
--- a/modules/bibsched/bin/bibsched.in
+++ b/modules/bibsched/bin/bibsched.in
@@ -1,34 +1,34 @@
#!@PYTHON@
## -*- mode: python; coding: utf-8; -*-
##
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""BibSched - task management, scheduling and executing system for Invenio
"""
__revision__ = "$Id$"
try:
from invenio.flaskshell import *
- from invenio.bibsched import main
+ from invenio.legacy.bibsched.scripts.bibsched import main
except ImportError, e:
print "Error: %s" % e
import sys
sys.exit(1)
main()
diff --git a/modules/bibsched/bin/bibtaskex.in b/modules/bibsched/bin/bibtaskex.in
index 527494a37..87e76b216 100644
--- a/modules/bibsched/bin/bibtaskex.in
+++ b/modules/bibsched/bin/bibtaskex.in
@@ -1,37 +1,37 @@
#!@PYTHON@
## -*- mode: python; coding: utf-8; -*-
##
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Invenio Bibliographic Task Example.
Demonstrates BibTask <-> BibSched connectivity, signal handling,
error handling, etc.
"""
__revision__ = "$Id$"
try:
from invenio.flaskshell import *
- from invenio.bibtaskex import main
+ from invenio.legacy.bibsched.bibtaskex import main
except ImportError, e:
print "Error: %s" % e
import sys
sys.exit(1)
main()
diff --git a/modules/bibsched/bin/bibtasklet.in b/modules/bibsched/bin/bibtasklet.in
index 187a7f815..f8e14d85a 100644
--- a/modules/bibsched/bin/bibtasklet.in
+++ b/modules/bibsched/bin/bibtasklet.in
@@ -1,35 +1,35 @@
#!@PYTHON@
## -*- mode: python; coding: utf-8; -*-
##
## This file is part of Invenio.
## Copyright (C) 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Invenio Bibliographic Tasklet BibTask.
This is a particular BibTask that execute tasklets, which can be any
function dropped into /opt/cds-invenio/lib/python/invenio/bibsched_tasklets/.
"""
try:
from invenio.flaskshell import *
- from invenio.bibtasklet import main
+ from invenio.legacy.bibsched.bibtasklet import main
except ImportError, e:
print "Error: %s" % e
import sys
sys.exit(1)
main()
diff --git a/modules/bibsched/lib/Makefile.am b/modules/bibsched/lib/Makefile.am
index bc4d09db7..bde8c9769 100644
--- a/modules/bibsched/lib/Makefile.am
+++ b/modules/bibsched/lib/Makefile.am
@@ -1,36 +1,31 @@
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
SUBDIRS = tasklets
pylibdir = $(libdir)/python/invenio
-pylib_DATA = \
- bibsched.py \
- bibtask.py \
- bibtaskex.py \
- bibtask_config.py \
- bibtasklet.py \
- bibsched_webapi.py
+pylib_DATA =
+
jsdir=$(localstatedir)/www/js
js_DATA = bibsched.js
EXTRA_DIST = $(pylib_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/bibsched/lib/tasklets/Makefile.am b/modules/bibsched/lib/tasklets/Makefile.am
index 6f2bbd9f9..bebba9368 100644
--- a/modules/bibsched/lib/tasklets/Makefile.am
+++ b/modules/bibsched/lib/tasklets/Makefile.am
@@ -1,25 +1,24 @@
## This file is part of Invenio.
## Copyright (C) 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir=$(libdir)/python/invenio/bibsched_tasklets
-pylib_DATA = __init__.py bst_fibonacci.py bst_send_email.py bst_twitter_fetcher.py bst_run_bibtask.py \
- bst_notify_url.py bst_weblinkback_updater.py
+pylib_DATA =
EXTRA_DIST = $(pylib_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/bibupload/lib/Makefile.am b/modules/bibupload/lib/Makefile.am
index 6c30e4ca9..1d17b5910 100644
--- a/modules/bibupload/lib/Makefile.am
+++ b/modules/bibupload/lib/Makefile.am
@@ -1,38 +1,35 @@
## This file is part of Invenio.
## Copyright (C) 2006, 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir = $(libdir)/python/invenio
-pylib_DATA = bibupload_config.py \
- bibupload.py \
- bibupload_regression_tests.py \
+pylib_DATA = bibupload_regression_tests.py \
batchuploader_webinterface.py \
batchuploader_engine.py \
batchuploader_templates.py \
batchuploader.py \
batchuploader_regression_tests.py \
- bibupload_revisionverifier.py \
- bibupload_revisionverifier_regression_tests.py
+ bibupload_revisionverifier_regression_tests.py
jsdir=$(localstatedir)/www/js
js_DATA = batchuploader.js
EXTRA_DIST = $(pylib_DATA) \
$(js_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/miscutil/lib/Makefile.am b/modules/miscutil/lib/Makefile.am
index b3d630c7d..34493ccff 100644
--- a/modules/miscutil/lib/Makefile.am
+++ b/modules/miscutil/lib/Makefile.am
@@ -1,99 +1,90 @@
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 CERN.
##
# Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir = $(libdir)/python/invenio
pylib_DATA = __init__.py \
crossrefutils.py \
- data_cacher.py \
dataciteutils_tester.py \
dbdump.py \
dbquery_regression_tests.py \
dbquery_unit_tests.py \
errorlib_regression_tests.py \
errorlib_webinterface.py \
flaskshell.py \
invenio_connector.py \
invenio_connector_regression_tests.py \
inveniocfg_unit_tests.py \
inveniomanage_unit_tests.py \
miscutil_model.py \
plotextractor.py \
plotextractor_config.py \
plotextractor_converter.py \
plotextractor_getter.py \
plotextractor_output_utils.py \
plotextractor_regression_tests.py \
plotextractor_unit_tests.py \
pluginutils.py \
pluginutils_unit_tests.py \
remote_debugger.py \
remote_debugger_config.py \
remote_debugger_wsgi_reload.py \
sequtils.py \
sequtils_cnum.py \
sequtils_regression_tests.py \
sherpa_romeo_testing.py \
- solrutils_bibindex_indexer.py \
- solrutils_bibindex_searcher.py \
solrutils_bibrank_indexer.py \
- solrutils_bibrank_searcher.py \
- solrutils_config.py \
solrutils_regression_tests.py \
testutils_regression_tests.py \
w3c_validator.py \
- xapianutils_bibindex_indexer.py \
- xapianutils_bibindex_searcher.py \
- xapianutils_bibrank_indexer.py \
- xapianutils_bibrank_searcher.py \
- xapianutils_config.py
+ xapianutils_bibrank_indexer.py
noinst_DATA = testimport.py \
kwalitee.py \
pep8.py
tmpdir = $(prefix)/var/tmp
EXTRA_DIST = $(pylib_DATA) \
$(tmp_DATA) \
testimport.py \
kwalitee.py \
pep8.py \
solrutils \
solrutils/schema.xml \
solrutils/java_sources.txt \
solrutils/org \
solrutils/org/invenio_software \
solrutils/org/invenio_software/solr \
solrutils/org/invenio_software/solr/BitSetFieldCollector.java \
solrutils/org/invenio_software/solr/InvenioFacetComponent.java \
solrutils/org/invenio_software/solr/FieldCollectorBase.java \
solrutils/org/invenio_software/solr/IntFieldCollector.java \
solrutils/org/invenio_software/solr/FieldCollector.java \
solrutils/org/invenio_software/solr/InvenioQueryComponent.java \
solrutils/org/invenio_software/solr/InvenioBitsetStreamResponseWriter.java \
solrutils/org/invenio_software/solr/InvenioBitSet.java \
solrutils/org/invenio_software/solr/StringFieldCollector.java \
solrutils/solrconfig.xml
install-data-hook:
$(PYTHON) $(srcdir)/testimport.py ${prefix}
CLEANFILES = *~ *.tmp *.pyc
clean-local:
rm -rf build
diff --git a/modules/webaccess/lib/Makefile.am b/modules/webaccess/lib/Makefile.am
index 51ab25542..96edecd58 100644
--- a/modules/webaccess/lib/Makefile.am
+++ b/modules/webaccess/lib/Makefile.am
@@ -1,38 +1,37 @@
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir = $(libdir)/python/invenio
pylib_DATA = \
webaccessadmin_lib.py \
external_authentication_cern.py \
- external_authentication.py \
external_authentication_ldap.py \
external_authentication_cern_wrapper.py \
external_authentication_cern_unit_tests.py \
external_authentication_sso.py \
external_authentication_robot.py \
webaccess_regression_tests.py \
external_authentication_oauth1.py \
external_authentication_oauth2.py \
external_authentication_openid.py
noinst_DATA = collection_restrictions_migration_kit.py
EXTRA_DIST = $(pylib_DATA) $(noinst_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/webcomment/lib/Makefile.am b/modules/webcomment/lib/Makefile.am
index 054366cf9..47eca90a2 100644
--- a/modules/webcomment/lib/Makefile.am
+++ b/modules/webcomment/lib/Makefile.am
@@ -1,27 +1,26 @@
## This file is part of Invenio.
## Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir = $(libdir)/python/invenio
-pylib_DATA = webcomment_templates.py \
- webcommentadminlib.py \
+pylib_DATA = webcommentadminlib.py \
webcomment_regression_tests.py \
webcomment_web_tests.py
EXTRA_DIST = $(pylib_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/weblinkback/lib/Makefile.am b/modules/weblinkback/lib/Makefile.am
index 3dca7aad9..a16aa1d40 100644
--- a/modules/weblinkback/lib/Makefile.am
+++ b/modules/weblinkback/lib/Makefile.am
@@ -1,30 +1,25 @@
## This file is part of Invenio.
## Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir = $(libdir)/python/invenio
-pylib_DATA = weblinkback_config.py \
- weblinkback_dblayer.py \
- weblinkback_regression_tests.py \
- weblinkback_templates.py \
- weblinkback_unit_tests.py \
- weblinkback.py \
- weblinkbackadminlib.py
+pylib_DATA = weblinkback_regression_tests.py \
+ weblinkback_unit_tests.py
EXTRA_DIST = $(pylib_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/websearch/lib/Makefile.am b/modules/websearch/lib/Makefile.am
index 629c9c4a4..892b302ec 100644
--- a/modules/websearch/lib/Makefile.am
+++ b/modules/websearch/lib/Makefile.am
@@ -1,45 +1,35 @@
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2010, 2011, 2012, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir = $(libdir)/python/invenio
pylib_DATA = \
- websearchadminlib.py \
websearch_flask_tests.py \
- websearch_templates.py \
websearch_regression_tests.py \
websearch_web_tests.py \
search_engine_unit_tests.py \
search_engine_utils.py \
search_engine_query_parser.py \
search_engine_query_parser_unit_tests.py \
- websearch_webcoll.py \
websearchadmin_regression_tests.py \
- websearch_external_collections.py \
search_engine_summarizer_unit_tests.py \
- websearch_external_collections_config.py \
- websearch_external_collections_getter.py \
websearch_external_collections_getter_unit_tests.py \
- websearch_external_collections_parser.py \
- websearch_external_collections_searcher.py \
- websearch_external_collections_templates.py \
- websearch_external_collections_unit_tests.py \
- websearch_external_collections_utils.py
+ websearch_external_collections_unit_tests.py
EXTRA_DIST = $(pylib_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/websession/lib/Makefile.am b/modules/websession/lib/Makefile.am
index 8d6142fbe..36851d21d 100644
--- a/modules/websession/lib/Makefile.am
+++ b/modules/websession/lib/Makefile.am
@@ -1,34 +1,30 @@
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir = $(libdir)/python/invenio
pylib_DATA = session.py webuser_unit_tests.py \
- websession_templates.py \
- webgroup.py websession_config.py \
- webaccount.py \
webaccount_flask_tests.py \
websession_regression_tests.py \
- webgroup_regression_tests.py webuser_regression_tests.py \
- webgroup_unit_tests.py inveniogc.py webuser_config.py \
+ webgroup_regression_tests.py \
+ webuser_regression_tests.py \
+ webgroup_unit_tests.py \
websession_web_tests.py
-noinst_DATA = password_migration_kit.py
-
-EXTRA_DIST = $(pylib_DATA) $(noinst_DATA)
+EXTRA_DIST = $(pylib_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/webstat/bin/webstatadmin.in b/modules/webstat/bin/webstatadmin.in
index 063afccec..0d693e6be 100644
--- a/modules/webstat/bin/webstatadmin.in
+++ b/modules/webstat/bin/webstatadmin.in
@@ -1,33 +1,33 @@
#!@PYTHON@
## -*- mode: python; coding: utf-8; -*-
## This file is part of Invenio.
## Copyright (C) 2007, 2008, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
"""Administers the WebStat framework by creating/deleting custom events as well as triggering caching of event raw data."""
__revision__ = "$Id$"
try:
from invenio.flaskshell import *
- from invenio.webstatadmin import main
+ from invenio.legacy.webstat.admin import main
except ImportError, e:
print "Error: %s" % e
import sys
sys.exit(1)
main()
diff --git a/modules/webstat/lib/Makefile.am b/modules/webstat/lib/Makefile.am
index bcd907002..c2cb8c024 100644
--- a/modules/webstat/lib/Makefile.am
+++ b/modules/webstat/lib/Makefile.am
@@ -1,60 +1,56 @@
## This file is part of Invenio.
## Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir = $(libdir)/python/invenio
-pylib_DATA = webstat.py \
- webstat_config.py \
- webstat_templates.py \
- webstat_engine.py \
- webstatadmin.py \
- webstat_regression_tests.py
+pylib_DATA = webstat_regression_tests.py
lispimagedir = $(libdir)/lisp/invenio
-lispimage_DATA = webstat.clisp.mem webstat.cmucl.core webstat.sbcl.core
+lispimage_DATA =
+#webstat.clisp.mem webstat.cmucl.core webstat.sbcl.core
-FILESLISP = load.lisp \
- webstatlib.lisp
+FILESLISP =
EXTRA_DIST = $(pylib_DATA) $(FILESLISP)
CLEANFILES = $(lispimage_DATA) *~ *.tmp *.pyc *.fas *.fasl *.lib *.x86f *.mem *.core
-webstat.clisp.mem: $(FILESLISP)
- if [ -x "${CLISP}" ]; then \
- (cwd=`pwd`; cd $(srcdir) && \
- ${CLISP} -q -x "(progn (load \"load.lisp\")(saveinitmem \"$${cwd}/webstat.clisp.mem\"))"); \
- else \
- echo "Warning: cannot find CLISP, hoping you have CMUCL or SBCL instead."; \
- touch webstat.clisp.mem; \
- fi
-
-webstat.cmucl.core: $(FILESLISP)
- if [ -x "${CMUCL}" ]; then \
- (cwd=`pwd`; cd $(srcdir) && \
- ${CMUCL} -quiet -batch -eval "(progn (load \"load.lisp\")(ext:save-lisp \"$${cwd}/webstat.cmucl.core\"))"); \
- else \
- echo "Warning: cannot find CMUCL, hoping you have CLISP or SBCL instead."; \
- touch webstat.cmucl.core; \
- fi
-
-webstat.sbcl.core: $(FILESLISP)
- if [ -x "${SBCL}" ]; then \
- (cwd=`pwd`; cd $(srcdir) && \
- ${SBCL} --noinform --eval "(progn (load \"load.lisp\")(sb-ext:save-lisp-and-die \"$${cwd}/webstat.sbcl.core\"))"); \
- else \
- echo "Warning: cannot find SBCL, hoping you have CLISP or CMUCL instead."; \
- touch webstat.sbcl.core; \
- fi
+#TODO: update this!
+# webstat.clisp.mem: $(FILESLISP)
+# if [ -x "${CLISP}" ]; then \
+# (cwd=`pwd`; cd $(srcdir) && \
+# ${CLISP} -q -x "(progn (load \"load.lisp\")(saveinitmem \"$${cwd}/webstat.clisp.mem\"))"); \
+# else \
+# echo "Warning: cannot find CLISP, hoping you have CMUCL or SBCL instead."; \
+# touch webstat.clisp.mem; \
+# fi
+#
+# webstat.cmucl.core: $(FILESLISP)
+# if [ -x "${CMUCL}" ]; then \
+# (cwd=`pwd`; cd $(srcdir) && \
+# ${CMUCL} -quiet -batch -eval "(progn (load \"load.lisp\")(ext:save-lisp \"$${cwd}/webstat.cmucl.core\"))"); \
+# else \
+# echo "Warning: cannot find CMUCL, hoping you have CLISP or SBCL instead."; \
+# touch webstat.cmucl.core; \
+# fi
+#
+# webstat.sbcl.core: $(FILESLISP)
+# if [ -x "${SBCL}" ]; then \
+# (cwd=`pwd`; cd $(srcdir) && \
+# ${SBCL} --noinform --eval "(progn (load \"load.lisp\")(sb-ext:save-lisp-and-die \"$${cwd}/webstat.sbcl.core\"))"); \
+# else \
+# echo "Warning: cannot find SBCL, hoping you have CLISP or CMUCL instead."; \
+# touch webstat.sbcl.core; \
+# fi
diff --git a/modules/webstyle/img/Makefile.am b/modules/webstyle/img/Makefile.am
index 7357d71f8..9a0409712 100644
--- a/modules/webstyle/img/Makefile.am
+++ b/modules/webstyle/img/Makefile.am
@@ -1,329 +1,327 @@
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
imgdir=$(localstatedir)/www/img
img_DATA = add-small.png \
add.png \
answer_bad.gif \
arrow_down.gif \
arrow_down2.png \
arrow_link-icon-15x11-right.gif \
arrow_up.gif \
arrow_up2.png \
at.gif \
aid_90low_right.png \
aid_btn_blue.png \
aid_btn_green.png \
aid_btn_red.png \
aid_denied.png \
aid_granted.png \
aid_info.gif \
aid_minus_16.png \
aid_plus_16.png \
aid_warning_denied.png \
aid_warning_granted.png \
aid_check_gray.png \
aid_check.png \
aid_operator.png \
aid_reject_gray.png \
aid_reject.png \
aid_reset_gray.png \
aid_reset.png \
aid_to_other.png \
aid_to_other_gray.png \
ajax-loader.gif \
application_pdf.png \
balloon_arrow_left_shadow.png \
balloon_bottom_left_shadow.png \
balloon_bottom_shadow.png \
balloon_right_shadow.png \
balloon_top_right_shadow.png \
balloon_arrow_shadow.png \
balloon_bottom_right_shadow.png \
balloon_left_shadow.png \
balloon_top_left_shadow.png \
balloon_top_shadow.png \
bibedit_extract_url.png \
bibedit_tableview.png \
bibedit_textmarc.png \
blankicon.gif \
blue_gradient.gif \
book_cover_placeholder.gif \
bullet_toggle_minus.png \
bullet_toggle_plus.png \
circle_green.png \
circle_red.png \
cross_red.gif \
delete-big.png \
delete-small.png \
delete.png \
description.gif \
diff.png \
document-print.png \
dot.gif \
down-trans.gif \
down.gif \
drop_down_menu_arrow_down_g.gif \
drop_down_menu_arrow_down_b.gif \
drop_down_menu_arrow_down_lb.gif \
drop_down_menu_arrow_down_w.gif \
edit1.gif \
edit-paste.png \
external-icon-light-8x8.gif \
eye.png \
feed-icon-12x12.gif \
file-icon-none-96x128.gif \
file-icon-text-12x16.gif \
file-icon-text-15x20.gif \
file-icon-text-34x48.gif \
file-icon-text-96x128.gif \
forbidden_left.gif \
forbidden_right.gif \
gradient-lightgray-1x100.gif \
gradient_tab-gray-1x23.gif \
gradient_tab_on-gray-1x23.gif \
group_admin.png \
head.gif \
header_background.gif \
help.png \
icon-journal_Athanasius_Kircher_Atlantis.gif \
icon-journal_hms_beagle.gif \
iconcross.gif \
iconcross2.gif \
page_edit.png \
pix.png \
plus_orange.png \
iconeye.gif \
iconpen.gif \
indicator.gif \
journal-template1.gif \
journal-template2.gif \
journal-template3.gif \
journal-template4.gif \
journal-template5.gif \
journal_Athanasius_Kircher_Atlantis.gif \
journal_content.png \
journal_footer.png \
journal_footer2.png \
- journal_header.png \
journal_hms_beagle.gif \
journal_new.png \
journal_virgin_forest.gif \
- journal_water_dog.gif \
keep_sso_connection_alive.gif \
last-right-part-trans.gif \
left-part-topless-trans.gif \
left-part-trans.gif \
left-trans.gif \
left.gif \
line-up-trans.gif \
line.gif \
loading.gif \
logo_white.png \
magnifying_minus.png \
magnifying_plus.png \
mail-icon-12x8.gif \
mainmenu.gif \
merge-small.png \
merge.png \
mergeNC.png \
move.png \
move_from.gif \
move_to.gif \
noway.gif \
okay.gif \
orcid_icon_24.png \
out.gif \
paper-texture-128x128.gif \
paper_clip-72x72.gif \
r.gif \
rcorners-gray-1280x18.gif \
rcorners-gray-1280x60-folded.gif \
red_gradient.gif \
ref_extract.png \
refresh.png \
replace.png \
restricted.gif \
right-part-topless-trans.gif \
right-part-trans.gif \
right-trans.gif \
right.gif \
rss.png \
sb.gif \
sbm_guide_accessnumber.png \
sbm_guide_approvals.png \
sbm_guide_approve_button.png \
sbm_guide_browse.png \
sbm_guide_description.png \
sbm_guide_login.png \
sbm_guide_logout.png \
sbm_guide_modify_button.png \
sbm_guide_revise_button.png \
sbm_guide_submissions.png \
sbm_guide_submit_button.png \
sbm_guide_subnumber.png \
sbm_guide_summary.png \
sciencewise.png \
sclose.gif \
se.gif \
search.png \
site_logo.gif \
site_logo_rss.png \
site_logo_small.gif \
smallbin.gif \
smalldown.gif \
smallfiles.gif \
smallup.gif \
smchk_gr.gif \
smchk_rd.gif \
sn.gif \
sp.gif \
star-icon-30x30.gif \
star_dot-icon-30x30.gif \
star_empty-icon-30x30.gif \
star_half-icon-30x30.gif \
stars-0-0.png \
stars-0-5.png \
stars-1-0.png \
stars-1-5.png \
stars-2-0.png \
stars-2-5.png \
stars-3-0.png \
stars-3-5.png \
stars-4-0.png \
stars-4-5.png \
stars-5-0.png \
summary.gif \
table.png \
table_add.png \
table_delete.png \
table_multiple.png \
table_row_delete.png \
table_row_insert.png \
table_sort.png \
test \
testgif \
test.foo \
test.gif \
tick.gif \
tree_branch.gif \
up.gif \
user-icon-1-16x16.gif \
user-icon-1-20x20.gif \
user-icon-1-24x24.gif \
waiting_or.gif \
warning.png \
webbasket_create.png \
webbasket_create_small.png \
webbasket_delete.png \
webbasket_down.png \
webbasket_extern.png \
webbasket_intern.png \
webbasket_move.png \
webbasket_right.png \
webbasket_ugs.png \
webbasket_up.png \
webbasket_us.png \
webbasket_user.png \
webbasket_usergroup.png \
webbasket_usergroup_gray.png \
webbasket_world.png \
webbasket_ws.png \
wb-add-note.png \
wb-copy-item.png \
wb-create-basket.png \
wb-delete-basket.png \
wb-delete-item.png \
wb-edit-basket.png \
wb-edit-topic.png \
wb-external-item.png \
wb-go-back.png \
wb-move-item-down-disabled.png \
wb-move-item-down.png \
wb-move-item-up-disabled.png \
wb-move-item-up.png \
wb-next-item-disabled.png \
wb-next-item.png \
wb-notes.png \
wb-previous-item-disabled.png \
wb-previous-item.png \
wb-sort-asc.gif \
wb-sort-desc.gif \
wb-sort-none.gif \
wb-subscribe.png \
wb-unsubscribe.png \
white_field.gif \
wsignout.gif \
compare.png \
document-preview.png \
ui-anim_basic_16x16.gif \
aol_icon_24.png \
aol_icon_48.png \
blogger_icon_24.png \
blogger_icon_48.png \
facebook_icon_24.png \
facebook_icon_48.png \
flickr_icon_24.png \
flickr_icon_48.png \
foursquare_icon_24.png \
foursquare_icon_48.png \
google_icon_24.png \
google_icon_48.png \
googleoauth2_icon_24.png \
googleoauth2_icon_48.png \
instagram_icon_24.png \
instagram_icon_48.png \
linkedin_icon_24.png \
linkedin_icon_48.png \
livejournal_icon_24.png \
livejournal_icon_48.png \
myopenid_icon_24.png \
myopenid_icon_48.png \
myspace_icon_24.png \
myspace_icon_48.png \
myvidoop_icon_24.png \
myvidoop_icon_48.png \
openid_icon_24.png \
openid_icon_48.png \
twitter_icon_24.png \
twitter_icon_48.png \
verisign_icon_24.png \
verisign_icon_48.png \
wordpress_icon_24.png \
wordpress_icon_48.png \
yahoo_icon_24.png \
yahoo_icon_48.png \
yammer_icon_24.png \
yammer_icon_48.png
tmpdir=$(localstatedir)/tmp
tmp_DATA = plotextractor_dummy.png
wwwdir=$(localstatedir)/www
www_DATA = favicon.ico \
apple-touch-icon-57-precomposed.png \
apple-touch-icon-72-precomposed.png \
apple-touch-icon-114-precomposed.png \
apple-touch-icon-144-precomposed.png
EXTRA_DIST = $(img_DATA) $(tmp_DATA) $(www_DATA)
CLEANFILES = *~ *.tmp
diff --git a/modules/webstyle/lib/Makefile.am b/modules/webstyle/lib/Makefile.am
index 89fc94876..d47e110f6 100644
--- a/modules/webstyle/lib/Makefile.am
+++ b/modules/webstyle/lib/Makefile.am
@@ -1,36 +1,35 @@
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
pylibdir=$(libdir)/python/invenio
wsgiwebdir=$(localstatedir)/www-wsgi/
pylib_DATA = webdoc.py \
webdoc_unit_tests.py \
webdoc_webinterface.py \
- webstyle_templates.py \
webstyle_regression_tests.py \
webinterface_unit_tests.py \
ckeditor_invenio_connector.py \
httptest_webinterface.py \
ping_webinterface.py \
goto_webinterface.py
wsgiweb_DATA = invenio.wsgi
EXTRA_DIST = $(pylib_DATA) $(wsgiweb_DATA)
CLEANFILES = *~ *.tmp *.pyc
diff --git a/modules/websubmit/lib/Makefile.am b/modules/websubmit/lib/Makefile.am
index 220057969..573b3acaa 100644
--- a/modules/websubmit/lib/Makefile.am
+++ b/modules/websubmit/lib/Makefile.am
@@ -1,47 +1,44 @@
## This file is part of Invenio.
## Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2013 CERN.
##
## Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
SUBDIRS = functions
pylibdir = $(libdir)/python/invenio
pylib_DATA = \
websubmit_dblayer.py \
websubmit_templates.py \
websubmit_regression_tests.py \
- websubmitadmin_config.py \
- websubmitadmin_dblayer.py \
websubmitadmin_engine.py \
websubmitadmin_templates.py \
websubmitadmin_regression_tests.py \
websubmit_file_stamper.py \
websubmit_icon_creator.py \
- websubmit_file_converter.py \
hocrlib.py \
websubmit_file_metadata.py \
websubmit_web_tests.py \
websubmitadmincli.py
metadataplugindir = $(libdir)/python/invenio/websubmit_file_metadata_plugins
metadataplugin_DATA = __init__.py \
wsm_extractor_plugin.py \
wsm_pyexiv2_plugin.py \
wsm_pdftk_plugin.py
EXTRA_DIST = $(pylib_DATA) $(metadataplugin_DATA)
CLEANFILES = *~ *.tmp *.pyc

Event Timeline